-
Notifications
You must be signed in to change notification settings - Fork 70
fix: Disambiguate last_modified timezone better
#502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| RESOURCE_ADEQUACY_TEST_DATES = [ | ||
| ( | ||
| (pd.Timestamp.now(tz=default_timezone) - pd.Timedelta(days=119)).strftime( | ||
| (pd.Timestamp.now(tz=default_timezone) - pd.Timedelta(days=92)).strftime( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'll continue to dial this in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dialing in continues...
| "%Y-%m-%d", | ||
| ), | ||
| (pd.Timestamp.now(tz=default_timezone) + pd.Timedelta(days=30)).strftime( | ||
| (pd.Timestamp.now(tz=default_timezone) + pd.Timedelta(days=34)).strftime( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the farthest out it forecasts, according to the data spec from the ISO
gridstatus/ieso.py
Outdated
| if last_modified: | ||
| last_modified = pd.Timestamp(last_modified, tz=self.default_timezone) | ||
| if last_modified.tz is None: | ||
| last_modified = last_modified.tz_localize("UTC") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the conversion to UTC. Aren't the last modified times from this page in EST?
date and end are converted to EST so it would be inconsistent to convert last_modified to UTC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, when a function is decorated with @support_date_range, the date and end parameters get converted to the local timezone using the utils._handle_date method. So, maybe we should pass last_modified through utils._handle_date with the default_timezone.
gridstatus/gridstatus/decorators.py
Lines 95 to 108 in 88acf30
| default_timezone = args_dict["self"].default_timezone | |
| # For today with sub daily data, create a range that spans the day | |
| if ( | |
| self.frequency in ["HOUR_START", "5_MIN"] | |
| and args_dict.get("date") == "today" | |
| ): # noqa | |
| args_dict["date"] = pd.Timestamp.now(tz=default_timezone).floor("D") | |
| args_dict["end"] = args_dict["date"] + pd.Timedelta(days=1) | |
| args_dict["date"] = utils._handle_date( | |
| args_dict["date"], | |
| default_timezone, | |
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah mostly I was converting to UTC here to make it unambiguous and explicit. The issue that was occurring was that a (timezone-naive) UTC time was being given to the kwarg last_modified which was then being (incorrectly) localized to local time here, resulting in no data.
So while yes the last modified times from the page are in EST, the date comparison used to filter out the files can compare two timestamps in different timezones correctly, as long as there is a known timezone to each. (If one is timezone-naive and the other is timezone-aware, the date comparison will error out with the classic Error: Cannot convert tz-naive Timestamp, use tz_localize to localize, which is why the localization was added here in the first place). But if the naive time is actually in UTC, then localizing to EST is not correct.
So the options are:
- Expect a timezone-aware time, error if not
- Handle a timezone-naive time to something (but only if it is naive).
I opt for 2, but can change that handling to EST rather than UTC (perhaps using utils._handle_date).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opt for 2, but can change that handling to EST rather than UTC (perhaps using utils._handle_date).
this would be consistent with the rest of the library.
|
|
||
| if last_modified: | ||
| last_modified = pd.Timestamp(last_modified, tz=self.default_timezone) | ||
| if last_modified.tz is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if needed, maybe you should raise an error if a tz-naive timestamp is passed.
if you don't like the raising an error approach and need to pick a timezone, we generally assume tz-naive timestamps are in the the default timezone of the market throughout this library
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think either is fine, I'm doing the latter now albeit to UTC, which I can change to EST (via self.default_timezone).
gridstatus/ieso.py
Outdated
| if last_modified: | ||
| last_modified = pd.Timestamp(last_modified, tz=self.default_timezone) | ||
| if last_modified.tz is None: | ||
| last_modified = last_modified.tz_localize("UTC") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to do the same handling here as above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
WillKoehrsen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.


Summary
Before, we were localizing the given
last_modifiedtime parameter into the ISO local time (always EST). However, this is not the desired behavior and reallylast_modifiedcan have any (or no) timezone.Details
Particularly this would cause an issue if the last modified was meant to be in UTC, as it was localized to EST, resulting in a radically different time than was given to the function and used to compare.