Skip to content

Conversation

@Kladar
Copy link
Contributor

@Kladar Kladar commented Dec 11, 2024

Summary

Before, we were localizing the given last_modified time parameter into the ISO local time (always EST). However, this is not the desired behavior and really last_modified can have any (or no) timezone.

Details

Particularly this would cause an issue if the last modified was meant to be in UTC, as it was localized to EST, resulting in a radically different time than was given to the function and used to compare.

@Kladar Kladar added the bug Something isn't working label Dec 11, 2024
@Kladar Kladar requested a review from WillKoehrsen December 11, 2024 00:24
@Kladar Kladar self-assigned this Dec 11, 2024
@Kladar
Copy link
Contributor Author

Kladar commented Dec 11, 2024

Looks like the files have rolled off, causing tests to fail
CleanShot 2024-12-10 at 16 29 56@2x

I will update the tests

RESOURCE_ADEQUACY_TEST_DATES = [
(
(pd.Timestamp.now(tz=default_timezone) - pd.Timedelta(days=119)).strftime(
(pd.Timestamp.now(tz=default_timezone) - pd.Timedelta(days=92)).strftime(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll continue to dial this in

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dialing in continues...

"%Y-%m-%d",
),
(pd.Timestamp.now(tz=default_timezone) + pd.Timedelta(days=30)).strftime(
(pd.Timestamp.now(tz=default_timezone) + pd.Timedelta(days=34)).strftime(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the farthest out it forecasts, according to the data spec from the ISO

if last_modified:
last_modified = pd.Timestamp(last_modified, tz=self.default_timezone)
if last_modified.tz is None:
last_modified = last_modified.tz_localize("UTC")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the conversion to UTC. Aren't the last modified times from this page in EST?

date and end are converted to EST so it would be inconsistent to convert last_modified to UTC.

Copy link
Collaborator

@WillKoehrsen WillKoehrsen Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, when a function is decorated with @support_date_range, the date and end parameters get converted to the local timezone using the utils._handle_date method. So, maybe we should pass last_modified through utils._handle_date with the default_timezone.

default_timezone = args_dict["self"].default_timezone
# For today with sub daily data, create a range that spans the day
if (
self.frequency in ["HOUR_START", "5_MIN"]
and args_dict.get("date") == "today"
): # noqa
args_dict["date"] = pd.Timestamp.now(tz=default_timezone).floor("D")
args_dict["end"] = args_dict["date"] + pd.Timedelta(days=1)
args_dict["date"] = utils._handle_date(
args_dict["date"],
default_timezone,
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah mostly I was converting to UTC here to make it unambiguous and explicit. The issue that was occurring was that a (timezone-naive) UTC time was being given to the kwarg last_modified which was then being (incorrectly) localized to local time here, resulting in no data.

So while yes the last modified times from the page are in EST, the date comparison used to filter out the files can compare two timestamps in different timezones correctly, as long as there is a known timezone to each. (If one is timezone-naive and the other is timezone-aware, the date comparison will error out with the classic Error: Cannot convert tz-naive Timestamp, use tz_localize to localize, which is why the localization was added here in the first place). But if the naive time is actually in UTC, then localizing to EST is not correct.

So the options are:

  1. Expect a timezone-aware time, error if not
  2. Handle a timezone-naive time to something (but only if it is naive).

I opt for 2, but can change that handling to EST rather than UTC (perhaps using utils._handle_date).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opt for 2, but can change that handling to EST rather than UTC (perhaps using utils._handle_date).

this would be consistent with the rest of the library.


if last_modified:
last_modified = pd.Timestamp(last_modified, tz=self.default_timezone)
if last_modified.tz is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if needed, maybe you should raise an error if a tz-naive timestamp is passed.

if you don't like the raising an error approach and need to pick a timezone, we generally assume tz-naive timestamps are in the the default timezone of the market throughout this library

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think either is fine, I'm doing the latter now albeit to UTC, which I can change to EST (via self.default_timezone).

@Kladar
Copy link
Contributor Author

Kladar commented Dec 11, 2024

Also, rolloff seems to be happening daily now
CleanShot 2024-12-11 at 09 33 21@2x

if last_modified:
last_modified = pd.Timestamp(last_modified, tz=self.default_timezone)
if last_modified.tz is None:
last_modified = last_modified.tz_localize("UTC")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to do the same handling here as above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, yes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

Copy link
Collaborator

@WillKoehrsen WillKoehrsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@Kladar Kladar merged commit 8d1d638 into main Dec 12, 2024
12 of 16 checks passed
@Kladar Kladar deleted the kladar/fix/ieso-adequacy-timezone branch December 12, 2024 00:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants