-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Have a way to have Time objects in pandas Series and Dataframe objects #17495
Comments
As a concrete code example of something which doesn't work: pd.Series(index=[astropy.time.Time("2024-01-01")]).resample(("5Min"))
File ~/micromamba/envs/sunpy-dev/lib/python3.12/site-packages/pandas/core/generic.py:9771, in NDFrame.resample(self, rule, axis, closed, label, convention, kind, on, level, origin, offset, group_keys)
9768 else:
9769 convention = "start"
-> 9771 return get_resampler(
9772 cast("Series | DataFrame", self),
9773 freq=rule,
9774 label=label,
9775 closed=closed,
9776 axis=axis,
9777 kind=kind,
9778 convention=convention,
9779 key=on,
9780 level=level,
9781 origin=origin,
9782 offset=offset,
9783 group_keys=group_keys,
9784 )
File ~/micromamba/envs/sunpy-dev/lib/python3.12/site-packages/pandas/core/resample.py:2050, in get_resampler(obj, kind, **kwds)
2046 """
2047 Create a TimeGrouper and return our resampler.
2048 """
2049 tg = TimeGrouper(obj, **kwds) # type: ignore[arg-type]
-> 2050 return tg._get_resampler(obj, kind=kind)
File ~/micromamba/envs/sunpy-dev/lib/python3.12/site-packages/pandas/core/resample.py:2272, in TimeGrouper._get_resampler(self, obj, kind)
2263 elif isinstance(ax, TimedeltaIndex):
2264 return TimedeltaIndexResampler(
2265 obj,
2266 timegrouper=self,
(...)
2269 gpr_index=ax,
2270 )
-> 2272 raise TypeError(
2273 "Only valid with DatetimeIndex, "
2274 "TimedeltaIndex or PeriodIndex, "
2275 f"but got an instance of '{type(ax).__name__}'"
2276 )
TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index' |
To my mind the right answer is for those data container packages to allow non-native column types (ala astropy mixin-columns) if they satisfy the array protocol. On the astropy end, we can certainly make that happen. |
I think that is one thing (and maybe pandas does a bit already) but I think specifically to use time as an index we will need to do more work to allow for things like time based resampling etc |
What is the problem this feature will solve?
Some heliophysics data has leap seconds, and some of it has high precision time requirements, however, a lot of people are familiar with Pandas and xarray and want to use those tools to work with their data.
Currently, we can not use
astropy.time.Time
objects as the index for apandas.Dataframe
or axarray.Dataset
which would allow using leap seconds and other good stuff with pandas & xarray.I don't have a good technical understanding of how to solve this problem, so this is a pretty speculative issue. I am hoping we can get some input from the pandas side.
Describe the desired outcome
There is an equivalent to
DatetimeIndex
which is backed withTime
instead.This would allow us to have both timeseries as Dataframe objects using Time as the index and also allow us use Time objects as indices in xarray (I am unsure of the exact details of how xarray uses pandas Index, anyone with more understanding please chime in).
Additional context
xarray have a
CFTimeIndex
which I think is what I am proposing but for cftime instead of astropy Time, but unsure of the details.This sunpy issue is one of the main motivations for this: sunpy/sunpy#5422
Bigger picture, if we had support for this, the Quantity 2 work and some other bits we should be able to make use of large chunks of the astropy ecosystem within xarray / pandas. This would make a massive difference for being able to use astropy with heliophysics data.
The text was updated successfully, but these errors were encountered: