Skip to content

Problem of building own dataset #90

@chufall

Description

@chufall

Hi, I'm trying to build a xarray dataset with mine own data, and the code is as following:

`numpy_array = data

lons = np.linspace(-180, 180, 73)
lats = np.linspace(87.5, -87.5, 71)

# mydatetime=
# dims = ('batch', 'time', 'channel', 'longitude', 'latitude')
times = pd.to_timedelta([f'{i:2d}:00:00' for i in range(data.shape[1])])
datetimes = pd.date_range(start=f'{start_year}-01-01', end=f'{end_year}-12-31 23:00:00', freq='H')

coords = {
    'batch': range(data.shape[0]),
    'time': ('time',times),
    'channel':[0],
    'lon': ('lon', lons),
    'lat': ('lat',lats),
    'datetime': (('batch','time'), [datetimes])
}

data_vars = {'Temp': (('batch', 'time', 'channel', 'lat', 'lon'), numpy_array[:, :, 0:1, :, :])}
xr_dataset = xarray.Dataset(data_vars=data_vars, coords=coords)`

but in the result dataset ,
the coord "datetime" which type is xarray.DataArray has own coords and dims "batch" and "time" , such like:

<xarray.DataArray 'datetime' (batch: 1, time: 17544)> Size: 140kB
array([['2020-01-01T00:00:00.000000000', '2020-01-01T01:00:00.000000000',
        '2020-01-01T02:00:00.000000000', ...,
        '2021-12-31T21:00:00.000000000', '2021-12-31T22:00:00.000000000',
        '2021-12-31T23:00:00.000000000']], dtype='datetime64[ns]')
Coordinates:
  * batch     (batch) int64 8B 0
  * time      (time) timedelta64[ns] 140kB 00:00:00 ... 730 days 23:00:00
    datetime  (batch, time) datetime64[ns] 140kB 2020-01-01 ... 2021-12-31T23...

However if I load the example dataset "source-era5_date-2022-01-01_res-0.25_levels-37_steps-01.nc", whose coord "datetime" has "batch" and "time" dims ,but only the "time" coord. such like :

<xarray.DataArray 'datetime' (batch: 1, time: 3)> Size: 24B
array([['2022-01-01T00:00:00.000000000', '2022-01-01T06:00:00.000000000',
        '2022-01-01T12:00:00.000000000']], dtype='datetime64[ns]')
Coordinates:
  * time      (time) timedelta64[ns] 24B 00:00:00 06:00:00 12:00:00
    datetime  (batch, time) datetime64[ns] 24B 2022-01-01 ... 2022-01-01T12:0...
Dimensions without coordinates: batch

So my question is how to remove the coord "batch" of the datetime in my dataset?

Thanks a lot!

Sincerely, Qc

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions