isimip_utils.xarray
Functions for working with xarray datasets for ISIMIP data.
init_dataset(lon=720, lat=360, time=None, dims=None, attrs=None, **variables)
Initialize a new xarray dataset with standard ISIMIP dimensions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lon
|
int | ndarray
|
Number of longitude points, or longitude array, or None to omit (default: 720). |
720
|
lat
|
int | ndarray
|
Number of latitude points, or latitude array, or None to omit (default: 360). |
360
|
time
|
int | ndarray
|
Number of time steps, or time array, or None to omit time dimension (default: None). |
None
|
attrs
|
dict
|
Dictionary of attributes for variables and global attributes. |
None
|
dims
|
list
|
List of dimensions (default time, lat, lon). |
None
|
**variables
|
ndarray
|
Data variables to include in the dataset. |
{}
|
Returns:
| Type | Description |
|---|---|
Dataset
|
Initialized xarray Dataset with coordinates and data variables. |
open_dataset(path, decode_cf=True, load=False)
Open a NetCDF dataset using xarray.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
Path to the NetCDF file. |
required |
decode_cf
|
bool
|
Whether to decode CF conventions (default: True). |
True
|
load
|
bool
|
Whether to load data into memory immediately (default: False). |
False
|
Returns:
| Type | Description |
|---|---|
Dataset
|
Xarray Dataset object. |
Note
Handles non-standard time units like growing seasons and years by converting
them to common_years with a 365_day calendar. month are read with the 360_day calendar.
load_dataset(path, decode_cf=True)
Open a NetCDF dataset using xarray and load data into memory immediately.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
Path to the NetCDF file. |
required |
decode_cf
|
bool
|
Whether to decode CF conventions (default: True). |
True
|
Returns:
| Type | Description |
|---|---|
Dataset
|
Xarray Dataset object. |
Note
Handles non-standard time units like growing seasons and years by converting
them to common_years with a 365_day calendar. month are read with the 360_day calendar.
This is a shortcut for open_dataset(path, decode_cf, load=True).
write_dataset(ds, path)
Write an xarray dataset to a NetCDF file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Xarray Dataset to write. |
required |
path
|
str | Path
|
Path where the NetCDF file will be written. |
required |
Note
Automatically adds fill values, orders variables, adds compression and sets time as unlimited dimension.
order_variables(ds)
Reorder dataset variables with coordinates first, then data variables.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Xarray Dataset to reorder. |
required |
Returns:
| Type | Description |
|---|---|
Dataset
|
Dataset with reordered variables. |
get_attrs(ds)
Get all attributes from coordinates and data variables.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Xarray Dataset. |
required |
Returns:
| Type | Description |
|---|---|
dict
|
Dictionary mapping variable names to their attributes. |
set_attrs(ds, attrs)
Set attributes on coordinates and data variables.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Xarray Dataset to modify. |
required |
attrs
|
dict
|
Dictionary mapping variable names to their attributes. |
required |
Returns:
| Type | Description |
|---|---|
Dataset
|
Modified dataset with updated attributes. |
set_fill_value_to_nan(ds)
Replace fill values with NaN in data variables. This is only needed for datasets which are read with decode_cf=False and _FillValue is not in encoding.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Xarray Dataset to modify. |
required |
Returns:
| Type | Description |
|---|---|
Dataset
|
Dataset with fill values replaced by NaN. |
set_nan_to_fill_value(ds)
Replace NaN values with fill values in data variables. This is only needed for datasets which are read with decode_cf=False and _FillValue is not in encoding.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Xarray Dataset to modify. |
required |
Returns:
| Type | Description |
|---|---|
Dataset
|
Dataset with NaN values replaced by fill values. |
remove_fill_value_from_coords(ds)
Remove _FillValue and missing_value attributes from the coords.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Xarray Dataset to modify. |
required |
Returns:
| Type | Description |
|---|---|
Dataset
|
Dataset with fill value removed for the coords. |
add_fill_value_to_data_vars(ds)
Add _FillValue and missing_value to data_vars if no encoding is present. This is the case for a newly created Dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Xarray Dataset to modify. |
required |
Returns:
| Type | Description |
|---|---|
Dataset
|
Dataset with encoding added for the data_vars. |
add_compression_to_data_vars(ds, complevel=5)
Add compression to data variables.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Xarray Dataset to reorder. |
required |
complevel
|
int
|
Compression level |
5
|
Returns:
| Type | Description |
|---|---|
Dataset
|
Dataset with updated encoding. |
compute_time(ds, timestamp)
Convert a datetime to numeric time value for dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Dataset with time coordinate containing units and calendar. |
required |
timestamp
|
datetime | date | None
|
Timestamp to convert, or None. |
required |
Returns:
| Type | Description |
|---|---|
float | None
|
Numeric time value in dataset's units, or |
compute_offset(ds1, ds2)
Compute time offset between two datasets with different time units.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds1
|
Dataset
|
First dataset with time coordinate. |
required |
ds2
|
Dataset
|
Second dataset with time coordinate. |
required |
Returns:
| Type | Description |
|---|---|
DataArray | None
|
Time offset to apply to |
create_mask(ds, df, layer)
Create a spatial mask from a geometry layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Xarray Dataset with lat/lon coordinates. |
required |
df
|
DataFrame
|
GeoDataFrame with geometry column. |
required |
layer
|
int
|
Index of the layer to use from the GeoDataFrame. |
required |
Returns:
| Type | Description |
|---|---|
Dataset
|
Xarray dataset with a |
convert_time(time, units='days since 1601-1-1 00:00:00', calendar='proleptic_gregorian')
Convert an time coordinate array to np.float64 using cftime.date2num.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
time
|
ndarray
|
Time coordinate array. |
required |
units
|
str
|
Units for the time coordinate (default: |
'days since 1601-1-1 00:00:00'
|
calendar
|
str
|
Calendar type for time coordinate (default: |
'proleptic_gregorian'
|
Returns:
| Name | Type | Description |
|---|---|---|
time |
ndarray
|
Time coordinate array as |
to_dataframe(ds)
Convert an xarray Dataset to a pandas DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Xarray Dataset to convert. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
Pandas DataFrame with coordinates as columns and data variables as columns. Attributes are preserved in df.attrs['coords'] and df.attrs['data_vars']. |
Note
Time coordinates are converted to datetime64[ns] format.
Data variables are converted to float64.