Skip to content

isimip_utils.extractions

Data extraction and manipulation utilities for xarray datasets.

select_time(ds, timestamp)

Select a single time point from a dataset.

Parameters:

Name Type Description Default
ds Dataset

Dataset with time dimension.

required
timestamp datetime

Timestamp to select.

required

Returns:

Type Description
Dataset | None

Dataset at the selected time, or None if timestamp is outside range.

select_period(ds, start, end)

Select a time period from a dataset.

Parameters:

Name Type Description Default
ds Dataset

Dataset with time dimension.

required
start datetime | None

Start of period, or None for beginning.

required
end datetime | None

End of period, or None for end.

required

Returns:

Type Description
Dataset

Dataset with time dimension sliced to the period.

Raises:

Type Description
ExtractionError

If no time axis remains after selection.

select_point(ds, lat, lon)

Select a single geographic point from a dataset.

Parameters:

Name Type Description Default
ds Dataset

Dataset with lat/lon dimensions.

required
lat float

Latitude (-90 to 90).

required
lon float

Longitude (-180 to 180).

required

Returns:

Type Description
Dataset

Dataset at the nearest grid point.

Raises:

Type Description
ValidationError

If lat/lon are out of valid range.

select_bbox(ds, west, east, south, north)

Select a bounding box region from a dataset.

Parameters:

Name Type Description Default
ds Dataset

Dataset with lat/lon dimensions.

required
west float

Western longitude boundary (-180 to 180).

required
east float

Eastern longitude boundary (-180 to 180).

required
south float

Southern latitude boundary (-90 to 90).

required
north float

Northern latitude boundary (-90 to 90).

required

Returns:

Type Description
Dataset

Dataset with lat/lon dimensions sliced to the bounding box.

Raises:

Type Description
ValidationError

If coordinates are out of valid range.

ExtractionError

If no lat or lon axis remains after selection.

mask_bbox(ds, west, east, south, north)

Mask a dataset to a bounding box, setting values outside to NaN.

Parameters:

Name Type Description Default
ds Dataset

Dataset with lat/lon dimensions.

required
west float

Western longitude boundary (-180 to 180).

required
east float

Eastern longitude boundary (-180 to 180).

required
south float

Southern latitude boundary (-90 to 90).

required
north float

Northern latitude boundary (-90 to 90).

required

Returns:

Type Description
Dataset

Dataset with values outside bounding box masked as NaN.

Raises:

Type Description
ValidationError

If coordinates are out of valid range.

mask_mask(ds, mask_ds, mask_var='mask', inverse=False)

Apply a mask dataset to another dataset.

Parameters:

Name Type Description Default
ds Dataset

Dataset to mask.

required
mask_ds Dataset

Dataset containing mask variable.

required
mask_var str

Name of mask variable (default: 'mask').

'mask'
inverse bool

Whether to invert the mask (default: False).

False

Returns:

Type Description
Dataset

Masked dataset with values where mask is 1 (or 0 if inverse=True).

compute_aggregation(ds, type, dim=None, weights=None)

Compute aggregated values over selected dimensions and add dummy dimensions like CDO.

Parameters:

Name Type Description Default
ds Dataset

Dataset to process.

required
type str

Type of aggregation.

required
dim str | Iterable

Dimensions along which apply mean [default: ('lat', 'lon')]

None
weights DataArray | None

Weights for averaging over lat/lon. If None, uses latitude-dependent weights.

None

Returns:

Type Description
Dataset

Dataset with aggregated values over selected dimensions.

compute_mean(ds, dim=None, weights=None)

Compute mean values over selected dimensions and add dummy dimensions like CDO. Wrapper for compute_aggregation.

Parameters:

Name Type Description Default
ds Dataset

Dataset to process.

required
dim str | Iterable

Dimensions along which apply mean [default: ('lat', 'lon')]

None
weights DataArray | None

Weights for averaging over lat/lon. If None, uses latitude-dependent weights.

None

Returns:

Type Description
Dataset

Dataset with mean values over selected dimensions.

compute_std(ds, dim=None, weights=None)

Compute the standard deviation over selected dimensions and add dummy dimensions like CDO. Wrapper for compute_aggregation.

Parameters:

Name Type Description Default
ds Dataset

Dataset to process.

required
dim str | Iterable

Dimensions along which apply mean [default: ('lat', 'lon')]

None
weights DataArray | None

Weights for averaging over lat/lon. If None, uses latitude-dependent weights.

None

Returns:

Type Description
Dataset

Dataset with the standard deviation over selected dimensions.

compute_sum(ds, dim=None, weights=None)

Compute the sum over selected dimensions and add dummy dimensions like CDO. Wrapper for compute_aggregation.

Parameters:

Name Type Description Default
ds Dataset

Dataset to process.

required
dim str | Iterable

Dimensions along which apply mean [default: ('lat', 'lon')]

None
weights DataArray | None

Weights for averaging over lat/lon. If None, uses latitude-dependent weights.

None

Returns:

Type Description
Dataset

Dataset with the sum over selected dimensions.

compute_min(ds, dim=None)

Compute minimum values over selected dimensions and add dummy dimensions like CDO. Wrapper for compute_aggregation.

Parameters:

Name Type Description Default
ds Dataset

Dataset to process.

required
dim str | Iterable

Dimensions along which apply mean [default: ('lat', 'lon')]

None

Returns:

Type Description
Dataset

Dataset with minimum values over selected dimensions.

compute_max(ds, dim=None)

Compute maximum values over selected dimensions and add dummy dimensions like CDO. Wrapper for compute_aggregation.

Parameters:

Name Type Description Default
ds Dataset

Dataset to process.

required
dim str | Iterable

Dimensions along which apply mean [default: ('lat', 'lon')]

None

Returns:

Type Description
Dataset

Dataset with maximum values over selected dimensions.

count_values(ds, dim=None)

Count non-NaN values over lat/lon dimensions.

Parameters:

Name Type Description Default
ds Dataset

Dataset with lat/lon dimensions.

required
dim str | Iterable

Dimensions along which to count [default: ('lat', 'lon')]

None

Returns:

Type Description
Dataset

Dataset with count of non-NaN values per time step.

concat_extraction(ds1, ds2)

Concatenate two datasets along time dimension with offset correction.

Parameters:

Name Type Description Default
ds1 Dataset | None

First dataset, or None.

required
ds2 Dataset

Second dataset to concatenate.

required

Returns:

Type Description
Dataset

Concatenated dataset, or copy of ds2 if ds1 is None.