Skip to content

isimip_utils.pandas

Pandas DataFrame utilities for ISIMIP data.

get_coords(df)

Get the coordinate names from DataFrame attributes.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with 'coords' in attrs.

required

Returns:

Type Description
tuple

Name of the coordinates.

get_first_coord(df)

Get the first coordinate name from DataFrame attributes.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with 'coords' in attrs.

required

Returns:

Type Description
str

Name of the first coordinate.

get_coord_labels(df)

Get a formatted labels for the coordinates with units.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with 'coords' in attrs.

required

Returns:

Type Description
tuple

Formatted string like "Coordinate Name [units]" or just the name if no units.

get_first_coord_label(df)

Get a formatted label for the coordinate with units.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with 'coords' in attrs.

required

Returns:

Type Description
str | None

Formatted string like "Coordinate Name [units]" or just the name if no units.

get_coord_axes(df)

Get the axis attribute for all coordinates.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with 'coords' in attrs.

required

Returns:

Type Description
tuple

Axis attribute (e.g., 'T', 'X', 'Y').

get_first_coord_axis(df)

Get the axis attribute for the first coordinate.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with 'coords' in attrs.

required

Returns:

Type Description
str | None

Axis attribute (e.g., 'T', 'X', 'Y'), or None if not set.

get_data_vars(df)

Get the data variable names from DataFrame attributes.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with 'data_vars' in attrs.

required

Returns:

Type Description
tuple

Names of the data variables.

get_first_data_var(df)

Get the first data variable name from DataFrame attributes.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with 'data_vars' in attrs.

required

Returns:

Type Description
str

Name of the first data variable.

get_data_var_labels(df)

Get a formatted label for the data variable with units.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with 'data_vars' in attrs.

required

Returns:

Type Description
str

Formatted string like "Variable Name [units]" or just the name if no units.

get_first_data_var_label(df)

Get a formatted label for the data variable with units.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with 'data_vars' in attrs.

required

Returns:

Type Description
str

Formatted string like "Variable Name [units]" or just the name if no units.

compute_average(df, data_var=None, area=True, type='annual')

Compute yearly or monthly average with optional standard deviation bounds.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with time column and data variable.

required
data_var str

Name of the data variable (default: first data var).

None
area bool

Whether to include lower/upper bounds using std (default: True).

True
type annual | monthly

Compute annual or monthly averages

'annual'

Returns: DataFrame with yearly aggregated data.

group_by_day(df, data_var=None)

Group data by day of year and compute mean.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with time column and data variable.

required
data_var str

Name of the data variable (default: first data var).

None

Returns:

Type Description
DataFrame

DataFrame grouped by day of year (1-365/366).

group_by_month(df, data_var=None)

Group data by month and compute mean.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with time column and data variable.

required
data_var str

Name of the data variable (default: first data var).

None

Returns:

Type Description
DataFrame

DataFrame grouped by month (1-12).

normalize(df, data_var=None)

Normalize data variable using z-score normalization.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with data variable to normalize.

required
data_var str

Name of the data variable (default: first data var).

None

Returns:

Type Description
DataFrame

DataFrame with normalized data variable (mean=0, std=1).

create_label(df, labels)

Add a label column to DataFrame by joining label strings.

Parameters:

Name Type Description Default
df DataFrame

DataFrame to add label to.

required
labels list[str]

List of label strings to join with spaces.

required

Returns:

Type Description
DataFrame

DataFrame with added 'label' column.