isimip_utils.pandas
Pandas DataFrame utilities for ISIMIP data.
get_coords(df)
Get the coordinate names from DataFrame attributes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with 'coords' in attrs. |
required |
Returns:
| Type | Description |
|---|---|
tuple
|
Name of the coordinates. |
get_first_coord(df)
Get the first coordinate name from DataFrame attributes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with 'coords' in attrs. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Name of the first coordinate. |
get_coord_labels(df)
Get a formatted labels for the coordinates with units.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with 'coords' in attrs. |
required |
Returns:
| Type | Description |
|---|---|
tuple
|
Formatted string like "Coordinate Name [units]" or just the name if no units. |
get_first_coord_label(df)
Get a formatted label for the coordinate with units.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with 'coords' in attrs. |
required |
Returns:
| Type | Description |
|---|---|
str | None
|
Formatted string like "Coordinate Name [units]" or just the name if no units. |
get_coord_axes(df)
Get the axis attribute for all coordinates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with 'coords' in attrs. |
required |
Returns:
| Type | Description |
|---|---|
tuple
|
Axis attribute (e.g., 'T', 'X', 'Y'). |
get_first_coord_axis(df)
Get the axis attribute for the first coordinate.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with 'coords' in attrs. |
required |
Returns:
| Type | Description |
|---|---|
str | None
|
Axis attribute (e.g., 'T', 'X', 'Y'), or None if not set. |
get_data_vars(df)
Get the data variable names from DataFrame attributes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with 'data_vars' in attrs. |
required |
Returns:
| Type | Description |
|---|---|
tuple
|
Names of the data variables. |
get_first_data_var(df)
Get the first data variable name from DataFrame attributes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with 'data_vars' in attrs. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Name of the first data variable. |
get_data_var_labels(df)
Get a formatted label for the data variable with units.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with 'data_vars' in attrs. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Formatted string like "Variable Name [units]" or just the name if no units. |
get_first_data_var_label(df)
Get a formatted label for the data variable with units.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with 'data_vars' in attrs. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Formatted string like "Variable Name [units]" or just the name if no units. |
compute_average(df, data_var=None, area=True, type='annual')
Compute yearly or monthly average with optional standard deviation bounds.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with time column and data variable. |
required |
data_var
|
str
|
Name of the data variable (default: first data var). |
None
|
area
|
bool
|
Whether to include lower/upper bounds using std (default: True). |
True
|
type
|
annual | monthly
|
Compute annual or monthly averages |
'annual'
|
Returns: DataFrame with yearly aggregated data.
group_by_day(df, data_var=None)
Group data by day of year and compute mean.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with time column and data variable. |
required |
data_var
|
str
|
Name of the data variable (default: first data var). |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame grouped by day of year (1-365/366). |
group_by_month(df, data_var=None)
Group data by month and compute mean.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with time column and data variable. |
required |
data_var
|
str
|
Name of the data variable (default: first data var). |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame grouped by month (1-12). |
normalize(df, data_var=None)
Normalize data variable using z-score normalization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with data variable to normalize. |
required |
data_var
|
str
|
Name of the data variable (default: first data var). |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with normalized data variable (mean=0, std=1). |
create_label(df, labels)
Add a label column to DataFrame by joining label strings.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame to add label to. |
required |
labels
|
list[str]
|
List of label strings to join with spaces. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with added 'label' column. |