icclim.index()#
icclim exposes a main entry point with icclim.index()
. It can be used
to compute ECA&D indices, DCSC indices or generic indices. There are
numerous arguments, but only a few are needed to compute simple
indices. Our How to recipes are also a good start to have an idea
on how to use icclim.index.
Note
Since version 5.2.0, icclim API now includes each individual index as a standalone function. Check ECA&D indices to see how to call them.
Compute climat indices#
- icclim.index(**kwargs)[source]
Compute climate index.
This is the main entry point for icclim.
- Parameters:
in_files (str | list[str] | Dataset | DataArray | InputDictionary) – Absolute path(s) to NetCDF dataset(s), including OPeNDAP URLs, or path to zarr store, or xarray.Dataset or xarray.DataArray.
index_name (str | StandardIndex) – Climate index name. For ECA&D index, case insensitive name used to lookup the index. For user index, it’s the name of the output variable.
var_name (str | list[str] | None) –
optional
Target variable name to process corresponding toin_files
. If None (default) on ECA&D index, the variable is guessed based on the climate index wanted. Mandatory for a user index.slice_mode (FrequencyLike | Frequency) – Type of temporal aggregation: The possibles values are
{"year", "month", "DJF", "MAM", "JJA", "SON", "ONDJFM" or "AMJJAS", ("season", [1,2,3]), ("month", [1,2,3,])}
(where season and month lists can be customized) or any valid pandas frequency. A season can also be defined between two exact dates:("season", ("19 july", "14 august"))
. Default is “year”. See slice_mode for details.time_range (list[datetime.datetime ] | list[str] | tuple[str, str] | None) –
optional
Temporal range: upper and lower bounds for temporal subsetting. IfNone
, whole period of input files will be processed. The dates can either be given as instance of datetime.datetime or as string values. For strings, many format are accepted. Default isNone
.out_file (str | None) – Output NetCDF file name (default: “icclim_out.nc” in the current directory). Default is “icclim_out.nc”. If the input
in_files
is aDataset
,out_file
field is ignored. Use the function returned value instead to retrieve the computed value. Ifout_file
already exists, icclim will overwrite it!threshold (float | list[float] | None) –
optional
User defined threshold for certain indices. Default depend on the index, see their individual definition. When a list of threshold is provided, the index will be computed for each thresholds.transfer_limit_Mbytes (float) – Deprecated, does not have any effect.
callback (Callable[[int], None]) –
optional
Progress bar printing. IfNone
, progress bar will not be printed.callback_percentage_start_value (int) –
optional
Initial value of percentage of the progress bar (default: 0).callback_percentage_total (int) –
optional
Total percentage value (default: 100).base_period_time_range (list[datetime.datetime ] | list[str] | tuple[str, str] | None) –
optional
Temporal range of the reference period. The dates can either be given as instance of datetime.datetime or as string values. It is used either: #. to compute percentiles if threshold is filled. When missing, the studied period is used to compute percentiles. The study period is either the dataset filtered by time_range or the whole dataset if time_range is missing. For day of year percentiles (doy_per), on extreme percentiles the overlapping period between base_period_time_range and the study period is bootstrapped. #. to compute a reference period for indices such as difference_of_mean (a.k.a anomaly) if a single variable is given in input.doy_window_width (int) –
optional
Window width used to aggreagte day of year values when computing day of year percentiles (doy_per) Default: 5 (5 days).min_spell_length (int) –
optional
Minimum spell duration to be taken into account when computing the sum_of_spell_lengths.rolling_window_width (int) –
optional
Window width of the rolling window for indicators such as {max_of_rolling_sum, max_of_rolling_average, min_of_rolling_sum, min_of_rolling_average}only_leap_years (bool) –
optional
Option for February 29th (default: False).ignore_Feb29th (bool) –
optional
Ignoring or not February 29th (default: False).interpolation (str | QuantileInterpolation | None) –
optional
Interpolation method to compute percentile values:{"linear", "median_unbiased"}
Default is “median_unbiased”, a.k.a type 8 or method 8. Ignored for non percentile based indices.out_unit (str | None) –
optional
Output unit for certain indices: “days” or “%” (default: “days”).netcdf_version (str | NetcdfVersion) –
optional
NetCDF version to create (default: “NETCDF3_CLASSIC”).user_index (UserIndexDict) –
optional
A dictionary with parameters for user defined index. See Create your own index with user_index. Ignored for ECA&D indices.save_thresholds (bool) –
optional
True if the thresholds should be saved within the resulting netcdf file (default: False).date_event (bool) – When True the date of the event (such as when a maximum is reached) will be stored in coordinates variables. warning This option may significantly slow down computation.
logs_verbosity (str | Verbosity) –
optional
Configure how verbose icclim is. Possible values:{"LOW", "HIGH", "SILENT"}
(default: “LOW”)sampling_method (str) – Choose whether the output sampling configured in slice_mode is a groupby operation or a resample operation (as per xarray definitions). Possible values:
{"groupby", "resample", "groupby_ref_and_resample_study"}
(default: “resample”) groupby_ref_and_resample_study may only be used when computing the difference_of_means (a.k.a the anomaly).indice_name (str | None) – DEPRECATED, use index_name instead.
user_indice (dict | None) – DEPRECATED, use user_index instead.
window_width (int) – DEPRECATED, use doy_window_width, min_spell_length or rolling_window_width instead.
save_percentile (bool) – DEPRECATED, use save_thresholds instead.
Note
For the variable names see the correspondence table “index - source variable”
Below are some additional information about input parameters.
in_files
and var_name
#
- The
in_files
parameter can be A string path to a netCDF file or a zarr store
A list of strings to represent multiple netCDF files to combine
A xarray.Dataset
A xarray.DataArray
A python dictionary
var_name
is an optional string, or a list of string, to clarify wich variables
must be used from the input in_files
.
var_name
can be omitted if the variables’ name can be guessed for standard indices.
For example Cold and Wet (CW()
) index needs a daily mean temperature variable named ‘tas’ (or any of its aliases)
and a precipitation variable ‘pr’ (or any of its aliases).
See StandardVariable
for a list of all the aliases of each standardized variable.
So, if a ‘mean_temperatures.nc’ contains a tas variable and ‘precipitations.nc’ a pr variable,
the following is sufficient to compute CW.
icclim.cw(in_files=["mean_temperatures.nc", "precipitations.nc"]).compute.CW
In case variables’ name cannot be guessed, you can explicitly name the variable you wish to read from the input file:
icclim.cw(in_files={"customTas": mean_temperatures.nc, "pr": "precipitations.nc"})
# equivalent to
icclim.cw(in_files=["mean_temperatures.nc", "precipitations.nc"], var_name=["custom_tas", "pr"])
The order in which variables are passed matters and must follow the input_variables
property defined in the
respective index of EcadIndexRegistry
.
Starting with icclim 5.3, in_files
can describe variable names,
formerly set in var_name
, as dictionary format. The dictionary keys
are variable names and values are the usual in_files types (netCDF,
zarr, Dataset, DataArray).
in_files = {"tasmax": "tasmax.nc", "pr": "precip.zarr"}
Moreover, this new dictionary syntax can be used to specify a different set of files for percentiles.
in_files = {"tasmax": "tasmax.nc", "thresholds": "tasmax-90p.zarr"}
The thresholds
input should contain percentile thresholds that will
be used in place of computing them. It allows to reuse pre-computed percentiles stored in a file.
But the percentiles will not be bootstrapped.
Note
Percentiles can be saved with save_percentile
parameter of icclim.index.
slice_mode
#
The slice_mode
parameter defines a desired temporal aggregation.
Thus, each index can be calculated at annual, winter half-year, summer
half-year, winter, spring, summer, autumn and monthly frequency:
Value (string) |
Description |
---|---|
|
annual |
|
monthly (all months) |
|
winter half-year |
|
summer half-year |
|
winter |
|
spring |
|
summer |
|
autumn |
|
monthly sampling filtered |
|
seasonal (1 value per season) |
|
seasonal (1 value per season) spells starting before season start are not accounted |
|
A valid pandas frequency (3 weeks here) |
DJF
) of 2000 is composed of December 2000,
January 2001 and February 2001.ONDJFM
) of 2000 includes October
2000, November 2000, December 2000, January 2001, February 2001 and
March 2001.Monthly time series filter#
Monthly time series with months selected by user (the keyword can be either month or months):
# index will be computed only for April, May and November
slice_mode = ["month", [4, 5, 11]]
# index will be computed only for April
slice_mode = ["month", [4]]
User defined seasons#
slice_mode = ["season", [4, 5, 6, 7]] # March to July un-clipped
slice_mode = ["clipped_season", [4, 5, 6, 7]] # March to July clipped
slice_mode = ["season", [11, 12, 1]] # November to January un-clipped
slice_mode = ["clipped_season", ([11, 12, 1])] # November to January clipped
Additionally, you can define a season between two exact dates:
slice_mode = ["season", ["07-19", "08-14"]]
slice_mode = ["clipped_season", ["07-19", "08-14"]]
Note
With 5.3.0 icclim now accepts pandas string frequency for slice_mode to resample the output data to a given frequency There are multiple combinations possible such as: “2YS-FEB” to resample data on two (2) years (Y) starting (S) in February (FEB). For further information, refer to pandas offset aliases.
threshold
#
It is possible to set a user define threshold for the following indices:
SU (default threshold: 25°C)
CSU (default threshold: 25°C)
TR (default threshold: 20°C)
CSDI (default 10th percentile)
WSDI (default 90th percentile)
TX90p (default 90th percentile)
TG90p (default 90th percentile)
TN90p (default 90th percentile)
TX10p (default 10th percentile)
TG10p (default 10th percentile)
TN10p (default 10th percentile)
The threshold could be one value:
threshold = 30
or a list of values:
threshold = [20, 25, 30]
Note
thresholds should be a float, the unit is expected to be in degrees Celsius or a unit-less for percentiles.
transfer_limit_Mbytes
#
!Deprecated
transfer_limit_Mbytes
is now ignored and will be deleted in a futur
version. See how to chunk data and parallelize computation
to configure dask chunking.
callback
#
!Deprecated
Callback can used to output a estimated progress of the calculus.
However, when using dask, the calculus are done lazily at the very end
of icclim’s process. Thus the values transmitted to callback
are
irrelevant with dask.
ignore_Feb29th
#
If it is True
, we kick out February 29th.
Computing percentile thresholds#
Percentile thresholds are used as thresholds for calculation of
percentile-based indices and are computed from values inside a reference
period, named base period which is usually 30 years
(base_period_time_range
parameter).
There are two methods for calculation of percentile thresholds:
1. For temperature indices, theses thresholds are computed for each calendar day from samples (5-day window centred on the calendar day in the base period) which depend on window_width, only_leap_years and ignore_Feb29th parameters.
In icclim these thresholds represent a dictionary with 365 (if ignore_Feb29th is True) or 366 (if ignore_Feb29th is False) calendar days as keys, and 2D arrays as values.
Note
A calendar day key of the dictionary is composed from the corresponding month and day, separated by a comma. For example, getting of the 2D array with percentiles for April 13th, looks like my_perc_dict[4,13].
The percentile thresholds are different for “in-base” years (years inside the base period) and “out-of-base” years. For “in-base” years, icclim uses the bootstrapping procedure, which is explained in this article: Avoiding Inhomogeneity in Percentile-Based Indices of Temperature Extremes (Zhang et al.) - see the resampling algorithm in the section 4. Removing the “jump”.
Warning
Computing of percentile thresholds with the bootstrapping procedure may take some time! For example, a 30-yr base period requires (30-1) times of computing percentiles for each “in-base” year!, i.e. 30*(30-1) times in total (+1 time without bootstrapping for “out-of-base” years).
2. For precipitation indices, the thresholds are computed from the set of wet days (i.e. days when daily precipitation amount >= 1.0 mm) in the base period. In icclim these thresholds represent an 2D array.
Both methods could use 2 types of interpolation.
The calc_percentiles.py module contains get_percentile_dict and get_percentile_arr functions for the described methods.
window_width
#
The window width
is used for getting samples for percentiles
computing and is set to 5: percentiles-based indices use 5-day window.
The window is centred on a certain calendar day, for example: - April
13th, we take the values for April 11th, April 12th, April 13th,
April 14th and April 15th of each year of the base period. -
January 1st, we take all days of December 30th, December 31st,
January 1st, January 2nd and January 3rd.
Hence, for a base period of 30 years and 5-day window width for each calendar day (except February 29th), there are 150 values ( 30 * 5 ) to compute its percentile value.
only_leap_years
#
The only_leap_years
parameter selects which of two methods to use
for calculating a percentile value for the calendar day of February
29th:
if
True
, we take only leap years, i.e. for example for the base period of 1980-1990 and 5-day window width, we take the values corresponding to the following dates:1980-02-27, 1980-02-28, 1980-02-29, 1980-03-01, 1980-03-02,
1984-02-27, 1984-02-28, 1984-02-29, 1984-03-01, 1984-03-02,
1988-02-27, 1988-02-28, 1988-02-29, 1988-03-01, 1988-03-02
if
False
, for the same base period and window width, we have:1980-02-27, 1980-02-28, 1980-02-29, 1980-03-01, 1980-03-02,
1981-02-27, 1981-02-28, 1981-03-01, 1981-03-02,
1982-02-27, 1982-02-28, 1982-03-01, 1982-03-02,
1983-02-27, 1983-02-28, 1983-03-01, 1983-03-02,
1984-02-27, 1984-02-28, 1984-02-29, 1984-03-01, 1984-03-02,
1985-02-27, 1985-02-28, 1985-03-01, 1985-03-02,
1986-02-27, 1986-02-28, 1986-03-01, 1986-03-02,
1987-02-27, 1987-02-28, 1987-03-01, 1987-03-02,
1988-02-27, 1988-02-28, 1988-02-29, 1988-03-01, 1988-03-02
1989-02-27, 1989-02-28, 1989-03-01, 1989-03-02,
1990-02-27, 1990-02-28, 1990-03-01, 1990-03-02
The second way is preferable, because we have more samples.
Warning
If ignore_Feb29th is True, the
only_leap_years
does not make sense!
interpolation
#
Computing of a percentile value could use linear
, also known as type
7 in other software or the interpolation proposed by Hyndman and Fan
(1996),
named in icclim as hyndman_fan
interpolation, also known as type
8.
out_unit
#
Percentile-based indices (TX10p, TX90p, TN10p, TN90p, TG10p, TG90p,
R75p, R95p and R99p) could be returned as number of days (default) or as
percentage of days (out_unit
= “%”).
Custom indices#
Custom indices are now described in their own chapter: Create your own index with user_index
Correspondence table “index - source variable”#
Using common names for the source variable, icclim is able to lookup the proper variable in the given input to compute an index.
index |
Source variable |
---|---|
TG, GD4, HD17, TG10p, TG90p |
daily mean temperature |
TN, TNx, TNn, TR, FD, CFD, TN10p, TN90p, CSDI |
daily minimum temperature |
TX, TXx, TXn, SU, CSU, ID, TX10p, TX90p, WSDI |
daily maximum temperature |
DTR, ETR, vDTR |
daily maximum + daily minimum temperature |
PRCPTOT, RR1, SDII, CWD, CDD, R10mm, R20mm, RX1day, RX5day, R75p, R75pTOT, R95p, R95pTOT, R99p, R99pTOT |
daily precipitation flux (liquide phase) |
SD, SD1, SD5cm, SD50cm |
daily snowfall flux (solid phase) |
CD, CW, WD, WW |
daily mean temperature + daily precipitation flux (liquide phase) |