{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"",
"\n\n**Compatible with icclim 7.1.0+**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Tutorial on calculating a climate index for Summer Days\n",
"\n",
"## About\n",
"\n",
"This tutorial will demonstrate how to calculate a climate index using a specific climate indices package. The example provided is of the Summer Days index, but similar principles can be applied to many other types of single or multi-variable indices available in the icclim package.\n",
"\n",
"The data is provided by Copernicus Climate Change Service (C3S) and includes daily gridded meteorological data for Europe from 1950 to present derived from in-situ observations (E-OBS) of maximum temperature, minimum temperature, and precipitation.\n",
"\n",
"The tutorial will first show how to download the necessary data from the C3S Climate Data Store (CDS). It will then describe how to calculate the Summer Days Index, and finally plot a map of the Summer Days climatology over Europe.\n",
"\n",
"The steps shown in this tutorial can be applied to other climate datasets with the proper variable to calculate specific climate indices. For the current index, Summer Days, the Maximum Daily Temperature variable is needed."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## How to access the notebook\n",
"\n",
"This tutorial is in the form of a [Jupyter notebook](https://jupyter.org/). You will not need to install any software for the training as there are a number of free cloud-based services to create, edit, run and export Jupyter notebooks such as this. Here are some suggestions (simply click on one of the links below to run the notebook):"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"|NBViewer|\n",
"|:-:|\n",
"[](https://nbviewer.org/github/cerfacs-globc/copernicus-training/blob/master/C3S_climate-indices_icclim.ipynb)|\n",
"|(this will not run the notebook, only render it)|"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you would like to run this notebook in your own environment, we suggest you install [Anaconda](https://docs.anaconda.com/anaconda/install/), which contains most of the libraries you will need.\n",
"You also need at least python version 3.8 for this notebook to work because of some requirements from some packages. \n",
"\n",
"You will need to install [icclim](https://github.com/cerfacs-globc/icclim) (`%pip install icclim`) for calculating the climate indices, and the CDS API (`%pip install cdsapi`) for downloading data programatically from the CDS. You will also need to install matplotlib (`%pip install matplotlib`) and cartopy (`%conda install cartopy`) to enable plotting of the results. The % character is to ensure installation occurs in your environment. The installation of cartopy requires a conda environment, if not already installed."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Search, download and view data\n",
"\n",
"Before we begin we must prepare our environment. This includes installing the Application Programming Interface (API) of the CDS, and importing the various python libraries that we will need.\n",
"\n",
"### Install CDS API\n",
"\n",
"To install the CDS API, run the following command if not already installed in your environment."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from zipfile import ZipFile\n",
"\n",
"import cartopy.crs as ccrs\n",
"import cdsapi\n",
"import matplotlib.pyplot as plt\n",
"import urllib3\n",
"\n",
"import icclim\n",
"\n",
"%pip install cdsapi"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Install icclim\n",
"\n",
"To install icclim, run the following command if not already installed in your environment."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install icclim"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Install matplotlib and cartopy\n",
"\n",
"To install matplotlib and cartopy to enable plotting, run the following command if not already installed in your environment.\n",
"A conda environment is expected as cartopy cannot be correctly installed with pip."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install matplotlib"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%conda install -y cartopy"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Import libraries\n",
"\n",
"We will be working with data in NetCDF format and calculating climate indices. We will use the icclim package and its dependencies for working with multidimensional arrays, in particular Xarray. We will also need libraries for plotting and viewing data, in this case we will use Matplotlib and Cartopy."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# CDS API\n",
"# icclim package for calculating climate indices\n",
"# To unzip dataset files from the CDS\n",
"# Libraries for working with multidimensional arrays\n",
"# To add specific units spelling\n",
"# Libraries for plotting and visualising data\n",
"# Disable warnings for data download via API\n",
"urllib3.disable_warnings()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Enter your CDS API key\n",
"\n",
"We will request data from the Climate Data Store (CDS) programmatically with the help of the CDS API. Let us make use of the option to manually set the CDS API credentials. First, you have to define two variables: `URL` and `KEY` which build together your CDS API key. The string of characters that make up your KEY include your personal User ID and CDS API key. To obtain these, first register or login to the CDS (http://cds.climate.copernicus.eu), then visit https://cds.climate.copernicus.eu/api-how-to and copy the string of characters listed after \"key:\". Replace the `#########` below with this string."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"URL = \"https://cds.climate.copernicus.eu/api/v2\"\n",
"KEY = \"#########\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here we specify a data directory in which we will download our data and all output files that we will generate:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"DATADIR = \"./\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Search for climate data to calculate summer days\n",
"\n",
"The summer days index we will calculate takes one parameter as input, it is the 2m near-surface daily maximum air temperature. Data for this parameter is available as part of the E-OBS daily gridded meteorological data for Europe from 1950 to present, but we will select a shorter period so that the download is faster: we will select the period from 2011 to 2021. We will search for this data on the CDS website: http://cds.climate.copernicus.eu. The specific dataset we will use is the `E-OBS daily gridded meteorological data for Europe from 1950 to present derived from in-situ observations`. \n",
"\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Having selected the dataset, we now need to specify what product type, variables, temporal and geographic coverage we are interested in. These can all be selected in the **\"Download data\"** tab. In this tab a form appears in which we will select the following parameters to download:\n",
"\n",
"- Product type: `Ensemble mean`\n",
"- Variable: `Maximum temperature`\n",
"- Grid resolution: `0.1deg`\n",
"- Period: `2011_2021`\n",
"- Version: `25.0e` (Latest version)\n",
"- Format: `Zip file (.zip)`\n",
"\n",
"At the end of the download form, select **\"Show API request\"**. This will reveal a block of code, which you can simply copy and paste into a cell of your Jupyter Notebook (see cells below). You will do this once for maximum temperature.\n",
"\n",
"### Download data\n",
"\n",
"... having copied the API request into the cell below, running this will retrieve and download the data you requested into your local directory. However, before you run the cell below, the **terms and conditions** of this particular dataset need to have been accepted in the CDS. The option to view and accept these conditions is given at the end of the download form, just above the **\"Show API request\"** option."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"c = cdsapi.Client(url=URL, key=KEY)\n",
"\n",
"# For the full period, use period: 'full_period' but it takes a long time to download data\n",
"# To download last decade use 'period': '2011_2021',\n",
"\n",
"c.retrieve(\n",
" \"insitu-gridded-observations-europe\",\n",
" {\n",
" \"format\": \"zip\",\n",
" \"product_type\": \"ensemble_mean\",\n",
" \"variable\": \"maximum_temperature\",\n",
" \"grid_resolution\": \"0.1deg\",\n",
" \"period\": \"2011_2021\",\n",
" \"version\": \"25.0e\",\n",
" },\n",
" \"eobs_tasmax.zip\",\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Inspect Data\n",
"\n",
"We have requested the data in a zip archive. This zip archive contains a file in the NetCDF format. This is a commonly used format for array-oriented scientific data. To read and process this data we will make use of the underlying Xarray library that is used by the software to calculate the climate index. Xarray is an open source project and Python package that makes working with labelled multi-dimensional arrays simple and efficient. We will uncompress the archive and retrieve the filename(s). The archive could contain several files, but since we requested only one time period, we have a list of one file."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['tx_ens_mean_0.1deg_reg_v25.0e.nc']\n"
]
}
],
"source": [
"# Create a ZipFile Object and load eobs_tasmax.zip in it\n",
"with ZipFile(\"eobs_tasmax.zip\", \"r\") as zip_obj:\n",
" # Get a list of all archived file names from the zip\n",
" list_of_file_names = zip_obj.namelist()\n",
" # Extract all the contents of zip file in current directory\n",
" zip_obj.extractall()\n",
"\n",
"# List the NetCDF filenames of the dataset"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Calculate Summer Days index using icclim"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's calculate using icclim Summer Days (SU) function for the months of June July and August (JJA) using the [icclim](https://icclim.readthedocs.io/en/stable/) software. We retrieve the calculated values in an Xarray dataset, but alternatively we could also write the values automatically in an output NetCDF file using the keyword out_file when calling the icclim function."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2022-08-10 14:14:22,569 --- icclim 5.4.0\n",
"2022-08-10 14:14:22,570 --- BEGIN EXECUTION\n",
"2022-08-10 14:14:22,570 Processing: 0%\n",
"2022-08-10 14:14:22,606 Calculating climate index: SU\n",
"/Users/page/miniconda3/envs/icclimv5/lib/python3.8/site-packages/xclim/core/cfchecks.py:41: UserWarning: Variable has a non-conforming cell_methods: Got `time: mean`, which do not include the expected `time: maximum`\n",
" _check_cell_methods(\n",
"/Users/page/miniconda3/envs/icclimv5/lib/python3.8/site-packages/xarray/core/indexing.py:1228: PerformanceWarning: Slicing is producing a large chunk. To accept the large\n",
"chunk and silence this warning, set the option\n",
" >>> with dask.config.set(**{'array.slicing.split_large_chunks': False}):\n",
" ... array[indexer]\n",
"\n",
"To avoid creating the large chunks, set the option\n",
" >>> with dask.config.set(**{'array.slicing.split_large_chunks': True}):\n",
" ... array[indexer]\n",
" return self.array[key]\n",
"2022-08-10 14:14:24,005 Processing: 100%\n",
"2022-08-10 14:14:24,005 --- icclim 5.4.0\n",
"2022-08-10 14:14:24,006 --- CPU SECS = 17.507 \n",
"2022-08-10 14:14:24,006 --- END EXECUTION\n"
]
}
],
"source": [
"# Add out_file='out_icclim.nc' to also output data into a NetCDF file\n",
"summer_days = icclim.index(\n",
" index_name=\"SU\", in_files=list_of_file_names, slice_mode=\"JJA\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Before we plot the results, we can query our newly created Xarray dataset ..."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
<xarray.Dataset>\n",
"Dimensions: (time: 73, longitude: 705, latitude: 465, bounds: 2)\n",
"Coordinates:\n",
" * time (time) datetime64[ns] 1949-07-16T12:00:00 ... 2021-07-16T12:...\n",
" * longitude (longitude) float64 -24.95 -24.85 -24.75 ... 45.25 45.35 45.45\n",
" * latitude (latitude) float64 25.05 25.15 25.25 ... 71.25 71.35 71.45\n",
" * bounds (bounds) int64 0 1\n",
"Data variables:\n",
" SU (time, latitude, longitude) float64 dask.array<chunksize=(1, 86, 131), meta=np.ndarray>\n",
" time_bounds (time, bounds) datetime64[ns] 1949-06-01 ... 2021-08-31\n",
"Attributes:\n",
" title: heat index SU\n",
" references: ATBD of the ECA&D indices calculation (https://knmi-ecad-as...\n",
" institution: Climate impact portal (https://climate4impact.eu)\n",
" history: Mon Mar 28 12:22:18 2022: ncks --no-abc -d time,0,26297 /da...\n",
" source: \n",
" Conventions: CF-1.6