Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Spatial Extremes

A step-by-step curriculum on modelling climate extremes in space: how often will a temperature this high recur, and how does that risk vary across a region? We build up from a single station to a full spatial model, one concept per notebook, on real station data from the Copernicus Climate Data Store (CDS).

Three packages do the heavy lifting, one per layer:

LayerPackageRole
Dataxrreaderpull + cache CDS in-situ land stations over Iberia
Extremesxtremaxblock-maxima extraction, GEV distribution, return levels
Gaussian processespyroxkernels, latent GP fields, variational inference
DynamicsdiffraxODE/SDE integration for the time-varying (NB10–12) trends

The build-up

Each notebook is short and adds exactly one idea.

00 — Data. Pull daily near-surface air temperature for Iberian land stations from CDS with xrreader, cache it, and look at it.

01–03 — Extreme-value foundations (one station). 01 turns a daily series into annual maxima (xtremax.extraction); 02 fits a Generalized Extreme Value (GEV) distribution to one station and interprets location/scale/shape (μ,σ,ξ)(\mu, \sigma, \xi); 03 covers the extremal-types theorem and turns the fit into return levels with posterior uncertainty.

04–06 — Pooling and Gaussian processes. 04 fits every station independently (04 with NUTS, 04b with a fast Laplace approximation) and maps the parameters — the noisy result motivates pooling. 05 pools them with a hierarchical Bayesian model. 06 is a Gaussian-process primer with pyrox: interpolate a field over (lon, lat), then add physical features (elevation, distance-to-coast, slope) and use ARD to see which actually matter.

07–09 — Spatial GEV models. Tie the strands together — the GEV parameters become latent GP fields, inferred with NumPyro. 07 makes the location μ(s)\mu(s) spatial; 08 adds a spatial scale σ(s)\sigma(s) driven by an elevation covariate; 09 frees the shape ξ(s)\xi(s) too, and asks honestly whether the tail carries any recoverable geography.

10–12 — Non-stationary in time (one long station). Switch axes: take the single longest record (Albacete, 1901–2025) and let the GEV location drift as the climate warms, three escalating ways. 10 fits a parametric linear trend μ(t)=μ0+μ1z(t)\mu(t)=\mu_0+\mu_1 z(t) (Coles’ model) and turns it into time-varying return levels. 11 replaces the line with a mechanistic ODE — a forced energy-balance relaxation integrated with diffrax inside NUTS. 12 goes nonparametric with a state-space Gaussian process (a local-linear-trend / integrated random walk, the stochastic sibling of the ODE), shows why a free stationary GP over-fits a short record, and puts all three trends on one set of axes.

Running it

Notebooks use a shared loader, spatial_extremes.data, that serves real CDS data when cached and a deterministic synthetic series otherwise — so the whole curriculum runs offline, no credentials required.

Set up the project environment with uv (from projects/spatial_extremes/):

uv sync --extra notebooks      # build .venv with the full stack, notebook tooling + MyST
.venv/bin/python -m ipykernel install --user --name spatial-extremes
.venv/bin/myst build --html    # execute the notebooks and render the static site

To use real data, accept the CDS licence, add credentials (see .env.example), and fetch once into the cache the notebooks read:

.venv/bin/python scripts/fetch_cds_insitu.py   # download the real CDS record
.venv/bin/python scripts/build_features.py     # derive nb 06 covariates (needs the cache)

Without credentials, just open any notebook — it will report that it is running on the synthetic fallback and otherwise behave identically.