Skip to article frontmatterSkip to article content

System

We assume that there is an underlying process that can explain the temperature extremes

Measurement:yn=y(sn,tn),y:RDs×R+R \begin{aligned} \text{Measurement}: && && y_n &= \boldsymbol{y}(\mathbf{s}_n, t_n), && && \boldsymbol{y}: \mathbb{R}^{D_s}\times \mathbb{R}^+ \rightarrow \mathbb{R} \end{aligned}

where tnt_n is the time stamp of acquisition and sn\mathbf{s}_n is the station coordinates. More concretely, the input parameters of this unknown function is

Spatial Coordinates:snΩRDs[Degrees,Degrees,Meters]Temporal Coordinates:tnTR+[Days]\begin{aligned} \text{Spatial Coordinates}: && && \mathbf{s}_n&\in\Omega\subseteq\mathbb{R}^{D_s} && [\text{Degrees}, \text{Degrees}, \text{Meters}] \\ \text{Temporal Coordinates}: && && t_n&\in\mathcal{T}\subseteq\mathbb{R}^+ && [\text{Days}] \end{aligned}

where Ds=[Latitude,Longitude,Altitude]D_s = [\text{Latitude}, \text{Longitude}, \text{Altitude}]. We have an alternative representation when we want to stress the dependencies in the spatial domain

Measurement:yn=y(Ω,tn),y:RDΩ×R+RDΩ\begin{aligned} \text{Measurement}: && && \mathbf{y}_n &= \boldsymbol{y}(\boldsymbol{\Omega}, t_n), && && \boldsymbol{y}: \mathbb{R}^{D_\Omega}\times \mathbb{R}^+ \rightarrow \mathbb{R}^{D_\Omega} \end{aligned}

Covariate

We assume that there is a covariate which is correlated with the increase of extremes. In this case, we are interested in the Global Mean Surface Temperature (GMST).

Covariate:xn=x(tn),x:R+R\begin{aligned} \text{Covariate}: && && x_n &= x(t_n), && && x: \mathbb{R}^+ \rightarrow \mathbb{R} \end{aligned}

Data

We assume that we have a sequence of data points available.

D={(tn,sn),xn,yn}n=1NxRNyRNSRN×DsTRN\begin{aligned} \mathcal{D} &= \left\{ (t_n, \mathbf{s}_n), x_n, y_n \right\}_{n=1}^N && && \mathbf{x} \in \mathbb{R}^{N} && \mathbf{y} \in \mathbb{R}^{N} && \mathbf{S} \in \mathbb{R}^{N\times D_s} && \mathbf{T} \in \mathbb{R}^{N} && \end{aligned}

where N=Ns×NΩN=N_s\times N_\Omega are the total number of spatial and temporal coordinates available.

For convenience, throughout this paper, we will often stack each of these into vectors or matrices. In addition, we might use a different notation to denote the dependencies between the spatial points

D={(tn,Sn),xn,yn}n=1NynRDΩYRDΩ×NSΩRN×DΩ×Ds\begin{aligned} \mathcal{D} &= \left\{ (t_n, \mathbf{S}_n), x_n, \mathbf{y}_n \right\}_{n=1}^N && && \mathbf{y}_n \in \mathbb{R}^{D_\Omega} && && \mathbf{Y} \in \mathbb{R}^{D_\Omega \times N} && \mathbf{S}_\Omega \in \mathbb{R}^{N\times D_\Omega \times D_s} \end{aligned}

Joint Distribution

We also assume that there is a joint distribution of a set of parameters, θ\boldsymbol{\theta}, combined with the observation, y\mathbf{y}. However, we decompose the joint distribution into a likelihood and prior. Basically, the observations can be explained some prior parameters.

p(y,z,θ)=p(yz)p(zθ)p(θ)p(\mathbf{y},\mathbf{z},\boldsymbol{\theta}) = p(\mathbf{y}|\mathbf{z})p(\mathbf{z}|\boldsymbol{\theta})p(\boldsymbol{\theta})

The likelihood term is an arbitrary distribution and the prior term are the prior parameters for the likelihood distribution.

Data Likelihood:yp(yz)Process Parameters:zp(zθ)Prior Parameters:θp(θ)\begin{aligned} \text{Data Likelihood}: && && y &\sim p(\mathbf{y}|\mathbf{z}) \\ \text{Process Parameters}: && && \mathbf{z} &\sim p(\mathbf{z}|\boldsymbol{\theta}) \\ \text{Prior Parameters}: && && \boldsymbol{\theta} &\sim p(\boldsymbol{\theta}) \\ \end{aligned}

where θ={μ,σ,κ}\boldsymbol{\theta} = \left\{\mu,\sigma,\kappa\right\}. The full term for posterior is given by

p(θy)=1Zp(yθ)p(θ)p(\boldsymbol{\theta}|\mathbf{y}) = \frac{1}{Z}p(\mathbf{y}|\boldsymbol{\theta}) p(\boldsymbol{\theta})

where ZZ is a normalizing constant. The problem term is the normalizing constant because it is an integral wrt to all of the parameters

Z=p(yθ)p(θ)pθZ=\int p(\mathbf{y}|\boldsymbol{\theta})p(\boldsymbol{\theta})p\boldsymbol{\theta}

This is intractable because there is no closed form given the non-linearities in the GEVD PDF as seen in (3) and (4).


Data Likelihood

We are interested in extreme values so it is natural that we use some of the distributions that are readily available for extreme values.


GEVD

The first option is the generalized extreme value distribution (GEVD). The cumulative denisty function (CDF) of the GEVD is given by

F(y;θ)=exp[t(y;θ)]\boldsymbol{F}(y;\boldsymbol{\theta}) = \exp \left[ -\boldsymbol{t}(y;\boldsymbol{\theta}) \right]

where the function t(y;θ)t(y;\boldsymbol{\theta}) is defined as

t(y;θ)={[1+κ(yμσ)]+1/κ,κ0exp(yμσ),κ=0\boldsymbol{t}(y;\boldsymbol{\theta}) = \begin{cases} \left[ 1 + \kappa \left( \frac{y-\mu}{\sigma} \right)\right]_+^{-1/\kappa}, && \kappa\neq 0 \\ \exp\left(-\frac{y-\mu}{\sigma}\right), && \kappa=0 \end{cases}

Parameters

For this distribution, we have the following free parameters

θGEVD={μ,σ,κ}\boldsymbol{\theta}_\text{GEVD} = \left\{ \mu, \sigma, \kappa\right\}

Log Probability

If we have a set of observations, we can maximize the log probability of the observations. We can define the probability density function

pGEVD(y;θ)=1σt(y;θ)κ+1et(y;θ)p_\text{GEVD}(y;\boldsymbol{\theta}) = \frac{1}{\sigma}t\left(y;\boldsymbol{\theta}\right)^{\kappa+1}e^{-t\left(y;\boldsymbol{\theta}\right)}

where t(y;θ)t(y;\boldsymbol{\theta}) is defined in (12). Subsequently, we can take the log-probability to get a loss function.

logp(y1:Nθ)=Nblogσ(1+1/κ)n=1Nblog[1+κzn]+n=1Nb[1+κzn]+1/κ\log p(\boldsymbol{y}_{1:N}|\boldsymbol{\theta}) = - N_b \log \sigma - (1+1/\kappa)\sum_{n=1}^{N_b} \log \left[ 1 + \kappa z_n\right]_+ - \sum_{n=1}^{N_b} \left[ 1 + \kappa z_n\right]_+^{-1/\kappa}

where where zn=(ynμ)/σz_n=(y_n - \mu)/\sigma and [1+κzn]+=max(1+κzn,0)[1 + \kappa z_n]_+ = \text{max}(1 + \kappa z_n,0) and NbN_{b} are the number of blocks.


GPD

Another option is the generalized Pareto distribution. This is a peak-over-threshold (POT) method which is the number of events conditioned on the fact that we are above a given threshold, y0y_0.

p(Yyyy0):=F(y;θ)p(Y\leq y|y\geq y_0) := F(y;\boldsymbol{\theta})

We can define the CDF as:

F(y;θ)={1[1+κ(yy0σ)]1/κ,κ01exp(yy0σ),κ=0\boldsymbol{F}(y;\boldsymbol{\theta}) = \begin{cases} 1 - \left[ 1 + \kappa^* \left( \frac{y-y_0}{\sigma^*} \right)\right]^{-1/\kappa^*} , && \kappa^*\neq 0 \\ 1 - \exp\left(-\frac{y-y_0}{\sigma^*}\right), && \kappa^*=0 \end{cases}

Parameters

The free parameters available for this distribution are

θGPD={y0,σ,κ}\boldsymbol{\theta}_\text{GPD} = \left\{ y_0, \sigma^*, \kappa^* \right\}

The threshold parameter, y0y_0, needs to be decided before trying to fit the other two parameters. If there are no strong concerns about what is a threshold, we typically choose a reasonable quantile range, e.g., 95%. The remaining free parameters can be directly related to the GEVD parameters in equation (13) like so

σ=σ+κ(y0μ)κ=κ\begin{aligned} \sigma^* &= \sigma + \kappa (y_0 - \mu) \\ \kappa^* &= \kappa \end{aligned}

Log Probability

If we have a set of observations, we can maximize the log probability of the observations. We can define the probability density function for the GPD as

pGPD(y;θ)=1σ[1+κ(yy0σ)]+1κ1\begin{aligned} p_\text{GPD}(y;\boldsymbol{\theta}) = \frac{1}{\sigma^*}\left[ 1 + \kappa^* \left( \frac{y-y_0}{\sigma^*} \right)\right]^{-\frac{1}{\kappa^*} - 1}_+ \end{aligned}

where a+=max(a,0)a_+=\text{max}(a,0) and t(y;θ)t(y;\boldsymbol{\theta}) is defined in (12). Subsequently, taking the log will result in

logpGPD(y1:Ny0θ)=Ny0logσ(1+1/κ)n=1Nlog[1+κzn]+\log p_\text{GPD}(\boldsymbol{y}_{1:N_{y_0}}|\boldsymbol{\theta}) = - N_{y_0} \log \sigma^* - (1+1/\kappa^*)\sum_{n=1}^N \log \left[ 1 + \kappa^* z_n\right]_+

where where zn=(yny0)/σz_n=(y_n - y_0)/\sigma and [1+κzn]+=max(1+κzn,0)[1 + \kappa^* z_n]_+ = \text{max}(1 + \kappa^* z_n,0) and Ny0N_{y_0} are the number of exceedences above the threshold, y0y_0.


Rate

The rate parameter is defined as the expected number of events per event period, TT.

λy0=[1+κz]1κ,z=(y0μ)/σ \begin{aligned} \lambda_{y_0} &= \left[ 1 + \kappa z \right]^{- \frac{1}{\kappa}}, && && z = (y_0 - \mu)/\sigma \end{aligned}

(eq:gpd-param-rate)

This parameterization is useful for both the GEVD and the GPD.


Return Period

For the GEVD, we have the return period defined as:

y={μ+σκ{[log(11/TR)]κ1}κ0μσlog[log(11/TR)]κ=0y = \begin{cases} \mu + \frac{\sigma}{\kappa}\left\{\left[\log\left(1-1/T_R\right)\right]^{\kappa}-1\right\} && \kappa\neq 0 \\ \mu - \sigma \log \left[ - \log \left(1 - 1/T_R \right) \right] && \kappa=0 \end{cases}

For the GPD, we have the return period defined as:

y={y0+σκ[(λy0TR)κ1],κ0y0+σlog(λy0TR),κ=0y = \begin{cases} y_0 + \frac{\sigma}{\kappa} \left[ (\lambda_{y_0} T_R)^{\kappa} - 1 \right], && \kappa \neq 0 \\ y_0 + \sigma \log (\lambda_{y_0} T_R), && \kappa = 0 \end{cases}

Process Parameterization

Concretely, we

Latent Variable:θ:=z=[zμzσzκ]zRDΩ\begin{aligned} \text{Latent Variable}: && && \boldsymbol{\theta} := \mathbf{z} &= \begin{bmatrix} \boldsymbol{z}_{\boldsymbol{\mu}} \\ \boldsymbol{z}_{\boldsymbol{\sigma}} \\ \boldsymbol{z}_{\boldsymbol{\kappa}} \end{bmatrix} && && \mathbf{z} \in \mathbb{R}^{D_\Omega} \end{aligned}

We define functions for each of these latent variables (which are input parameters for the respective methods).

Location Parameter:zμzμ(s,x;θ),zμ:R+×RDx×ΘRDΩScale Parameter:zσzσ(s,x;θ),zσ:R+×RDx×ΘRDΩShape Parameter:zκzκ(s,x;θ),zκ:R+×RDx×ΘRDΩ\begin{aligned} \text{Location Parameter}: && && \boldsymbol{z}_{\boldsymbol{\mu}} &\approx \boldsymbol{z}_{\boldsymbol{\mu}}(\mathbf{s},x;\boldsymbol{\theta}) , && && \boldsymbol{z}_{\boldsymbol{\mu}}: \mathbb{R}^+\times\mathbb{R}^{D_x}\times\Theta \rightarrow \mathbb{R}^{D_\Omega} \\ \text{Scale Parameter}: && && \boldsymbol{z}_{\boldsymbol{\sigma}} &\approx \boldsymbol{z}_{\boldsymbol{\sigma}}(\mathbf{s},x;\boldsymbol{\theta}) , && && \boldsymbol{z}_{\boldsymbol{\sigma}}: \mathbb{R}^+\times\mathbb{R}^{D_x}\times\Theta \rightarrow \mathbb{R}^{D_\Omega} \\ \text{Shape Parameter}: && && \boldsymbol{z}_{\boldsymbol{\kappa}} &\approx \boldsymbol{z}_{\boldsymbol{\kappa}}(\mathbf{s},x;\boldsymbol{\theta}) , && && \boldsymbol{z}_{\boldsymbol{\kappa}}: \mathbb{R}^+\times\mathbb{R}^{D_x}\times\Theta \rightarrow \mathbb{R}^{D_\Omega} \end{aligned}

The hypothesis is that the location parameter for the extremes distribution will be correlated with the GMST covariate. We also conjecture that the location parameter for each of the stations is highly correlated. However, we do not postulate that the scale and shape parameters.


Latent Parameter

Latent Variable:z(t,θ)=z0+z1ψ(t;θ)+z1ψ(t;θ)+ϵ\begin{aligned} \text{Latent Variable}: && && \mathbf{z}(t,\boldsymbol{\theta}) &= \mathbf{z}_0 + \mathbf{z}_1\psi(t;\boldsymbol{\theta}) + \mathbf{z}_1\psi(t;\boldsymbol{\theta}) + \epsilon \end{aligned}

Location Parameters

We have a range of use cases for the location parameter.

Location:zμ=μ0+μ1ϕ(t;θ)+μ2ϕ(s;θ)+ϵ\begin{aligned} \text{Location}: && && \mathbf{z}_{\boldsymbol{\mu}} &= \boldsymbol{\mu}_0 + \mu_1 \phi(t;\boldsymbol{\theta}) + \mu_2 \phi(\mathbf{s};\boldsymbol{\theta}) + \epsilon \\ \end{aligned}

where zμRDΩ\mathbf{z}_{\boldsymbol{\mu}}\in\mathbb{R}^{D_\Omega}


Scale & Shape Parameters

For the scale and shape parameters, we impose only constraint on each form. We allow for the scale and shape parameter to be different for each station.

Scale:logzσ=σ0σ0RDΩShape:zκ=κ0,κ0RDΩ\begin{aligned} \text{Scale}: && && \log \mathbf{z}_{\boldsymbol{\sigma}} &= \boldsymbol{\sigma}_0 && && \boldsymbol{\sigma}_0 \in \mathbb{R}^{D_\Omega} \\ \text{Shape}: && && \mathbf{z}_{\boldsymbol{\kappa}} &= \boldsymbol{\kappa}_0, && && \kappa_0 \in \mathbb{R}^{D_\Omega} \\ \end{aligned}

We use the log\log transformation to ensure that everything is positive.


ODE Formulation

The DMT is formulated as an ordinary differential equation (ODE). First, we will define it as a system of ODEs whereby we have a state variable

State:z=[xy],zR2\begin{aligned} \text{State}: && && \mathbf{z} &= \begin{bmatrix} x \\ y \end{bmatrix}, && && \mathbf{z}\in\mathbb{R}^2 \end{aligned}

Now, we can define an equation of motion which describes the temporal dynamics of the system.

Equation of Motion:dzdt=f(z,t,θ),f:R2×R+×ΘR\begin{aligned} \text{Equation of Motion}: && && \frac{d\mathbf{z}}{dt} &= \boldsymbol{f}(\mathbf{z},t,\theta), && && \boldsymbol{f}:\mathbb{R}^2 \times \mathbb{R}^+ \times \Theta \rightarrow \mathbb{R} \end{aligned}

We also have initial measurements of the system

Initial Values:z(0)=[x(0)y(0)]:=z0\begin{aligned} \text{Initial Values}: && && \mathbf{z}(0) &= \begin{bmatrix} x(0) \\ y(0) \end{bmatrix} := \mathbf{z}_0 \end{aligned}

From the fundamental theory of calculus, we know that the solution of said ODE is a temporal integration wrt time

TimeStepper:zt=z0+0tf(z0,τ,θ)dτ\begin{aligned} \text{TimeStepper}: && && \mathbf{z}_t = \mathbf{z}_0 + \int_0^t \boldsymbol{f}(\mathbf{z}_0, \tau, \theta)d\tau \end{aligned}

Conventionally, we use ODE solvers like Euler, Heun, or Runge-Kutta.

ODESolver:zt=ODESolve(f,z0,t,θ)\begin{aligned} \text{ODESolver}: && && \mathbf{z}_t = \text{ODESolve}(\boldsymbol{f}, \mathbf{z}_0, t, \theta) \end{aligned}

Non-Dimensionalization

We will reparameterize this ODE to remove some dependencies on time. The above equation is divided by

dydtdtdx=f(y,x,θ)\frac{dy}{dt}\frac{dt}{dx} = f(y,x,\theta)

Parameterization

There are many special forms of ODEs which are known from the literature.

1st Order ODE:f(y,x,θ)=f1(x)f2(x)y\begin{aligned} \text{1st Order ODE}: && && \boldsymbol{f}(y,x,\theta) &= \boldsymbol{f}_1(x) - \boldsymbol{f}_2(x)\cdot y \end{aligned}

An example form would the following:

f(y,x,θ)=a0+a1x+a2y\boldsymbol{f}(y,x,\theta) = a_0 + a_1 x + a_2 y

Constant Form. The first form assumes that we have a constant change in DMT wrt the GMST

Constant:f(y,x,θ)=a0Linear Solution:y(x)=y0+a0x\begin{aligned} \text{Constant}: && && \boldsymbol{f}(y,x,\theta) &= a_0 \\ \text{Linear Solution}: && && y(x) &= y_0 + a_0 x \end{aligned}

Linear Form. The first form assumes that we have a constant change in DMT wrt the GMST

Linear:f(y,x,θ)=a0+a1xQuadratic Solution:y(x)=y0+a0x+12a1x2\begin{aligned} \text{Linear}: && && \boldsymbol{f}(y,x,\theta) &= a_0 + a_1 x\\ \text{Quadratic Solution}: && && y(x) &= y_0 + a_0 x + \frac{1}{2}a_1x^2 \end{aligned}

Multiplicative Form. The first form assumes that we have a constant change in DMT wrt the GMST

Multiplicative:f(y,x,θ)=a2yExponential Solution:y(x)=y0exp(a2x)\begin{aligned} \text{Multiplicative}: && && \boldsymbol{f}(y,x,\theta) &= a_2 y\\ \text{Exponential Solution}: && && y(x) &= y_0 \exp \left( a_2x \right) \end{aligned}