Generalized Pareto Distribution

This is a location-scale family distribution.

Parameters¶

\begin{aligned} \text{Location}: && && y_0 &\in \mathbb{R} \\ \text{Scale}: && && \sigma &\in \mathbb{R}^+ \\ \text{Shape}: && && \kappa &\in \mathbb{R} \\ \end{aligned}

(1)

Interpretation¶

We can interpret the shape parameters as follows:

$\kappa=0$ . This corresponds to a type 1, short tail distribution with exponential decay.

$\kappa>0$ . This corresponds to a type 2, heavy tail distribution with a slow power-law decay.

$\kappa<0$ . This corresponds to a type 3, thin-tailed, polynomial decay with a finite upper bound.

Probability Density Function¶

This is denoted as the probability that our rv $Y$ will be equivalent to some specific value, $y$ , conditioned on the fact that our values are greater than some threshold $y_0$ .

p(Y=y|y\geq y_0) := f(y;\boldsymbol{\theta})

(2)

We can define the probability density function as

\begin{aligned} \boldsymbol{f}(y;\boldsymbol{\theta}) = \frac{1}{\sigma}\left[ 1 + \kappa \left( \frac{y-\mu}{\sigma} \right)\right]^{-\frac{1}{\kappa} - 1}_+ \end{aligned}

(3)

where $a_+=\text{max}(a,0)$ .

Cumulative Distribution Function¶

This is denoted as the probability that our rv $Y$ will be less than or equal to some specific value $y$ conditioned on the fact that our values are greater than some threshold $y_0$ .

p(Y\leq y|y\geq y_0) := F(y;\boldsymbol{\theta})

(4)

We can define the cumulative density function is defined as:

\boldsymbol{F}(y;\boldsymbol{\theta}) = \begin{cases} 1 - \left[ 1 + \kappa \left( \frac{y-\mu}{\sigma} \right)\right]^{-1/\kappa} , && \kappa\neq 0 \\ 1 - \exp\left(-\frac{y-\mu}{\sigma}\right), && \kappa=0 \end{cases}

(5)

Survival Function¶

This is the exceedence probability of $Y$ above some value $y$ . However, in this case, this probability is conditioned on the probability of a threshold value, $y_0$ .

\boldsymbol{S}(y):= p(Y>y|Y>y_0) = 1 - p(Y\leq y|Y>y_0)

(6)

We denote as the survival function of the GPD. This is simply 1 minus the CDF of the GPD given as:

\boldsymbol{S}(y;\boldsymbol{\theta}) = 1 - \boldsymbol{F}(y;\boldsymbol{\theta})

(7)

We can plug in the CDF function into this equation defined as:

\boldsymbol{S}(y;\boldsymbol{\theta}) = \begin{cases} \left[ 1 + \kappa \left( \frac{y-\mu}{\sigma} \right)\right]^{-\frac{1}{\kappa}}, && \kappa\neq 0 \\ \exp\left(-\frac{y-\mu}{\sigma}\right), && \kappa=0 \end{cases}

(8)

Quantile Function¶

This is also known as the Point-Percentile-Function or the inverse CDF. This function maps an input threshold, $y_0$ , to a value $y$ st the probability of $Y$ being less than or equal to $y$ is $y_p$ .

y_p = \boldsymbol{F}(y;\boldsymbol{\theta})

(9)

We can take the inverse of this function to see that it is the inverse CDF which we denote as the quantile function.

y_p = \boldsymbol{F}^{-1}(y_p;\boldsymbol{\theta}) := \boldsymbol{Q}(y_p;\boldsymbol{\theta})

(10)

where $y_p\in[0,1]$ is the data within the probability transform domain. These can be computed in closed form

\boldsymbol{Q}(y_p) = \begin{cases} y_0 + \frac{\sigma}{\kappa} \left[ (1 - y_p)^{-\kappa} - 1 \right], && \kappa \neq 0 \\ y_0 - \sigma \log (1 - y_p), && \kappa = 0 \end{cases}

(11)

Code Snippet

We can do some naive functions for calculating the quantile functions based on the equations (11)

# function for kappa neq 0
def gpd_quantile(yp, loc, scale, shape):
    # loc - scale / shape * (1 - (- log(1 - p)) ** (- shape))
    return loc + scale / shape * ((1 - yp) ** (- shape) - 1)
# function for kappa = 0
def quantile(yp, loc, scale):
    return loc - scale * log(1 - yp)

However, we can take into account some of the numerical errors that can come up. We can look at the expression in equation (11) to be functions of the shape parameter only.

\boldsymbol{F}_z(y_p;\kappa) = \begin{cases} \frac{1}{\kappa} \left[ (1-y_p)^{-\kappa} - 1\right], && \kappa \neq 0 \\ - \log (1 - y_p), && \kappa = 0 \end{cases}

(18)

We will do some manipulation to write this in terms of a single function

\boldsymbol{F}_z(y_p;\kappa) = \begin{cases} \frac{1}{\kappa}\exp\left[-\kappa\log (1 - y_p)\right] - 1, && \kappa \neq 0 \\ - \log (1 - y_p), && \kappa = 0 \end{cases}

(19)

This makes it easier to code while taking into account numerical errors because we can do a simple where statement.

# location scale inverse
def location_scale_inverse(z, location, scale):
    return location + scale * z

def location_scale_forward(x, location, scale):
    return (x - location) / scale

def gpd_quantile_shape(y_p, shape):
    # get boolian array
    is_shape_zero = jnp.equal(shape, 0.0)
    # calculate the true shape
    safe_shape = jnp.where(is_shape_zero, 0.0, shape)
    # calculate negative log term, -log(1-yp)
    shape_zero_term = - jnp.log1p(- y_p)
    # calculate k neq 0, exp(-k log(1-yp)) / k
    shape_nonzero_term = jnp.expm1(shape * neglog1mp) / safe_shape
    # zero 
    term = jnp.where(is_shape_zero, shape_zero_term,  shape_nonzero_term)
    return term

# function for kappa > 0
def gpd_quantile(y_p, loc, scale, shape):
    # calculate the shape parameter
    z = gpd_quantile_shape(y_p, shape)
    # calculate the inverse of location-scale
    return location_scale_inverse(z, loc, scale)

Inverse Survival Function¶

The inverse survival function maps the input value $y_s$ to some probability value $y_p$ which represents the probability that there is some value greater than said input value.

y_s = Pr[Y> y_p]

(20)

So recall the survival function is the negative of the CDF. This implies that the inverse survival function is the negative of the quantile function.

y_p = \boldsymbol{S}(y;\boldsymbol{\theta}) = 1 - \boldsymbol{F}(y;\boldsymbol{\theta})

(21)

We can take the inverse of this function to see that it is the inverse CDF which we denote as the quantile function.

y_p = \boldsymbol{S}^{-1}(y_s;\boldsymbol{\theta}) := \boldsymbol{Q}(y_s;\boldsymbol{\theta})

(22)

where $y_s\in[0,1]$ is the survival probability within the probability transform domain. These can be computed in closed form

\boldsymbol{Q}^{-1}(y_s) = \begin{cases} y_0 + \frac{\sigma}{\kappa} \left[ y_s^{-\kappa} - 1 \right], && \kappa \neq 0 \\ y_0 - \sigma \log y_p, && \kappa = 0 \end{cases}

(23)

where $y_s=1-y_p$ is the survival probability.

Code Snippet

We can reuse the code snippet from before.

# function for kappa > 0
def gpd_isf(y_s, loc, scale, shape):
    # calculate the shape parameter
    z = gpd_quantile_shape(1 - y_s, shape)
    # calculate the inverse of location-scale
    return location_scale_inverse(z, loc, scale)

Joint Distribution¶

We can write the likelihood that the observations, $y$ , follow the GEVD distribution. So, given some observations, $\mathcal{D}=\{y_n\}_{n=1}^{N}$ , which we believe follow the GEVD distribution, we can write the joint distribution decomposition as

p(y_{1:N}>y_0, y>y_0,\boldsymbol{\theta}) = p(\boldsymbol{\theta}) \prod_{n=1}^N p(y_n|y_n>y_0,\boldsymbol{\theta})p(y_n>y_0|\boldsymbol{\theta})

(24)

This implies that the global prior parameters come from some distribution

\boldsymbol{\theta} \sim p(\boldsymbol{\theta})

(25)

and that these parameters get passed through our data likelihood term

y_n \sim p(y|\boldsymbol{\theta})

(26)

Log Probability¶

Let’s say we are given some samples.

\mathcal{D} = \left\{ y_n\right\}_{n=1}^N

(27)

where $N$ are the number of exceedances above our threshold, $y_0$ . Recall the GPD PDF for our iid samples is

p(y_{1:N}|\boldsymbol{\theta}) = \prod_{n=1}^N \frac{1}{\sigma}\left[ 1 + \kappa \left( \frac{y_n-y_0}{\sigma} \right)\right]^{-\frac{1}{\kappa} - 1}_+

(28)

We can add the log term to get

\log p(\boldsymbol{y}_{1:N}|\boldsymbol{\theta}) = \sum_{n=1}^N \log \left[ 1 + \kappa \left( \frac{y_n-y_0}{\sigma} \right)\right]^{-\frac{1}{\kappa} - 1}_+

(29)

which reduces to

\log p(\boldsymbol{y}_{1:N}|\boldsymbol{\theta}) = - N \log \sigma - (1+1/\kappa)\sum_{n=1}^N \log \left[ 1 + \kappa z_n\right]_+

(30)

where $z_n=(y_n - y_0)/\sigma$ and $[1 + \kappa z_n]_+ = \text{max}(1 + \kappa z_n,0)$ .

Code Snippet

We can create an likelihood function for this.

def gpd_logpdf(x, location, scale, shape):
    # calculate location scale: z=(y−μ)/σ
    z = (x - location) / scale
    # calculate t(z) = max(1+κz)
    t = max(1.0 + shape * z, 0)
    # term 1: −log σ
    t1 = - np.log(scale)
    # term 2: - (1+1/κ)log(1+κz)
    t2 = - (1.0 / shape + 1.0) * np.log(t) 
    return  t1 + t2

Instead of actually calculating the full scheme, we can simply apply this

y: Array["T"] = ...
params: PyTree = ...
# apply vectorized operation
nll: Array["T"] = vectorize(gpd_logpdf, y, *params)
# take the sume
nll: Scalar = sum(nll)

Return Period¶

We can calculate the RP using equation (8).

R_T = Pr[Y>y|Y>y_0]Pr[Y>y_0]

(39)

We assume that the occurrence of events over the threshold, $y_0$ , is given by the Poisson process.

\lambda_p = Pr[Y>y_0]

(40)

We can calculate the RP using equation (8). Practically, we set this to the survival function of the GEVD (equation ).

1/T_R = \lambda_p\boldsymbol{S}(y;\boldsymbol{\theta})= \lambda_p\left(1-\boldsymbol{F}(y;\boldsymbol{\theta})\right)

(41)

If we rearrange this equation, we get

y_p := \boldsymbol{F}(y;\boldsymbol{\theta}) = 1 - 1 / (\lambda_p T_R)

(42)

To make things simpler, we can simply use the quantile function in equation (11) and set the probability to

y_p = 1 - 1 / (\lambda_p T_R)

(43)

However, if we expand this out, we get

y = \begin{cases} y_0 + \frac{\sigma}{\kappa} \left[ (\lambda_p T_R)^{\kappa} - 1 \right], && \kappa \neq 0 \\ y_0 + \sigma \log (\lambda_p T_R), && \kappa = 0 \end{cases}

(44)

Average Recurrence Interval¶

We can calculate the RP using equation (14).

\exp(-1/\bar{T}) = \exp\left( -Pr[Y>y|Y>y_0]Pr[Y>y_0] \right)

(51)

Removing the exponential components, we are left with

1/\bar{T} = Pr[Y>y|Y>y_0]Pr[Y>y_0]

(52)

We assume that the occurrence of events over the threshold, $y_0$ , is given by the Poisson process.

\lambda_p = Pr[Y>y_0]

(53)

We can calculate the RP using equation (14). Practically, we set this to the survival function of the GEVD (equation ).

1/\bar{T} = \lambda_p\boldsymbol{S}(y;\boldsymbol{\theta})= \lambda_p\left(1-\boldsymbol{F}(y;\boldsymbol{\theta})\right)

(54)

If we rearrange this equation, we get

y_p := \boldsymbol{F}(y;\boldsymbol{\theta}) = 1 - 1 / (\lambda_p T_R)

(55)

To make things simpler, we can simply use the quantile function in equation (11) and set the probability to

y_p = 1 - 1 / (\lambda_p \bar{T})

(56)

However, if we expand this out, we get

y = \begin{cases} y_0 + \frac{\sigma}{\kappa} \left[ (\lambda_p \bar{T})^{\kappa} - 1 \right], && \kappa \neq 0 \\ y_0 + \sigma \log (\lambda_p \bar{T}), && \kappa = 0 \end{cases}

(57)

Reparameterization¶

In this instance, we are assuming that there is a threshold parameter, $y_0$ . We can write the reparameterization of the GPD in terms of the GEV distribution

\begin{aligned} \mu_{y_0} &= \mu + \frac{\sigma}{\kappa}\left(1 - \lambda_{y_0}^{\kappa} \right) && && \sigma_{y_0} =\sigma\lambda_{y_0}^{\kappa} && && \kappa\neq0 \\ \mu_{y_0} &= \mu - \sigma\ln\lambda_{y_0} && && \sigma_{y_0} =\sigma\lambda_{y_0}^{\kappa} && && \kappa=0 \\ \end{aligned}

(64)

We can also re-parameterize the $\lambda_{y_0}$ in terms of the GPD and GEV parameters.

\begin{aligned} \lambda_{y_0} &= \log \left[ 1 + \kappa z \right]^{- \frac{1}{\kappa}}, && && z = (y - \mu_{y_0})/\sigma \\ \sigma_{y_0} &=\sigma + \kappa(y - \mu_{y_0}) \\ \kappa_{y_0} &= \kappa \end{aligned}

(65)

Lastly, we can do something similar like so

\begin{aligned} \log\lambda &= - \frac{1}{\kappa}\ln \left[ 1 + \kappa \frac{y_0 - \mu}{\sigma} \right] \\ \sigma_{y_0} &=\sigma + \kappa(y_0 - \mu) \\ \kappa_{y_0} &= \kappa \end{aligned}

(66)

There has also been a similar reparameterization found in

\lambda = 1 - \exp \left\{ - h \left[ 1 + \kappa(y_0 - \mu)/\sigma\right]^{-1/\kappa}\right\}

(67)

\sigma_{y_0} = \log[\sigma + \kappa(y_0 - \mu)/\sigma]

(68)

Similarly, another reparameterization is

\sigma_{y_0} = \sigma + \kappa (y_0 - \mu)

(69)

Background¶

p(Y\leq y+y_0|Y>y_0) = 1 - \frac{1-F(y+y_0)}{1-F(y_0)}

(70)

where $y>0$ . If we let $y_0\rightarrow\infty$ , then this leads to an approximate family of distributions given by the GPD

Marginal Survival Function¶

We are interested in the marginal probability of occurrence above an arbitrary maximum value, $y$ . We can write the joint distribution of both quantities to be factored as follows.

p(Y>y,Y>y_0)=p(Y>y|Y>y_0)p(Y>y_0)

(71)

The first term is the rate of exceedences above some quantity $y$ given some threshold, $y_0$ . The second term is the probability that an event is above some threshold, $y_0$ . We could also describe it as the rate of exceedences above some threshold, $y_0$ .

Let’s define an arrival rate λ to be the average number of events per year larger than a threshold, $y_0$ . This is analogous to a Poisson distribution.

p(Y>y_0) := \text{Pois}(Y=k)

(72)

where $k$ is the number of occurrences within some period $T$ . We can write down this distribution as

f(k;\lambda)= \frac{\lambda^k}{k!}e^{-\lambda}

(73)

We know that the expected value is simply the parameter λ.

\mathbb{E}\left[p(Y>y_0) \right] = \lambda

(74)

We can calculate this approximately by summing the number of events, over the threshold, $y_0$ , and then we divide them by the total number of events, $N_{y_0}$

\hat{\lambda} = \frac{1}{N_{y_0}}\sum_{n=1}^{N_{y_0}} \boldsymbol{I}(y_n > y_0)

(75)

To relate this back to our function, we would need the rate parameter λ in units as events per year

\lambda_{year} = \lambda t \hspace{5mm} [\text{events}][\text{year}]^{-1}

(76)

where $t$ is some conversion factor from whatever time unit to years.

Literature Review¶

There are many cases where the GPD is used within the literature. We split this section into theory and applications. However, most of the theoretic work that presents this method is based in applied settings, in particular hydrology.

Theory. From a more theoretic perspective, Davison & Smith, 1990Martins & Stedinger, 2001Chavez-Demoulin * et al., 2005 introduce the Poisson-GPD method where they also relate the AEP and ARI. In Coles, 2001 (Chapter 4.3.3 - Appendix A.1), there is a staple chapter where they introduce the Poisson-GPD method. In Wang & Holmes, 2020, the authors discuss some of the key differences between the annual exceedence probability (AEP) and the average recurrence interval (ARI). Birkhäuser Basel (2007) is a another good book that has a few chapters dedicated to BM, POTs and PPs.

Applications. In Nemukula & Sigauke, 2020, they apply this to study maximum daily temperature in South Africa where they apply a non-stationary Poisson-GPD. Thiombiano et al. (2016) study how the Arctic Oscillation and Pacific North American covariates are related to the extreme daily precipitation months in Southeastern Canada where they apply a Poisson-GPD method with spline functions to map the covariates to the GPD parameters. Silva et al. (2015) study how El Niño-Southern Oscillation can effect the flood regime in the Itajaí river basin in Southern Brasil where they apply a non-stationary Poisson-GPD. Cid et al., 2015 look at how the North Atlantic Oscillation effect storm surges in the Atlantic and North Atlantic regions where they apply a non-stationary Poisson-GPD. Katz et al. (2005) investigate ecological disturbances in extremes for paleoecology where they apply a non-stationary Poisson-GPD.

Algorithms. Silva et al. (2015) showcase how we can use the delta-method (aka Laplace approximation) to calculate uncertainty intervals on the parameters. MacDonald et al. (2011) consider a Kernel Density estimator for the event occurrences parameterization.

References¶

Davison, A. C., & Smith, R. L. (1990). Models for Exceedances Over High Thresholds. Journal of the Royal Statistical Society Series B: Statistical Methodology, 52(3), 393–425. 10.1111/j.2517-6161.1990.tb01796.x
Martins, E. S., & Stedinger, J. R. (2001). Generalized Maximum Likelihood Pareto‐Poisson estimators for partial duration series. Water Resources Research, 37(10), 2551–2557. 10.1029/2001wr000367
Chavez-Demoulin *, V., Davison, A. C., & McNeil, A. J. (2005). Estimating value-at-risk: a point process approach. Quantitative Finance, 5(2), 227–234. 10.1080/14697680500039613
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. In Springer Series in Statistics. Springer London. 10.1007/978-1-4471-3675-0
Wang, C.-H., & Holmes, J. D. (2020). Exceedance rate, exceedance probability, and the duality of GEV and GPD for extreme hazard analysis. Natural Hazards, 102(3), 1305–1321. 10.1007/s11069-020-03968-z

Distributions

GEVD

Distributions