Skip to article frontmatterSkip to article content

Background

CSIC
UCM
IGEO

Probability Density Function

The probability that a variate, YY, has the value yy.

Pr[Y=y]:=f(y)Pr[Y=y] := f(y)

Cumulative Distribution Function

The probability that a variate, YY, takes a value less than or equal to yy.

Pr[Yy]:=F(y)Pr[Y \leq y] := F(y)

From a TPP perspective, this is known as the lifetime distribution.

F(t)=Pr[Tt]=1S(t)F(t) = Pr[T\leq t] = 1 - S(t)

Survival Function

The probability that a variate, YY, takes a value greater than yy. In other words, this gives the probability that an event will happen past a value yy, e.g., time.

Pr[Y>y]=1Pr[Yy]=1F(y):=S(y)Pr[Y > y] = 1 - Pr[Y \leq y] = 1 - F(y) := S(y)

From a TPP perspective, i.e., S(t)S(t) where t[0,)t\in[0,\infty), we have the following properties:

  1. The survival function is non-increasing.
  2. At t=0t=0, S(t)=1S(t)=1, i.e., the probability of surviving past time 0 is 1.
  3. At t=t=\infty, S(t=)=0S(t=\infty)=0, i.e., as time goes to infinity, the survival curve goes to 0.

In theory, the survival function is smooth. However, in practice, we may observe events on a discrete scale. For example, on a time scale we may have days, weeks, or months.


Quantile Function

yp=Pr[Yy]y_p = Pr[Y\leq y]

We can write this as the quantile function

y=F1(yp):=Q(yp)y = F^{-1}(y_p):= Q(y_p)

We often use this function to calculate the frequency estimation like the AEP or the ARI.


Inverse Survival Function

This is the same as the quantile function given in equation (6) except we set the probability equal to the survival probability

yp=1ysy_p = 1 - y_s

Annual Exceedence Probability

The recurrence interval is a measure of how often an event is expected to occur based on the probability of exceeding a given stage streshold. This threshold is called the annual exceedance probability. To calculate this, we can express the return period (in years) as

Ra=Ra(Ta)=1TaR_a = R_a(T_a) = \frac{1}{T_a}

where RaR_a is the annual exceedence probability (AEP) and TaT_a is the number of years Wang & Holmes, 2020. The AEP is has a domain between 0 and 1, Ra[0,1]R_a\in[0,1], and the return period, TaT_a, has a domain between 1 and infinity, Ta[1,)T_a\in[1,\infty). This can be limiting when we consider sub-annual probabilities which would be elements less than 1. In addition, it can be incorrect when there is some wrong interpolation between 100 and 1.

A figure showing the return period [years] vs the probability of exceedance, R_a.

Figure 1:A figure showing the return period [years] vs the probability of exceedance, RaR_a.


Derivation

This section is based off of Wang & Holmes, 2020Davison & Smith, 1990. Let YtY_t be an indicator variable that indicates whether in (t,t+1](t,t+1], at least one event occurs or not

Yt={1,when Nt+1Nt>00,otherwiseY_t = \begin{cases} 1, && && \text{when }N_{t+1}-N_t >0 \\ 0, && && \text{otherwise} \end{cases}

Then, YtY_t is a Bernoulli distribution with the probabilities

F(t)=Pr[Tt]=1S(t)F(t) = Pr[T\leq t] = 1 - S(t)

Usage

In practice, we can use this to calculate the return level given any arbitrary CDF function

Ra=Pr[Y>y]=1Pr[Yy]=1F(y;θ)R_a = Pr[Y > y] = 1 - Pr[Y \leq y] = 1 - F(y; \boldsymbol{\theta})

Once we solve this for the quantity yy in terms of RpR_p.

1Ta=1F(y;θ)\frac{1}{T_a} = 1 - \boldsymbol{F}(y;\boldsymbol{\theta})

After we simplify the expression, we get the following relationship

y=F1(yp;θ)=Q(yp;θ)y = \boldsymbol{F}^{-1}(y_p;\boldsymbol{\theta}) = \boldsymbol{Q}(y_p;\boldsymbol{\theta})

where QQ is the quantile function, i.e., the inverse CDF function, and yp=1Ra=11/Tay_p = 1 - R_a = 1 - 1/T_a.


Average Recurrence Interval

The average recurrence interval (ARI) is the average time between events for a specified duration at a given location. This term is associated with partial duration series (PDS) or peak-over-thresholds (POTs). This is also known as the Mean Inter-Arrival Time or the Mean Recurrence Interval.

Rp=Rp(Tp)=1exp(1Tp)R_p = R_p(T_p) = 1 - \exp\left(- \frac{1}{T_p}\right)

where TpT_p is the mean inter-arrival time measured in yearsyears Wang & Holmes, 2020.

A figure showing the average recurrence interval [years] vs the probability of recurrence, R_p.

Figure 1:A figure showing the average recurrence interval [years] vs the probability of recurrence, RpR_p.


Derivation

This section is based off of Wang & Holmes, 2020. We assume that we have a counting process, N(A)N(A), which is a Poisson process with a rate of occurrence, λ. Then the probability that there is at least 1 event in the time interval, (0,T](0,T], is given as the survival function of the exponential distribution:

Pr[N(A)1]=1Pr[N(A)=0]=1exp(λ)Pr[N(A) \geq 1] = 1 - Pr[N(A)=0] = 1 - \exp \left(-\lambda\right)

The mean inter-arrival time is given as

E[Y]=1λ:=Tˉ,Tˉ[0,)\mathbb{E}[Y] = \frac{1}{\lambda} := \bar{T}, \hspace{10mm} \bar{T}\in[0,\infty)

The probability of at least 1 event in the interval (0,T](0,T] is given as

Pr[N(A)1]=1exp(T/Tˉ)Pr[N(A) \geq 1] = 1 - \exp \left(-T/ \bar{T}\right)

and the probability that there is at least 1 event within 1 unit time interval is given as

Pr[N(A)1]=1exp(1/Tˉ)Pr[N(A) \geq 1] = 1 - \exp \left(-1/ \bar{T}\right)

Note: we can extend this for distributions where we have multiple criteria. For example, in marked HPP, we could have a 2D Poisson process given over the domain

$$

$$

λ(t,y)=λf(y;θ)\lambda(t,y) = \lambda f(y;\theta)

So essentially, we state that

Pr[Y>yY>y0]Pr[Y>y0]=λ(1F(y;θ))Pr[Y>y|Y> y_0]Pr[Y>y_0] = \lambda \left(1 - F(y;\boldsymbol{\theta})\right)

So, the probability of no exceedances of yy over a 1-year period is given by the Poisson distribution

Fa(y)=exp[λS(y)]F_a(y) = \exp\left[ -\lambda S(y)\right]

Usage

In practice, we can use this to calculate the return level given any arbitrary CDF function

RT=Pr[Y>y]=1Pr[Yy]=1F(y;θ)R_T = Pr[Y > y] = 1 - Pr[Y \leq y] = 1 - F(y;\boldsymbol{\theta})

Once we solve this for the quantity yy in terms of RTR_T, we get the following relationship

yT=Q(yp;θ)y_T = \boldsymbol{Q}(y_p;\boldsymbol{\theta})

where QQ is the quantile function, i.e., the inverse CDF function, and yp=1RT=exp(1/Tp)y_p = 1 - R_T = \exp\left(-1/T_p\right).


AEP vs ARI

There are some equivalences of these two quantities. Namely, we can write this as:

Rp=Ra1Ta=1exp(1Tp)\begin{aligned} R_p &= R_a \\ \frac{1}{T_a} &= 1 - \exp\left(- \frac{1}{T_p}\right) \end{aligned}

Figure 3 showcases the AEP vs the probability of recurrence. We see that they are almost the same except for near the upper tail. Figure 4 demonstrates the relationship better. We see that the ARI has the domain between Tp[0,)T_p \in [0, \infty) whereas the RP has the domain between Rp[0,)R_p \in [0, \infty). So, there is a relationship between the two quantities but they are not the same due to the differences in the domain.

Probabilities
Periods
A figure showing the probability of occurrence, R_a, vs the probability of exceedence, R_p.

Figure 3:A figure showing the probability of occurrence, RaR_a, vs the probability of exceedence, RpR_p.


Hazard Function

The ratio of probability density function to the survival function, aka the conditional failure density function.

H(y)=yh(τ)dτ=log(1F(y))=logS(y)H(y) = \int_{-\infty}^yh(\tau)d\tau = -\log\left(1-F(y)\right)=- \log S(y)

Counting Process

N(A)=#{nN+:TnA}=n=11(TnA)\begin{aligned} N(A) &= \#\left\{n\in\mathbb{N}^+: T_n \in A \right\} \\ &= \sum_{n=1}^\infty \mathcal{1} (T_n \in A) \end{aligned}

Survival Function

This is the probability that the time of death is later than some specified time, tt.

S(t)=Pr[T>t]=tf(τ)dτ=1F(t)S(t) = Pr[T>t] = \int_t^\infty f(\tau)d\tau = 1 - F(t)

Event Density

This is the rate of death/failure events per unit time

f(t)=F(t)=ddtF(t)f(t) = F'(t) = \frac{d}{dt}F(t)

Survival Event Density

s(t)=S(t)=ddtS(t)=ddttf(τ)dτ=ddt[1F(t)]=f(t)\begin{aligned} s(t) &= S'(t) = \frac{d}{dt}S(t) \\ &= \frac{d}{dt}\int_t^\infty f(\tau)d\tau \\ &= \frac{d}{dt}\left[1 - F(t) \right] \\ &= - f(t) \end{aligned}

Conditional Intensity Function

This is the instantaneous rate of a new arrival of new events at time, tt, given a history of past events, Ht\mathcal{H}_t. This is also known as the hazard function.

λ(t)=f(t)1F(t)\lambda^*(t) = \frac{f^*(t)}{1-F^*(t)}

We can rewrite this using th relationship of the survival function

λ(t)=f(t)S(t)\lambda^*(t) = \frac{f^*(t)}{S^*(t)}

We can also rewrite this using the relationship between the survival function and the cumulative hazard function

λ(t)=f(t)exp(Λ(T))\lambda^*(t) = \frac{f^*(t)}{\exp\left( -\Lambda(\mathcal{T}) \right)}

Cumulative Hazard Function

In general, there are four properties it needs to satisfy

Λ(t)>0Λ(tn)=0limtΛ(t)=dΛ(t)dt>0\begin{aligned} \Lambda^*(t) &> 0 \\ \Lambda^*(t_n) &= 0 \\ \lim_{t\rightarrow \infty} \Lambda^*(t) &= \infty \\ \frac{d \Lambda^*(t)}{dt} &> 0 \end{aligned}

This is achieved by always having a positive outcome within hazard function parameterization.


Probability Density Function

We can write the conditional probability density function in terms of the hazard and cumulative hazard function

f(t)=λ(t)exp(Λ(T))=λ(t)S(t)f^*(t) = \lambda^*(t) \exp\left( -\Lambda(T) \right) = \lambda^*(t)S^*(t)

We can also write it using the hazard function and the survival function

f(t)=λ(t)S(t)f^*(t) = \lambda^*(t)S^*(t)

And lastly, we can write it in terms of the hazard function and the CDF function.

f(t)=λ(t)(1F(t))f^*(t) = \lambda^*(t)\left(1-F^*(t)\right)
References
  1. Wang, C.-H., & Holmes, J. D. (2020). Exceedance rate, exceedance probability, and the duality of GEV and GPD for extreme hazard analysis. Natural Hazards, 102(3), 1305–1321. 10.1007/s11069-020-03968-z
  2. Davison, A. C., & Smith, R. L. (1990). Models for Exceedances Over High Thresholds. Journal of the Royal Statistical Society Series B: Statistical Methodology, 52(3), 393–425. 10.1111/j.2517-6161.1990.tb01796.x