In this section, we look at point processes (PP). In particular, we will use point processes to outline the framework for how we can jointly model extreme event occurrences and magnitudes. We will start with temporal point processes which will address extreme event occurrences. Then we will go into marked temporal point processes which will incorporate magnitudes. finally we will add the spatial component.
Temporal Point Process¶
These are processes that are concerned with modeling sequences of random events in continuous time. Let’s say we have an ordered sequence of events at time . We denote this as
Typically, , but it can be between any two arbitrary time endpoints, e.g., . We will also use the notation of the historical events predating our time event of interest, . We denote this as
Lastly, we will define the conditional intensity function (aka the hazard function) as
where is the infinitesimal time interval containing .
We will use the common shorthand to denote the conditional dependence on the historical dataset .
We can write out the conditional likelihood function as the probability that we observe an event of interest given all of the history as:
This can be decomposed as
Learning¶
In general, we are interested in finding the best parameters of our model given access to potentially many sequences of events. So naturally, we can simply maximize the log-likelihood.
We can write out the joint log-likelihood of observing within a time interval which is given by
The first term is the log-likelihood of the specific events at the specific location, , that we observe them. The second term is the probability that we do not observe them anywhere else within the time interval of interest. We can also shorten the notation by introducing the cumulative hazard function as
This will leave us with
There are other alternatives to maximizing the log-likelihood. In general, the loss function can look like
where is described by some parametric distribution, , and is some criteria. In the above example, our criteria is simply the log-likelihood function. We can use some other generative modeling methods like:
- GANs use a parametric model and the loss is some sample quality metric
- reinforcement learning uses some policy and the loss is some reward function
- variational inference uses some approximate posterior and the loss is the evidence lower bound.
Usages¶
Prediction. The first obvious use case is prediction. In this case, we have some observed data over a period, , and we would like to know what will happen in a forecast period, .
- How much time, , until the next event?
- What type of mark, , of the next event?
- How many events of the type, , will happen?
We can even sample some potential trajectories of events of the future which helps to answer how many events could possibly happen.
100-Year Events. If the marks distribution is parametric, we can do some post-analysis about the occurrence of events. This can be done through the use of return periods (RPs) or average recurrence intervals (ARI). See sections Annual Exceedence Probability and Average Recurrence Interval for more details.
Example I: Homogeneous Poisson Process (HPP)¶
In this case, we have a dataset of number of exceedances along a timeline as seen in equation (1). Essentially, we have a vector, , which has the counts per unit time. According to the traditional PP, we can define our assumptions as:
- The number of events in any two disjoint intervals are independent
- The number of events in any interval for follows a Poisson distribution with rate .
- The inter-event times are iid rv that follow the exponential distribution with a rate parameter, λ.
Following these assumptions, we say that our intensity function be a constant parameter with no dependence on time.
This means that our cumulative Hazard function, , will also not depend on any of the historical events and it will be constant with time. Plugging our terms into the cumulative hazard function in equation (9) results in
where is the interval of interest, e.g., number of years, . So, we can plug these two quantities into our log likelihood function in equation (10) as
As mentioned above, the inter-arrival time is an exponential distribution. Please see section ... for more details.
Example II: Inhomogeneous Poisson Process¶
Also in this case, in this case, we have a dataset of number of exceedances along a timeline as seen in equation (1). However, we let our intensity function be a function parameter with dependence on time but no dependence on any historical events.
This means that our cumulative Hazard function, , will also not depend on any of the historical events but it will depend on time.
So, we can plug these two quantities into our log likelihood function into the equation (10)
The difficult part for this equation is the 2nd term which is an integral; however, there are many ways to deal with this. For example, we can use a parametric form for the intensity
which would result in a closed-form integral. For example, we could use a log-linear model, a cox process or a Hawkes process to name a few. The game is to:
- use a simple parametric function that has a closed form integral form
- use a more complex parametric function and approximate the integral using quadrature or discretization strategies.
See the section Temporal Dependencies for more ideas of temporal parameterizations.
Marked Temporal Point Process¶
These are processes that are concerned with modeling sequences of random events in continuous time along with some additional meta-data, i.e., marks. Marks can be whatever type of meta-data we have available. For example, we could have some magnitude, e.g., temperature, Earthquake magnitude. We could also have some spatial information, i.e., latitude, longitude, and/or altitude.
Firstly, we will have some underlying process which is dependent upon time
Let’s say we have a sequence of time stamps, , and their associated marks, .
This is given as a sequence of events
We will also use the notation of the historical events predating time, .
From a PP perspective, we can model this as a 2D PP which results in
Lastly, we will define the conditional intensity function
We will use the common shorthand to denote the conditional dependence on the historical dataset .
We can write out the joint density as an autoregressive probability where the arrival time, , and the mark, , is conditioned upon the history.
We can write out the joint log-likelihood of observing within a time interval which is given by
Marks Parameterizations¶
Now, let’s dive a bit into the marks distribution. In general, we can model the marks in three ways: 1) conditionally independent, 2) conditioned on time, and 3) time conditioned on the marks. These can be seen in these equations
The first case, we say there is no dependence between the marks. The second case gives us a more flexible parameterization of the marks which influences how the marks behave wrt time. The third cases is the most flexible parameterization and frankly the most correct because we state that the occurrence of events is also conditioned on the marks.
There are some known special cases of these marks. These include:
- A compound Poisson process if and for deterministic functions and .
- A process with independent marks if and -intensity and
- A process with unpredictable marks if .
Example III: MPP 4 Extremes¶
We have a marked point process where we represent it as a 2D point process as shown in equation (22) where we write the joint intensity function for the temporal plane and the mark plane. However, we simplify it to be a parametric form.
where is some parametric function in terms of the marks. Now, it’s easier to reason about the cumulative hazard function because it’s an integral of some parametric PDF which has a closed-form double integral.
We recognize that the inner integral for the mark domain is simply the survival function, , of the parametric PDF, .
We take the threshold of interest, , to be the lower bound of the mark space. So, the remaining outer integral on the temporal domain is a simple homogeneous Poisson process that was done previously in equation (9). Plugging our expression into this equation leaves us with
Our final log-likelihood expression will be
We can use whatever PDF we want for the marks, e.g., Normal, LogNormal, or T-Student. However, in the literature for extreme values, we typically use the GPD or even the GEVD in some cases.
Annual Exceedence Probabilities. We can also do return periods where try to find the annual exceedence probability or the average recurrence interval. See sections Annual Exceedence Probability and Average Recurrence Interval for more details. For our case, as shown in equations (8) and (14) we equate these to our cumulative distribution function. After solving for the respective and , we arrive at
where is the quantile function for the PDF/CDF and the probability in the domain, .
Decoupled Marked Temporal Point Process¶
We can decompose this joint intensity measure into its conditional dependencies, i.e., the mark depends on the time.
The term, is either a probability density function or a probability mass function depending upon whether the marks are continuous or discrete. Now, we can write the conditional intensity for the marked TPP as
where is the ground intensity and is the conditional mark density function. Notice how the arrival times are similar to the unmarked case except that now this intensity measure may depend on past marks.
Proof: Joint Intensity Function
Notice that this decomposition is very similar to the joint distribution decomposition. Let’s say we have and composed as a joint distribution which we factorize as follows.
As shown above, we can decompose the joint intensity function into it’s conditional parts
Using some rules from survival analysis, we can rewrite this using only PDFs and CDFs.
where is the joint density (in a broad sense) of the time, , and mark, , conditioned on the past times and marks. The term is the conditional CDF of also conditioned on the past times and marks.
We can simplify this even more by considering the survival function
Finally, we can write out the joint log-likelihood of observing within a time interval which is given by
The first two terms are the ground intensity likelihoods for the temporal rate and the third term is the marks likelihood.
Example IV: MHPP 4 Extremes¶
In this example, we have a DMTPP for extremes. We have a marked point process where we represent it as a 2D point process as shown in equation (22). In this case, we decouple the intensity function as shown above section. For the ground intensity, we have a HPP. For the marks, we have iid parametric distribution
We can plug in these terms into the equation (41) to obtain:
Because we have the homogeneous rate parameter cases, we get the Marked Homogeneous Poisson Process (MHPP). Using the portion of the HPP, we can plug in the terms found in equation (14) into our equation above. This gives us
We see that both likelihood terms we decoupled as there are no dependencies between parameters of the temporal likelihood (the first two terms) and the marks likelihood (the third term) so they can be solved independently. In the case of extremes, one option is to use the GPD as the marked distribution. We can write out the new log-likelihood as
This is known as the Poisson-GPD algorithm within the EVT literature Chavez-Demoulin * et al., 2005Nemukula & Sigauke, 2020. We can also parameterize the intensity with the parameterization in equation (65) where we have some new free parameters .
Alternatively, we can use the GEVD as the marked distribution.
This has no specific name in the EVT literature, however there are a few papers which use this distribution to motivate the GPD via the PP. In this work, we name this the Poisson-GEVD. We can parameterize GEVD parameters in terms of the GPD parameters which are needed for the intensity parameter given in equations (53) where we introduce free parameters .
Annual Exceedence Probability. Lastly, something of great interest within the EVT community is to characterize the return periods. Recall section Average Recurrence Interval whereby we showed that the ARI can be related to the conditional CDF. Recall that this is given as
where is the survival function for the marks distribution and λ is the average rate of occurrences over the threshold, . After rearranging this equation and simplifying, we can recognize that this is simply the quantile function of the marks distribution.
where is the quantile function for the PDF/CDF and the probability in the domain, .
Example VII: MIPP 4 Extremes¶
In this example, we have a DMTPP for extremes. We have a marked point process where we represent it as a 2D point process as shown in equation (22). In this case, we decouple the intensity function as shown above section. For the ground intensity, we have a HPP or IPP. For the marks, we have iid parametric distribution
where the parameters of the marks distribution are time dependent
We can plug in these terms into the equation (41) to obtain:
Here, our free parameters of our model are . Similar to the HPP case, we can use the GEVD or the GPD.
Return Periods. Lastly, something of great interest within the EVT community is to characterize the return periods. Recall section Average Recurrence Interval whereby we showed that the ARI can be related to the conditional CDF. Recall that this is given as
where is the survival function for the marks distribution and λ is the average rate of occurrences over the threshold, . After rearranging this equation and simplifying, we can recognize that this is simply the quantile function of the marks distribution.
where is the quantile function for the PDF/CDF and the probability in the domain, .
Spatial Point Process¶
These are processes that are concerned with modeling sequences of random events in continuous space and time. Let’s say we have a sequence
We will also use the notation of the historical events predating time, .
Lastly, we will define the conditional intensity function
We will use the common shorthand to denote the conditional dependence on the historical dataset .
Finally, we can write out the joint log-likelihood of observing within a time interval which is given by
Marked Spatiotemporal Point Process¶
A marked spatiotemporal processes that are concerned with modeling sequences of random events in continuous space and time which come with some underlying function for the marks. Firstly, we will have some underlying process which is dependent upon time and space
Now, let’s say we have a sequence
We will also use the notation of the historical events predating time, .
Lastly, we will define the conditional intensity function
We will use the common shorthand to denote the conditional dependence on the historical dataset .
Finally, we can write out the joint log-likelihood of observing within a time interval and space interval which is given by
Literature Review¶
Applications. Mannshardt-Shamseldin et al., 2010 investigate the differences in precipitation extremes during the 21st century as a trend stemming from global warming. In Cooley & Sain, 2010, they compare the extreme precipitation simulated in a regional climate model over its spatial domain where they apply a Bayesian Hierarchical model for MHPP model.
- Chavez-Demoulin *, V., Davison, A. C., & McNeil, A. J. (2005). Estimating value-at-risk: a point process approach. Quantitative Finance, 5(2), 227–234. 10.1080/14697680500039613
- Nemukula, M. M., & Sigauke, C. (2020). A Point Process Characterisation of Extreme Temperatures: an Application to South African Data. Environmental Modeling & Assessment, 26(2), 163–177. 10.1007/s10666-020-09718-6
- Mannshardt-Shamseldin, E. C., Smith, R. L., Sain, S. R., Mearns, L. O., & Cooley, D. (2010). Downscaling extremes: A comparison of extreme value distributions in point-source and gridded precipitation data. The Annals of Applied Statistics, 4(1). 10.1214/09-aoas287
- Cooley, D., & Sain, S. R. (2010). Spatial Hierarchical Modeling of Precipitation Extremes From a Regional Climate Model. Journal of Agricultural, Biological, and Environmental Statistics, 15(3), 381–402. 10.1007/s13253-010-0023-9