2.1 - Extreme Value Theory¶
Three Interpretations
There are three interpretations of extreme value theory which are complementary. In a nutshell, there are three ways of selecting extreme values from data and then defining a likelihood function.
- Max Values —> GEVD
- Threshold + Max Values —> GPD
- Threshold + Max Values + Counts + Summary Statistic —> PP
Problems
- Spatiotemporal Dependencies - representation
- Measurements - very little, rare/extremely rare # of observations and complex
- Modeling - difficult with little measurements, even with simulations, things are complex and heavy, lose interpretability
- Experiment - what’s counterfactual?
- Causality - event attribution and direction
Calculating Extremes¶
Block Maxima¶
We define a spatiotemporal block and we take the maximum count within a spatiotemporal block.
Algorithm
- Define spatiotemporal block
- Select maximum/minimum values
Clon_coords: Array[“”] = …
Clat_coords: Array[“”] = …
Peak-Over-Threshold¶
We select the values that are over a predefined threshold and discard the rest. We also have the option to discretize this further by taking the maximum within a pre-defined spatiotemporal block. The POT method is a discretized version of the block maxima method, i.e., it is the infinite limit as the size of spatiotemporal block goes to zero whereby each individual point is a maximum. This will result in an irregular grid because there is no guarantee that only one maximum occurrence above a pre-defined threshold within a pre-defined spatiotemporal block. In addition, one could have irregular blocks/shapes but this makes processing much harder. One could further discretize this to count exceedences (and intensity).
Algorithm
- Define maximum/minimum threshold values
- Select values above/below threshold
- Define spatiotemporal block (Optional)
- Summary statistic of values within spatiotemporal block (Optional)
Point Processes¶
This method is similar to the POT method with the spatiotemporal blocks. However, we also count the number of exceedences and take a summary statistic of the values within the block.
- Point Process Analysis | Point Process NBs | PP w/ PyTorch | Marked Spatiotemporal Point Process Simulator
- neural spatial temporal point process | point process and models
- spatial point process w Paula
Algorithm
- Define maximum/minimum threshold values
- Select values above/below threshold
- Define spatiotemporal block
- Count the number of occurrences within spatiotemporal block
- Summary statistic of values within spatiotemporal block
Core Operations¶
Resampling¶
We need to choose the temporal frequency we wish to choose.
Declustering¶
We need to merge the observations.
- Method I - we take non-lapping blocks of the data to the minimum spatiotemporal resolution we accept by taking maximum values.
- Method II - we take a radius neighbors based approach on non-overlapping spatial regions at a desired temporal frequency.
Examples
In this section, we will do a deeper dive into how one can further preprocess the data to remove extreme values
- Spatially Aggregate Data (Optional)
- Temporally Aggregate Data (Optional)
- Stitching, SuperImposing, Aggregating, Batch Sampling - PoPPY
Examples
- 1D Data Recorded in a sequence of distance or time
- 2D sampling for spatial interpolation
- 3D sampling for spatial interpolation
- Spatiotemporal
Spatial Scale
- Changes —> Mean, Variance, Tails, Range, Distribution Shape
- Tools —> Variogram, Predict the scale
- Recale:
- DownScale/SuperResolution/UpSample
- Upscaling/Coarsen/DownSample —> Average Arithmetic, Power Law Average, Harmonic, Geometric
- Aggregations
- Creating location weights - https://
youtu .be /k9VbyqafnPk ?si = biWcgcqwuXVe8RfG
# filtering - remove high/low frequency signals
# spatiotemporal peaks - spatial,temporal dependencies
# remove climatology - temporal dependencies
# spatial aggregation - spatial dependencies
# rolling mean - spatial, temporal dependencies
Cookbook
- Spatial Statistics with Declustering Weights —> Grid Cell Size vs Declustered Mean
- Lat-Lon Spatial Averages using weights at poles
Example PsuedoCode¶
First, we need some spatiotemporal data. This data could be any spatiotemporal field, , representing the extreme values we wish to extract.
y: Array["Dt Dy"] = ...
Now, we need to do some preprocessing steps to ensure that we get an iid dataset. We will remove some of the excess effects.
# filter high frequency signals
y: Array["Dt Dy"] = low_pass_filter(y, params)
# remove climatology
climatology["Dclim"] = calculate_climatology(y, reference_period, params)
y: Array["Dt Dy"] = remove_climatology(y, climatology, params)
# spatial aggregation
y: Array["Dt"] = spatial_aggregator(y, params)
Now, we need to select some extreme values.
y_max: Array["Dt"] = block_maximum(y, params)