TLDR: We will incorporate spatiotemporal effects into our model assumptions. We will use nonparametric Gaussian process models to account for the spatial and temporal autocorrelations present within the data. We assume that each of the parameters of the distribution are a random field given which is the result of an underlying latent spatial process. Then, we can say that the extreme values are conditionally independent of the parameters of the data likelihood. These latent models will be used to parameterize the data likelihood parameters. We can include the covariates within the kernel function (Deep Kernel Learning) or we can include them within the mean function. The heterogeneity can be accounted for by having a separate GP for the scale parameter for the data likelihood. We can use approximations like inducing points or spectral approximations to deal with the large number of spatial samples present.
Questions¶
How do we go beyond the univariate site-per-site or spatially aggregated modeling?
Can we take into account the spatial pairwise dependence among different sites?
Assumptions¶
The idea is that standard EVT methods can greatly benefit from spatial considerations. We assume that these spatial methods can map risk at an individual location and borrow strength of spatial neighbours which would be helpful to estimate rare-event probabilities. We assume that this spatial dependence consideration is necessary and valid for inference.
Conditional Independence. The extreme values, , are independent given the distribution parameters.
Process Layer. The parameters, , of the distribution, , are random fields and result from an unobservable latent spatiotemporal process.
Model¶
Data Layer. We make the following assumption for the data layer.
More concretely, the extreme observations, , stem from a conditional distribution which is a distribution parameterized by some process, θ, and the hyperparameters of the process, α.
Process Layer. We assume that the process layer has the form:
More concretely, we believe the parameters of the data distribution are parameters by some covariates and hyperparameters.
For example, we could have the following Mean Function:
where the hyperparameters are .
Previous Work¶
Gaussian Processes
Gaussian Processes
- Exact
- Inference - MLE, MAP, Variational, MCMC
- Scale (Hardware) - Cola, GPUs, Parallel GPUs
- Scale (Moment)- Inducing Points
- Covariates - Additive, Seperable, Mixture Kernels
- Noise - Heteroskedastic
- Physics Informed - Deep Kernel Learning, Custom Mean Function
- Scale (Spectral) - RFF
- Scale (Time Series) - Markovian, Variational Markovian
- Uncertain Inputs - MC, Taylor (1st, 2nd Order), Unscented, Moment-Matching, GPLVM
Heteroskedastic GP
Max-Stable Processes
These papers adapt asymptotic results for multivariate extremes. They typically measure the spatial dependence among maxima.
Schlather & Tawn, 2003 | Pereira & Andraz, 2005 | | Vannitsem & Naveau, 2007 |
Bayesian or Latent Models
spatial structure indirectly modeled via the EVT parameters distribution, i.e., conditioning via covariates and Bayesian Hierarchical Modeling.
Spatial Interpolation of Return Levels in Colorado - Cooley et al, 2012 | Coles & Tawn, 1996
Downscaling Extremes - Vrac & Naveau, 2007
Dynamical Models
Auto-Regressive spatio-temporal heavy tailed processes.
Gaussian Anamorphasis
Transforming the field into a Gaussian one.
PseudoCode¶
Gaussian Process Model¶
Mean Function¶
def mean_function(
X: Array["N Dx+Ds+Dt"], weight: Array["Dy Dx"], bias: Array["Dy"]
) -> Array["Dy"]:
# extract covariates only
return weight @ x[:,-2:] + bias
def gp_model(S: Array["N"], T: Array["N"], X: Array["N Dx"], Y: Array["N Dy"]):
# mean function hyper parameters
mean = numpyro.sample("mean", dist.Normal(0.0, prior_sigma))
jitter = numpyro.sample("jitter", dist.HalfNormal(prior_sigma))
# hyperparameters for spatial coordinates
s_sigma = numpyro.sample("s_sigma", dist.HalfNormal(prior_sigma))
s_rho = numpyro.sample("s_rho", dist.HalfNormal(prior_sigma))
s_tau = numpyro.sample("s_tau", dist.HalfNormal(prior_sigma))
kernel_s = sigma1**2 * kernels.ExpSquared(tau) * kernels.Cosine(rho1)
# compute kernel
kernel = kernel_s(S) + kernel_t(T) + kernel_x(X)
# concaténate input
input: Array["N 3"] = jnp.vstack([S,T,X])
# sample Y according to the standard gaussian process formula
gp = GaussianProcess(kernel, input, diag=yerr**2 + jitter, mean=mean)
numpyro.sample("gp", gp.numpyro_dist(), obs=y)
if y is not None:
numpyro.deterministic("pred", gp.condition(y, true_t).gp.loc)
- Schlather, M. (2003). A dependence measure for multivariate and spatial extreme values: Properties and inference. Biometrika, 90(1), 139–156. 10.1093/biomet/90.1.139
- Pereira, A. M., & Andraz, J. M. (2005). Public Investment in Transportation Infrastructure and Economic Performance in Portugal. Review of Development Economics, 9(2), 177–196. 10.1111/j.1467-9361.2005.00271.x
- Vannitsem, S., & Naveau, P. (2007). Spatial dependences among precipitation maxima over Belgium. Nonlinear Processes in Geophysics, 14(5), 621–630. 10.5194/npg-14-621-2007
- Cooley, D., Nychka, D., & Naveau, P. (2007). Bayesian Spatial Modeling of Extreme Precipitation Return Levels. Journal of the American Statistical Association, 102(479), 824–840. 10.1198/016214506000000780
- Coles, S. G., & Tawn, J. A. (1996). A Bayesian Analysis of Extreme Rainfall Data. Applied Statistics, 45(4), 463. 10.2307/2986068