GP Tutorial Master List
A reconciled, exhaustive curriculum spanning what currently exists in gaussx, pyrox, and research_notebook, plus gaps surfaced from the gaussx + pyrox public APIs, open GitHub issues, and pyrox design_docs/. Goal: the most complete GP tutorial sequence we could ship.
Bayesian NN / NeRF / basis-function-regression tutorials live in
../bayesian_nns/TUTORIAL_MASTER_LIST.md. Cross-listed items (RFF, deep kernels, BLR, last-layer-Bayes) are flagged 🔁.
Legend — Source columns:
G= exists in gaussx (docs/notebooks/<name>)P= exists in pyrox (docs/notebooks/<name>)R= exists in research_notebook (projects/gaussian_processes/notebooks/<path>)—= does not exist yet (gap)
Scope tag: 🧱 fundamental · 🔬 research · 🌉 bridge · 🔁 cross-listed
Refs column: gh#N = open GitHub issue · dd:path = pyrox design_docs/pyrox/<path> · api:foo = gaussx exported symbol.
Curriculum at a glance¶
A bird’s-eye view of the parts and their subparts. Skim this first to orient; the detailed per-tutorial tables live below.
- Part 0 — Linear Algebra & Gaussian Foundations
- 0.A — The Multivariate Gaussian
- 0.B — Parameterizations
- 0.C — Bayesian Updates & Conditioning
- 0.D — Numerical Mechanics
- Part 1 — Structured Linear Operators
- 1.A — Operator Zoo (catalog)
- 1.B — Matrix Identities & Decompositions
- 1.C — Matrix-Free / Implicit
- 1.D — Solvers
- 1.E — Trace, Log-Det, Roots
- Part 2 — Kernels
- 2.A — Standard kernels
- 2.B — Spectral & deep kernels
- 2.C — Multi-output kernels
- 2.D — Spherical / localized kernels
- 2.E — Kernel-based statistics & utilities
- 2.F — Non-Euclidean & operator-valued kernels
- Part 3 — Exact GP Regression
- 3.A — Foundations
- 3.B — Diagnostics
- 3.C — Heteroscedastic noise
- 3.D — High-level API
- 3.E — Constrained & Physics-informed GPs
- Part 4 — Structured GPs
- 4.A — Kronecker GPs
- 4.B — Grid / Toeplitz GPs
- 4.C — Sparse-precision (mesh / GMRF)
- Part 5 — Approximations & Scalability
- 5.A — Random features
- 5.B — Inducing-point fundamentals
- 5.C — Inter-domain features
- 5.D — Iterative-solver scaling
- 5.E — Deep GPs
- Part 6 — Non-Conjugate Likelihoods & Inference
- 6.A — Likelihood & integrator zoos
- 6.B — Classification
- 6.C — Newton / Gauss-Newton family
- 6.D — Variational inference
- 6.E — Expectation Propagation
- 6.F — Bayesian linear regression & non-standard outputs
- 6.G — Aggregate Bayesian methods
- Part 7 — Spectral GPs
- 7.A — Spectral foundations
- 7.B — Spectral kernel models
- 7.C — Random Fourier features
- 7.D — Hilbert-space methods
- 7.E — Variational spectral methods
- 7.F — Spherical / periodic spectral
- Part 8 — Markov / State-Space GPs
- 8.A — Foundations
- 8.B — SDE kernel zoo
- 8.C — Markov GP workflows
- 8.D — Parallel & scalable filtering
- 8.E — Nonlinear filtering
- 8.F — Ensemble methods
- 8.G — Steady-state & structured-Gaussian surfaces
- 8.H — Non-conjugate temporal case studies
- Part 9 — Sampling, Pathwise, Conditioning
- 9.A — Pathwise sampling
- 9.B — Matheron’s-rule conditioning
- Part 10 — Uncertainty Propagation & UQ
- 10.A — Foundations
- 10.B — Uncertain inputs
- 10.C — Analytic moments
- 10.D — BGPLVM
- 10.E — Special integrators & quantiles
- Part 11 — Probabilistic Programming Integration
- 11.A — gaussx + NumPyro
- 11.B — pyrox patterns
- 11.C — Hierarchical & sampling
- Part 12 — Ensembles
- Part 13 — Data Pipelines
- Part 14 — Applied Case Studies (research_notebook)
- 14.A — Spatial extremes
- 14.B — SVGP applied
- 14.C — Geophysics & emulation
- 14.D — Optimization & decision
- 14.E — Causal & event data
- 14.F — Practical
- Part 15 — Metrics & Calibration
Part 0 — Linear Algebra & Gaussian Foundations¶
0.A — The Multivariate Gaussian¶
Key equations / models:
- Density:
- Reparameterized sample: , ,
- Entropy:
- KL:
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 0.1 | The Multivariate Gaussian: density, sampling, conditioning | R multivariate_gaussian | 🧱 | pedagogical entry — three sampling routes, marginal & Schur conditioning, jitter |
| 0.2 | MultivariateNormal & MultivariateNormalPrecision distribution API | R mvn_distribution_api | 🧱 | covariance vs precision parameterisation, GMRF / banded Λ, round-trip equivalence |
| 0.3 | Quadratic forms, entropy, KL between Gaussians | R gaussian_quantities | 🧱 | api: gaussian_entropy, dist_kl_divergence, kl_standard_normal, quadratic_form, gaussian_expected_log_lik — extended to cover score, cross-entropy, expected log-likelihood, mutual information, mini-ELBO |
0.B — Parameterizations¶
Key equations / models:
- Natural parameters: ,
- Expectation parameters: ,
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 0.4 | Three parameterizations: mean-cov ↔ natural ↔ expectation | R natural_parameters | 🧱 | api: mean_cov_to_natural, natural_to_mean_cov, natural_to_expectation, expectation_to_natural, damped_natural_update — round-trip identities, conjugate update as natural-form addition, moment matching, damped VI/EP primitive, use-case map across the curriculum |
0.C — Bayesian Updates & Conditioning¶
Key equations / models:
- Sequential conjugate update:
- Schur conditional: ,
- Structured-MVN sample: with dispatched
rootper operator type
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 0.6 | Bayesian updates from scratch (sequential conjugate) | R bayesian_updates | 🧱 | natural-form addition recursion, batch = sequential = any order, GP regression as single application |
| 0.7 | Conditional distributions & Schur complement | R conditional_distributions | 🧱 | api: gaussx.conditional, schur_complement, conditional_variance, cov_transform; GP regression as joint conditioning |
| 0.8 | Structured MVN sampling dispatch | R structured_sampling | 🧱 | api: gaussx.cholesky, gaussx.sqrt; dispatch on Diagonal / Kronecker / BlockDiag / BlockTriDiag; LowRank additive sampling; fast-sampling tracking issues gaussx#168 (Toeplitz), #169 (KroneckerSum), #170 (SumKronecker) |
0.D — Numerical Mechanics¶
Key equations / models:
- Joseph-form covariance update: (PSD-preserving)
- Cholesky: ,
- Implicit diff through
solve: - Jacobi’s formula:
- Jitter / safe Cholesky: , doubling ε until SPD
- Stable squared distances: mixed-precision to avoid catastrophic cancellation
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 0.5 | Joseph-form covariance update | R joseph_form_update | 🧱 | four equivalent covariance updates (standard / symmetric / information / Joseph), float32 stress test, connection to natural-parameter addition; preservation counterpart to 0.11’s recovery tools |
| 0.9 | Cholesky, log-det, trace primitives tour | R cholesky_logdet_trace | 🧱 | api: gaussx.cholesky, gaussx.logdet, gaussx.trace, gaussx.diag — closed-form identities / compute / storage tables, theoretical-order plots, Hutchinson stochastic trace |
| 0.10 | Differentiating through solve | R differentiating_solve | 🧱 | implicit-function-theorem JVP/VJP via lineax, Jacobi’s formula for logdet gradients, GP marginal-likelihood ascent in one jax.grad call |
| 0.11 | Numerical stability: jitter, safe Cholesky, condition number | R numerical_stability | 🧱 | api: gaussx.add_jitter, gaussx.safe_cholesky — condition-number diagnostic, bias–stability U-curve trade-off, float32 stress; jitter as recovery vs Joseph as preservation |
| 0.12 | Stable RBF & squared distances | R stable_rbf_distances | 🧱 | api: gaussx.stable_rbf_kernel, gaussx.stable_squared_distances — mixed-precision recipe, catastrophic cancellation, three-stage robustness pipeline (stable distances → jitter / safe Cholesky → Joseph form) |
Part 1 — Structured Linear Operators¶
1.A — Operator Zoo (catalog)¶
Key equations / models:
- Kronecker: , with Roth’s lemma
- BlockDiag:
- LowRankUpdate:
- Toeplitz: , matvec via FFT in
- BlockTriDiag: tridiagonal precision Λ for Markov chains
- KroneckerSum: (separable Laplacian)
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 1.1 | Operator basics + structural tags & dispatch (Dense, Diagonal, Kronecker, BlockDiag, LowRankUpdate; tag inventory; isinstance dispatch; bring-your-own-operator Circulant demo) | G operator_basics | 🧱 | ✅ merged 1.1+1.7; replaces former basics/operator_zoo stubs |
| 1.2 | Lazy operator algebra (Sum, Scaled, Product) | G lazy_algebra | 🧱 | ✅ |
| 1.3 | KroneckerSum vs SumKronecker (additive vs superposed) | G kronecker_sum_vs_sum_kronecker | 🧱 | ✅ |
| 1.4 | Toeplitz operators for stationary 1-D / 2-D grids | G toeplitz | 🧱 | ✅ |
| 1.5 | BlockTriDiag (Markov / Kalman precision form) + Lower/Upper variants | G block_tridiag | 🧱 | ✅ |
| 1.6 | MaskedOperator for missing data on a structured grid (MVN / Toeplitz / Kron / BlockTriDiag bases) | G masked_operator | 🧱 | ✅ |
1.B — Matrix Identities & Decompositions¶
Key equations / models:
- Kron eigendecomp:
- Sherman–Morrison–Woodbury:
- Det lemma:
- Operator sandwich: assembled lazily
- UDL of block-tridiagonal:
- Discrete Lyapunov:
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 1.8 | Kronecker eigendecomposition | G kronecker_eigen | 🧱 | |
| 1.9 | Sherman–Morrison–Woodbury walkthrough | G woodbury_solve | 🧱 | |
| 1.10 | Operator sandwich A P Aᵀ without materialization | — | 🧱 | GAP — gh:gaussx#163 |
| 1.11 | UDL decomposition for block-tridiagonal precision | — | 🧱 | GAP — gh:gaussx#65 |
| 1.12 | Discrete Lyapunov solve (stationary covariance of LTI) | — | 🧱 | GAP — api: discrete_lyapunov_solve |
1.C — Matrix-Free / Implicit¶
Key equations / models:
- Matrix-free matvec: via nested
vmapover rows, never forming - Cross-kernel matvec: for prediction without allocation
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 1.13 | Matrix-free / implicit operators | G implicit_kernel | 🧱 | extend with ImplicitCrossKernelOperator (GAP) |
1.D — Solvers¶
Key equations / models:
- Cholesky solve:
- Conjugate Gradient: minimize in Krylov subspace
- BBMM: batched matvec drives
solve+logdet+gradsimultaneously (Gardner et al. 2018) - MINRES: indefinite symmetric · LSMR: rectangular least squares
- Preconditioned CG: solve with
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 1.14 | Solver strategies overview (dense, CG, Lanczos) | G solver_strategies, G solver_comparison | 🧱 | merge candidates |
| 1.15 | Preconditioned CG | — | 🧱 | GAP — api: PreconditionedCGSolver |
| 1.16 | BBMM — Black-Box Matrix-Matrix Multiplication | — | 🧱 | GAP — api: BBMMSolver; GPyTorch-style |
| 1.17 | Indefinite/non-PSD: MINRES / LSMR | — | 🧱 | GAP — api: MINRESSolver, LSMRSolver |
| 1.18 | Auto-dispatch (AutoSolver, ComposedSolver) | — | 🧱 | GAP |
1.E — Trace, Log-Det, Roots¶
Key equations / models:
- Hutchinson trace: ,
- Stochastic Lanczos Quadrature: via Lanczos tridiagonalization
- Contour-integral root:
- Joint inv-quad-logdet: shared CG/Lanczos passes return and together
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 1.19 | Stochastic Lanczos Quadrature log-det | — | 🧱 | GAP — api: SLQLogdet, IndefiniteSLQLogdet |
| 1.20 | RNLA — randomized numerical linear algebra port | — | 🧱 | GAP — gh:gaussx#156 |
| 1.21 | Contour-integral sqrt_inv_matmul / sqrt_matmul | — | 🧱 | GAP — gh:gaussx#43 |
| 1.22 | Root & inverse-root decompositions | — | 🧱 | GAP — gh:gaussx#40 |
| 1.23 | Joint inverse-quadratic + log-det | — | 🧱 | GAP — gh:gaussx#39 |
Part 2 — Kernels¶
2.A — Standard kernels¶
Key equations / models:
- RBF:
- Matérn-ν: ,
- Periodic:
- Linear: · Polynomial:
- ARD lengthscale:
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 2.1 | Kernel cookbook: RBF, Matérn, Periodic, Linear, Polynomial | — | 🧱 | GAP |
| 2.2 | Kernel composition: sum, product, warping | — | 🧱 | GAP |
| 2.3 | ARD & lengthscale interpretation | — | 🧱 | GAP |
| 2.4 | Stationary vs non-stationary kernels | — | 🧱 | GAP |
| 2.13 | Pytree kernel composition — sum / product / scaled as pytrees | — | 🧱 | GAP — canonical JAX pattern, important for gaussx/pyrox users |
2.B — Deep kernels¶
Spectral kernels (Bochner / Spectral Mixture) live in Part 7 — Spectral GPs. This subsection covers neural-network-warped kernels only.
Key equations / models:
- Deep kernel:
- ArcCosine- (Cho & Saul 2009): , NN-correspondence
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 2.6 | Deep kernels (NN-warped inputs) | R pyroxgp/04_svgp_rff_nn | 🌉 🔁 | |
| 2.7 | ArcCosine kernel (NN-correspondence) | — | 🧱 🔁 | GAP — dd:features/gp/gpflow.md |
2.C — Multi-output kernels¶
Key equations / models:
- LMC (Linear Model of Coregionalization): ,
- ICM (rank-1 LMC): with output covariance
- OILMM: orthogonal projection such that decouples outputs
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 2.8 | Multi-output: LMC, ICM, OILMM | P multioutput_gp | 🧱 | |
| 2.9 | OILMM mechanics: project / back-project | — | 🧱 | GAP — api: oilmm_project, oilmm_back_project |
2.D — Spherical / localized kernels (moved)¶
Spherical / harmonic / Slepian kernels are now grouped under Part 7.F — Spherical / periodic spectral, where they sit alongside the full spectral toolkit (Bochner, RFF, HSGP). Riemannian / graph kernels remain in 2.F.
2.E — Kernel-based statistics & utilities¶
Key equations / models:
- Centered kernel: ,
- HSIC:
- MMD²:
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 2.11 | Kernel centering & KPCA | — | 🧱 | GAP — api: center_kernel, centering_operator |
| 2.12 | Kernel-based statistics: HSIC & MMD | — | 🧱 | GAP — api: hsic, mmd_squared |
2.F — Non-Euclidean & operator-valued kernels¶
Key equations / models:
- Operator-valued kernel: , predicts function-valued outputs (velocity fields, spectral curves)
- Graph heat kernel: for graph Laplacian
- Geodesic RBF on manifold: , = geodesic distance
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 2.15 | Operator-valued kernel regression — function-valued outputs (velocity fields, spectral curves) | — | 🔬 | GAP |
| 2.16 | GP regression on graphs — heat kernel | — | 🔬 | GAP |
| 2.17 | GP regression on Riemannian manifolds — geodesic distance kernels | — | 🔬 | GAP |
Part 3 — Exact GP Regression¶
3.A — Foundations¶
Key equations / models:
- GP prior: with ,
- Posterior mean:
- Posterior variance:
- Log marginal likelihood:
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 3.1 | Kernel ridge regression / GP “hello world” | G kernel_regression | 🧱 | |
| 3.2 | Exact GP regression — three patterns | P exact_gp_regression | 🧱 | |
| 3.3 | Hyperparameter learning: marginal likelihood | — | 🧱 | GAP |
| 3.8 | GP regression with mean function — constant / linear / NN mean; posterior shift under strong prior | — | 🧱 | GAP |
| 3.9 | Empirical Bayes / type-II MLE for hyperparameter priors — log-normal priors on , joint optimisation | — | 🧱 | GAP |
| 3.10 | Batch GP regression — vmap over independent GPs simultaneously | — | 🧱 | GAP — canonical JAX pattern |
| 3.11 | GPU-accelerated exact GP / tile-based Cholesky — block-Cholesky for exact GPs up to | — | 🔬 | GAP |
3.B — Diagnostics¶
Key equations / models:
- LOO-CV (LOVE):
- Probability integral transform (PIT) for calibration
- Coverage at : empirical fraction of
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 3.4 | LOVE: fast leave-one-out CV | G love_crossval | 🌉 | |
| 3.5 | Predictive variance & calibration diagnostics | — | 🧱 | GAP |
| 3.12 | Bayesian model selection — compare kernels via log marginal likelihood, Bayes factors, WAIC | — | 🧱 | GAP |
| 3.13 | Predictive distribution anatomy — decompose posterior mean vs variance; under/oversmoothing regimes | — | 🧱 | GAP — pedagogical |
3.C — Heteroscedastic noise¶
Key equations / models:
- Two-GP heteroscedastic: , with and
- Joint ELBO over via posterior linearization or coupled cubature
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 3.6 | Heteroscedastic GP — two coupled latent GPs | — | 🧱 | GAP — dd:examples/gp/moments.md |
3.D — High-level API¶
Key equations / models:
GPEstimator(kernel, ...).fit(X, y).predict(X*, quantiles=...)— sklearn-style facade
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 3.7 | sklearn-style GPEstimator facade | — | 🧱 | GAP — gh:pyrox#71 |
3.E — Constrained & Physics-informed GPs¶
Key equations / models:
- Monotone GP: derivative observations via linear operator on prior; or projection to monotone function space
- Convex GP: Hessian positivity as inequality constraint
- Boundary-condition GP: zero mean at domain boundary via Dirichlet eigenfunction basis with
- PDE-constrained GP (Raissi et al.): encode as derivative / linear-operator observations; cross-covariance
- Student-t process: ; posterior analytically tractable with heavier tails than GP
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 3.14 | Monotone GP — derivative observations or monotone projection | — | 🔬 | GAP |
| 3.15 | Convex GP — Hessian positivity constraints | — | 🔬 | GAP |
| 3.16 | Boundary-condition GP — zero mean at domain boundary via eigenfunction basis | — | 🔬 | GAP |
| 3.17 | PDE-constrained GP — encode as linear operator observations (Raissi et al.) | — | 🔬 | GAP — generalises monotone GP to arbitrary linear operators |
| 3.18 | Student-t process — heavier-tailed alternative to GP with tractable posterior | — | 🧱 | GAP |
Part 4 — Structured GPs¶
GPs whose covariance has direct algebraic structure (Kronecker, Toeplitz, grid, sparse-precision) — exploited by Part 1 operators.
4.A — Kronecker GPs¶
Key equations / models:
- 2D-grid GP: , solve in via Roth’s lemma
- Kronecker + low-rank: , posterior via Woodbury
- Sum-of-Kronecker: (additive separable)
- Additive decomposition:
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 4.1 | GPs on 2D grids with Kronecker structure | G gp_2d_grid | 🌉 | |
| 4.2 | Combined Kronecker + low-rank | G structured_gp | 🌉 | |
| 4.3 | Sum-of-Kronecker (additive space + time) | R kronecker/01_spain_extremes (uses) | 🔬 | could break out a fundamental |
| 4.4 | Separable spatiotemporal & additive (trend + seasonal + residual) | — | 🧱 | GAP — dd:examples/gp/moments.md |
| 4.5 | Kronecker marginal log-likelihood & posterior predictive | — | 🧱 | GAP — api: kronecker_mll, kronecker_posterior_predictive |
4.B — Grid / Toeplitz GPs¶
Key equations / models:
- KISS-GP / SKI: with cubic local interpolation to grid points
- Toeplitz matvec via FFT: ,
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 4.6 | KISS-GP / SKI on grids | — | 🧱 | GAP — api: InterpolatedOperator, cubic_interpolation_weights, grid_data, create_grid |
| 4.7 | Lattice / Toeplitz GPs for stationary 1D | — | 🧱 | GAP — pairs with 1.4 |
4.C — Sparse-precision (mesh / GMRF)¶
Key equations / models:
- SPDE (Lindgren et al. 2011): → Matérn kernel
- FEM precision: on triangulated mesh, sparse banded
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 4.8 | SPDE / FEM Matérn — triangulated mesh GMRF, O(n^{3/2}) | — | 🌉 | GAP — gh:pyrox#50, dd:features/gp/spde_fem.md |
Part 5 — Approximations & Scalability¶
GPs that scale to large N via inducing points, random features, or iterative solvers — distinct from Part 4 in that the structure is imposed rather than inherent to the data geometry.
5.A — Random features (moved)¶
Random Fourier Features, Nyström, and FastFood live under Part 7.C — Random Fourier features. This subsection is intentionally empty in Part 5 — the inducing-point / variational story (5.B–5.E) does not depend on RFF derivations, only on the spectral approximation result they deliver.
5.B — Inducing-point fundamentals¶
Key equations / models:
- Inducing variables: with ; FITC: ,
- SVGP ELBO (Hensman 2013):
- Whitened: , → isotropic optimization
- Collapsed ELBO (Titsias 2009):
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 5.4 | Inducing point methods (FITC, DTC, VFE) — theory | — | 🧱 | GAP |
| 5.5 | Sparse Variational GP (Titsias/Hensman) | G sparse_variational_gp, R pyroxgp/01_svgp_standard | 🌉 | DUP |
| 5.6 | Whitening mechanics: whiten_covariance, unwhiten, unwhiten_covariance | — | 🧱 | GAP |
| 5.7 | Whitened SVGP & Bayesian linear regression view | G whitened_svgp | 🌉 🔁 | |
| 5.8 | Collapsed ELBO | — | 🧱 | GAP — api: collapsed_elbo |
| 5.9 | Mini-batched SVGP / stochastic VI | R pyroxgp/02_svgp_batched | 🔬 | |
| 5.10 | Full SVGP tutorial — 6 guide families incl. orthogonal decoupled | — | 🧱 | GAP — dd:examples/gp/svgp_numpyro.py |
| 5.19 | Collapsed vs uncollapsed SVGP — explicit comparison of Titsias vs Hensman objectives, bias/variance tradeoff | — | 🧱 | GAP — pedagogical |
| 5.20 | Online sparse GP (Csató & Opper 2002) — sequential Bayesian update of inducing set without full retraining | — | 🔬 | GAP — complements streaming filter tutorial 8.33 |
5.C — Inter-domain features¶
Inter-domain features that are fundamentally spectral (VFF, VISH, Laplacian eigenfunctions) live under Part 7.E — Variational spectral methods and Part 7.F — Spherical / periodic spectral. This subsection now covers only the generic decoupled-basis pattern.
Key equations / models:
- Inter-domain inducing variables: for chosen basis
- Decoupled: separate basis for posterior mean (large) and covariance (small)
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 5.14 | Decoupled inter-domain features (mixed spatial + spectral) | — | 🧱 | GAP — gh:pyrox#49 |
5.D — Iterative-solver scaling¶
Key equations / models:
- CG-based GP: via CG; logdet via SLQ
- CGLB: from Lanczos eigenvalue bounds
- Preconditioned CG: with Nyström / pivoted-Cholesky preconditioner
- EigenPro: SGD with eigen-spectrum preconditioner
- Falkon (Rudi 2017): Newton iteration on Nyström-reduced system,
- LogFalkon: extends Falkon to GSC losses (logistic, exponential)
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 5.15 | CG for exact GPs at scale + CGLB | — | 🧱 | GAP — pairs with 1.16; dd:features/gp/gpflow.md |
| 5.16 | EigenPro spectral preconditioning | — | 🧱 | GAP — gh:gaussx#63 |
| 5.17 | Falkon: Nyström preconditioner + solve recipe | — | 🌉 | GAP — gh:gaussx#49 |
| 5.18 | LogFalkon / GSC-Falkon — Newton outer + preconditioned CG | — | 🌉 | GAP — gh:pyrox#50, dd:features/gp/logfalkon.md |
5.E — Deep GPs¶
Key equations / models:
- Deep GP (Salimbeni & Deisenroth 2017): , each layer
- Doubly stochastic ELBO:
- Convolutional GP (van der Wilk 2017): patch-level inducing features, , over patches
- Inter-domain inducing variables per patch; exploits translation-equivariance
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 5.21 | Deep GP — doubly stochastic VI ELBO, layer-by-layer sampling (Salimbeni & Deisenroth 2017) | — | 🔬 | GAP — clear gap in Part 5 hierarchy |
| 5.22 | Convolutional GP — patch-level inducing features for image data (van der Wilk 2017) | — | 🔬 | GAP — natural extension after VISH/VFF |
Part 6 — Non-Conjugate Likelihoods & Inference¶
6.A — Likelihood & integrator zoos¶
Key equations / models:
- Generic non-conjugate factorization:
- Bernoulli (), Poisson (), Student-t, Beta, Gamma, Exponential, Softmax
- ELL: via integrator
- Gauss–Hermite:
- Sigma points (Unscented): deterministic points
- 5th-order cubature: symmetric points
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 6.1 | Likelihood zoo: Bernoulli, Poisson, StudentT, Softmax, Heteroscedastic, Exponential, Beta, Gamma, Multi-latent | — | 🧱 | GAP — gh:pyrox#48 |
| 6.2 | Integrator zoo: Gauss–Hermite, MC, Unscented, Taylor, Assumed-Density Filter | — | 🧱 | GAP — api: gaussx _quadrature |
| 6.3 | Sigma points & cubature | — | 🧱 | GAP — api: sigma_points, cubature_points |
| 6.4 | Fifth-order symmetric cubature integrator | — | 🧱 | GAP — gh:gaussx#26 |
| 6.5 | Statistical Linear Regression via cubature (SLR) | — | 🧱 | GAP — gh:gaussx#25 |
6.B — Classification¶
Key equations / models:
- Latent GP classification: ,
- Multi-class: , latent GPs
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 6.6 | Latent GP classification — three patterns (Bernoulli + softmax) | P latent_gp_classification | 🧱 | extend to multi-class per dd |
6.C — Newton / Gauss-Newton family¶
Key equations / models:
- Laplace: , ,
- Newton update: with damping
- Gauss-Newton / GGN: (drops 2nd-order terms)
- Posterior linearization (SLR): matched on Gaussian
- Hutchinson Hessian diag:
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 6.7 | Laplace approximation | P advanced_gp_laplace | 🧱 | |
| 6.8 | Gauss–Newton inference | P advanced_gp_gauss_newton | 🧱 | |
| 6.9 | Quasi-Newton inference (L-BFGS sites) | P advanced_gp_qn | 🧱 | |
| 6.10 | Posterior Linearization (Bayes-Newton) | P advanced_gp_pl | 🧱 | |
| 6.11 | Newton & damped natural updates | — | 🧱 | GAP — api: newton_update, damped_natural_update |
| 6.12 | Gauss–Newton & GGN diagonal | — | 🧱 | GAP — api: gauss_newton_precision, ggn_diagonal |
| 6.13 | Hutchinson Hessian diagonal & Riemannian PSD correction | — | 🧱 🔁 | GAP |
6.D — Variational inference¶
Key equations / models:
- Variational guides: delta · diagonal mean-field · low-rank () · full-rank Cholesky · normalizing flow · whitened
- ELBO:
- Natural gradient: , = Fisher
- CVI sites (Khan & Lin 2017):
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 6.14 | Variational guides — full-rank, mean-field, low-rank, whitened, delta, flow | — | 🧱 | GAP — dd:examples/gp/vgp_numpyro.py + features/gp/variational_families.md |
| 6.15 | Natural gradient VI | G natural_gradient_vi | 🌉 | |
| 6.16 | Conjugate VI for GPs (CVI sites) | — | 🧱 | GAP — api: cvi_update_sites, site_natural_from_tilted |
| 6.23 | Full VGP (non-sparse) — variational parameters, , no inducing-point approximation | — | 🧱 | GAP — closes gap between sparse and exact tutorials |
6.E — Expectation Propagation¶
Key equations / models:
- Cavity:
- Tilted:
- Site update: choose such that is minimized → moment match
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 6.17 | Expectation Propagation | P advanced_gp_ep, G expectation_propagation | 🌉 | DUP |
| 6.18 | EP cavity & tilted moments mechanics | — | 🧱 | GAP — api: cavity_distribution, ep_tilted_moments; gh:gaussx#24 |
6.F — Bayesian linear regression & non-standard outputs¶
Key equations / models:
- BLR posterior: ,
- Sequential update via Sherman–Morrison: rank-1 covariance update on each new observation
- Log-Gaussian Cox Process: , observations from Poisson process
- Warped GP (Snelson 2003): with monotone bijection , transformed likelihood
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 6.19 | Bayesian linear regression updates | — | 🧱 🔁 | GAP — api: blr_diag_update, blr_full_update |
| 6.20 | Log-Gaussian Cox Process (spatial point-process intensity) | — | 🔬 | GAP — dd:examples/gp/moments.md |
| 6.21 | Warped GP (Box–Cox for skewed targets) | — | 🧱 | GAP — dd:examples/gp/moments.md |
| 6.24 | Warped GP with normalizing flows — learnable bijection extends Box–Cox to NF-parameterized warpings | — | 🔬 | GAP |
6.G — Aggregate Bayesian methods¶
Key equations / models:
- INLA (Rue et al. 2009): at Laplace mode
- Numerical integration over θ on grid, marginal posteriors of latent field
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 6.22 | R-INLA port — integrated nested Laplace approximation | — | 🌉 | GAP — gh:gaussx#155 |
Part 7 — Spectral GPs¶
Spectral methods unify several threads from earlier parts: stationary kernels via Bochner’s theorem (extending 2.A), spectral kernel families (former 2.B), random-feature approximations (former 5.A), and spectral inducing-feature methods (subset of former 5.C). This part collects them with a coherent through-line: every stationary kernel is the Fourier transform of a positive measure, and every approximation in this part is a clever way of sampling, parameterizing, or projecting that measure.
7.A — Spectral foundations¶
Key equations / models:
- Bochner’s theorem: stationary with a finite measure (the power spectral density).
- Wiener–Khinchin: covariance spectral density via Fourier transform; Toeplitz–circulant duality on a regular grid (pairs with 1.4).
- Karhunen–Loève expansion: , , eigenpairs of the kernel integral operator.
- Mercer’s theorem: , basis for the spectral representations in 7.D and 7.E.
- Sampling theorem connection: bandlimited prior + grid spacing → no aliasing in Toeplitz / FFT matvec.
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 7.1 | Bochner’s theorem — stationary kernels as Fourier transforms of spectral densities | — | 🧱 | GAP — pedagogical anchor for the rest of Part 7 |
| 7.2 | Wiener–Khinchin & power spectral density estimation — covariance ↔ spectrum, periodogram, Welch | — | 🧱 | GAP — connects 1.4 Toeplitz–FFT machinery to spectrum estimation |
| 7.3 | Karhunen–Loève expansion — eigenpairs of the kernel integral operator | — | 🧱 | GAP — bridge to Mercer / HSGP |
| 7.4 | Sampling theorem & aliasing — bandlimited priors on regular grids | — | 🧱 | GAP — practical guidance for Toeplitz/Kronecker setups |
7.B — Spectral kernel models¶
Key equations / models:
- Spectral mixture (Wilson & Adams 2013): — closed-form via inverse FT.
- Spectral mixture product (multi-D): outer product of 1-D SM along each axis.
- Sparse spectrum kernel (Lázaro-Gredilla 2010): finite mixture of point masses in .
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 7.5 | Spectral kernels — visual guide | P spectral_kernel_models | 🧱 🔁 | moved from 2.5 |
| 7.6 | Spectral Mixture (SM) kernel fitting — auto-discover periodicity from data (Wilson & Adams 2013) | — | 🔬 | GAP — visualise learned spectral components; moved from 2.14 |
| 7.7 | Sparse spectrum kernel — point-mass spectral approximation | — | 🧱 | GAP |
7.C — Random Fourier features¶
Key equations / models:
- RFF (Rahimi & Recht 2007): , , .
- SSGP / VSSGP: Bayesian linear regression in RFF space; hierarchical / variational priors over ω.
- Nyström: , rank- approximation; data-dependent counterpart to RFF.
- FastFood: — structured frequency matrix, matvec.
- Orthogonal random features: drawn from a uniform measure on the sphere (variance reduction).
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 7.8 | Random Fourier Features → SSGP → VSSGP | P random_fourier_features | 🧱 🔁 | moved from 5.1 |
| 7.9 | Kernel approximations: Nyström vs RFF | G kernel_approximations, P kernel_approximation | 🧱 | DUP — pick one home; moved from 5.2 |
| 7.10 | FastFood structured random features | — | 🧱 | GAP — gh:gaussx#62; moved from 5.3 |
| 7.11 | Orthogonal random features (ORF) — variance reduction over vanilla RFF | — | 🧱 | GAP |
7.D — Hilbert-space methods¶
Key equations / models:
- Hilbert-space GP (Solin & Särkkä 2020): on a bounded domain Ω with Dirichlet boundary, Laplacian eigenpairs give — diagonalized prior, inference.
- Periodic kernel via truncated Fourier series: orthonormal basis on .
- Convergence of HSGP to exact GP as for stationary kernels (Mercer rate).
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 7.12 | Hilbert-space GP (HSGP, Solin–Särkkä) — Laplace-eigenbasis spectral approximation | — | 🧱 | GAP — heavy use in Bayesian time-series / NumPyro contrib |
| 7.13 | Periodic kernel via truncated Fourier basis — exact spectral representation | — | 🧱 | GAP |
| 7.14 | HSGP convergence diagnostics — error vs , boundary-effect calibration | — | 🧱 | GAP |
7.E — Variational spectral methods¶
Key equations / models:
- Variational Fourier Features (Hensman 2018): on → diagonal , .
- Inter-domain spectral inducing variables: choose basis to be eigenfunctions of the kernel operator → diagonalized .
- Variational Sparse Spectrum GP (VSSGP, Gal & Turner 2015): variational distribution over RFF frequencies.
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 7.15 | Variational Fourier Features (VFF, Hensman 2018) — Fourier basis on bounded intervals, diagonal | — | 🧱 | GAP — moved from 5.12; gh:pyrox#49, dd:features/gp/inducing_features.md |
| 7.16 | Laplacian-eigenfunction inducing features (manifolds, graphs) | — | 🧱 | GAP — moved from 5.13; gh:pyrox#49 |
| 7.17 | VSSGP — variational distribution over RFF frequencies (Gal & Turner 2015) | — | 🔬 | GAP — natural next step after 7.8 |
7.F — Spherical / periodic spectral¶
Key equations / models:
- Spherical harmonics on : orthonormal eigenfunctions of the Laplacian; zonal kernels .
- Funk–Hecke: convolution of zonal function with spherical-harmonic basis is diagonal in .
- Variational Inducing Spherical Harmonics (VISH, Dutordoir 2020): inducing variables = spherical-harmonic projections; diagonal .
- Spherical Slepian: solve to maximize energy in region .
- Fourier features on the torus / circle: periodic boundary conditions → exact spectral kernel.
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 7.18 | VISH — Variational Inducing Spherical Harmonics | R pyroxgp/03_svgp_spherical_harmonics | 🔬 | moved from 5.11; gh:pyrox#49 |
| 7.19 | Slepian positional encodings (spherical, localized) | — | 🧱 🔁 | moved from 2.10; GAP — gh:pyrox#125 |
| 7.20 | Fourier features on the torus / circle — periodic BC, exact spectral kernel | — | 🧱 | GAP |
| 7.21 | Zonal spectral kernels on — Funk–Hecke diagonalization | — | 🧱 | GAP |
Part 8 — Markov / State-Space GPs¶
8.A — Foundations¶
Key equations / models:
- Discrete SSM: , , ,
- Kalman predict: ,
- Kalman update: , ,
- RTS smoother: backward
- Joseph form: (numerically stable)
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 8.1 | Kalman filter + RTS smoother (pure SSM) | G kalman_filter | 🧱 | |
| 8.2 | SSM ↔ natural / expectation parameterizations | — | 🧱 | GAP — api: ssm_to_naturals, naturals_to_ssm, expectations_to_ssm |
| 8.3 | Pairwise marginals & sites | — | 🧱 | GAP — api: pairwise_marginals, GaussianSites, sites_to_precision |
| 8.4 | SDE autocovariance & process noise | — | 🧱 | GAP — api: sde_autocovariance, process_noise_covariance |
| 8.5 | Joseph-form Kalman update standalone | — | 🧱 | GAP |
8.B — SDE kernel zoo¶
Key equations / models:
- LTI SDE: , observation
- Kernel ↔ SDE map (Hartikainen & Särkkä 2010): Matérn-3/2 → 2D LTI; Matérn-5/2 → 3D
- Discretization: , via Lyapunov
- Periodic: truncated Fourier, block-diagonal · QuasiPeriodic: Matérn × Periodic via
- Drift-KL:
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 8.6 | Matérn kernels in state-space form | P markov_gp_sde_kernels | 🧱 | |
| 8.7 | Full SDE kernel zoo: Periodic, QuasiPeriodic, Cosine, Constant, Sum, Product, Subband Matérn | — | 🧱 | GAP — api: gaussx _ssm SDE kernels |
| 8.8 | SDE linearization & drift-KL helpers | — | 🧱 | GAP — gh:gaussx#70 |
8.C — Markov GP workflows¶
Key equations / models:
- Marginal log-likelihood: from filter pass
- Hyperparameter learning: gradient through Kalman filter
- Sparse variational Markov GP: ELBO via filter over inducing time points
- KalmanGuide: pseudo-observations , → standard Kalman + RTS for posterior
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 8.9 | Markov GP with Kalman filtering | P markov_gp_kalman | 🧱 | |
| 8.10 | Markov GP hyperparameter training | P markov_gp_training | 🧱 | |
| 8.11 | Non-Gaussian Markov GP | P markov_gp_nongauss | 🧱 | |
| 8.12 | Sparse variational Markov GP | P sparse_markov_gp | 🧱 | |
| 8.13 | KalmanGuide — Bayes-Newton via pseudo-observations + RTS | — | 🧱 | GAP — dd:features/gp/variational_families.md |
8.D — Parallel & scalable filtering¶
Key equations / models:
- Parallel scan (Särkkä & García-Fernández 2021): associative op on pairs, depth
- Square-root form: propagate instead of for numerical stability
- SpInGP: parallel-in-time + sparse (banded) state representation
- Mean-field Kalman: block-diagonal across independent dimensions
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 8.14 | Parallel / batched Kalman filter | G parallel_kalman | 🌉 | |
| 8.15 | Square-root parallel Kalman filter / RTS | — | 🧱 | GAP — gh:gaussx#165 |
| 8.16 | SpInGP — sparse parallel-in-time GP | — | 🧱 | GAP — api: spingp_log_likelihood, spingp_posterior |
| 8.17 | Mean-field block-diagonal Kalman filter | — | 🧱 | GAP — gh:gaussx#29 |
8.E — Nonlinear filtering¶
Key equations / models:
- EKF: linearize around via Jacobian
- UKF: propagate sigma points through , recompute moments
- CKF: cubature points, no tuning parameter
- Innovation cov as
LowRankUpdatewhen has structure:
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 8.18 | Nonlinear Gaussian Filter (UKF/EKF generalization) | — | 🧱 | GAP — gh:gaussx#161 |
| 8.19 | Extended Kalman Smoother (Taylor(1)) | — | 🧱 | GAP — dd:examples/gp/integration_detail.md |
| 8.20 | Unscented Kalman Smoother (PL + SigmaPoints + Kalman) | — | 🧱 | GAP — dd:examples/gp/integration_detail.md |
| 8.21 | Cubature Kalman Smoother | — | 🧱 | GAP — dd:examples/gp/integration_detail.md |
| 8.22 | Innovation cov as structured LowRankUpdate | — | 🧱 | GAP — gh:gaussx#164 |
8.F — Ensemble methods¶
Key equations / models:
- EnKF analysis: ,
- Bessel-corrected covariance: divide by not
- Ensemble Kalman gain: low-rank
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 8.23 | Ensemble Kalman Filter on Lorenz-63 | G ensemble_kalman | 🔬 | |
| 8.24 | Bessel-corrected EnKF + ensemble_kalman_gain | — | 🌉 | GAP — gh:gaussx#127 |
8.G — Steady-state & structured-Gaussian surfaces¶
Key equations / models:
- DARE: → unique SPD solution
- Infinite-horizon Kalman: solve DARE once, reuse
- MarkovGaussian surface: structured Gaussian with as PyTree, exposes filter/smoother/sample
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 8.25 | Infinite-horizon Kalman & DARE | — | 🧱 | GAP — api: infinite_horizon_filter/smoother, dare |
| 8.26 | DARE via Optimistix fixed-point + implicit diff | — | 🧱 | GAP — gh:gaussx#97 |
| 8.27 | MarkovGaussian structured surface | — | 🧱 | GAP — gh:gaussx#76 |
| 8.28 | Spatiotemporal SDE GPs | — | 🔬 | GAP |
8.H — Non-conjugate temporal case studies¶
Key equations / models:
- Laplace + Kalman: iterate site moments via Newton on each , run filter+smoother per iteration
- EP + Kalman: same loop with EP cavity / tilted moments
- Changepoints (additive): , Matérn-5/2 + Matérn-1/2
- Streaming filter-only: discard for fixed window
- Time-varying lengthscale: as random walk in
numpyro.scan
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 8.29 | GP classification with Laplace + Kalman | — | 🧱 | GAP — dd:examples/gp/state_space.md |
| 8.30 | Poisson counts with EP + Kalman | — | 🧱 | GAP — dd:examples/gp/state_space.md |
| 8.31 | Changepoint detection via additive temporal GPs | — | 🌉 | GAP — dd:examples/gp/state_space.md |
| 8.32 | Latent temporal GP in a BHM | — | 🌉 | GAP — dd:examples/gp/state_space.md |
| 8.33 | Online / streaming GP (filter-only mode) | — | 🌉 | GAP — dd:examples/gp/state_space.md |
| 8.34 | Non-LTI temporal model via numpyro.scan | — | 🧱 | GAP — dd:examples/gp/state_space.md |
Part 9 — Sampling, Pathwise, Conditioning¶
9.A — Pathwise sampling¶
Key equations / models:
- Pathwise (Wilson 2020): with
- Decoupled SVGP sampling: parametric prior + non-parametric update from inducing variables only
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 9.1 | Pathwise GP posterior sampling (Wilson 2020) | P gp_pathwise | 🧱 | |
| 9.2 | Pathwise sampling with NumPyro | P gp_pathwise_numpyro | 🧱 | |
| 9.3 | Decoupled sampling for SVGP | — | 🧱 | GAP |
9.B — Matheron’s-rule conditioning¶
Key equations / models:
- Matheron’s rule: where are independent prior samples
- Partitioned joint: factor as with shared Cholesky
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 9.4 | Matheron’s-rule conditioning by sampling | — | 🧱 | GAP — gh:gaussx#77 |
| 9.5 | Partitioned joint conditional sampling | — | 🧱 | GAP — gh:gaussx#79 |
Part 10 — Uncertainty Propagation & UQ¶
10.A — Foundations¶
Key equations / models:
- Moment matching: match and for
- Linearization (Taylor-1):
- Unscented: sigma points
- Gauss–Hermite ELL:
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 10.1 | Moment matching, unscented transform, linearization | — | 🧱 | GAP |
| 10.2 | Gauss–Hermite quadrature for ELL | — | 🧱 | GAP — used in R kronecker series |
10.B — Uncertain inputs¶
Key equations / models:
- GP at uncertain : ,
- PILCO chain: iterate moment-matched GP predictions steps ahead, track
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 10.3 | Uncertainty propagation through nonlinear functions | G uncertainty_propagation | 🌉 | |
| 10.4 | GPs with uncertain inputs (PILCO-style) | G uncertain_gp_inputs | 🔬 | |
| 10.5 | Multi-step-ahead PILCO autoregressive forecasting | — | 🔬 | GAP — dd:examples/gp/integration_detail.md |
10.C — Analytic moments¶
Key equations / models:
- Ψ-statistics for RBF (Titsias & Lawrence 2010):
- Closed form for RBF: products of Gaussian integrals with kernel parameters
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 10.6 | Ψ-statistics & exact RBF closed form for uncertain inputs | — | 🧱 | GAP — api: compute_psi_statistics, AnalyticalPsiStatistics |
| 10.7 | Uncertain SVGP / VGP prediction (sigma-point + analytic) | — | 🧱 | GAP — api: uncertain_svgp_predict, uncertain_vgp_predict |
| 10.8 | Cost / mean / gradient expectations under Gaussian inputs | — | 🧱 | GAP — api: cost_expectation, mean_expectation, gradient_expectation |
10.D — BGPLVM¶
Key equations / models:
- Bayesian GPLVM: , , marginalize via Ψ-statistics
- ELBO uses
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 10.9 | Bayesian GPLVM with uncertain inputs | — | 🔬 | GAP — api: uncertain_bgplvm_predict |
| 10.12 | GP-LVM (Lawrence 2004) — unsupervised manifold learning vs BGPLVM; Bayesian nonlinear PCA vs PCA/UMAP | — | 🔬 | GAP — distinct from 9.9 (Bayesian extension) |
| 10.13 | Supervised GPLVM — classification via latent GP representation | — | 🔬 | GAP |
10.E — Special integrators & quantiles¶
Key equations / models:
- Mixture quantile root-find: solve via Brent / Optimistix
- Importance-weighted MC: , for rare-event tails
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 10.10 | Mixture-quantile root-finder | — | 🧱 | GAP — gh:gaussx#121 |
| 10.11 | Custom integrator: importance-weighted MC for rare events | — | 🧱 | GAP — dd:examples/gp/integration_detail.md |
| 10.14 | GP quadrature / Bayesian cubature — with GP prior on ; Bayesian numerical integration | — | 🔬 | GAP |
Part 11 — Probabilistic Programming Integration¶
11.A — gaussx + NumPyro¶
Key equations / models:
numpyro.factor("gp", log_p)withlog_p = log_marginal_likelihood- Precision-form Gaussian: for sparse Λ (avoids Cholesky of Σ)
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 11.1 | GP regression with NumPyro + gaussx | G numpyro_gp | 🧱 | |
| 11.2 | Bayesian linear regression in precision form | G numpyro_precision | 🧱 🔁 |
11.B — pyrox patterns¶
Key equations / models:
- Pattern 1 —
eqx.tree_at(model, replace, sampled_values) - Pattern 2 —
PyroxModule.pyrox_sample(name, dist) - Pattern 3 —
Parameterizedwithregister_param+set_prior
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 11.3 | Three-pattern regression masterclass: tree_at / pyrox_sample / Parameterized | P regression_masterclass_treeat, _pyrox_sample, _parameterized | 🧱 🔁 |
11.C — Hierarchical & sampling¶
Key equations / models:
- Hierarchical: , ,
- NUTS: HMC with auto step size + tree-based termination
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 11.4 | Hierarchical / multi-task GPs in NumPyro | — | 🌉 | GAP |
| 11.5 | MCMC for GP hyperparameters (NUTS) | — | 🧱 | GAP |
Part 12 — Ensembles¶
Key equations / models:
- Ensemble predictive:
- vmap over PRNG keys: for
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 12.1 | Ensemble primitives — three ways | P ensemble_primitives_tutorial | 🧱 | |
| 12.2 | EnsembleMAP & EnsembleVI runners | P ensemble_runner_tutorial | 🧱 |
Part 13 — Data Pipelines¶
Key equations / models:
- Spatial encoding: lat/lon ↔ Cartesian unit-vector on
- Time encoding: cyclic
- Standardization: per dimension
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 13.1 | Spatiotemporal preprocessing (geo + time + pandas) | P spatiotemporal_preprocessing | 🧱 | |
| 13.2 | Loading climate data: xarray, zarr, ERA5 | — | 🔬 | GAP |
Part 14 — Applied Case Studies (research_notebook)¶
14.A — Spatial extremes¶
Key equations / models:
- GEV CDF:
- Multiplicative model: with a spatial GP
- Non-stationary tails: each as spatial GPs
- Gaussian copula: on residuals
- BHM: with GP priors on each parameter
- Time-varying GEV: as temporal GPs
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 14.1 | Kronecker GP + GEV likelihood (Spain extremes) | R kronecker/01_spain_extremes | 🔬 | |
| 14.2 | Kronecker-multiplicative GP (spatial warming rates) | R kronecker/02_spain_multiplicative | 🔬 | |
| 14.3 | Non-stationary GEV (location-dependent tails) | R kronecker/03_spain_nonstationary | 🔬 | |
| 14.4 | Gaussian copula spatial dependence | R kronecker/04_spain_copula | 🔬 | |
| 14.5 | BHM with GEV + spatial GPs (methane / precipitation extremes) | — | 🔬 | GAP — dd:examples/gp/moments.md |
| 14.6 | Temporal extremes: GEV with time-varying μ(t), σ(t), ξ(t) | — | 🔬 | GAP — dd:examples/gp/state_space.md |
14.B — SVGP applied¶
Key equations / models:
- Mini-batch ELBO:
- Inter-domain SVGP: for spherical-harmonic basis on real climate data
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 14.7 | SVGP on real climate data (large-N) | R pyroxgp/01–04 | 🔬 | could split: standard / batched / SH / deep-kernel |
14.C — Geophysics & emulation¶
Key equations / models:
- GP emulator: trained on simulator outputs
- somax composition: GP prior on diffusivity field feeds ocean PDE
- DA composition: learn dynamics as GP, plug into EnKF/4D-Var
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 14.8 | GP for ocean / SST / sea-level extremes | — | 🔬 | GAP |
| 14.9 | GP emulator for a numerical model | — | 🔬 | GAP |
| 14.10 | GP + somax — spatially smooth GP priors for ocean parameters | — | 🔬 | GAP — dd:examples/integration.md |
| 14.11 | GP + ekalmX/vardax — learned GP dynamics for DA | — | 🔬 | GAP — dd:examples/integration.md |
| 14.16 | Multi-fidelity GP (Kennedy & O’Hagan 2000) — fuse cheap (coarse) + expensive (fine) simulators via autoregressive GP | — | 🔬 | GAP |
| 14.17 | ABC-GP emulator — use GP surrogate to bypass expensive likelihood; sample θ via ABC with GP-matched summary statistics | — | 🔬 | GAP |
14.D — Optimization & decision¶
Key equations / models:
- Expected Improvement: ,
- Thompson sampling via pathwise posterior
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 14.12 | Bayesian optimization with GPs (Expected Improvement) | — | 🔬 | GAP — dd:examples/gp/integration_detail.md |
| 14.18 | GP-UCB acquisition — Upper Confidence Bound | — | 🔬 | GAP |
| 14.19 | Probability of Improvement (PI) — simpler BO baseline alongside EI | — | 🔬 | GAP |
| 14.20 | Thompson sampling for BO — use pathwise posterior sampling (connects to 9.1) | — | 🔬 | GAP |
| 14.21 | Multi-objective BO — Pareto front approximation via GP surrogates | — | 🔬 | GAP — relevant for simulator calibration |
| 14.22 | GP for contextual bandits — GP reward model with online UCB/Thompson updates | — | 🔬 | GAP — connects BO and online learning |
| 14.23 | Optimal experimental design — sensor placement via mutual information maximisation | — | 🔬 | GAP |
14.E — Causal & event data¶
Key equations / models:
- Counterfactual GP: condition on hypothetical intervention
- Marked TPP: intensity , marks
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 14.13 | Causal inference / counterfactual GPs | — | 🔬 | GAP |
| 14.14 | Marked temporal point process + GP intensity (seismology, methane plumes) | — | 🔬 | GAP — dd:examples/gp/moments.md |
14.F — Practical¶
Key equations / models:
- Masked likelihood: , automatic imputation of via posterior
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 14.15 | Missing data / partial observations with masked likelihood | — | 🌉 | GAP — dd:examples/gp/moments.md |
| 14.24 | Covariate-shift GP — importance-weighted marginal likelihood for train/test distribution mismatch | — | 🔬 | GAP |
Part 15 — Metrics & Calibration¶
Key equations / models:
- NLPD: → decomposes into calibration + sharpness
- ECE:
- CRPS: ; closed form for Gaussian
- Coverage at :
- Interval width (sharpness):
| # | Tutorial | Source | Scope | Refs / Notes |
|---|---|---|---|---|
| 15.1 | NLPD decomposition: calibration + sharpness | — | 🧱 | GAP — dd:features/gp/metrics.md |
| 15.2 | Expected Calibration Error (ECE) & coverage diagnostics | — | 🧱 | GAP |
| 15.3 | Continuous Ranked Probability Score (CRPS) | — | 🧱 | GAP |
| 15.4 | RMSE / MAE / R² / interval width | — | 🧱 | GAP |
Summary of dups to reconcile¶
| Topic | Locations | Suggestion |
|---|---|---|
| Kernel approximations / RFF / Nyström | G kernel_approximations, P kernel_approximation, P random_fourier_features | Keep pyrox as canonical; gaussx version → low-level mechanics |
| Sparse VGP | G sparse_variational_gp, G whitened_svgp, R pyroxgp/01_svgp_standard | research_notebook = applied; gaussx ones = linear-algebra view |
| Expectation Propagation | G expectation_propagation, P advanced_gp_ep | gaussx = mechanics-from-scratch; pyrox = library API |
| Schur / conditioning | G conditional_distributions, G sugar_ops | merge |
| Operator basics | G basics, G operator_zoo | merge |
| Solver strategies | G solver_strategies, G solver_comparison | merge |
Proposed final homes (high-level)¶
- gaussx/docs/notebooks/ → Parts 0, 1, 9.A (mechanics), small subset of 5 (linear-algebra view), 10.A, 14
- pyrox/docs/notebooks/ → Parts 2, 3, 5 (mostly), 6, 7, 8, 10.B, 11, 12.1
- research_notebook/projects/gaussian_processes/ → Part 4 (applied), 9.B–9.D, 13, plus migrated gaussx fully-fledged items