filterax Tutorial Master List
A reconciled, exhaustive curriculum spanning what currently exists in filterax , gaussx , pipekit , and research_notebook , plus gaps surfaced from the filterax public API, open GitHub issues, and the design_docs/ series under filterax/docs/design_docs/. Goal: the most complete ensemble-DA / data-assimilation tutorial sequence we could ship.
GP / SVI tutorials live in ../../gaussian_processes/TUTORIAL_MASTER_LIST.md . Cross-listed items (state-space GPs, ensemble VI, structured Gaussians, sigma points, shrinkage) are flagged π.
Legend β Source columns:
F = exists in filterax (docs/notebooks/<name>)G = exists in gaussx (docs/notebooks/<name>)K = exists in pipekit (docs/notebooks/<name>)R = exists in research_notebook (projects/assimilation/notebooks/<path>)β = does not exist yet (gap)Scope tag : π§± fundamental Β· π¬ research Β· π bridge Β· π cross-listed (GP master list)
Refs column : gh#N = open GitHub issue Β· dd:path = filterax docs/design_docs/<path> Β· api:foo = filterax exported symbol.
Curriculum at a glance ΒΆ Part 0 β Bayesian Filtering & DA Foundations 0.A β The filtering problem 0.B β Linear-Gaussian Kalman filter 0.C β The forecast β analysis β inflate cycle 0.D β Variational DA (3D/4D-Var) contrast 0.E β Information vs covariance form Part 1 β Layer 0 Primitives 1.A β Ensemble statistics 1.B β Gain & innovation 1.C β Likelihood & innovation statistics 1.D β Perturbations 1.E β Localisation kernels 1.F β Inflation primitives 1.G β Patches & domain decomposition Part 2 β Layer 1 Sequential Filters 2.A β Stochastic / perturbed-observation 2.B β Deterministic square-root family 2.C β Localised 2.D β Symmetry-breaking variants 2.E β Parametric (non-ensemble) 2.F β Selection guide Part 3 β Layer 2 Forecast-Analysis Loops 3.A β Protocols & extension points 3.B β L2 model walkthroughs 3.C β Inflator integration 3.D β Stochastic key handling Part 4 β Backward-Pass Smoothers 4.A β Sequential smoothers 4.B β Square-root smoothers 4.C β Iterative smoothers 4.D β Selection & memory trade-offs Part 5 β Ensemble Kalman Processes (Inversion & Sampling) 5.A β Inversion (EKI family) 5.B β Sampling (EKS family) 5.C β Parametric (UKI) 5.D β Regularised / sparse 5.E β Schedulers Part 6 β Localisation, Inflation, Calibration 6.A β Why localisation 6.B β Why inflation 6.C β Adaptive variants 6.D β Shrinkage estimators Part 7 β Diagnostics & Verification 7.A β Basic spread / RMSE / rank 7.B β Innovation diagnostics 7.C β Reliability & sharpness 7.D β Predictive likelihood Part 8 β Differentiable DA 8.A β Theory & gradient stability 8.B β differentiable_assimilate mechanics 8.C β Training patterns 8.D β Memory & remat 8.E β Loss zoo Part 9 β optax Integration 9.A β Process transforms 9.B β Composition with optax chains 9.C β Hybrid SGD + EKI Part 10 β Sequential Variational Inference 10.A β Foundations 10.B β Particle filters & SMC 10.C β Variational SMC 10.D β Ensemble VI for SSMs 10.E β Amortised / streaming inference 10.F β Sequential VB comparison Part 11 β Ecosystem Integrations 11.A β gaussx (structured covariances) 11.B β pipekit (orchestration) 11.C β somax (SDE dynamics) 11.D β geo_toolz / xr_assimilate (xarray) 11.E β plumax (Tier IV) Part 12 β Applied Case Studies (research_notebook) 12.A β Canonical DA benchmarks 12.B β Atmospheric & remote sensing 12.C β Inverse problems 12.D β Online / streaming Part 13 β Reference Surfaces (Zoo) 13.A β Continuous-time 13.B β Toy dynamical systems 13.C β Hybrid Var-EnKF Part 0 β Bayesian Filtering & DA Foundations ΒΆ 0.A β The filtering problem ΒΆ Key equations / models:
Prior, dynamics, likelihood factorisation: p ( x 0 : T , y 1 : T ) = p ( x 0 ) β t p ( x t β£ x t β 1 ) β p ( y t β£ x t ) p(x_{0:T}, y_{1:T}) = p(x_0)\prod_t p(x_t \mid x_{t-1})\,p(y_t \mid x_t) p ( x 0 : T β , y 1 : T β ) = p ( x 0 β ) β t β p ( x t β β£ x t β 1 β ) p ( y t β β£ x t β ) Filtering target: p ( x t β£ y 1 : t ) p(x_t \mid y_{1:t}) p ( x t β β£ y 1 : t β ) Smoothing target: p ( x t β£ y 1 : T ) p(x_t \mid y_{1:T}) p ( x t β β£ y 1 : T β ) , T > t T > t T > t Forecasting: p ( x t + h β£ y 1 : t ) p(x_{t+h} \mid y_{1:t}) p ( x t + h β β£ y 1 : t β ) Chapman-Kolmogorov: p ( x t β£ y 1 : t β 1 ) = β« p ( x t β£ x t β 1 ) β p ( x t β 1 β£ y 1 : t β 1 ) β d x t β 1 p(x_t \mid y_{1:t-1}) = \int p(x_t \mid x_{t-1})\, p(x_{t-1} \mid y_{1:t-1})\,dx_{t-1} p ( x t β β£ y 1 : t β 1 β ) = β« p ( x t β β£ x t β 1 β ) p ( x t β 1 β β£ y 1 : t β 1 β ) d x t β 1 β # Tutorial Source Scope Refs / Notes 0.1 The filtering problem from scratch β joint factorisation, filtering vs smoothing vs forecasting β π§± pedagogical entry β graphical-model diagram, three target densities, recursion identity; sets the language for Parts 1β4 0.2 Sequential Bayesian inference as natural-form addition β π§± π mirrors GP 0.6 / 0.4; conjugate update = Ξ· t + 1 = Ξ· t + H β€ R β 1 y t \eta_{t+1} = \eta_t + H^\top R^{-1} y_t Ξ· t + 1 β = Ξ· t β + H β€ R β 1 y t β ; batch = sequential = any order
0.B β Linear-Gaussian Kalman filter ΒΆ Key equations / models:
Forecast: x Λ t f = M t x Λ t β 1 a \bar x^f_t = M_t \bar x^a_{t-1} x Λ t f β = M t β x Λ t β 1 a β , P t f = M t P t β 1 a M t β€ + Q t P^f_t = M_t P^a_{t-1} M_t^\top + Q_t P t f β = M t β P t β 1 a β M t β€ β + Q t β Analysis: K t = P t f H t β€ S t β 1 K_t = P^f_t H_t^\top S_t^{-1} K t β = P t f β H t β€ β S t β 1 β , x Λ t a = x Λ t f + K t ( y t β H t x Λ t f ) \bar x^a_t = \bar x^f_t + K_t(y_t - H_t \bar x^f_t) x Λ t a β = x Λ t f β + K t β ( y t β β H t β x Λ t f β ) , P t a = ( I β K t H t ) P t f P^a_t = (I - K_t H_t) P^f_t P t a β = ( I β K t β H t β ) P t f β Joseph form: P t a = ( I β K t H t ) β P t f β ( I β K t H t ) β€ + K t R t K t β€ P^a_t = (I - K_t H_t)\, P^f_t\, (I - K_t H_t)^\top + K_t R_t K_t^\top P t a β = ( I β K t β H t β ) P t f β ( I β K t β H t β ) β€ + K t β R t β K t β€ β Log-marginal: log β‘ p ( y t β£ y 1 : t β 1 ) = β 1 2 [ N y log β‘ 2 Ο + log β‘ β£ S t β£ + v t β€ S t β 1 v t ] \log p(y_t \mid y_{1:t-1}) = -\tfrac{1}{2}\bigl[N_y \log 2\pi + \log|S_t| + v_t^\top S_t^{-1} v_t\bigr] log p ( y t β β£ y 1 : t β 1 β ) = β 2 1 β [ N y β log 2 Ο + log β£ S t β β£ + v t β€ β S t β 1 β v t β ] # Tutorial Source Scope Refs / Notes 0.3 Kalman filter from scratch β derivation, closed-form recursion, log-marginal likelihood β π§± uses SquareRootKF for parametric ground truth; visual covariance ellipses; six-equation cheat sheet 0.4 Joseph-form covariance update β float32 stress test, PSD preservation β π§± π mirrors GP 0.5; four equivalent forms (standard / symmetric / information / Joseph) with PSD checks
0.C β The forecast β analysis β inflate cycle ΒΆ Key equations / models:
Cycle structure: forecast (apply dynamics) β analysis (assimilate obs) β inflate (counteract sample collapse) Ensemble representation: X β R N e Γ N x X \in \mathbb{R}^{N_e \times N_x} X β R N e β Γ N x β , rows = members Sample covariance: P e = ( N e β 1 ) β 1 ( X β X Λ ) β€ ( X β X Λ ) P_e = (N_e - 1)^{-1} (X - \bar X)^\top (X - \bar X) P e β = ( N e β β 1 ) β 1 ( X β X Λ ) β€ ( X β X Λ ) , rank β€ N e β 1 \le N_e - 1 β€ N e β β 1 # Tutorial Source Scope Refs / Notes 0.5 Anatomy of one DA cycle β forecast / analysis / inflate, why each step exists β π§± dd:architecture.md; three-panel diagram (prior cloud β posterior cloud β inflated cloud) 0.6 Why ensembles? β sample-covariance limits, rank β€ N e β 1 N_e β 1 N e β β 1 , when ensemble beats parametric β π§± eigenvalue-spectrum plot vs N e N_e N e β ; rank deficit and the null direction visualised
0.D β Variational DA (3D/4D-Var) contrast ΒΆ Key equations / models:
3D-Var cost: J ( x ) = 1 2 ( x β x b ) β€ B β 1 ( x β x b ) + 1 2 ( y β H x ) β€ R β 1 ( y β H x ) J(x) = \tfrac{1}{2}(x - x^b)^\top B^{-1}(x - x^b) + \tfrac{1}{2}(y - H x)^\top R^{-1}(y - H x) J ( x ) = 2 1 β ( x β x b ) β€ B β 1 ( x β x b ) + 2 1 β ( y β H x ) β€ R β 1 ( y β H x ) 4D-Var window: J ( x 0 ) = 1 2 β₯ x 0 β x 0 b β₯ B β 1 2 + β t 1 2 β₯ y t β H t M 1 : t x 0 β₯ R t β 1 2 J(x_0) = \tfrac{1}{2}\|x_0 - x^b_0\|^2_{B^{-1}} + \sum_t \tfrac{1}{2}\|y_t - H_t M_{1:t} x_0\|^2_{R_t^{-1}} J ( x 0 β ) = 2 1 β β₯ x 0 β β x 0 b β β₯ B β 1 2 β + β t β 2 1 β β₯ y t β β H t β M 1 : t β x 0 β β₯ R t β 1 β 2 β Adjoint vs autodiff: see Part 8 # Tutorial Source Scope Refs / Notes 0.7 3D-Var vs Kalman β duality (Kalman = sequential 3D-Var with B = P f B = P^f B = P f ) β π§± π pairs with R projects/plume_simulation/notebooks/assimilation/00_3dvar_derivation.md; minimisation = closed-form same answer 0.8 4D-Var with adjoints β and how differentiable EnKF (Part 8) compares β π§± π dd:features/differentiable_da.md Β§4; cost / memory / Jacobian comparison table
Key equations / models:
Information matrix Ξ = Ξ£ β 1 \Lambda = \Sigma^{-1} Ξ = Ξ£ β 1 , information vector Ξ· = Ξ£ β 1 ΞΌ \eta = \Sigma^{-1} \mu Ξ· = Ξ£ β 1 ΞΌ Conjugate update in natural form: Ξ· t + 1 = Ξ· t + H β€ R β 1 y t \eta_{t+1} = \eta_t + H^\top R^{-1} y_t Ξ· t + 1 β = Ξ· t β + H β€ R β 1 y t β , Ξ t + 1 = Ξ t + H β€ R β 1 H \Lambda_{t+1} = \Lambda_t + H^\top R^{-1} H Ξ t + 1 β = Ξ t β + H β€ R β 1 H When to prefer information form: GMRF priors, banded Ξ, sequential-by-obs updates # Tutorial Source Scope Refs / Notes 0.9 Information vs covariance form β when each wins, conjugate-update identities β π§± π mirrors GP 0.4; sparsity-of-Ξ vs density-of-Ξ£ table; round-trip cost diagram
Part 1 β Layer 0 Primitives ΒΆ filteraxβs pure-function building blocks. Every L1 / L2 algorithm composes from these.
1.A β Ensemble statistics ΒΆ Key equations / models:
x Λ = N e β 1 β j x ( j ) \bar x = N_e^{-1} \sum_j x^{(j)} x Λ = N e β 1 β β j β x ( j ) X β² = X β 1 x Λ β€ X' = X - \mathbf{1}\bar x^\top X β² = X β 1 x Λ β€ (rows sum to zero)P e = ( N e β 1 ) β 1 X β² β€ X β² P_e = (N_e - 1)^{-1} X'^\top X' P e β = ( N e β β 1 ) β 1 X β²β€ X β² β Bessel-corrected; returned as gaussx.LowRankUpdateCross-cov C x H = ( N e β 1 ) β 1 X β² β€ ( H X ) β² C^{xH} = (N_e - 1)^{-1} X'^\top (HX)' C x H = ( N e β β 1 ) β 1 X β²β€ ( H X ) β² # Tutorial Source Scope Refs / Notes 1.1 Ensemble mean, anomalies, sample covariance β the three primitives every filter starts from β π§± api: ensemble_mean, ensemble_anomalies, ensemble_covariance; rank-deficit visualised; equivalence P e = X β€ X / ( N e β 1 ) P_e = X^\top X / (N_e-1) P e β = X β€ X / ( N e β β 1 ) via QR / SVD 1.2 Cross-covariance for nonlinear H H H β implicit derivative-free linearisation β π§± api: cross_covariance; sanity check against finite-difference Jacobian; identity C x H = P e H β€ C^{xH} = P_e H^\top C x H = P e β H β€ for linear H H H 1.3 Low-rank covariance as a structured operator β Woodbury identity preview β π§± π bridges to GP 1.B / 1.10; api: gaussx.LowRankUpdate; solve / logdet routing diagram
1.B β Gain & innovation ΒΆ Key equations / models:
Ensemble Kalman gain: K = C x H ( C H H + R ) β 1 K = C^{xH}(C^{HH} + R)^{-1} K = C x H ( C HH + R ) β 1 Innovation covariance: S = C H H + R S = C^{HH} + R S = C HH + R (low-rank update over R R R ) Innovation: v = y β H Λ X v = y - \bar H X v = y β H Λ X # Tutorial Source Scope Refs / Notes 1.4 The ensemble Kalman gain β Bessel correction, Woodbury dispatch for structured R R R β π§± api: kalman_gain; dd:architecture.md; cost table for dense / diagonal / Toeplitz R R R 1.5 Innovation covariance & gaussx structural dispatch β diag / low-rank / Toeplitz R R R β π§± π api: innovation_covariance; pairs with GP 1.4 (Toeplitz) and 1.3 (Kronecker)
1.C β Likelihood & innovation statistics ΒΆ Key equations / models:
log β‘ p ( y β£ forecast ) = β 1 2 [ N y log β‘ 2 Ο + log β‘ β£ S β£ + v β€ S β 1 v ] \log p(y \mid \text{forecast}) = -\tfrac{1}{2}[N_y \log 2\pi + \log|S| + v^\top S^{-1} v] log p ( y β£ forecast ) = β 2 1 β [ N y β log 2 Ο + log β£ S β£ + v β€ S β 1 v ] InnovationStatistics β packaged v v v , S S S , log β‘ p \log p log p for diagnostics & training# Tutorial Source Scope Refs / Notes 1.6 Predictive log-likelihood as the universal training signal β π§± api: log_likelihood, innovation_statistics, InnovationStatistics; gradient sanity check (β S β 1 v -S^{-1}v β S β 1 v ); feeds Parts 7 and 8
1.D β Perturbations ΒΆ Key equations / models:
Ο΅ ( j ) βΌ N ( 0 , R ) \epsilon^{(j)} \sim \mathcal{N}(0, R) Ο΅ ( j ) βΌ N ( 0 , R ) , structure-aware via gaussx.root_decompositionDiagonal-R R R fast path: Ο΅ k ( j ) = z k ( j ) R k k \epsilon^{(j)}_k = z^{(j)}_k \sqrt{R_{kk}} Ο΅ k ( j ) β = z k ( j ) β R kk β β Determinism: identical key β identical draws # Tutorial Source Scope Refs / Notes 1.7 Perturbed observations & R-aware sampling β π§± api: perturbed_observations; fast-path vs dense-fallback paths, deterministic-key reproducibility
1.E β Localisation kernels ΒΆ Key equations / models:
Gaspari-Cohn (compact, C 4 C^4 C 4 at origin, support [ 0 , 2 r ] [0, 2r] [ 0 , 2 r ] ) Gaussian taper: exp β‘ ( β d 2 / 2 r 2 ) \exp(-d^2/2r^2) exp ( β d 2 /2 r 2 ) SOAR: ( 1 + d / r ) exp β‘ ( β d / r ) (1 + d/r)\exp(-d/r) ( 1 + d / r ) exp ( β d / r ) Hard cutoff (non-differentiable!) & adaptive (Anderson 2007 / 2012) # Tutorial Source Scope Refs / Notes 1.8 Localisation taper zoo β visual + differentiability table β π§± π api: gaspari_cohn, gaussian_taper, hard_cutoff, soar_taper; pairs with GP 2.A (kernel zoo); plot Ο ( d / r ) \rho(d/r) Ο ( d / r ) side-by-side 1.9 Generic localize(cov, coords, taper_fn) β assembling localised covariances β π§± api: localize; ETKF-localized vs LETKF-localized comparison 1.10 Adaptive localisation (Anderson) β empirical correlation β \to β taper β π¬ api: adaptive_localization; dd:features/localization_inflation.md
1.F β Inflation primitives ΒΆ Key equations / models:
Multiplicative: X a β² β Ξ» X a β² X'_a \leftarrow \lambda X'_a X a β² β β Ξ» X a β² β , posterior cov Ξ» 2 P a \lambda^2 P_a Ξ» 2 P a β Additive: X a β X a + ΞΎ X_a \leftarrow X_a + \xi X a β β X a β + ΞΎ , ΞΎ βΌ N ( 0 , Q add ) \xi \sim \mathcal{N}(0, Q_\text{add}) ΞΎ βΌ N ( 0 , Q add β ) RTPS (Whitaker-Hamill 2012): X a β² β X a β² [ Ξ± β Ο f / Ο a + ( 1 β Ξ± ) ] X'_a \leftarrow X'_a [\alpha\, \sigma^f/\sigma^a + (1-\alpha)] X a β² β β X a β² β [ Ξ± Ο f / Ο a + ( 1 β Ξ± )] RTPP (Zhang 2004): X a β² β Ξ± X f β² + ( 1 β Ξ± ) X a β² X'_a \leftarrow \alpha X'_f + (1-\alpha) X'_a X a β² β β Ξ± X f β² β + ( 1 β Ξ± ) X a β² β Ledoit-Wolf shrinkage: blend sample cov with structured target # Tutorial Source Scope Refs / Notes 1.11 Multiplicative & additive inflation primitives β π§± api: inflate_multiplicative, inflate_additive; spread-vs-step diagram 1.12 Relaxation: RTPS vs RTPP β π§± api: inflate_rtps, inflate_rtpp; spread-recovery curves; Ξ± = 0 \alpha=0 Ξ± = 0 / Ξ± = 1 \alpha=1 Ξ± = 1 limiting cases 1.13 Adaptive inflation & Ledoit-Wolf shrinkage β π¬ π api: inflate_adaptive, ledoit_wolf_shrinkage; pairs with GP 0.10 (jitter / shrinkage)
1.G β Patches & domain decomposition ΒΆ Key equations / models:
Patch index sets P i β { 1 , β¦ , N x } P_i \subseteq \{1, β¦, N_x\} P i β β { 1 , β¦ , N x β } , obs-to-patch assignment, blending for overlapping patches Per-patch ETKF analysis assembled with smooth weighting in overlaps # Tutorial Source Scope Refs / Notes 1.14 Domain decomposition β patches for high-dim spatial DA β π¬ api: create_patches, assign_obs_to_patches, blend_patches; dd:architecture.md; 2D grid example with overlap visualisation
Part 2 β Layer 1 Sequential Filters ΒΆ Each filter as its own tutorial. Verified against the closed-form Kalman update on a linear-Gaussian problem; the existing tests/test_filters.py baselines are the template.
2.A β Stochastic / perturbed-observation ΒΆ Key equations / models:
X j a = X j f + K ( y + Ο΅ ( j ) β H X j f ) X^a_j = X^f_j + K(y + \epsilon^{(j)} - H X^f_j) X j a β = X j f β + K ( y + Ο΅ ( j ) β H X j f β ) , Ο΅ ( j ) βΌ N ( 0 , R ) \epsilon^{(j)} \sim \mathcal{N}(0, R) Ο΅ ( j ) βΌ N ( 0 , R ) Monte Carlo gain noise scales as 1 / N e 1/\sqrt{N_e} 1/ N e β β Per-window PRNG key: jr.fold_in(base_key, step) # Tutorial Source Scope Refs / Notes 2.1 Stochastic EnKF (Evensen 1994) β perturbed-obs analysis β π§± api: filters.StochasticEnKF; dd:features/filters.md; MC-noise vs N e N_e N e β plot; pairs with 3.9 (key threading)
2.B β Deterministic square-root family ΒΆ Key equations / models:
ETKF transform precision: C ~ = ( N e β 1 ) I + Y β² R β 1 Y β² β€ \tilde C = (N_e - 1) I + Y' R^{-1} Y'^\top C ~ = ( N e β β 1 ) I + Y β² R β 1 Y β²β€ Rank-N y N_y N y β spectrum trick (no eigh of degenerate ( N e , N e ) (N_e, N_e) ( N e β , N e β ) matrix) β fixed in #82 W a = ( N e β 1 ) C ~ β 1 W_a = \sqrt{(N_e - 1) \tilde C^{-1}} W a β = ( N e β β 1 ) C ~ β 1 β applied to anomalies via g ( C ~ ) v = g ( N e β 1 ) v + U y d i a g ( g Ξ ) U y β€ v g(\tilde C) v = g(N_e-1) v + U_y \mathrm{diag}(g_\Delta) U_y^\top v g ( C ~ ) v = g ( N e β β 1 ) v + U y β diag ( g Ξ β ) U y β€ β v # Tutorial Source Scope Refs / Notes 2.2 ETKF (Bishop 2001) β ensemble transform, symmetric sqrt β π§± api: filters.ETKF; dd:features/filters.md; rank-N y N_y N y β spectrum trick walked through; gradient-stability sanity check 2.3 EnSRF batch form (Whitaker & Hamill 2002) β separate mean & perturbation updates β π§± api: filters.EnSRF; equivalence with ETKF in batch mode (Tippett 2003 Β§3) 2.4 Serial EnSRF β scalar obs processing, no eigh β π§± api: filters.EnSRF_Serial; per-obs scalar gain; diagonal-R R R requirement 2.5 ESTKF (Nerger 2012) β ( N e β 1 ) (N_e β 1) ( N e β β 1 ) error subspace, mean-preserving projection β π§± api: filters.ESTKF; L β R N e Γ ( N e β 1 ) L \in \mathbb{R}^{N_e \times (N_e-1)} L β R N e β Γ ( N e β β 1 ) Householder construction; reduced eigh cost
2.C β Localised ΒΆ Key equations / models:
R-localisation (Hunt et al. 2007): inflate local R β 1 R^{-1} R β 1 by per-obs taper Ο Per-grid-point local ETKF, vmapped across N x N_x N x β points Hard cutoff at radius: obs beyond r r r excluded entirely (Gaspari-Cohn nonzero out to 2 r 2r 2 r ) # Tutorial Source Scope Refs / Notes 2.6 LETKF (Hunt 2007) β local ETKF with R-localisation β π§± api: filters.LETKF; requires diagonal R R R ; per-point compute diagram 2.7 LETKF hard-cutoff vs taper-only β why explicit cutoff matters β π regression context for test_letkf_hard_cutoff_at_radius; far-obs invariance demo
2.D β Symmetry-breaking variants ΒΆ Key equations / models:
W a rot = W a β
Ξ W_a^\text{rot} = W_a \cdot \Theta W a rot β = W a β β
Ξ , Ξ β O ( N e ) \Theta \in O(N_e) Ξ β O ( N e β ) , Ξ 1 = 1 \Theta \mathbf{1} = \mathbf{1} Ξ 1 = 1 Counteracts preferred-direction drift over many cycles # Tutorial Source Scope Refs / Notes 2.8 ETKF_Livings β mean-preserving random rotation β π¬ api: filters.ETKF_Livings; rotation construction via Householder + random O ( N e β 1 ) O(N_e-1) O ( N e β β 1 ) ; cov preserved, ensemble realisations differ
2.E β Parametric (non-ensemble) ΒΆ Key equations / models:
Cholesky-form mean & covariance propagation Marginal likelihood from the innovation sequence PSD-preserving across T T T steps by construction # Tutorial Source Scope Refs / Notes 2.9 SquareRootKF β Cholesky-form parametric KF β π§± π api: filters.SquareRootKF; ground truth for Part 0.B; pairs with GP 8.D (parametric Kalman)
2.F β Selection guide ΒΆ # Tutorial Source Scope Refs / Notes 2.10 Filter selection β when each L1 variant wins (table + worked examples) β π§± reads like an extended design_docs/decisions.md D-row; deterministic vs stochastic, batch vs serial, localised vs global
Part 3 β Layer 2 Forecast-Analysis Loops ΒΆ 3.A β Protocols & extension points ΒΆ Key equations / models:
AbstractDynamics(state, t0, t1) -> state β vmap over membersAbstractObsOperator(state) -> obsAbstractInflator(particles, forecast, **kwargs) -> particlesAbstractLocalizer(cov, coords) -> covAbstractScheduler.get_dt(state) for processes# Tutorial Source Scope Refs / Notes 3.1 The protocol family β AbstractDynamics / AbstractObsOperator / AbstractInflator / AbstractLocalizer / AbstractScheduler β π§± api: filterax._src._protocols; dd:architecture.md; class diagram with extension points 3.2 Plugging in a JAX dynamics model β π§± identity / linear / Lorenz-63 / SDE wrappers; pure-function rule 3.3 Plugging in a nonlinear obs operator β neural decoder warm-up for Part 8 β π dd:features/differentiable_da.md Β§6.B; equinox-based eqx.nn.MLP example
3.B β L2 model walkthroughs ΒΆ # Tutorial Source Scope Refs / Notes 3.4 L2 ETKF end-to-end β forecast β analysis β likelihood logging β π§± api: ETKF; mirrors test_l2_etkf_assimilate_smoke; AssimilationResult field walkthrough 3.5 L2 EnSRF & L2 StochasticEnKF β when batch vs perturbed-obs matters β π§± api: EnSRF, StochasticEnKF; side-by-side spread plots 3.6 L2 LETKF with state_coords / obs_coords β π§± api: LETKF; 1D-grid worked example with localisation radius sweep
3.C β Inflator integration ΒΆ # Tutorial Source Scope Refs / Notes 3.7 Adding MultiplicativeInflator / RTPS / RTPP to the L2 loop β π§± api: MultiplicativeInflator, RTPS, RTPP; spread-trajectory comparison 3.8 AdditiveInflator & per-cycle key threadingβ π¬ api: AdditiveInflator; jr.fold_in(base_key, step) pattern
3.D β Stochastic key handling ΒΆ # Tutorial Source Scope Refs / Notes 3.9 jr.fold_in per window β why naive StochasticEnKF repeats draws (regression for test_stochastic_enkf_l2_uses_independent_keys_per_window)β π api: StochasticEnKF.assimilate; demo of identical-draw bug without folding
Part 4 β Backward-Pass Smoothers ΒΆ 4.A β Sequential smoothers ΒΆ Key equations / models:
Smoother gain: G t = C t , t + 1 a f β ( C t + 1 f f ) β 1 G_t = C^{af}_{t,t+1}\,(C^{ff}_{t+1})^{-1} G t β = C t , t + 1 a f β ( C t + 1 ff β ) β 1 Recursion: X t s = X t a + G t ( X t + 1 s β X t + 1 f ) X^s_t = X^a_t + G_t(X^s_{t+1} - X^f_{t+1}) X t s β = X t a β + G t β ( X t + 1 s β β X t + 1 f β ) Dual ensemble-space form: G t ( X t + 1 s β X t + 1 f ) = D F β€ ( F F β€ ) + A G_t (X^s_{t+1} - X^f_{t+1}) = D F^\top (F F^\top)^+ A G t β ( X t + 1 s β β X t + 1 f β ) = D F β€ ( F F β€ ) + A β avoids materialising ( N x , N x ) (N_x, N_x) ( N x β , N x β ) cov # Tutorial Source Scope Refs / Notes 4.1 EnKS (Evensen & van Leeuwen 2000) β standard backward pass β π§± api: smoothers.EnKS; dd:features/smoothers.md; backward-scan diagram; final-time identity 4.2 EnsembleRTS β RTS interpretation, model-error placeholder β π§± api: smoothers.EnsembleRTS; equivalence with EnKS without explicit Q Q Q 4.3 FixedLagSmoother β windowed lookahead, online interpretation β π¬ api: smoothers.FixedLagSmoother; lag=0 / lag=T-1 limits; rolling-buffer interpretation
4.B β Square-root smoothers ΒΆ Key equations / models:
Decompose mean + perturbation; apply symmetric sqrt to anomalies in ensemble space Ξ = I + K e ( D β€ D β F β€ F ) K e β€ \Lambda = I + K_e (D^\top D - F^\top F) K_e^\top Ξ = I + K e β ( D β€ D β F β€ F ) K e β€ β , K e = ( F F β€ ) + F K_e = (F F^\top)^+ F K e β = ( F F β€ ) + F Smoothed perts X t s β² = Ξ 1 / 2 A t X^{s'}_t = \Lambda^{1/2} A_t X t s β² β = Ξ 1/2 A t β live in the row span of A t A_t A t β # Tutorial Source Scope Refs / Notes 4.4 EnsembleSqrtSmoother β deterministic sqrt backward pass β π¬ api: smoothers.EnsembleSqrtSmoother; dd:features/smoothers.md Gap 4; perts-in-column-span demo
4.C β Iterative smoothers ΒΆ Key equations / models:
Chen-Oliver IES update with prior anchor:
ΞΈ i + 1 j = ( 1 β Ξ± ) β ΞΈ i j + Ξ± [ ΞΈ 0 j + K i ( y + Ο΅ ( j ) β G ( ΞΈ i j ) ) ] \theta^j_{i+1} = (1 - \alpha)\,\theta^j_i + \alpha\bigl[\theta^j_0 + K_i(y + \epsilon^{(j)} - G(\theta^j_i))\bigr] ΞΈ i + 1 j β = ( 1 β Ξ± ) ΞΈ i j β + Ξ± [ ΞΈ 0 j β + K i β ( y + Ο΅ ( j ) β G ( ΞΈ i j β )) ] K i = C i ΞΈ G ( C i G G + Ξ y ) β 1 K_i = C^{\theta G}_i (C^{GG}_i + \Gamma_y)^{-1} K i β = C i ΞΈG β ( C i GG β + Ξ y β ) β 1 recomputed each iteration# Tutorial Source Scope Refs / Notes 4.5 IES (Chen & Oliver 2013) β iterative ensemble smoother for inverse problems β π¬ api: smoothers.IES; dd:features/smoothers.md Gap 5; anchor-to-ΞΈ 0 \theta_0 ΞΈ 0 β visualisation; Ξ± ablation
4.D β Selection & memory trade-offs ΒΆ # Tutorial Source Scope Refs / Notes 4.6 Smoother selection guide β offline vs online, deterministic vs iterative, memory budget β π§± extends features/smoothers.md Β§5 comparison table; flow chart for picking smoother
Part 5 β Ensemble Kalman Processes (Inversion & Sampling) ΒΆ 5.A β Inversion (EKI family) ΒΆ Key equations / models:
EKI update: ΞΈ n + 1 j = ΞΈ n j + Ξ t n β C n ΞΈ G ( C n G G + Ξ t n β 1 Ξ ) β 1 ( y β G ( ΞΈ n j ) ) \theta^j_{n+1} = \theta^j_n + \Delta t_n\, C^{\theta G}_n (C^{GG}_n + \Delta t_n^{-1} \Gamma)^{-1}(y - G(\theta^j_n)) ΞΈ n + 1 j β = ΞΈ n j β + Ξ t n β C n ΞΈG β ( C n GG β + Ξ t n β 1 β Ξ ) β 1 ( y β G ( ΞΈ n j β )) Tempered Ξ t β 1 Ξ \Delta t^{-1}\Gamma Ξ t β 1 Ξ noise as low-rank update over Ξ Algo-time β 1 \to 1 β 1 collapse: spread β 0 \to 0 β 0 , MAP recovered # Tutorial Source Scope Refs / Notes 5.1 EKI (Iglesias, Law & Stuart 2013) β iterative ensemble inversion β π§± api: EKI, processes.EKI; dd:features/processes.md; one-step Kalman equivalence on the sample cov 5.2 TEKI β Tikhonov-regularised EKI with prior pull β π api: processes.TEKI; augmented-identity block; unidentifiable-parameter shrinkage demo 5.3 GNKI β Gauss-Newton with explicit ensemble Jacobian β π¬ api: processes.GNKI; requires J > N p J > N_p J > N p β ; one-step linear-Gaussian convergence 5.4 ETKI β deterministic / sqrt EKI variant β π¬ api: processes.ETKI; deterministic transform analog 5.5 SparseInversion β LΒΉ proximal soft-threshold on EKI step β π¬ api: processes.SparseInversion; Schneider-Stuart-Wu 2022; inactive-parameter-to-zero demo
5.B β Sampling (EKS family) ΒΆ Key equations / models:
EKS / interacting Langevin: ΞΈ n + 1 j = ΞΈ n j + Ξ t β C n ΞΈ G ( β¦ ) + 2 Ξ t β C n ΞΈ ΞΈ β ΞΎ n j \theta^j_{n+1} = \theta^j_n + \Delta t\, C^{\theta G}_n (\ldots) + \sqrt{2\Delta t\, C^{\theta\theta}_n}\,\xi^j_n ΞΈ n + 1 j β = ΞΈ n j β + Ξ t C n ΞΈG β ( β¦ ) + 2Ξ t C n ΞΈΞΈ β β ΞΎ n j β Approaches the Bayesian posterior as Ξ t β 0 \Delta t \to 0 Ξ t β 0 , N e β β N_e \to \infty N e β β β Ergodic β spread doesnβt collapse # Tutorial Source Scope Refs / Notes 5.6 EKS (Garbuno-IΓ±igo 2020) β ergodic sampler, no collapse β π§± π api: EKS, processes.EKS_Process; cross-listed with GP 12.x (ensemble VI); spread-vs-time vs EKI plot
5.C β Parametric (UKI) ΒΆ Key equations / models:
Sigma-point propagation: { ΞΈ k } = ΞΌ Β± ( N p + ΞΊ ) Ξ£ i \{\theta_k\} = \mu \pm \sqrt{(N_p + \kappa)\Sigma}_i { ΞΈ k β } = ΞΌ Β± ( N p β + ΞΊ ) Ξ£ β i β Closed-form mean + cov update; no random perturbations Exact second-moment match in linear-Gaussian limit # Tutorial Source Scope Refs / Notes 5.7 UKI β unscented Kalman inversion, parametric mean / cov β π§± api: UKI, processes.UKI; dd:features/processes.md; sigma-point cloud diagram 5.8 Sigma-point utilities β reusable for filter ops in Part 2 β π§± π api: processes.sigma_points; mirrors GP 6.3; reconstruction-of-mean-and-cov sanity check
5.D β Regularised / sparse ΒΆ Covered in 5.A (TEKI, SparseInversion); listed here for navigation.
5.E β Schedulers ΒΆ Key equations / models:
Fixed: Ξ t n = const \Delta t_n = \text{const} Ξ t n β = const Data-misfit controller (Iglesias 2016): adapt Ξ t \Delta t Ξ t from current misfit, freeze past algo_time = 1 \text{algo\_time}=1 algo_time = 1 EKS-stable: Ξ t = h / ( trace ( C G G ) + Ξ΄ ) \Delta t = h / (\text{trace}(C^{GG}) + \delta) Ξ t = h / ( trace ( C GG ) + Ξ΄ ) (avoids ensemble blow-up) # Tutorial Source Scope Refs / Notes 5.9 Scheduler zoo β fixed, data-misfit, EKS-stable β π§± api: FixedScheduler, DataMisfitController, EKSStableScheduler; convergence-trajectory comparison 5.10 DataMisfitController past convergence β algo_time β₯ 1 safety β π regression context for test_eki_update_is_finite_after_algo_time_one; Ξ t = 0 \Delta t = 0 Ξ t = 0 floor; no-NaN guarantee
Part 6 β Localisation, Inflation, Calibration ΒΆ 6.A β Why localisation ΒΆ Key equations / models:
Spurious correlations: β£ C i j β£ βΌ 1 / N e |C_{ij}| \sim 1/\sqrt{N_e} β£ C ij β β£ βΌ 1/ N e β β for unrelated state pairs Sample-cov rank β€ N e β 1 N_e β 1 N e β β 1 in state space; rank-deficient inverse undefined without regularisation # Tutorial Source Scope Refs / Notes 6.1 Spurious correlations & sample-cov rank β visual + spectrum demo β π§± dd:features/localization_inflation.md; correlation heat-map at small / large N e N_e N e β 6.2 R-localisation vs B-localisation β when each is correct β π§± Hunt 2007 Β§2; B-loc preserves PSD only with specific tapers; R-loc requires diagonal R R R
6.B β Why inflation ΒΆ Key equations / models:
Filter divergence: posterior cov collapses β obs rejected β estimates drift Recovery: Ξ» > 1 \lambda > 1 Ξ» > 1 multiplicative or Q add Q_\text{add} Q add β additive # Tutorial Source Scope Refs / Notes 6.3 Filter divergence β overconfident analysis rejects obs β π§± dd:features/localization_inflation.md; trajectory-plot demo with / without inflation 6.4 Multiplicative vs RTPS vs RTPP β when each wins β π§± calibration table; spread-trajectory comparison; failure modes
6.C β Adaptive variants ΒΆ # Tutorial Source Scope Refs / Notes 6.5 Adaptive localisation in operation β π¬ api: adaptive_localization; empirical-correlation threshold demo 6.6 Adaptive inflation (Anderson 2007 / Miyoshi 2011) β π¬ api: inflate_adaptive; observation-space hierarchical inflation update
6.D β Shrinkage estimators ΒΆ # Tutorial Source Scope Refs / Notes 6.7 Ledoit-Wolf shrinkage for ensemble cov β π¬ π api: ledoit_wolf_shrinkage; pairs with GP 0.11 (jitter / safe Cholesky); analytic shrinkage intensity
Part 7 β Diagnostics & Verification ΒΆ 7.A β Basic spread / RMSE / rank ΒΆ Key equations / models:
Ensemble spread: Ο t = ( N e β 1 ) β 1 β j β₯ x t ( j ) β x Λ t β₯ 2 / N x \sigma_t = \sqrt{(N_e-1)^{-1}\sum_j \|x^{(j)}_t - \bar x_t\|^2 / N_x} Ο t β = ( N e β β 1 ) β 1 β j β β₯ x t ( j ) β β x Λ t β β₯ 2 / N x β β RMSE: β¨ ( x Λ t β x t true ) 2 β© t \sqrt{\langle (\bar x_t - x^\text{true}_t)^2\rangle_t} β¨( x Λ t β β x t true β ) 2 β© t β β Rank-histogram of obs vs ensemble (Talagrand) # Tutorial Source Scope Refs / Notes 7.1 Ensemble spread & RMSE β minimal pair for βis the filter aliveβ β π§± dd:features/diagnostics.md; trajectory plots; spread-RMSE ratio reading 7.2 Rank histograms & reliability β π§± Talagrand diagrams; under- / well- / over-dispersive signatures
7.B β Innovation diagnostics ΒΆ Key equations / models:
Mahalanobis: v β€ S β 1 v βΌ Ο N y 2 v^\top S^{-1} v \sim \chi^2_{N_y} v β€ S β 1 v βΌ Ο N y β 2 β if filter is consistent Desroziers (2005): E [ ( y β H x Λ a ) ( y β H x Λ f ) β€ ] = R \mathbb{E}[(y - H\bar x^a)(y - H\bar x^f)^\top] = R E [( y β H x Λ a ) ( y β H x Λ f ) β€ ] = R # Tutorial Source Scope Refs / Notes 7.3 Innovation chi-squared & Mahalanobis β π§± api: innovation_statistics; pass/fail thresholds; cycle-averaged plots 7.4 Desroziers diagnostic β observation-error tuning β π¬ dd:features/diagnostics.md; recovering R R R from posterior innovations
7.C β Reliability & sharpness ΒΆ Key equations / models:
Spread-skill: Ο t β RMSE t \sigma_t \approx \text{RMSE}_t Ο t β β RMSE t β when the filter is calibrated CRPS: β« β β β ( F ( z ) β 1 { y β€ z } ) 2 d z \int_{-\infty}^\infty \bigl(F(z) - \mathbf{1}\{y \le z\}\bigr)^2 dz β« β β β β ( F ( z ) β 1 { y β€ z } ) 2 d z DFS: tr ( I β P a ( P f ) β 1 ) \text{tr}(I - P^a (P^f)^{-1}) tr ( I β P a ( P f ) β 1 ) ESS: ( β w ( j ) ) 2 / β w ( j ) 2 (\sum w^{(j)})^2 / \sum w^{(j)2} ( β w ( j ) ) 2 / β w ( j ) 2 # Tutorial Source Scope Refs / Notes 7.5 Spread-skill relationship β π calibration scatter Ο t \sigma_t Ο t β vs $ 7.6 CRPS β calibration without Gaussianity β π π mirrors GP calibration tutorials; ensemble-vs-pointwise CRPS 7.7 DFS β degrees of freedom for signal β π¬ observability metric; per-obs contribution table 7.8 Effective sample size (ESS) β π¬ π bridges to particle filters in Part 10; degeneracy threshold
7.D β Predictive likelihood ΒΆ # Tutorial Source Scope Refs / Notes 7.9 Log predictive density as the training loss β connects to Part 8 β π§± api: InnovationStatistics.log_likelihood; running sum over windows; sanity gradient sign
Part 8 β Differentiable DA ΒΆ See dd:features/differentiable_da.md.
8.A β Theory & gradient stability ΒΆ Key equations / models:
Pure-function filter β jax.grad flows for free Stochastic vs deterministic filters: perturbed-obs draws inject non-smooth randomness eigh degeneracy: ( N e , N e ) (N_e, N_e) ( N e β , N e β ) transform precision has N e β N y N_e β N_y N e β β N y β repeated eigenvalues β NaN gradient Rank-N y N_y N y β QR-based fix: stable at any N e N_e N e β # Tutorial Source Scope Refs / Notes 8.1 Why differentiable β learning dynamics, obs ops, hyperparams end-to-end β π§± dd:features/differentiable_da.md Β§1; four motivating use cases 8.2 Stochastic vs deterministic filters under grad β eigh degeneracy & the rank-N y N_y N y β trick β π¬ regression context for #82 ; api: _etkf_inner_spectrum; before / after gradient plot
8.B β differentiable_assimilate mechanics ΒΆ Key equations / models:
X t a = EnKF ( X t f = f ΞΈ ( X t β 1 a ) , y t ) X^a_t = \text{EnKF}(X^f_t = f_\theta(X^a_{t-1}), y_t) X t a β = EnKF ( X t f β = f ΞΈ β ( X t β 1 a β ) , y t β ) unrolled as jax.lax.scanMemory under reverse-mode: O ( T β
N e β
N x ) O(T \cdot N_e \cdot N_x) O ( T β
N e β β
N x β ) β O ( T β
N e β
N x ) O(\sqrt{T}\cdot N_e \cdot N_x) O ( T β β
N e β β
N x β ) with checkpoint # Tutorial Source Scope Refs / Notes 8.3 The scan + vmap + remat idiom β single fused XLA While β π§± api: differentiable_assimilate; dd:features/differentiable_da.md Β§8 8.4 Carry-dtype unification & extension kwargs (LETKF coords) β π regression context for test_diff_assimilate_handles_mixed_time_dtypes; mixed-dtype trace error reproduction
8.C β Training patterns ΒΆ Key equations / models:
Pattern A: ΞΈ f \theta_f ΞΈ f β via β ΞΈ f ( β β t log β‘ p ( y t β£ forecast t ) ) \nabla_{\theta_f} (-\sum_t \log p(y_t \mid \text{forecast}_t)) β ΞΈ f β β ( β β t β log p ( y t β β£ forecast t β )) Pattern B: ΞΈ H \theta_H ΞΈ H β via the same loss, gradient flows through obs operator Pattern C: Ο, r loc r_\text{loc} r loc β , R R R diag via reparameterised log_factor etc. # Tutorial Source Scope Refs / Notes 8.5 Pattern A β learn dynamics parameters (Neural ODE through filter) β π¬ dd:features/differentiable_da.md Β§6.A; loss-vs-epoch curve; gradient-sign sanity 8.6 Pattern B β learn observation operator (neural decoder) β π¬ dd:features/differentiable_da.md Β§6.B; neural RTM example (plumax Tier IV v2) 8.7 Pattern C β meta-learn inflation / localisation radius / R R R diag β π¬ dd:features/differentiable_da.md Β§6.C; constrained-via-exp reparameterisation
8.D β Memory & remat ΒΆ Key equations / models:
jax.checkpoint(step) placed on the scan bodyBinomial checkpointing schedule: O ( T ) O(\sqrt{T}) O ( T β ) memory at 2-3Γ compute ROAD-EnKF (Chen et al. 2023): O ( 1 ) O(1) O ( 1 ) memory, approximate gradient # Tutorial Source Scope Refs / Notes 8.8 jax.checkpoint placement β O ( T ) O(\sqrt{T}) O ( T β ) memory under reverse-modeβ π§± dd:features/differentiable_da.md Β§5.1; checkpoint-on-body vs checkpoint-on-scan 8.9 ROAD-EnKF β local-gradient approximation, O ( 1 ) O(1) O ( 1 ) memory β π¬ dd:features/differentiable_da.md Β§6.D; not yet implemented β gap
8.E β Loss zoo ΒΆ Key equations / models:
NLL : β t β log β‘ p ( y t β£ x Λ t f ) \sum_t -\log p(y_t \mid \bar x^f_t) β t β β log p ( y t β β£ x Λ t f β ) MSE: β t β₯ v t β₯ 2 \sum_t \|v_t\|^2 β t β β₯ v t β β₯ 2 CRPS: β t CRPS ( ens t , y t ) \sum_t \text{CRPS}(\text{ens}_t, y_t) β t β CRPS ( ens t β , y t β ) Spread-skill: β t ( Ο t β err t ) 2 \sum_t (\sigma_t - \text{err}_t)^2 β t β ( Ο t β β err t β ) 2 # Tutorial Source Scope Refs / Notes 8.10 NLL vs MSE vs CRPS vs spread-skill β pick your gradient signalβ π§± dd:features/differentiable_da.md Β§3; calibration vs accuracy trade-off table
Part 9 β optax Integration ΒΆ Key equations / models:
Each process exposed as an optax.GradientTransformation: init(params) -> state, update(grad, state, params) -> (updates, state) The βgradientβ slot is unused; updates come from the ensemble Kalman process internally optax.apply_updates(params, updates) advances the parameter mean# Tutorial Source Scope Refs / Notes 9.1 filterax.optax.eki β EKI as a gradient transformβ π§± api: filterax.optax.eki; dd:features/optax_ekp.md; three-iter convergence smoke 9.2 filterax.optax.eks β EKS as a gradient transformβ π§± api: filterax.optax.eks; per-step key-advance demonstration 9.3 filterax.optax.uki β UKI with parametric carryβ π api: filterax.optax.uki; mean / covariance both updated
9.B β Composition with optax chains ΒΆ # Tutorial Source Scope Refs / Notes 9.4 Composing with optax.chain β gradient clipping, masking, scheduling on top of EKI β π dd:features/optax_ekp.md; clip_by_global_norm example; mask-by-param-name
9.C β Hybrid SGD + EKI ΒΆ # Tutorial Source Scope Refs / Notes 9.5 Hybrid pipelines β SGD on neural-net params + EKI on physics params, single optax.chain β π¬ bridges to Part 8 (Pattern A/B); per-leaf transform via optax.multi_transform
Part 10 β Sequential Variational Inference ΒΆ The bridge into broader filtering / sequential-VI work. Each tutorial sits next to a filterax / pyrox primitive and points at the GP master list where overlap exists.
10.A β Foundations ΒΆ Key equations / models:
Sequential VI: q t ( x t ) β p ( x t β£ y 1 : t ) q_t(x_t) \approx p(x_t \mid y_{1:t}) q t β ( x t β ) β p ( x t β β£ y 1 : t β ) updated as obs arrive Recursive ELBO: L t = E q t [ log β‘ p ( y t β£ x t ) ] β KL ( q t β₯ q t β 1 forecast ) \mathcal{L}_t = \mathbb{E}_{q_t}[\log p(y_t \mid x_t)] - \text{KL}(q_t \Vert q_{t-1}^\text{forecast}) L t β = E q t β β [ log p ( y t β β£ x t β )] β KL ( q t β β₯ q t β 1 forecast β ) Connection to Kalman: linear-Gaussian q q q β exact Kalman recursion # Tutorial Source Scope Refs / Notes 10.1 Sequential VI from scratch β recursive ELBO, Gaussian limit = Kalman β π§± π pairs with GP 6.14 (variational guides); duality diagram 10.2 Variational EnKF β interpreting ETKF as ELBO ascent β π¬ research bridge; minimisation-vs-update derivation
10.B β Particle filters & SMC ΒΆ Key equations / models:
Importance sampling + resampling: w ( j ) β p ( y t β£ x t ( j ) ) w^{(j)} \propto p(y_t \mid x^{(j)}_t) w ( j ) β p ( y t β β£ x t ( j ) β ) ESS: N eff = ( β w ( j ) ) 2 / β ( w ( j ) ) 2 N_\text{eff} = (\sum w^{(j)})^2 / \sum (w^{(j)})^2 N eff β = ( β w ( j ) ) 2 / β ( w ( j ) ) 2 Resampling schemes: multinomial, stratified, systematic # Tutorial Source Scope Refs / Notes 10.3 Bootstrap particle filter β proposal = prior β π§± not in filterax core; build from primitives; reweighting & degeneracy demo 10.4 Auxiliary particle filter β π optimal proposal q ( x t β£ x t β 1 , y t ) q(x_t \mid x_{t-1}, y_t) q ( x t β β£ x t β 1 β , y t β ) ; variance-reduction plots 10.5 SMC samplers β annealed posteriors β π¬ π overlaps GP MCMC tutorials; tempered-sequence visualisation
10.C β Variational SMC ΒΆ Key equations / models:
VSMC ELBO (Naesseth 2018): tighter than IWAE via SMC structure Filtering variational objectives (Maddison 2017): per-step lower-bound # Tutorial Source Scope Refs / Notes 10.6 Variational SMC (Naesseth 2018) β amortised proposal trained against ELBO β π¬ research bridge; learnable-proposal example 10.7 Filtering variational objectives (Maddison 2017) β IWAE-style filtering β π¬ comparison with bootstrap PF on Lorenz
10.D β Ensemble VI for SSMs ΒΆ # Tutorial Source Scope Refs / Notes 10.8 EKS as ensemble VI β Garbuno-IΓ±igo 2020 reading β π¬ π cross-listed with GP 12.x; api: EKS; ergodicity & posterior recovery 10.9 Reich-style ensemble VI β coupling-based posterior approximation β π¬ research; OT-coupling-as-resampling
10.E β Amortised / streaming inference ΒΆ # Tutorial Source Scope Refs / Notes 10.10 Amortised filter β neural encoder q Ο ( x t β£ y 1 : t ) q_\phi(x_t \mid y_{1:t}) q Ο β ( x t β β£ y 1 : t β ) trained through differentiable_assimilate β π¬ builds on Part 8; encoder ELBO recipe 10.11 Streaming amortised inference β online updates without retraining β π¬ bounded-memory amortisation; recurrent encoder
10.F β Sequential VB comparison ΒΆ # Tutorial Source Scope Refs / Notes 10.12 Sequential VB (Beal & Ghahramani 2003; Honkela 2003) β when classical VB beats ensemble β π π pairs with GP 6.16 (CVI); cost / accuracy table
Part 11 β Ecosystem Integrations ΒΆ 11.A β gaussx (structured covariances) ΒΆ Key equations / models:
S = C H H + R S = C^{HH} + R S = C HH + R as gaussx.LowRankUpdate β Woodbury solve in O ( N e 2 N y + N e 3 ) O(N_e^2 N_y + N_e^3) O ( N e 2 β N y β + N e 3 β ) vs O ( N y 3 ) O(N_y^3) O ( N y 3 β ) Diagonal / Toeplitz / Kronecker R R R never densified # Tutorial Source Scope Refs / Notes 11.1 Diagonal R R R β Woodbury gain β never densify β π§± π api: gaussx.LowRankUpdate, gaussx.solve_rows; pairs with GP 1.9 11.2 Toeplitz / Kronecker R R R β spatial obs noise β π π dd:integrations/geostack.md; pairs with GP 1.4 / 1.3
11.B β pipekit (orchestration) ΒΆ Key equations / models:
D11 wrapper pattern: FilterAsAnalysisStep, DynamicsAsForwardModel Sequential / Graph / Cycle composition of analysis steps # Tutorial Source Scope Refs / Notes 11.3 filterax filter as pipekit.AnalysisStep (wrapper pattern D11) β π§± dd:integrations/pipekit.md; api: FilterAsAnalysisStep (user wrapper) 11.4 Sequential / Graph / Cycle composition β π full multi-step Tier IV pipeline; pipekit-side notebook
11.C β somax (SDE dynamics) ΒΆ # Tutorial Source Scope Refs / Notes 11.5 somax SDE as AbstractDynamicsβ π dd:examples/integration.md; stochastic forward model worked example
# Tutorial Source Scope Refs / Notes 11.6 xarray-aware DA β coordinate-driven assimilation β π π pairs with coordax tutorials; named-axis filter API
11.E β plumax (Tier IV) ΒΆ # Tutorial Source Scope Refs / Notes 11.7 Multi-instrument methane retrieval β JointObsOperator, SequentialAssimilation, GeoLocalizer, fixed-lag smoother β π¬ dd:integrations/plumax.md; the canonical end-to-end demo
Part 12 β Applied Case Studies (research_notebook) ΒΆ 12.A β Canonical DA benchmarks ΒΆ # Tutorial Source Scope Refs / Notes 12.1 Lorenz-63 toy DA β ETKF / EnSRF / LETKF on the standard benchmark β π§± mirrors gaussx ensemble_kalman notebook; RMSE-vs-time across filters 12.2 Lorenz-96 spatially-extended DA β LETKF + adaptive inflation β π¬ operational analogue; per-grid-point posterior
12.B β Atmospheric & remote sensing ΒΆ # Tutorial Source Scope Refs / Notes 12.3 1D heat-equation DA β pedagogical PDE state-space β π§± ground-truth-from-PDE; visualised assimilation 12.4 Plume dispersion DA β plume_simulation/matched_filter β EnKF β π¬ pairs with projects/plume_simulation; emission-rate estimation 12.5 Multi-instrument retrieval β TROPOMI /EMI T/GHGSat-style joint observation β π¬ extends 11.7; per-instrument H H H stack
12.C β Inverse problems ΒΆ # Tutorial Source Scope Refs / Notes 12.6 Inverse heat conduction β EKI parameter estimation β π¬ thermal-conductivity recovery; ensemble-vs-truth contour plot 12.7 Subsurface flow history matching β IES end-to-end β π¬ reservoir engineering analogue; full Chen-Oliver iteration
12.D β Online / streaming ΒΆ # Tutorial Source Scope Refs / Notes 12.8 Online streaming smoother β fixed-lag with rolling buffer, memory budget β π¬ api: FixedLagSmoother; live-data DA recipe 12.9 Differentiable end-to-end β learn dynamics through 100-step assimilation β π¬ full Pattern A demo with checkpointing
Part 13 β Reference Surfaces (Zoo) ΒΆ Explicitly not maintained as core API; lives under zoo/ (gap, planned for #61 ) for educational / benchmarking use.
13.A β Continuous-time ΒΆ # Tutorial Source Scope Refs / Notes 13.1 Continuous-time EnKF β derivation from continuous Kalman β π¬ zoo; SDE form of the ensemble update 13.2 4D-EnKF β observation-time-aware ensemble update β π¬ zoo; multi-obs-window equivalence
13.B β Toy dynamical systems ΒΆ # Tutorial Source Scope Refs / Notes 13.3 Toy systems catalog β Lorenz-63 / Lorenz-96, sinusoid, double-well, brownian, OU β π benchmarking fixtures; reusable AbstractDynamics implementations
13.C β Hybrid Var-EnKF ΒΆ # Tutorial Source Scope Refs / Notes 13.4 EnVar β hybrid 3D/4D-Var-EnKF (Bocquet 2010) β π¬ bridges Part 0.D and Parts 2-3; static-B-plus-ensemble-B cost
References ΒΆ