Problem Setting¶
Data assimilation is the problem of estimating the state of a dynamical system given partial, noisy observations and prior knowledge. It sits at the intersection of Bayesian inference, optimisation, and numerical analysis: the prior comes from a physical model (or a learned one), the likelihood comes from instruments, and the posterior is what we want.
The Bayesian picture¶
Let \(x \in \mathbb{R}^n\) denote the state and \(y \in \mathbb{R}^m\) the observations. Bayes' theorem gives the posterior
Three ingredients:
- Prior \(p(x)\). What we know about the state before seeing the
data. Most commonly Gaussian: \(x \sim \mathcal{N}(x_b, B)\), with
background mean \(x_b\) and background-error covariance \(B\). In
FourDVarNetthe prior is replaced by a learned autoencoder; inAmortizedPosteriorit is absorbed into the trained head. - Likelihood \(p(y \mid x)\). How observations relate to the state. We model \(y = H(x) + \varepsilon\) with \(\varepsilon \sim \mathcal{N}(0, R)\) giving \(p(y \mid x) = \mathcal{N}(y; H(x), R)\). The observation operator \(H\) is the projection from state space to observation space (see chapter 2).
- Posterior \(p(x \mid y)\). What we infer about the state given the data. For Gaussian prior and Gaussian likelihood with linear \(H\), this is Gaussian in closed form — that's chapter 4. Otherwise we approximate.
The maximum a posteriori (MAP) estimator minimises the negative log-posterior:
This is the 3DVar cost (chapter 5). When the state evolves through dynamics \(M_t\), we add a time index and a forward rollout — that's 4DVar (chapters 6–8).
When dynamics enter — the 4D problem¶
Consider a state \(x_0\) at the start of an assimilation window and observations \(y_t\) at times \(t = 0, 1, \ldots, T\). A dynamical model \(M_t\) propagates \(x_0\) forward:
The 4D variational cost combines the background term, all observation terms across time, and (optionally) a model-error term:
The model-error term is absent in strong-constraint 4DVar (chapter 6, control variable \(x_0\) alone) and present in weak-constraint 4DVar (chapter 7, augmented control vector \((x_0, \eta_1, \ldots, \eta_T)\)).
In the linear-Gaussian limit (linear \(H_t\), linear \(M_t\), Gaussian \(B\), \(R_t\), \(Q_t\)) the posterior is Gaussian in closed form. In all other cases we minimise \(J\) iteratively. The cost itself is the same; the solver is where the methods diverge.
The seven analysis methods¶
vardax implements seven peer methods, each specialising the general problem above:
| Method | Setting | Solver |
|---|---|---|
OptimalInterpolation |
Linear \(H\), Gaussian \(B\), \(R\); \(T = 0\) | Closed-form |
ThreeDVar |
Nonlinear \(H\); \(T = 0\) | optimistix minimiser |
StrongFourDVar |
Multi-time; dynamics treated as exact | optimistix minimiser + diffrax adjoint |
WeakFourDVar |
Multi-time; model error active | Same, augmented control |
IncrementalFourDVar |
Operational fast path of StrongFourDVar |
GN outer + CG inner + CVT |
FourDVarNet |
Learned prior + learned grad modulator | Learned inner iteration |
AmortizedPosterior |
Direct \(q_\phi(x \mid y)\) head | Single forward pass |
They are siblings, not subclasses. Each is appropriate to a different regime. Chapters 4–10 give one method per chapter.
The role of the forward model¶
The dynamics \(M_t\) and the observation operator \(H_t\) are the
physics. vardax does not own them — somax (geophysics), plumax
(atmospheric transport / methane), or any user code can supply them as
long as the operator satisfies pipekit_cycle.ForwardModel and
pipekit_cycle.ObservationOperator respectively.
This means the same analysis class accepts any forward. A
StrongFourDVar doesn't know whether \(M_t\) is a shallow-water solver
from somax, a Lagrangian particle simulator from plumax, or a
neural emulator trained from physics outputs. The interface is fixed;
the implementation is a slot.
This separation is what makes vardax composable with the broader
JAX ecosystem (diffrax, optimistix, lineax, gaussx,
pipekit-cycle) without bespoke per-method wiring.
Computational ingredients¶
The four computations that recur in every DA method:
- Forward model rollout \(x_t = M_t(x_0)\). ODE integration via
diffrax; vardax composes but does not implement. - Tangent-linear and adjoint \(M'_t\), \(M'^\top_t\). Computed via
diffrax.AbstractAdjoint(recursive checkpointing, continuous backsolve, or forward sensitivity — chapter 12). - Observation operator \(H_t(x_t)\) and its tangent linear \(H'_t\).
See chapter 2 (
linearize()contract vialineax). - Inner minimisation \(\arg\min_x J(x)\). Via
optimistix.AbstractMinimiser(Gauss-Newton, BFGS, NonlinearCG, fixed-point iteration). For incremental 4DVar, \(\arg\min\) is done by CG on the linearised quadratic subproblem.
Each computation is owned by an upstream library. vardax composes them into DA algorithms; it does not reimplement them.
Posterior, not point estimate¶
A MAP \(x^*\) is the start of the answer, not the end. Real DA produces a posterior — mean, covariance, and provenance — that downstream consumers (population models, alert systems, decision pipelines) need.
vardax exposes a Posterior(mean, cov, samples, provenance) container
returned by every analysis. Approximations include the Laplace
(LaplaceCovariance) at MAP, the Gauss-Newton Hessian inversion
(GaussNewtonHessian, free for IncrementalFourDVar), the ensemble
covariance (EnsembleCovariance via filterax), or direct sampling
from amortized heads. See chapter 13.
For the closed-form case (OptimalInterpolation), the posterior is
exact: \(P^* = (B^{-1} + H^\top R^{-1} H)^{-1}\). Chapter 4 derives it.
Reading order¶
Chapters 1–3 establish the shared foundation: this chapter (the problem), chapter 2 (the observation model), chapter 3 (the dynamical model and its adjoint).
Chapters 4–10 cover one analysis method each, in increasing order of sophistication:
- Chapter 4 —
OptimalInterpolation(closed form) - Chapter 5 —
ThreeDVar(nonlinear \(H\)) - Chapter 6 —
StrongFourDVar(add dynamics) - Chapter 7 —
WeakFourDVar(allow model error) - Chapter 8 —
IncrementalFourDVar(operational fast path) - Chapter 9 —
FourDVarNet(learned prior + solver) - Chapter 10 —
AmortizedPosterior(direct head)
Chapters 11–14 cover cross-cutting concerns: observation operators (the
averaging-kernel / multi-instrument family), posterior covariance,
adjoint composition via optimistix and diffrax, and the six-step
research-to-operations methodology.
Chapters 15–17 are end-to-end examples on Lorenz, SSH, and methane.
The companion design docs in design/ document the API
contracts and architectural decisions behind these methods.