Strong-constraint 4DVar extends 3DVar from a single snapshot to
a time window: observations yt at times t=0,1,…,T
are related to a single initial stateu0 through known
dynamics Le Dimet & Talagrand, 1986Talagrand & Courtier, 1987. The label
strong-constraint means the dynamics are treated as exact — any
u0 deterministically produces the whole trajectory, and model
error is excluded. Systems that need a model-error term use weak-constraint
4DVar instead.
The control variable is the initial state u0 (and optionally the
parameters θ); the trajectory follows from the
dynamical model.
Under the strong constraint the trajectory is a deterministic function of the
initial state, ut=Mt(u0), where Mt propagates
u0 from t0 to t. The dynamics enter the cost through the
composition Ht∘Mt — the observation operator
applied to the propagated state. The generative model is
Assembling (4) directly would require the dense Jacobians
(Mt′)⊤. Instead the gradient is computed by a single backward sweep of
an adjoint variable λt, initialised at the final time and
accumulated as it propagates backward:
When the dynamics are a continuous flow ∂tu=f(u,t,θ), the flow map is an ODE
solve, Mt(u0)=ODESolve(f,u0,θ,[t0,t]),
and the discrete adjoint sweep (5) becomes a backward ODE.
In modern code you rarely write the sweep by hand: define the cost
(3), wrap the rollout in a differentiable ODE solve, and call
jax.grad(J)(u0) — the dynamical-model adjoints compose
the pullback automatically.
When Mt is nonlinear the cost (3) is non-convex with
multiple local minima. Mitigations: multi-start optimisation, warm-starting from
the previous cycle, or switching to incremental 4DVar — which solves a sequence
of linearised inner problems (each an OI/BLUE, exactly as in
3DVar’s Gauss–Newton loop) and is the more robust choice for
production.
A Laplace approximation gives a Gaussian posterior whose covariance is the
inverse Gauss–Newton Hessian of (3), evaluated through the
trajectory. Forming it exactly needs many Krylov iterations (tens of
matrix–vector products, each a full forward+adjoint pass), so for production
uncertainty quantification, incremental 4DVar or an
ensemble / amortized posterior is cheaper.
Le Dimet, F.-X., & Talagrand, O. (1986). Variational Algorithms for Analysis and Assimilation of Meteorological Observations: Theoretical Aspects. Tellus A, 38(2), 97–110. 10.3402/tellusa.v38i2.11706
Talagrand, O., & Courtier, P. (1987). Variational Assimilation of Meteorological Observations with the Adjoint Vorticity Equation. I: Theory. Quarterly Journal of the Royal Meteorological Society, 113(478), 1311–1328. 10.1002/qj.49711347812
Errico, R. M. (1997). What Is an Adjoint Model? Bulletin of the American Meteorological Society, 78(11), 2577–2591. https://doi.org/10.1175/1520-0477(1997)078<2577:WIAAM>2.0.CO;2