Incremental 4DVar with Control-Variable Transform¶
Operational 4DVar at ECMWF, NCEP, JMA, and UKMO all follow a common
template: Gauss-Newton outer iterations on the full nonlinear cost,
CG (or Lanczos) inner iterations on a linearised quadratic
subproblem, with a control-variable transform that preconditions the
inner solve. IncrementalFourDVar packages this pattern.
It is functionally equivalent to StrongFourDVar (chapter 6) — same
problem, same answer — but the specialised inner solver is dramatically
faster when the prior is structured (Matérn, Kronecker) and the
window is long.
The outer / inner decomposition¶
Strong-constraint 4DVar minimises
Linearise around the current outer iterate \(x_b^{(k)}\):
where \(M'_t = \partial M_t / \partial x|_{x_b^{(k)}}\) from
jax.linearize and similarly \(H'_t\) from obs_op.linearize(...). The
linearised (quadratic) increment cost is
with innovation \(d_t = y_t - H_t(M_t(x_b^{(k)}))\). The Gauss-Newton Hessian:
The inner CG / Lanczos solver finds \(\delta x^*\) such that
Outer update: \(x_b^{(k+1)} = x_b^{(k)} + \delta x^*\). Relinearise.
This pattern converges in a handful of outer iterations (typically 3–5) regardless of nonlinearity strength, because each outer step sees a fresh tangent linearisation.
Control-variable transform (CVT)¶
The Hessian \(\mathbf{H}_\text{GN}\) has condition number set by the prior covariance eigenvalues spread over the correlation length scales of \(B\). For a Matérn \(B\) on a 2D grid, this can be hundreds of orders of magnitude — CG converges slowly without preconditioning.
CVT changes variables:
so \(\delta x = B^{1/2} \chi\). In \(\chi\)-space the prior cost is
and the Hessian becomes
The spectrum of \(\tilde{\mathbf{H}}_\text{GN}\) is bunched around 1 — CG converges in tens of iterations regardless of grid resolution.
\(B^{1/2}\) comes from gaussx.MaternLinearOperator.half():
import gaussx as gx
B_half = gx.MaternLinearOperator(
grid_coords=coords, length_scale=10.0, nu=1.5, sigma=1.0,
).half()
For non-Matérn structured priors, factorise via Cholesky (small grids) or Lanczos (large grids).
Implementation in vardax¶
import gaussx as gx
import lineax as lx
from vardax.models import IncrementalFourDVar
from vardax import IncrementalConfig
model = IncrementalFourDVar(
forward=somax_model,
obs_op=AveragingKernel(A=A, x_a=xa, h=h),
prior_mean=x_b,
prior_cov_op=gx.MaternLinearOperator(coords, length_scale=10.0, sigma=0.1),
obs_cov_op=lx.DiagonalLinearOperator(obs_uncertainty),
config=IncrementalConfig(n_outer=3, n_inner=20, cvt=True),
)
x_star = model(batch)
# Posterior is free — reuses the GN Hessian from the last outer iteration
posterior = model.posterior(batch)
IncrementalConfig:
n_outer— number of Gauss-Newton outer iterations (typical: 3)n_inner— max CG iterations per outer (typical: 20–50)cg_atol,cg_rtol— CG convergence tolerancescvt— apply control-variable transform (defaultTrue)
Algorithm¶
Inputs:
x_b — background mean
B_op — prior covariance (gaussx; supports .half())
R_op — observation-error covariance
forward — pipekit_cycle.ForwardModel
obs_op — pipekit_cycle.ObservationOperator
batch — observations y_t for t = 0, …, T
config — IncrementalConfig
x ← x_b
for k in range(config.n_outer):
# 1. Linearise M_t and H_t at the current outer iterate
M_lin = jax.linearize(lambda s: forward.step(s, dt), x)
H_lin = obs_op.linearize(forward_trajectory(x, batch))
# 2. Compute innovations
d = batch.y - H_lin(forward_trajectory(x, batch))
# 3. Inner solve via lineax.CG (in χ-space if config.cvt)
if config.cvt:
B_half = B_op.half()
# Solve (I + B^{1/2} (H'M')ᵀ R⁻¹ (H'M') B^{1/2}) χ = B^{1/2} (H'M')ᵀ R⁻¹ d
preconditioned_hessian = identity_op + B_half @ sum_t_hmrhm @ B_half
rhs = B_half @ adj_innovation
chi = lineax.linear_solve(
preconditioned_hessian, rhs,
solver=lineax.CG(atol=config.cg_atol, rtol=config.cg_rtol,
max_steps=config.n_inner),
)
dx = B_half @ chi
else:
# Identity preconditioner — slower but works without B^{1/2}
dx = lineax.linear_solve(...)
# 4. Outer update
x = x + dx
return x
Why it's the operational fast path¶
Three multiplicative speedups over StrongFourDVar with NonlinearCG:
- CVT preconditioning — CG converges in \(O(10)\) iterations instead of \(O(\sqrt{\kappa(B)})\) where \(\kappa(B)\) is the prior condition number.
- Linearisation reuse — the tangent linear \(M'\) is computed once
per outer iteration and reused across all \(n_\text{inner}\) inner CG
steps.
NonlinearCGon the full nonlinear cost recomputes the gradient (which requires a full forward + adjoint pass) every step. - Free posterior — the GN Hessian assembled in the last outer
iteration is reused for
LaplaceCovariance/GaussNewtonHessian. No separate Krylov pass needed for posterior diagonal.
For a typical operational problem (\(T = 12\) hours, \(n = 10^7\) grid
points, \(m = 10^5\) observations), IncrementalFourDVar converges in
minutes; StrongFourDVar with NonlinearCG takes hours.
Linear-Gaussian agreement¶
For \(T = 0\) and linear \(H\), IncrementalFourDVar reduces to
OptimalInterpolation. The outer iteration converges in one step (no
relinearisation needed), and the inner CG produces the closed-form
Kalman update:
def test_incremental_recovers_oi():
oi = OptimalInterpolation(linear_H, x_b, B_op, R_op)
inc = IncrementalFourDVar(
forward=static_forward, obs_op=linear_H,
prior_mean=x_b, prior_cov_op=B_op, obs_cov_op=R_op,
config=IncrementalConfig(n_outer=2, n_inner=50, cvt=True),
)
batch = make_linear_gaussian_batch(T=0)
assert jnp.allclose(oi(batch), inc(batch), atol=1e-3)
Part of the Decision D14 invariant.
Strong- vs weak-constraint¶
IncrementalFourDVar is strong-constraint by default. For
weak-constraint incremental 4DVar (the augmented control vector with
\(\eta_t\)), use WeakFourDVar directly — the incremental machinery for
the augmented problem is heavier and not always worth the complexity.
The strong / weak choice is structural; the inner-solver choice
(general vs incremental) is performance.
Posterior covariance — for free¶
The last outer iteration's GN Hessian is an AbstractLinearOperator
that can be inverted via Krylov for posterior diagonal or sampling:
posterior = model.posterior(batch)
# Posterior(mean=x_star, cov=GN_Hessian_inv_op, samples=None, provenance={...})
Compare StrongFourDVar, where computing the posterior requires a
fresh Krylov pass over the same Hessian. For IncrementalFourDVar the
Hessian is already there.
When to switch from IncrementalFourDVar to StrongFourDVar¶
Almost never, for operational work. IncrementalFourDVar is faster
and produces the same answer. Use StrongFourDVar only when:
- The prior is exotic (not Matérn / Kronecker / structured) and
gaussx.MaternLinearOperator.half()doesn't apply. The CVT preconditioner is then unavailable. - You want to swap in a custom optimistix minimiser (e.g.
optx.LevenbergMarquardtfor very nonlinear cases). Incremental uses a fixed GN+CG; the rest of the optimistix zoo isn't compatible with the incremental machinery.
For most users, IncrementalFourDVar is the right default.
See also¶
- Chapter 6 —
StrongFourDVar(equivalent problem, slower solver) - Chapter 12 — adjoint composition (CVT relies on
gaussxfactorisation) - Chapter 13 — posterior covariance (reuses GN Hessian)
- Design doc:
design/decisions.md#d11
References¶
- Courtier, P., Thépaut, J.-N., & Hollingsworth, A. (1994). A strategy for operational implementation of 4D-Var, using an incremental approach. QJRMS 120(519).
- Lorenc, A. C. (1997). Development of an operational variational assimilation scheme. JMSJ 75(1B).
- Bannister, R. N. (2017). A review of operational methods of variational and ensemble-variational data assimilation. QJRMS 143(703).