Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Incremental 4DVar with Control-Variable Transform

CSIC
UCM
IGEO

Incremental 4DVar is the operational fast path for strong-constraint 4DVar. Rather than minimising the full nonlinear cost directly, it solves a sequence of linearised quadratic subproblems — a Gauss–Newton outer loop — each preconditioned by a control-variable transform Courtier et al., 1994Lorenc, 1997. It returns the same analysis as strong-constraint 4DVar, far faster.

The pattern is exactly 3DVar’s Gauss–Newton loop generalised to a time window: each inner problem is a linear-Gaussian (BLUE) analysis through the linearised dynamics.

The Outer / Inner Decomposition

Start from the strong-constraint cost. Linearising the dynamics and observations around the current outer iterate u(k)\boldsymbol{u}^{(k)},

Mt(u(k)+δu)Mt(u(k))+Mtδu,Ht(z+δz)Ht(z)+Htδz,M_t(\boldsymbol{u}^{(k)} + \delta\boldsymbol{u}) \approx M_t(\boldsymbol{u}^{(k)}) + M'_t\, \delta\boldsymbol{u}, \qquad H_t(\boldsymbol{z} + \delta\boldsymbol{z}) \approx H_t(\boldsymbol{z}) + H'_t\, \delta\boldsymbol{z},

gives a quadratic increment cost in δu\delta\boldsymbol{u},

δJ(δu)=12δuB12+12t=0TdtHtMtδuRt12,dt=ytHt ⁣(Mt(u(k))),\delta J(\delta\boldsymbol{u}) = \tfrac{1}{2} \| \delta\boldsymbol{u} \|^2_{\mathbf{B}^{-1}} + \tfrac{1}{2} \sum_{t=0}^{T} \| \boldsymbol{d}_t - H'_t M'_t\, \delta\boldsymbol{u} \|^2_{\mathbf{R}_t^{-1}}, \qquad \boldsymbol{d}_t = \boldsymbol{y}_t - H_t\!\big(M_t(\boldsymbol{u}^{(k)})\big),

where dt\boldsymbol{d}_t is the innovation at the current iterate. The minimiser solves the Gauss–Newton normal equations

HGNδu=t=0T(HtMt)Rt1dt,HGN=B1+t=0T(HtMt)Rt1(HtMt),\mathbf{H}_{\text{GN}}\, \delta\boldsymbol{u}^\star = \sum_{t=0}^{T} (H'_t M'_t)^\top \mathbf{R}_t^{-1} \boldsymbol{d}_t, \qquad \mathbf{H}_{\text{GN}} = \mathbf{B}^{-1} + \sum_{t=0}^{T} (H'_t M'_t)^\top \mathbf{R}_t^{-1} (H'_t M'_t),

after which the outer update u(k+1)=u(k)+δu\boldsymbol{u}^{(k+1)} = \boldsymbol{u}^{(k)} + \delta\boldsymbol{u}^\star relinearises and repeats.

Convergence is typically 3–5 outer iterations, largely independent of the strength of the nonlinearity.

Control-Variable Transform (CVT)

The condition number of HGN\mathbf{H}_{\text{GN}} inherits the eigenvalue spread of the prior B\mathbf{B}, which for a Matérn covariance on a 2D grid can span hundreds of orders of magnitude — crippling CG without preconditioning. The CVT changes variables to a whitened control,

χ=B1/2δu,δu=B1/2χ,\boldsymbol{\chi} = \mathbf{B}^{-1/2}\, \delta\boldsymbol{u}, \qquad \delta\boldsymbol{u} = \mathbf{B}^{1/2}\, \boldsymbol{\chi},

so the prior term becomes 12δuB12=12χ2\tfrac{1}{2}\|\delta\boldsymbol{u}\|^2_{\mathbf{B}^{-1}} = \tfrac{1}{2}\|\boldsymbol{\chi}\|^2 and the Hessian becomes

H~GN=I+B1/2(t=0T(HtMt)Rt1(HtMt))B1/2.\widetilde{\mathbf{H}}_{\text{GN}} = \mathbf{I} + \mathbf{B}^{1/2} \left( \sum_{t=0}^{T} (H'_t M'_t)^\top \mathbf{R}_t^{-1} (H'_t M'_t) \right) \mathbf{B}^{1/2}.

The square root B1/2\mathbf{B}^{1/2} comes from a structured covariance operator (e.g. a Matérn .half()); for non-structured priors, factorise via Cholesky (small grids) or Lanczos (large grids).

Implementation

A generic outer/inner loop: linearise the windowed observation map with jax.linearize, then solve the CVT-preconditioned normal equations with CG.

import jax
from jax.scipy.sparse.linalg import cg
from jaxtyping import Array, Float

def incremental_4dvar(
    ub:  Float[Array, "N"],     # background / initial outer iterate
    ys:  Float[Array, "T M"],   # observations over the window
    G,                          # windowed obs map  u0 -> {H_t(M_t(u0))}  : (N,) -> (T, M)
    apply_R_inv,                # (T, M) -> (T, M), block-diagonal R⁻¹ over time
    B_half,                     # χ -> B^{1/2} χ   (structured square root, symmetric)
    *, n_outer: int = 3, n_inner: int = 20,
) -> Float[Array, "N"]:
    u = ub
    for _ in range(n_outer):
        # 1. linearise the windowed observation map at the current iterate
        Gu, G_lin = jax.linearize(G, u)             # Gu = G(u) (T,M);  G_lin(v) = G'(u) v
        G_adj     = jax.linear_transpose(G_lin, u)  # (G')ᵀ
        d         = ys - Gu                         # innovations (T, M)

        # 2. CVT-preconditioned Gauss–Newton solve in χ-space:
        #    (I + B^{1/2} G'ᵀ R⁻¹ G' B^{1/2}) χ = B^{1/2} G'ᵀ R⁻¹ d
        def hessian(chi: Float[Array, "N"]) -> Float[Array, "N"]:
            v = B_half(chi)
            return chi + B_half(G_adj(apply_R_inv(G_lin(v)))[0])
        rhs = B_half(G_adj(apply_R_inv(d))[0])
        chi, _ = cg(hessian, rhs, maxiter=n_inner)

        # 3. outer update:  δu = B^{1/2} χ
        u = u + B_half(chi)
    return u

Typical knobs: n_outer ≈ 3 (Gauss–Newton outer iterations), n_inner ≈ 20–50 (CG steps per outer), plus CG tolerances and a cvt on/off switch.

Why It Is the Operational Fast Path

Three multiplicative speedups over running a generic nonlinear minimiser on the full strong-constraint cost:

CVT preconditioning
Linearisation reuse
Free posterior

CG converges in O(10)O(10) iterations instead of O(κ(B))O(\sqrt{\kappa(\mathbf{B})}) — the prior condition number κ(B)\kappa(\mathbf{B}) no longer drives the cost. See (5).

For a representative operational problem (T=12T = 12 h, Du=107D_u = 10^7 grid points, Dy=105D_y = 10^5 observations), incremental 4DVar converges in minutes where a generic nonlinear solver on the same cost takes hours.

Limiting Case and Scope

Strong vs. weak constraint. Incremental 4DVar is strong-constraint by default. The strong/weak choice is structural (does the control include ηt\boldsymbol{\eta}_t?); the inner-solver choice (generic vs. incremental) is performance. For weak-constraint problems the incremental machinery over the augmented control is heavier and not always worth it.

When to prefer strong-constraint 4DVar instead — almost never operationally, but:

For most uses, incremental 4DVar is the right default Bannister, 2017.

References
  1. Courtier, P., Thépaut, J.-N., & Hollingsworth, A. (1994). A Strategy for Operational Implementation of 4D-Var, Using an Incremental Approach. Quarterly Journal of the Royal Meteorological Society, 120(519), 1367–1387. 10.1002/qj.49712051912
  2. Lorenc, A. C. (1997). Development of an Operational Variational Assimilation Scheme. Journal of the Meteorological Society of Japan, 75(1B), 339–346. 10.2151/jmsj1965.75.1B_339
  3. Bannister, R. N. (2017). A Review of Operational Methods of Variational and Ensemble-Variational Data Assimilation. Quarterly Journal of the Royal Meteorological Society, 143(703), 607–633. 10.1002/qj.2982