Skip to article frontmatterSkip to article content

Strong-Constrained 4DVar

CSIC
UCM
IGEO

Problem Formulation

Joint Distribution

p(y1:T,u1:T,θ)=p(θ)p(u0)t=1Tp(ytzt,θ)p(\boldsymbol{y}_{1:T},\boldsymbol{u}_{1:T},\boldsymbol{\theta}) = p(\boldsymbol{\theta})p(\boldsymbol{u}_0) \sum_{t=1}^T p(\boldsymbol{y}_t|\boldsymbol{z}_t,\boldsymbol{\theta})

Posterior

p(zt)p(\boldsymbol{z}_t|\boldsymbol{})

Loss Function

J(u,θ)=t=1Tlogp(ytut,θ)+logp(z0θ)+logp(θ)\boldsymbol{J}(\boldsymbol{u},\boldsymbol{\theta}) = \sum_{t=1}^T\log p(\boldsymbol{y}_t|\boldsymbol{u}_t,\boldsymbol{\theta}) + \log p(\boldsymbol{z}_0|\boldsymbol{\theta}) + \log p(\boldsymbol{\theta})

So each of the terms:

Data Likelihood:p(ytut,θ)=N(ytut,Σuu)Prior State:p(ztθ)=N(ztμz,Σz)Prior Parameters:p(θ)=N(θμθ,Σθ)\begin{aligned} \text{Data Likelihood}: && && p(\boldsymbol{y}_t|\boldsymbol{u}_t,\boldsymbol{\theta}) &= \mathcal{N}(\boldsymbol{y}_t|\boldsymbol{u}_t,\boldsymbol{\Sigma_{uu}}) \\ \text{Prior State}: && && p(\boldsymbol{z}_t|\boldsymbol{\theta}) &= \mathcal{N}(\boldsymbol{z}_t|\boldsymbol{\mu_z},\boldsymbol{\Sigma_z}) \\ \text{Prior Parameters}: && && p(\boldsymbol{\theta}) &= \mathcal{N}(\boldsymbol{\theta}|\boldsymbol{\mu_\theta},\boldsymbol{\Sigma_\theta}) \\ \end{aligned}

Algorithm

Data

Observations:D={yt}n=1NytRDy\begin{aligned} \text{Observations}: && && \mathcal{D} &= \left\{ \boldsymbol{y}_t\right\}_{n=1}^N && && \boldsymbol{y}_t \in \mathbb{R}^{D_y} \\ \end{aligned}

Dynamical Model

tu=f(u,t,θ)\partial_t \boldsymbol{u} = \boldsymbol{f}(\boldsymbol{u},t,\boldsymbol{\theta})

Adjoint Model

tλ=f(λ,t,u,y)\begin{aligned} \partial_t \boldsymbol{\lambda} &= \boldsymbol{f}^* \left( \boldsymbol{\lambda}, t, \boldsymbol{u}, \boldsymbol{y} \right) \end{aligned}
def init_f_adjoint(f) -> Callable:
    f = lambda u, lam: jax.vjp(f, u)(lam)[0]
    return f

Initial Condition

We need an initial condition for the state, u0\boldsymbol{u}_0, and the parameters, θ\boldsymbol{\theta}.

Initial State:u0RDθInitial Parameters:θ0RDθ\begin{aligned} \text{Initial State}: && && \boldsymbol{u}_0 &\in \mathbb{R}^{D_\theta} \\ \text{Initial Parameters}: && && \boldsymbol{\theta}_0 &\in \mathbb{R}^{D_\theta} \\ \end{aligned}

Integrate Forward Model

We need the list of time steps which match the observations

T+=[t0,t1,t2,,tT]\begin{aligned} \mathbf{T}^+ = \left[t_0,t_1,t_2, \ldots, t_T\right] \end{aligned}
# time step
dt: float = 0.01
# list of time steps
time_steps: Array["Nt"] = np.arange(0, num_time_steps, dt)

We need to pass this through a solver to get our states.

U=ODESolve(f,u0,θ,T)\mathbf{U} = \text{ODESolve}\left( \boldsymbol{f},\boldsymbol{u}_0,\boldsymbol{\theta},\mathbf{T}\right)
state_sol: PyTree = ode_solve(f, u_0, params, time_steps)

So we essentially get a matrix of all of the time points output of our model.

U=[u1,u2,,uT]RT×Du\mathbf{U} = \left[\boldsymbol{u}_1, \boldsymbol{u}_2, \ldots, \boldsymbol{u}_T \right] \in \mathbb{R}^{T\times D_u}
# extract state
u_sol: Array["Nt Du"]: state_sol.u

Integrate Backward Adjoint Model

Now, we need to do the opposite, we need to run through our solver in reverse using the adjoint model. First, we need to initialize

T=[tT,tT1,tT2,,t2,t1,t0]\mathbf{T} = \left[ t_T, t_{T-1}, t_{T-2}, \ldots, t_2, t_1, t_0\right]
# list of inverse time steps
time_steps_reverse: Array["Nt"] = time_steps[::-1]
tλ=f(λ,t,u,y)f(ut,λ,y,θ)=Jf(ut)λJh(ut)Cyy1(h(ut)yt)\begin{aligned} \partial_t \boldsymbol{\lambda} &= \boldsymbol{f}^* \left( \boldsymbol{\lambda}, t, \boldsymbol{u}, \boldsymbol{y} \right) \\ \boldsymbol{f}^*(\boldsymbol{u}_t,\boldsymbol{\lambda},\boldsymbol{y},\boldsymbol{\theta}) &= \boldsymbol{J_f}^\top(\boldsymbol{u}_t)\boldsymbol{\lambda} - \boldsymbol{J_h}^\top(\boldsymbol{u_t})\mathbf{C}_{\boldsymbol{yy}}^{-1} \left(\boldsymbol{h}(\boldsymbol{u}_t) - \boldsymbol{y}_t\right) \end{aligned}

Now, we iterate through each of these time steps

λt+1=f(ut,λt,y,θ)\boldsymbol{\lambda}_{t+1} = \boldsymbol{f}^*(\boldsymbol{u}_t,\boldsymbol{\lambda}_t,\boldsymbol{y},\boldsymbol{\theta})

Loss Function


State

First, we have the loss function for the state

u0L(u0,θ,λ0)=Σzz1(u0μz)λ0\boldsymbol{\nabla_{u_0}}\boldsymbol{L}(\boldsymbol{u}_0,\boldsymbol{\theta},\boldsymbol{\lambda}_0) = \boldsymbol{\Sigma}_{\boldsymbol{zz}}^{-1} \left(\boldsymbol{u}_0 - \boldsymbol{\mu}_z \right) - \boldsymbol{\lambda}_0

So a single gradient step would be

u0k+1=u0kαBu0L(u0k,θ,λ0)\boldsymbol{u}_0^{k+1} = \boldsymbol{u}_0^{k} - \alpha \mathbf{B}\boldsymbol{\nabla_{u_0}} \boldsymbol{L}(\boldsymbol{u}_0^{k},\boldsymbol{\theta},\boldsymbol{\lambda}_0)

Parameters

Secondly, we have the loss function for the parameters

θL(u0,θ,λ0)=Σθθ1t=1TJf(θt)λt+1\boldsymbol{\nabla_{\theta}}\boldsymbol{L}(\boldsymbol{u}_0,\boldsymbol{\theta},\boldsymbol{\lambda}_0) = \boldsymbol{\Sigma}_{\boldsymbol{\theta\theta}}^{-1} \sum_{t=1}^T \boldsymbol{J_f}^\top(\boldsymbol{\theta}_t)\boldsymbol{\lambda}_{t+1}

So a single gradient step would be

θk+1=θkαBθL(u0k,θ,λ0)\boldsymbol{\theta}^{k+1} = \boldsymbol{\theta}^{k} - \alpha \mathbf{B}\boldsymbol{\nabla_{\theta}} \boldsymbol{L}(\boldsymbol{u}_0^{k},\boldsymbol{\theta},\boldsymbol{\lambda}_0)