Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Tier IV — End-to-end coupled system

UNEP
IMEO
MARS

Forward model: transport + RTM + multi-instrument fusion, from source parameters all the way to simulated radiances across multiple satellites simultaneously. This is the full operational pipeline.

Source params (Q_{1:K}(t), x₀_{1:K}, t₀_{1:K},  ū, θ_wind, c_bg, α_BC, …)
       ↓  [Tier I/II/III transport]
Concentration field  c(x,t)
       ↓  [RTM / AK operator,  per instrument]
Simulated observations  {y_inst}_{inst ∈ {TROPOMI, EMIT, Tanager, GHGSat, …}}
       ↑
Cross-instrument bias correction  bias_inst

Tier IV is assembly + multi-instrument fusion, not new modelling: it composes transport (any of Tiers I–III) with the RTM stack and joins observations from multiple satellites into a single coherent posterior. The contribution at this tier is the joint multi-instrument inference, the operational predictor, and the cross-instrument calibration.


(1) Simple model — composed forward over multiple instruments

Per-instrument forward

For a single instrument inst\text{inst}:

yinst  =  Ainstcolz ⁣(transport(Q(t),x0,met)+cbg)  +  biasinst  +  εinst\mathbf{y}_\text{inst} \;=\; \mathbf{A}_\text{inst}\, \mathrm{col}_z\!\bigl(\text{transport}(Q(t), x_0, \text{met}) + \mathbf{c}_\text{bg}\bigr) \;+\; \text{bias}_\text{inst} \;+\; \boldsymbol{\varepsilon}_\text{inst}
εinstN(0,Rretr,inst+Rrepr,inst+Ralign,inst)\boldsymbol{\varepsilon}_\text{inst} \sim \mathcal{N}(\mathbf{0}, \mathbf{R}_{\text{retr},\text{inst}} + \mathbf{R}_{\text{repr},\text{inst}} + \mathbf{R}_{\text{align},\text{inst}})

State vector — full enumeration

Single-overpass coupled inference works in a state space far larger than just (Q,x0,t0)(Q, x_0, t_0):

x  =  (Q1:K(t),  x0,1:K,  uˉ,θwind,  cbg,  αBC,  biasinst,  Asurf,AOD,)\mathbf{x} \;=\; \bigl(\, Q_{1:K}(t),\; x_{0,1:K},\; \bar{u},\, \theta_\text{wind},\; \mathbf{c}_\text{bg},\; \alpha_\text{BC},\; \text{bias}_\text{inst},\; A_\text{surf},\, \text{AOD},\, \dots \bigr)

with trans-dimensional K=nsourcesK = n_\text{sources} (basin case). Single-source single-instrument is the sanity-check special case, not the operational target.

Multi-instrument fusion

The joint observation operator is a list-of-forwards keyed on instrument_id, not a single forward:

y  =  [yTROPOMI,yEMIT,yTanager,yGHGSat,],H(x)  =  [Hinst(x)  :  instinstruments]\mathbf{y} \;=\; [\mathbf{y}_\text{TROPOMI},\, \mathbf{y}_\text{EMIT},\, \mathbf{y}_\text{Tanager},\, \mathbf{y}_\text{GHGSat},\, \dots], \qquad \mathbf{H}(\mathbf{x}) \;=\; \bigl[\, \mathbf{H}_\text{inst}(\mathbf{x}) \;:\; \text{inst} \in \text{instruments} \,\bigr]

Each Hinst\mathbf{H}_\text{inst} carries its own AK, footprint, native resolution, observation time, and quality-flag schema (see Veefkind et al., 2012Green & others, 2022Carbon Mapper, 2024GHGSat Inc., 2016).

Spatiotemporal alignment — Q(t)Q(t) as a stochastic process

Different satellites overpass at different times. With a static QQ the coupled forward implies the same source state at every overpass — wrong for intermittent/leak emissions and wrong over multi-day windows.

Default: Q(t)Ornstein–UhlenbeckQ(t) \sim \text{Ornstein–Uhlenbeck} with a basin-typical correlation timescale (hours to days), or Gaussian process prior with Matérn-3/2 covariance:

dQ(t)  =  θ(Q(t)μQ)dt  +  σQdW(t)\mathrm{d} Q(t) \;=\; -\theta\, \bigl(Q(t) - \mu_Q\bigr)\, \mathrm{d}t \;+\; \sigma_Q\, \mathrm{d}W(t)

Captures intermittent / burst emissions naturally.

Build order

Start with the cheapest combination that is still physically coherent, with multi-instrument fusion enabled from day 1:

  1. Tier I + AK + L2 fusion across {TROPOMI, GHGSat, EMIT} for static QQ. This is the v1 target — Methane Alert and Response System (MARS, UNEP-IMEO) style attribution with multi-satellite cross-validation.
  2. Lagrangian (Tier II) + AK + L2 fusion → handles wind-driven plumes; same fusion harness.
  3. FV (Tier III) + neural RTM + L1 fusion → full L1-radiance inversion with end-to-end gradients.
  4. Q(t)Q(t) stochastic-process upgrade once multi-day events appear in the catalog.

The point: don’t try to ship the most complex tier first. Each upgrade replaces a single block in the diagram; the multi-instrument fusion harness, likelihood structure, and observational comparison stay the same.


(2) Model-based inference

End-to-end gradient — honest cost

xJ  =  x ⁣[inst12Hinst(x)yinstRinst2  +  12xxbB2]\nabla_{\mathbf{x}} J \;=\; \nabla_{\mathbf{x}}\!\left[\, \sum_\text{inst} \tfrac{1}{2}\lVert \mathbf{H}_\text{inst}(\mathbf{x}) - \mathbf{y}_\text{inst} \rVert^{2}_{\mathbf{R}_\text{inst}} \;+\; \tfrac{1}{2}\lVert \mathbf{x} - \mathbf{x}_b \rVert^{2}_{\mathbf{B}} \right]

JAX autodiff propagates through transport + RTM jointly — no chain rule by hand. Cost is non-trivial: each gradient call runs transport + RTM for every instrument’s Hinst\mathbf{H}_\text{inst}. For Tier III + HAPI that’s seconds-to-minutes per call; emulator-based inference (Step 4) is the operational path.

Cost function

Three terms:

J(x)  =  inst12Hinst(x)yinstRinst2observations, per-instrument  +  12xxbB2prior on full state  +  12Q(t)μQ(t)KQ2Q(t) stochastic-process priorJ(\mathbf{x}) \;=\; \underbrace{\sum_\text{inst} \tfrac{1}{2}\lVert \mathbf{H}_\text{inst}(\mathbf{x}) - \mathbf{y}_\text{inst} \rVert^{2}_{\mathbf{R}_\text{inst}}}_{\text{observations, per-instrument}} \;+\; \underbrace{\tfrac{1}{2}\lVert \mathbf{x} - \mathbf{x}_b \rVert^{2}_{\mathbf{B}}}_{\text{prior on full state}} \;+\; \underbrace{\tfrac{1}{2}\lVert Q(t) - \mu_Q(t) \rVert^{2}_{\mathbf{K}_Q}}_{Q(t)\text{ stochastic-process prior}}

B\mathbf{B} carries the structured priors from §1 (lognormal QQ, met-tight uˉ\bar{u}, θwind\theta_\text{wind}, GP/OU on Q(t)Q(t), etc.); KQ\mathbf{K}_Q is the OU/GP kernel. Rinst\mathbf{R}_\text{inst} includes representation, retrieval, and temporal-alignment terms.

Quality-flag handling

Per-instrument quality flags from the RTM stack flow into the coupled forward. Default policy: flagged pixels contribute zero log-likelihood (mask multiplier in R1\mathbf{R}^{-1}).

Posterior covariance

Three paths, mirroring Tier III:

Posterior export to Tier V.A is via the same adapter pattern as Tiers I/II/III.

Trans-dimensional nsourcesn_\text{sources}

K=nsourcesK = n_\text{sources} is itself unknown. Three options:

v1: max-K with masking (Kmax=10K_\text{max} = 10 per basin tile). Promote to RJMCMC when basin events exceed KmaxK_\text{max} regularly.


(3) Model emulator — coupled vs. stacked

Two architectural choices:

Stacked emulators (tier-modular)

Compose Tier-N transport emulator + RTM emulator at runtime.

Coupled emulator (single network)

gϕ:(met fields,source params,instrument metadata)    simulated multi-instrument overpass tensorg_\phi : (\text{met fields},\, \text{source params},\, \text{instrument metadata}) \;\longmapsto\; \text{simulated multi-instrument overpass tensor}

Decision rule

Both should exist; the coupled emulator is validated against the stacked composition before deployment.

Training-data budget

“Millions of pairs” naively needs O(106)O(10^{6}) transport+RTM simulations. For Tier III + HAPI that’s CPU-years on a single machine.

Domain randomization

Sample the joint (met regime,source configuration,scene class,viewing geometry,instrument,nsources)(\text{met regime}, \text{source configuration}, \text{scene class}, \text{viewing geometry}, \text{instrument}, n_\text{sources}) distribution with stratified sampling, not uniform. Naive uniform under-represents the tail regimes that actually drive operational failures.


(4) Emulator-based inference

Use the coupled (or stacked) emulator in EKI (filterax) or gradient-based inversion. Real-time capable.


(5) Amortized inference (predictor)

fθ:({(instrument_id,yinst,Ainst,maskinst,footprintinst)},  metreanalysis,transport_tier_id)    p ⁣(Q1:K(t),x0,1:K,Kobservations,met)f_\theta : \bigl(\, \{(\text{instrument\_id}, \mathbf{y}_\text{inst}, \mathbf{A}_\text{inst}, \text{mask}_\text{inst}, \text{footprint}_\text{inst})\},\; \text{met}_\text{reanalysis},\, \text{transport\_tier\_id} \,\bigr) \;\longmapsto\; p\!\bigl(\, Q_{1:K}(t),\, x_{0,1:K},\, K \,\big|\, \text{observations}, \text{met} \,\bigr)

This is the operational product: a multi-instrument satellite-overpass list goes in, source-parameter posterior comes out.

Multi-instrument list input

Input is a list of per-instrument observation tuples — same pattern as Tier II/III, generalised to a heterogeneous list. Each element keeps native resolution, AK, mask, and footprint. No pre-regridding.

Per-instrument heads, tier-conditioned

Trans-dimensional output

K=nsourcesK = n_\text{sources} varies. Default architecture: max-K masked output, predicting (K,{Qk(t),x0,k}k=1Kmax,activity_mask)(K, \{Q_k(t), x_{0,k}\}_{k=1}^{K_\text{max}}, \text{activity\_mask}) jointly. Activity mask is a Bernoulli per slot. Promote to RJMCMC predictor head only if max-K masking shows systematic basin saturation.

Posterior representation

Training data

Simulate millions of (source config,multi-instrument overpass)(\text{source config}, \text{multi-instrument overpass}) pairs spanning the realistic met regime distribution + scene-class distribution + instrument coverage distribution. Active learning over the training schedule (§3) is mandatory at this scale.


(6) Improve


Module layout (proposed)

Table (1):Tier IV proposed module layout — step, concern, target module, status.

StepConcernModuleStatus
1Coupled forward (Tier I + AK + multi-inst)plume_simulation.coupled.gaussian_ak
1Coupled forward (Tier II + AK + multi-inst)plume_simulation.coupled.lagrangian_ak
1Coupled forward (Tier III + RTM + multi-inst)plume_simulation.coupled.fv_rtm
1Multi-instrument fusion harnessplume_simulation.coupled.fusion
1Cross-instrument bias modelplume_simulation.coupled.bias
1Quality-flag aggregatorplume_simulation.coupled.quality
1Q(t)Q(t) stochastic-process model (OU / GP)plume_simulation.coupled.q_dynamics
1Trans-dimensional source-count handlingplume_simulation.coupled.k_sources
2End-to-end inversionreuse assimilation/ with composed forward
2Posterior covariance (Laplace / Hessian / EnKF)reuse Tier III’s posterior modules
2Posterior export → Tier Vplume_simulation.coupled.posterior_export
3Stacked emulator runtimeplume_simulation.coupled.stacked_emulator
3Coupled emulator (end-to-end)plume_simulation.coupled.emulator
3Active-learning training schedulerplume_simulation.coupled.active_learning
5Operational predictor (per-instrument, tier-conditioned)plume_simulation.coupled.predictor
6Joint met + source inversionplume_simulation.coupled.joint_met

The coupled subpackage doesn’t exist yet; this is the proposed shape. It’s the only tier where new top-level modules are still needed once Tiers I–III and RTM are done.


Validation strategy


Open questions

References
  1. Veefkind, J. P., Aben, I., McMullan, K., Förster, H., de Vries, J., Otter, G., Claas, J., Eskes, H. J., de Haan, J. F., Kleipool, Q., & others. (2012). TROPOMI on the ESA Sentinel-5 Precursor: a GMES mission for global observations of the atmospheric composition for climate, air quality and ozone layer applications. Remote Sensing of Environment, 120, 70–83.
  2. Green, R. O., & others. (2022). EMIT: Earth Surface Mineral Dust Source Investigation. https://earth.jpl.nasa.gov/emit/
  3. Carbon Mapper. (2024). Carbon Mapper: airborne and satellite imaging spectroscopy for greenhouse gas monitoring. https://carbonmapper.org/
  4. GHGSat Inc. (2016). GHGSat WAF-P imaging spectrometer constellation. https://www.ghgsat.com/