Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Tier V — Source population & forecasting

UNEP
IMEO
MARS

Forward model: thinned marked temporal point process (TMTPP) over emission events, with per-event marks drawn from Tier I–IV posteriors and per-event detection thinning by per-satellite POD models.

This tier sits above the per-event physics tiers. Tiers I–IV answer “what’s the emission rate from this plume right now?” — single overpass, single source. Tier V answers a different family of questions:

Inventory and forecasting are co-equal products of Tier V — not just totals. The inverted intensity λ(t)\lambda(t) directly powers operational forecasts (dispatch windows, occurrence probabilities); see Persistency.

Sub-pages:


TMTPP foundations — the three-term log-likelihood

The full population log-likelihood has three terms (derived in 06b; foundations in Daley & Vere-Jones, 2003Daley & Vere-Jones, 2008):

logL  =  iDlogp(detectedif,λ,Pd)mark contribution  +  iDlogλ(ti)detection-time intensity    0Tλ(t)[Pd(Q)f(Q)dQ]dtintegrated thinned rate\log L \;=\; \underbrace{\sum_{i \in \mathcal{D}} \log p(\text{detected}_i \mid f, \lambda, P_d)}_{\text{mark contribution}} \;+\; \underbrace{\sum_{i \in \mathcal{D}} \log \lambda(t_i)}_{\text{detection-time intensity}} \;-\; \underbrace{\int_{0}^{T} \lambda(t) \left[ \int P_d(Q)\, f(Q)\, \mathrm{d}Q \right] \mathrm{d}t}_{\text{integrated thinned rate}}

The third term is what makes λ and PdP_d jointly identifiable — without it the two trade off.

Mark contribution and the soft-observation framing

The per-event posterior from Tiers I–IV is a soft observation of the (unknown) true mark QiQ_i. This is the same Bayesian-deconvolution / errors-in-variables structure used in measurement-error regression. The per-event mark contribution is:

p(detectedif,λ,Pd)  =  Pd(Q)Li(Q)f(Q)dQp(\text{detected}_i \mid f, \lambda, P_d) \;=\; \int P_d(Q)\, L_i(Q)\, f(Q)\, \mathrm{d}Q

where Li(Q)=p(observationiQ)L_i(Q) = p(\text{observation}_i \mid Q) is the per-event likelihood, not the posterior. In sample-based practice (per-event posterior samples Qi(s)p(Qobservationi)Q_i^{(s)} \sim p(Q \mid \text{observation}_i)):

p(detectedif,λ,Pd)    1Ss=1SPd(Qi(s))f(Qi(s))πper-event(Qi(s))p(\text{detected}_i \mid f, \lambda, P_d) \;\approx\; \frac{1}{S}\, \sum_{s=1}^{S}\, P_d(Q_i^{(s)})\, \frac{f(Q_i^{(s)})}{\pi_\text{per-event}(Q_i^{(s)})}

with πper-event(Q)\pi_\text{per-event}(Q) the per-event prior used at Tier I–IV. The ratio f/πper-eventf / \pi_\text{per-event} is the importance weight that re-points the per-event posterior at the population mark distribution.

This is the central math of cross-tier inference. Currently the prototype in methane_pod.fitting summarises per-event posteriors to point estimates before the population fit, side-stepping the importance correction. Formalising this is the v1 deliverable for 06a_instantaneous.md.


How the cycle adapts at population scale

The six-step cycle still applies, but the objects change:

Table (1):Six-step cycle adaptation: Tier I–IV (per event) vs. Tier V (population).

StepTier I–IV (per event)Tier V (population)
1 — Simple modelForward physics (plume / PDE / RTM)Generative TMTPP: λ(t)\lambda(t) + mark f(Q)f(Q) + POD Pd()P_d(\cdot)
2 — Model-based inferenceMAP / MCMC over source paramsNumPyro NUTS over (λ params,mark params,POD params)(\lambda \text{ params}, \text{mark params}, \text{POD params}). Cheap at O(104)O(10^{4}) events (minutes); hours-to-days at O(106)O(10^{6}) events (national catalog)
3 — Model emulatorFNO / neural ODE on the PDESkip when NUTS fits in budget. Optionally a normalising flow over the population posterior for repeated re-fits or sensitivity studies
4 — Emulator-based inferencePDE-free 4D-VarVariational fit (numpyro.infer.SVI) or flow-based posterior approximation; required at national catalog scale
5 — Amortized predictorPer-overpass QQ predictor(basin tile,history window)(\text{basin tile}, \text{history window}) \to posterior over (λ,f(Q),total mass,next-event time)(\lambda, f(Q), \text{total mass}, \text{next-event time}) conditioned on per-event evidence and met-region context
6 — ImproveBetter physicsSpatial point process (links to Tier III); multi-satellite fusion; varying-coefficient POD (per-(basin, season, scene class) hierarchy); non-Poisson clustering (Hawkes / Cox)

Tile definition: an H3 hex-resolution-7 cell (~5 km²) for sub-basin work, or a basin polygon for inventory accounting. History window: 30–365 days, hierarchical prior on the cutoff.

Varying-coefficient POD: PdP_d parameters indexed by (basin,season,scene class)(\text{basin}, \text{season}, \text{scene class}) with hierarchical shrinkage to the global POD. Captures regional / seasonal detection differences without inflating parameter count.


Cross-tier interface — the load-bearing contract

Payload schema

Every per-event posterior consumed by Tier V must carry:

Table (2):Per-event posterior payload — fields, types, notes.

FieldTypeNotes
posterior_samples(S,) array of QQ drawsOR posterior_summary for Gaussian shorthand
posterior_summary(μlogQ,σlogQ)(\mu_{\log Q}, \sigma_{\log Q})lognormal quick form when full samples are too heavy
per_event_prior_logpdfcallable Qlogπ(Q)Q \to \log \pi(Q)required for the importance correction; without it the population fit is biased
instrument_idstrdispatch into per-instrument POD
t_detectionfloat (UTC seconds)for λ(t)\lambda(t)
x0_posterior(μxy,Σxy)(\boldsymbol{\mu}_{xy}, \boldsymbol{\Sigma}_{xy})for spatial Cox-process upgrade
qualitydictconfidence flags from the Tier I–IV quality bitmask

Independence assumption — the v1 caveat

The factorised likelihood above assumes detections at different overpasses are independent. Two overpasses of the same physical leak (e.g. GHGSat then TROPOMI two days later) violate this.


Validation strategy


Module layout — depend on methane_pod, don’t absorb it

plumax depends on the standalone methane_pod package (pinned methane_pod >= 0.1, < 0.2 for v1); the population-scale code is not re-implemented. Rationale:

Table (3):Tier V module layout — concern, target module, status.

ConcernModuleStatus
Intensity registry λ(t)\lambda(t)methane_pod.intensitylibrary ✓ (13 kernels)
POD registry Pd()P_d(\cdot)methane_pod.pod_functionslibrary ✓ (10 models)
Missing-mass MC simulatormethane_pod.paradoxlibrary ✓
NUTS fittermethane_pod.fittinglibrary ✓; importance-correction integration ☐
Per-event posterior summariserplume_simulation.population.adapter.summariser
Per-event prior recall (πper-event\pi_\text{per-event} lookup)plume_simulation.population.adapter.prior_recall☐ — required for importance weighting
Importance-weight calculatorplume_simulation.population.adapter.importance
Multi-satellite POD unionplume_simulation.population.adapter.pod_union
Catalog schema (CSV / parquet)plume_simulation.population.adapter.schema
Real-data CSV ingestionplume_simulation.population.ingest☐ (placeholder in 07_pod_fitting_mcmc.md)
Population SBC harnessplume_simulation.population.validation.sbc
Importance-weight ESS diagnosticplume_simulation.population.validation.iw_ess
Per-event-prior swap-out testplume_simulation.population.validation.prior_swap
Spatial Cox-process extension (v2)plume_simulation.population.spatial

A plume_simulation.population subpackage doesn’t exist yet; this is the proposed shape.


Tier III’s distributed source field S(x,t)S(\mathbf{x},t) is exactly a spatial inhomogeneous Poisson rate at the population level — temporally aggregated, this is the spatial intensity of a Cox process over emission events. The v2 spatial extension of Tier V is the same mathematical object Tier III already inverts at the per-event timescale, just averaged over a longer horizon. The two tiers should share the parameterisation: a Matérn GP prior on logS(x,t)\log S(\mathbf{x},t) plays the role of both Tier III’s source-field prior and Tier V.v2’s spatial Cox-process intensity.

This isn’t a coincidence — it’s why plumax’s tier structure works: the same mathematical objects appear at different scales.


Status snapshot


Open questions

References
  1. Daley, D. J., & Vere-Jones, D. (2003). An Introduction to the Theory of Point Processes, Volume I: Elementary Theory and Methods (2nd ed.). Springer. 10.1007/b97277
  2. Daley, D. J., & Vere-Jones, D. (2008). An Introduction to the Theory of Point Processes, Volume II: General Theory and Structure (2nd ed.). Springer. 10.1007/978-0-387-49835-5
  3. U.S. Environmental Protection Agency. (2024). Inventory of U.S. Greenhouse Gas Emissions and Sinks: 1990–2022. EPA 430-R-24-004. https://www.epa.gov/ghgemissions/inventory-us-greenhouse-gas-emissions-and-sinks
  4. Scarpelli, T. R., Jacob, D. J., Maasakkers, J. D., Sulprizio, M. P., Sheng, J.-X., Rose, K., Romeo, L., Worden, J. R., & Janssens-Maenhout, G. (2020). A global gridded (0.1° × 0.1°) inventory of methane emissions from oil, gas, and coal exploitation based on national reports to the United Nations Framework Convention on Climate Change. Earth System Science Data, 12(1), 563–575. 10.5194/essd-12-563-2020
  5. Maasakkers, J. D., Mcduffie, E. E., Sulprizio, M. P., Chen, C., Schultz, M., Brunelle, L., Thrush, R., Steller, J., Sherry, C., Jacob, D. J., & others. (2023). A gridded inventory of annual 2012-2018 U.S. anthropogenic methane emissions. Environmental Science & Technology, 57(43), 16276–16288. 10.1021/acs.est.3c05138
  6. Jacob, D. J., Varon, D. J., Cusworth, D. H., Dennison, P. E., Frankenberg, C., Gautam, R., Guanter, L., Kelley, J., McKeever, J., Ott, L. E., Poulter, B., & others. (2022). Quantifying methane emissions from the global scale down to point sources using satellite observations of atmospheric methane. Atmospheric Chemistry and Physics, 22(14), 9617–9646. 10.5194/acp-22-9617-2022