Skip to article frontmatterSkip to article content

Beer-Lambert's Law - Simplified

United Nations Environmental Programme

Combined Approximation (Taylor + MacLaurin) - Relative/Normalized Radiance

Introduction: The Operational Standard

The relative combined linearized model represents the most widely used formulation in operational atmospheric methane detection. This model applies two sequential approximations—Taylor expansion of optical depth followed by MacLaurin approximation of the exponential—to the normalized Beer-Lambert law, resulting in a fully linear relationship between observations and methane enhancement.

Historical Context and Adoption

This formulation emerged from the practical needs of satellite and airborne hyperspectral methane detection in the 2010s. Early systems struggled with:

  1. Computational constraints: Processing millions of pixels in real-time

  2. Calibration uncertainties: Absolute radiometric calibration errors of 5-10%

  3. Scene variability: Surface reflectance variations of 2-10× across footprints

The relative combined model addressed all three challenges by providing a fast, calibration-independent, linear detection algorithm with acceptable accuracy for moderate plumes.

Key operational systems using this approach:

Relationship to Model Family

This model sits at the “sweet spot” in the approximation hierarchy:

Exact ←────────── Approximation Level ──────────→ Simplified │ │ Nonlinear Taylor Only Combined MacLaurin Only (Models 1, 1B) (Models 3, 3B) (Models 4, 4B) (Models 2, 2B) │ │ │ │ Most Accurate Good Accuracy Practical Theoretical Slowest Moderate Speed Fast Fastest Iterative Iterative Closed-Form Closed-Form All Plumes τ < 0.3 τ < 0.1 τ < 0.05

Model 4B combines:

  1. Taylor expansion (Model 3B): Linearizes optical depth around background → handles concentration perturbations

  2. MacLaurin approximation (Model 2B): Linearizes exponential transmittance → enables closed-form solution

  3. Normalization: Divides by background radiance → eliminates calibration dependencies

Result: A fully linear, calibration-independent, closed-form detection algorithm.

The Two Approximations: Physical Justification

Approximation 1: Taylor Expansion of Optical Depth

Mathematical form:

τ(VMR)=τ(VMR0+ΔVMR)τ0+dτdVMR0ΔVMR\tau(\text{VMR}) = \tau(\text{VMR}_0 + \Delta\text{VMR}) \approx \tau_0 + \frac{d\tau}{d\text{VMR}}\bigg|_0 \cdot \Delta\text{VMR}

Physical justification:

Optical depth is defined as:

τ=0Lσ(s)nCH4(s)ds\tau = \int_0^L \sigma(s) n_{\text{CH}_4}(s) \, ds

where ss is path coordinate. For well-mixed plumes over a sensor footprint:

τ=σNtotalVMR106LAMF\tau = \sigma \cdot N_{\text{total}} \cdot \text{VMR} \cdot 10^{-6} \cdot L \cdot \text{AMF}

This is exactly linear in VMR (no approximation needed for τ\tau itself). The Taylor expansion becomes exact:

τ(VMR0+ΔVMR)=τ0+dτdVMRΔVMR\tau(\text{VMR}_0 + \Delta\text{VMR}) = \tau_0 + \frac{d\tau}{d\text{VMR}} \cdot \Delta\text{VMR}

because d2τdVMR2=0\frac{d^2\tau}{d\text{VMR}^2} = 0 (second derivative vanishes).

Key insight: This isn’t really an approximation—it’s an exact representation of the linear relationship between concentration and optical depth. The “approximation” comes from assuming:

When it breaks down:

Typical accuracy: 1-5% error for boundary layer plumes, 10-20% for elevated plumes

Approximation 2: MacLaurin Expansion of Exponential

Mathematical form:

exp(Δτ)1Δτ\exp(-\Delta\tau) \approx 1 - \Delta\tau

Physical justification:

The exponential represents the multiplicative attenuation through the atmosphere. For weak absorption:

Full series:

exp(Δτ)=1Δτ+(Δτ)22!(Δτ)33!+\exp(-\Delta\tau) = 1 - \Delta\tau + \frac{(\Delta\tau)^2}{2!} - \frac{(\Delta\tau)^3}{3!} + \cdots

Truncate at first order:

exp(Δτ)1Δτ\exp(-\Delta\tau) \approx 1 - \Delta\tau

Truncation error:

Error=(Δτ)22!(Δτ)33!+(Δτ)22\text{Error} = \frac{(\Delta\tau)^2}{2!} - \frac{(\Delta\tau)^3}{3!} + \cdots \approx \frac{(\Delta\tau)^2}{2}

Relative error:

Errorexp(Δτ)(Δτ)2/21Δτ(Δτ)22\frac{|\text{Error}|}{\exp(-\Delta\tau)} \approx \frac{(\Delta\tau)^2/2}{1 - \Delta\tau} \approx \frac{(\Delta\tau)^2}{2}

(for small Δτ\Delta\tau)

Numerical accuracy table:

Δτ\Delta\tauFractional AbsorptionRelative ErrorAcceptable?
0.011.0%0.005%✓ Excellent
0.054.9%0.13%✓ Very good
0.109.5%0.5%✓ Good (threshold)
0.1513.9%1.1%⚠ Marginal
0.2018.1%2.0%✗ Poor
0.3025.9%4.7%✗ Very poor

Operational threshold: Most systems require Δτ<0.1\Delta\tau < 0.1 (< 10% absorption) for this approximation.

Physical interpretation of validity:

The MacLaurin approximation assumes weak absorption where:

  1. Most photons survive: T=exp(Δτ)1Δτ>0.9T = \exp(-\Delta\tau) \approx 1 - \Delta\tau > 0.9

  2. Linear regime: Fractional change proportional to optical depth

  3. No saturation: Absorption doesn’t significantly deplete the radiation field

When it breaks down:

Typical accuracy: 0.5% for Δτ<0.1\Delta\tau < 0.1, degrades rapidly beyond

Combined Effect: Why Two Approximations Work Together

Synergistic simplification:

Starting from exact normalized radiance:

Lnorm=exp(Δτ(ΔVMR))L_{\text{norm}} = \exp(-\Delta\tau(\Delta\text{VMR}))

Apply both approximations:

Lnormexp(dτdVMRΔVMR)1dτdVMRΔVMRL_{\text{norm}} \approx \exp\left(-\frac{d\tau}{d\text{VMR}} \cdot \Delta\text{VMR}\right) \approx 1 - \frac{d\tau}{d\text{VMR}} \cdot \Delta\text{VMR}

Result: Linear in ΔVMR\Delta\text{VMR}!

Computational advantage:

Speed comparison:

For a 1000×1000 pixel hyperspectral cube:

This 100× speedup enables real-time processing essential for operational systems.

Accuracy vs. Computational Cost Trade-off

The Pareto frontier: Accuracy ↑ │ Exact Nonlinear ● │ ╲ │ ╲ │ ╲ Taylor Only │ ● │ ╲╲ │ ╲╲ │ Combined ●●●● ← Optimal trade-off │ ╲╲╲╲ │ ╲╲╲╲ │ MacLaurin Only ● └────────────────────────────────────────→ Speed

Model 4B occupies the “knee” of the curve: Maximum gain in speed for minimum loss in accuracy.

Quantitative comparison for ΔVMR=500\Delta\text{VMR} = 500 ppm (typical moderate plume):

ModelRetrieval ErrorComputation TimeOperational?
Exact Nonlinear0% (reference)50 ms/pixel✗ Too slow
Taylor Only~2-5%10 ms/pixel⚠ Borderline
Combined (4B)~5-10%0.5 ms/pixel✓ Optimal
MacLaurin Only~15-20%0.5 ms/pixel✗ Too inaccurate

Decision rationale for operational systems:


1. Concentration (Background + Enhancement)

Same as all models:

VMRtotal=VMRbg+ΔVMR\text{VMR}_{\text{total}} = \text{VMR}_{\text{bg}} + \Delta\text{VMR}

No approximation applied to this fundamental relationship.


2. Optical Depth + Jacobian

Optical Depth (Taylor Expansion Applied)

Background optical depth (exact):

τbg=σNtotalVMRbg106LAMF\tau_{\text{bg}} = \sigma \cdot N_{\text{total}} \cdot \text{VMR}_{\text{bg}} \cdot 10^{-6} \cdot L \cdot \text{AMF}

Differential optical depth (Taylor expansion around background):

Since τ\tau is linear in VMR, the Taylor expansion is exact:

Δτ=τ(VMRtotal)τbg=τVMRbgΔVMR\Delta\tau = \tau(\text{VMR}_{\text{total}}) - \tau_{\text{bg}} = \frac{\partial \tau}{\partial \text{VMR}}\bigg|_{\text{bg}} \cdot \Delta\text{VMR}

Explicit form:

Δτ=σNtotalΔVMR106LAMF\boxed{\Delta\tau = \sigma \cdot N_{\text{total}} \cdot \Delta\text{VMR} \cdot 10^{-6} \cdot L \cdot \text{AMF}}

Units: dimensionless

Physical interpretation:

This is where the first “approximation” enters, but it’s really the assumption that:

Δτ=ΔVMRσNtotal106LAMFconstant over footprint\Delta\tau = \Delta\text{VMR} \cdot \underbrace{\sigma \cdot N_{\text{total}} \cdot 10^{-6} \cdot L \cdot \text{AMF}}_{\text{constant over footprint}}

Validity requires:

  1. Spatially uniform enhancement over sensor footprint

  2. Single effective path length LL (not vertically varying)

  3. Constant atmospheric state (TT, PP uniform → σ\sigma, NtotalN_{\text{total}} uniform)

Typical operational conditions where this holds:

Where it breaks down:

Jacobian of Optical Depth

(Δτ)(ΔVMR)=σNtotal106LAMF\frac{\partial (\Delta\tau)}{\partial (\Delta\text{VMR})} = \sigma \cdot N_{\text{total}} \cdot 10^{-6} \cdot L \cdot \text{AMF}

Units: ppm⁻¹

Key property: Constant (independent of VMR, exact for linear τ\tau)

This constant Jacobian is what enables the closed-form solution later.


3. Transmittance + Jacobian

Transmittance (Both Approximations Combined)

Normalized transmittance (definition, exact):

Tnorm=TtotalTbg=exp(τtotal)exp(τbg)=exp(Δτ)T_{\text{norm}} = \frac{T_{\text{total}}}{T_{\text{bg}}} = \frac{\exp(-\tau_{\text{total}})}{\exp(-\tau_{\text{bg}})} = \exp(-\Delta\tau)

Apply MacLaurin approximation:

Tnorm1ΔτT_{\text{norm}} \approx 1 - \Delta\tau

With Taylor-expanded Δτ\Delta\tau:

Tnorm1σNtotalΔVMR106LAMF\boxed{T_{\text{norm}} \approx 1 - \sigma \cdot N_{\text{total}} \cdot \Delta\text{VMR} \cdot 10^{-6} \cdot L \cdot \text{AMF}}

Units: dimensionless

Physical interpretation of combined approximation:

Starting from exact physics:

Tnorm(ΔVMR)=exp(σNtotalΔVMR106LAMF)T_{\text{norm}}(\Delta\text{VMR}) = \exp\left(-\sigma N_{\text{total}} \Delta\text{VMR} \cdot 10^{-6} L \cdot \text{AMF}\right)

The combined approximation replaces:

Why this works for weak absorption:

Plot of exp(x)\exp(-x) vs. 1x1-x:

1.0 ●──────────────── │╲
│ ╲ Exponential │ ╲ ─── 0.9 │ ● Linear approximation │ ╲ ─ ─ ─ │ ╲
│ ╲
0.8 │ ● Good match │ ╲ for x < 0.1 │ ╲
0.7 │ ●
└─────────────→ Δτ 0 0.1 0.2 0.3

For Δτ<0.1\Delta\tau < 0.1: curves nearly identical → approximation excellent.

Saturation effects (what we’re missing):

The exponential exp(Δτ)\exp(-\Delta\tau) captures that:

The MacLaurin approximation assumes we stay far from saturation (Δτ1\Delta\tau \ll 1).

Operational impact:

Jacobian of Transmittance

Tnorm(ΔVMR)=(ΔVMR)[1Δτ]=(Δτ)(ΔVMR)\frac{\partial T_{\text{norm}}}{\partial (\Delta\text{VMR})} = \frac{\partial}{\partial (\Delta\text{VMR})}[1 - \Delta\tau] = -\frac{\partial (\Delta\tau)}{\partial (\Delta\text{VMR})}

Explicit form:

Tnorm(ΔVMR)=σNtotal106LAMF\frac{\partial T_{\text{norm}}}{\partial (\Delta\text{VMR})} = -\sigma \cdot N_{\text{total}} \cdot 10^{-6} \cdot L \cdot \text{AMF}

Units: ppm⁻¹

Key property: Constant - does not vary with VMR or optical depth.

Physical consequence:

The constant Jacobian means the model predicts:

Reality (from exact model):

Operational consequence:

For moderate plumes (Δτ<0.1\Delta\tau < 0.1):

This causes a systematic bias: The combined model will slightly underestimate VMR for moderate-to-strong plumes because it overestimates sensitivity.

Correction factor:

Empirical studies show the bias is approximately:

Bias(Δτ)22α^combined\text{Bias} \approx -\frac{(\Delta\tau)^2}{2} \cdot \hat{\alpha}_{\text{combined}}

For Δτ=0.1\Delta\tau = 0.1: Bias ≈ 0.5% underestimation (negligible). For Δτ=0.2\Delta\tau = 0.2: Bias ≈ 2% underestimation (small).

This is acceptable given other uncertainties (wind speed, path length, etc. often 10-20%).


4. Beer’s Law + Jacobian

Beer’s Law (Normalized, Linearized Form)

Starting from normalized radiance:

Lnorm=TnormL_{\text{norm}} = T_{\text{norm}}

Apply combined approximation:

Lnorm1Δτ=1σNtotalΔVMR106LAMF\boxed{L_{\text{norm}} \approx 1 - \Delta\tau = 1 - \sigma \cdot N_{\text{total}} \cdot \Delta\text{VMR} \cdot 10^{-6} \cdot L \cdot \text{AMF}}

Units: dimensionless

This is the fundamental forward model: Linear relationship between normalized radiance and VMR enhancement.

Physical interpretation:

Lnorm=1backgroundσNtotal106LAMFsensitivity factorΔVMRplume strengthL_{\text{norm}} = \underbrace{1}_{\text{background}} - \underbrace{\sigma N_{\text{total}} \cdot 10^{-6} L \cdot \text{AMF}}_{\text{sensitivity factor}} \cdot \underbrace{\Delta\text{VMR}}_{\text{plume strength}}

Three components:

  1. Unity baseline: Background normalized to 1 (by definition)

  2. Sensitivity factor: Depends on molecular physics (σ\sigma), atmosphere (NtotalN_{\text{total}}), geometry (LL, AMF)

  3. Plume strength: The unknown to estimate

Advantages of this form:

  1. Calibration-free: No F0F_0 or RR appears

  2. Scene-independent: Doesn’t depend on τbg\tau_{\text{bg}} or VMRbg\text{VMR}_{\text{bg}}

  3. Linear: Simple structure enables closed-form inversion

  4. Physically interpretable: Each term has clear meaning

  5. Computationally efficient: Single matrix multiply for forward model

Comparison across model hierarchy:

ModelForward ModelComplexityAccuracy
ExactLnorm=exp(σNΔVMR106LAMF)L_{\text{norm}} = \exp(-\sigma N \Delta\text{VMR} \cdot 10^{-6} L \cdot \text{AMF})NonlinearHighest
Taylor onlyLnorm=exp(Δτ)L_{\text{norm}} = \exp(-\Delta\tau)NonlinearHigh
CombinedLnorm=1σNΔVMR106LAMFL_{\text{norm}} = 1 - \sigma N \Delta\text{VMR} \cdot 10^{-6} L \cdot \text{AMF}LinearGood
MacLaurin onlyLnorm=1σNVMRtotal106LAMFL_{\text{norm}} = 1 - \sigma N \text{VMR}_{\text{total}} \cdot 10^{-6} L \cdot \text{AMF}LinearPoor

The combined model achieves the optimal balance.

Jacobian of Normalized Radiance

Lnorm(ΔVMR)=σNtotal106LAMF\frac{\partial L_{\text{norm}}}{\partial (\Delta\text{VMR})} = -\sigma \cdot N_{\text{total}} \cdot 10^{-6} \cdot L \cdot \text{AMF}

Units: ppm⁻¹

Key insight: The Jacobian is the negative of the sensitivity factor.

Wavelength dependence:

Since σ(λ)\sigma(\lambda) varies strongly with wavelength:

H(λ)=σ(λ)Ntotal106LAMF\mathbf{H}(\lambda) = -\boldsymbol{\sigma}(\lambda) \cdot N_{\text{total}} \cdot 10^{-6} \cdot L \cdot \text{AMF}

The Jacobian is a vector with different values at each wavelength:

This spectral structure is what enables detection: The plume signal has a characteristic spectral shape matching the methane absorption spectrum.

Matched filter exploits this by correlating observations with the expected spectral signature.


5. Observations

Observation Model

ynorm=Lnorm(ΔVMR)+ϵnorm\mathbf{y}_{\text{norm}} = L_{\text{norm}}(\Delta\text{VMR}) + \boldsymbol{\epsilon}_{\text{norm}}

With combined approximation:

ynorm=1Hα+ϵnorm\mathbf{y}_{\text{norm}} = \mathbf{1} - \mathbf{H} \cdot \alpha + \boldsymbol{\epsilon}_{\text{norm}}

where:

Model structure: This is a standard linear Gaussian model:

ynorm=μ+Hα+ϵ\mathbf{y}_{\text{norm}} = \boldsymbol{\mu} + \mathbf{H}\alpha + \boldsymbol{\epsilon}

where μ=1\boldsymbol{\mu} = \mathbf{1} is the known mean.

This enables textbook statistical inference:

All have closed-form analytical solutions.

Constructing Normalized Observations

From absolute radiance:

ynorm=yabsLbg\mathbf{y}_{\text{norm}} = \frac{\mathbf{y}_{\text{abs}}}{\mathbf{L}_{\text{bg}}}

Estimating background Lbg\mathbf{L}_{\text{bg}}:

Method 1: Spatial background (most common)

Method 2: Temporal background

Method 3: Model-based

Operational consideration:

Choice of background estimation has large impact on performance:

Best practice: Use spatially local, spectrally matched background from plume-free regions with median aggregation.

Noise Characteristics

Normalized noise covariance:

If absolute noise is ϵabsN(0,Σabs)\boldsymbol{\epsilon}_{\text{abs}} \sim \mathcal{N}(\mathbf{0}, \mathbf{\Sigma}_{\text{abs}}):

ϵnorm=ϵabsLbgN(0,diag(1Lbg)Σabsdiag(1Lbg))\boldsymbol{\epsilon}_{\text{norm}} = \frac{\boldsymbol{\epsilon}_{\text{abs}}}{\mathbf{L}_{\text{bg}}} \sim \mathcal{N}\left(\mathbf{0}, \text{diag}\left(\frac{1}{\mathbf{L}_{\text{bg}}}\right) \mathbf{\Sigma}_{\text{abs}} \text{diag}\left(\frac{1}{\mathbf{L}_{\text{bg}}}\right)\right)

For diagonal Σabs\mathbf{\Sigma}_{\text{abs}}:

Σnorm,ii=Σabs,iiLbg,i2\Sigma_{\text{norm},ii} = \frac{\Sigma_{\text{abs},ii}}{L^2_{\text{bg},i}}

Physical interpretation:

Typical values:

Operational impact:


6. Measurement Model

Forward Measurement Model (Linear)

ynorm=g(α)+ϵnorm\mathbf{y}_{\text{norm}} = g(\alpha) + \boldsymbol{\epsilon}_{\text{norm}}

where:

g(α)=1Hαg(\alpha) = \mathbf{1} - \mathbf{H}\alpha

Linearity is the key:

g(α1+α2)=g(α1)+g(α2)1g(\alpha_1 + \alpha_2) = g(\alpha_1) + g(\alpha_2) - \mathbf{1}

Actually, better to write:

g(α)1=Hαg(\alpha) - \mathbf{1} = -\mathbf{H}\alpha

So the deviation from background is linear in α\alpha:

d=ynorm1=Hα+ϵnorm\mathbf{d} = \mathbf{y}_{\text{norm}} - \mathbf{1} = -\mathbf{H}\alpha + \boldsymbol{\epsilon}_{\text{norm}}

Inverse Problem: Maximum Likelihood Estimation

Likelihood function:

p(ynormα)=1(2π)n/2Σnorm1/2exp(12(ynormg(α))TΣnorm1(ynormg(α)))p(\mathbf{y}_{\text{norm}} | \alpha) = \frac{1}{(2\pi)^{n/2}|\mathbf{\Sigma}_{\text{norm}}|^{1/2}} \exp\left(-\frac{1}{2}(\mathbf{y}_{\text{norm}} - g(\alpha))^T \mathbf{\Sigma}_{\text{norm}}^{-1} (\mathbf{y}_{\text{norm}} - g(\alpha))\right)

Log-likelihood:

logp(ynormα)=12(d+Hα)TΣnorm1(d+Hα)+const\log p(\mathbf{y}_{\text{norm}} | \alpha) = -\frac{1}{2}(\mathbf{d} + \mathbf{H}\alpha)^T \mathbf{\Sigma}_{\text{norm}}^{-1} (\mathbf{d} + \mathbf{H}\alpha) + \text{const}

Maximize by taking derivative:

logpα=HTΣnorm1(d+Hα)=0\frac{\partial \log p}{\partial \alpha} = -\mathbf{H}^T \mathbf{\Sigma}_{\text{norm}}^{-1} (\mathbf{d} + \mathbf{H}\alpha) = 0

Solve for α\alpha:

HTΣnorm1Hα=HTΣnorm1d\mathbf{H}^T \mathbf{\Sigma}_{\text{norm}}^{-1} \mathbf{H} \cdot \alpha = -\mathbf{H}^T \mathbf{\Sigma}_{\text{norm}}^{-1} \mathbf{d}
α^ML=HTΣnorm1dHTΣnorm1H=HTΣnorm1(1ynorm)HTΣnorm1H\boxed{\hat{\alpha}_{\text{ML}} = -\frac{\mathbf{H}^T \mathbf{\Sigma}_{\text{norm}}^{-1} \mathbf{d}}{\mathbf{H}^T \mathbf{\Sigma}_{\text{norm}}^{-1} \mathbf{H}} = \frac{\mathbf{H}^T \mathbf{\Sigma}_{\text{norm}}^{-1} (\mathbf{1} - \mathbf{y}_{\text{norm}})}{\mathbf{H}^T \mathbf{\Sigma}_{\text{norm}}^{-1} \mathbf{H}}}

Units: ppm

This is a closed-form, single-step solution! No iteration, no convergence issues, no initial guess needed.

Computational cost breakdown:

For nn wavelengths:

  1. Compute innovation: d=ynorm1\mathbf{d} = \mathbf{y}_{\text{norm}} - \mathbf{1}O(n)O(n)

  2. Matrix-vector product: Σnorm1d\mathbf{\Sigma}_{\text{norm}}^{-1}\mathbf{d}O(n2)O(n^2) (or O(n)O(n) if diagonal)

  3. Inner products: HT()\mathbf{H}^T(\cdots)O(n)O(n)

  4. Division: Scalar / scalar → O(1)O(1)

Total: O(n2)O(n^2) general, O(n)O(n) for diagonal covariance

Typical values:

Compare to iterative methods:

This speed enables real-time operations for airborne and satellite missions.

Statistical Properties of the Estimate

Variance:

Var(α^ML)=(HTΣnorm1H)1\text{Var}(\hat{\alpha}_{\text{ML}}) = \left(\mathbf{H}^T \mathbf{\Sigma}_{\text{norm}}^{-1} \mathbf{H}\right)^{-1}

Standard error:

σα^=(HTΣnorm1H)1\sigma_{\hat{\alpha}} = \sqrt{\left(\mathbf{H}^T \mathbf{\Sigma}_{\text{norm}}^{-1} \mathbf{H}\right)^{-1}}

This is the Cramér-Rao lower bound: No unbiased estimator can have lower variance. The matched filter achieves optimal statistical efficiency.

For white noise (Σnorm=σnorm2I\mathbf{\Sigma}_{\text{norm}} = \sigma^2_{\text{norm}} \mathbf{I}):

σα^=σnormHTH=σnormH\sigma_{\hat{\alpha}} = \frac{\sigma_{\text{norm}}}{\sqrt{\mathbf{H}^T\mathbf{H}}} = \frac{\sigma_{\text{norm}}}{\|\mathbf{H}\|}

Physical interpretation:

Typical uncertainty:

Detection threshold:

For 3σ detection: Need α^>3σα^\hat{\alpha} > 3\sigma_{\hat{\alpha}} → Minimum detectable enhancement ~150-600 ppm depending on conditions.


7. Taylor Expanded Measurement Model (Useful for 3DVar)

Exact Representation (No Additional Approximation)

Since the forward model is already linear, the “Taylor expansion” is exact:

g(α)=g(α0)+H(αα0)g(\alpha) = g(\alpha_0) + \mathbf{H}(\alpha - \alpha_0)

This is not an approximation—it’s the exact representation of a linear function.

At background (α0=0\alpha_0 = 0):

Linearized model:

ynorm=1+Hα+ϵnorm\mathbf{y}_{\text{norm}} = \mathbf{1} + \mathbf{H}\alpha + \boldsymbol{\epsilon}_{\text{norm}}

Innovation:

d=ynorm1=Hα+ϵnorm\mathbf{d} = \mathbf{y}_{\text{norm}} - \mathbf{1} = \mathbf{H}\alpha + \boldsymbol{\epsilon}_{\text{norm}}

This is already in the standard linear form for data assimilation.

3DVar Formulation

Cost function with prior constraint:

J(α)=12B(ααb)2+12(dHα)TΣnorm1(dHα)J(\alpha) = \frac{1}{2B}(\alpha - \alpha_b)^2 + \frac{1}{2}(\mathbf{d} - \mathbf{H}\alpha)^T \mathbf{\Sigma}_{\text{norm}}^{-1} (\mathbf{d} - \mathbf{H}\alpha)

where:

Physical interpretation:

Analytical solution:

Taking derivative and setting to zero:

Jα=1B(ααb)HTΣnorm1(dHα)=0\frac{\partial J}{\partial \alpha} = \frac{1}{B}(\alpha - \alpha_b) - \mathbf{H}^T\mathbf{\Sigma}_{\text{norm}}^{-1}(\mathbf{d} - \mathbf{H}\alpha) = 0

Rearranging:

(1B+HTΣnorm1H)α=αbB+HTΣnorm1d\left(\frac{1}{B} + \mathbf{H}^T\mathbf{\Sigma}_{\text{norm}}^{-1}\mathbf{H}\right)\alpha = \frac{\alpha_b}{B} + \mathbf{H}^T\mathbf{\Sigma}_{\text{norm}}^{-1}\mathbf{d}
α^3DVar=BHTΣnorm1d+αbBHTΣnorm1H+1\boxed{\hat{\alpha}_{3\text{DVar}} = \frac{B\mathbf{H}^T\mathbf{\Sigma}_{\text{norm}}^{-1}\mathbf{d} + \alpha_b}{B\mathbf{H}^T\mathbf{\Sigma}_{\text{norm}}^{-1}\mathbf{H} + 1}}

For no prior plume (αb=0\alpha_b = 0):

α^3DVar=BHTΣnorm1dBHTΣnorm1H+1=HTΣnorm1dHTΣnorm1H+B1\hat{\alpha}_{3\text{DVar}} = \frac{B\mathbf{H}^T\mathbf{\Sigma}_{\text{norm}}^{-1}\mathbf{d}}{B\mathbf{H}^T\mathbf{\Sigma}_{\text{norm}}^{-1}\mathbf{H} + 1} = \frac{\mathbf{H}^T\mathbf{\Sigma}_{\text{norm}}^{-1}\mathbf{d}}{\mathbf{H}^T\mathbf{\Sigma}_{\text{norm}}^{-1}\mathbf{H} + B^{-1}}

Posterior variance:

σα,post2=(1B+HTΣnorm1H)1=(B1+HTΣnorm1H)1\sigma^2_{\alpha,\text{post}} = \left(\frac{1}{B} + \mathbf{H}^T\mathbf{\Sigma}_{\text{norm}}^{-1}\mathbf{H}\right)^{-1} = \left(B^{-1} + \mathbf{H}^T\mathbf{\Sigma}_{\text{norm}}^{-1}\mathbf{H}\right)^{-1}

Physical interpretation:

Operational use:


8. Pedagogical Connection to Matched Filter

From Linear Model to Matched Filter

The matched filter is simply the maximum likelihood estimator for the linear-Gaussian model with no prior constraint:

Combined Linear ModelBMatched Filter\boxed{\text{Combined Linear Model} \xrightarrow{B \to \infty} \text{Matched Filter}}

No approximation involved in this step—it’s a direct consequence of the model structure.

The Matched Filter Formula

Define target spectrum:

t=H=σNtotal106LAMF\mathbf{t} = -\mathbf{H} = \boldsymbol{\sigma} \odot N_{\text{total}} \cdot 10^{-6} L \cdot \text{AMF}

Note the sign: We define t\mathbf{t} as the negative of H\mathbf{H} so that t>0\mathbf{t} > 0 (absorption causes positive values in standard convention).

Matched filter estimate:

α^MF=tTΣnorm1dtTΣnorm1t\hat{\alpha}_{\text{MF}} = \frac{\mathbf{t}^T\mathbf{\Sigma}_{\text{norm}}^{-1}\mathbf{d}}{\mathbf{t}^T\mathbf{\Sigma}_{\text{norm}}^{-1}\mathbf{t}}

where d=1ynorm\mathbf{d} = \mathbf{1} - \mathbf{y}_{\text{norm}} (note sign: we want positive d\mathbf{d} for absorption).

For white noise:

α^MF=tTdtTt=tT(1ynorm)t2\hat{\alpha}_{\text{MF}} = \frac{\mathbf{t}^T\mathbf{d}}{\mathbf{t}^T\mathbf{t}} = \frac{\mathbf{t}^T(\mathbf{1} - \mathbf{y}_{\text{norm}})}{\|\mathbf{t}\|^2}

Physical interpretation:

Detection Statistic

Define detection statistic:

δ=tTΣnorm1d\delta = \mathbf{t}^T\mathbf{\Sigma}_{\text{norm}}^{-1}\mathbf{d}

Relationship to estimate:

α^MF=δtTΣnorm1t\hat{\alpha}_{\text{MF}} = \frac{\delta}{\mathbf{t}^T\mathbf{\Sigma}_{\text{norm}}^{-1}\mathbf{t}}

Under null hypothesis (α=0\alpha = 0, no plume):

δN(0,tTΣnormt)\delta \sim \mathcal{N}(0, \mathbf{t}^T\mathbf{\Sigma}_{\text{norm}}\mathbf{t})

Standard deviation:

σδ=tTΣnormt\sigma_{\delta} = \sqrt{\mathbf{t}^T\mathbf{\Sigma}_{\text{norm}}\mathbf{t}}

For white noise:

σδ=σnormt\sigma_{\delta} = \sigma_{\text{norm}} \|\mathbf{t}\|

Detection threshold:

Declare plume if δ>λσδ\delta > \lambda \sigma_{\delta} where:

Operational choice: Typically λ=34\lambda = 3-4 balancing false alarms vs. missed detections.

Why It’s Called “Matched” Filter

The filter is matched to the expected signal in two senses:

  1. Spectral matching: The target t\mathbf{t} has the same spectral shape as the expected plume signature (methane absorption spectrum)

  2. Statistical matching: The weighting by Σnorm1\mathbf{\Sigma}_{\text{norm}}^{-1} whitens the noise, making the filter optimal in the signal-to-noise ratio sense

Matched filter theorem: For detecting a known signal in Gaussian noise, the matched filter maximizes SNR and minimizes probability of error.

Computational Implementation

Pseudocode for operational system:

## Preprocessing (once per scene)
L_bg = median(background_pixels, axis=0)  # #Estimate background
y_norm = y_absolute / L_bg                # Normalize radiance
## Matched filter (per pixel)
d = 1 - y_norm                            # Innovation
numerator = t.T @ Sigma_inv @ d           # Correlation
denominator = t.T @ Sigma_inv @ t         # Normalization
alpha_hat = numerator / denominator        # VMR estimate
## Detection
sigma_delta = sqrt(t.T @ Sigma_norm @ t)  # Detection threshold
delta = numerator                          # Detection statistic
detected = (delta > 3 * sigma_delta)      # 3-sigma detection

Optimizations:

With these optimizations: ~0.1 ms/pixel on GPU, enabling real-time processing.


Summary: Model 4B in Context

The Optimal Operational Choice

Model 4B achieves the best trade-off for operational methane detection:

CriterionModel 4B Performance
AccuracyGood (5-10% error for moderate plumes)
SpeedExcellent (100× faster than exact)
CalibrationNone required (self-calibrating)
RobustnessHigh (immune to multiplicative errors)
ImplementationSimple (closed-form solution)
Validity rangeModerate plumes (Δτ<0.1\Delta\tau < 0.1, ~1000 ppm)

When to Use Model 4B

Recommended for:

Not recommended for:

Hierarchy Position

Most Accurate ←──────────────────→ Fastest Model 1B Model 3B Model 4B Model 2B (Exact) (Taylor only) (Combined) (MacLaurin only) ●──────────────●──────────────●──────────────● │ │ │ │ 0% error 2-5% error 5-10% error 15-20% error 50 ms/px 10 ms/px 0.5 ms/px 0.5 ms/px Iterative Iterative Closed-form Closed-form All τ τ < 0.3 τ < 0.1 τ < 0.05

Model 4B occupies the “sweet spot” where accuracy is good enough and speed is fast enough for practical operations.

Future Directions

Extensions of Model 4B:

The combined linearized model will remain the foundation of operational methane detection for the foreseeable future.