Skip to article frontmatterSkip to article content

Beer-Lambert's Law RTM

United Nations Environmental Programme

Introduction and Physical Motivation

The Absolute Radiance Model

The absolute radiance formulation represents the fundamental forward model for atmospheric methane detection using passive remote sensing in the shortwave infrared (SWIR). This model describes how radiance measured at a satellite or airborne sensor relates to atmospheric methane concentration through the Beer-Lambert law of absorption.

Physical Context

When solar radiation reflects off Earth’s surface and travels through the atmosphere to reach a sensor, it is attenuated by atmospheric absorption. Methane absorbs strongly in specific SWIR bands (notably 1650 nm and 2300 nm). The Beer-Lambert law quantifies this exponential attenuation as a function of:

  1. Molecular properties: Absorption cross-section (temperature and pressure dependent)

  2. Atmospheric state: Total number density (from temperature and pressure via ideal gas law)

  3. Methane concentration: Volume mixing ratio along the optical path

  4. Geometric factors: Path length and viewing geometry (air mass factor)

  5. Scene properties: Surface reflectance and solar illumination

Applications

This model is essential for:

Treatment Relative to Observations

Absolute radiance measurements require:

These requirements make absolute retrievals challenging but provide physically interpretable quantities needed for quantitative emission estimation.


1. Concentration

Total Concentration - Background + Enhancement

The total methane volume mixing ratio is the sum of background and enhancement:

VMRtotal=VMRbg+ΔVMR\text{VMR}_{\text{total}} = \text{VMR}_{\text{bg}} + \Delta\text{VMR}

Components:

Relationship to column-averaged quantities:

For column measurements:

XCH4=0ztopnCH4(z)dz0ztopnair(z)dz\text{XCH}_4 = \frac{\int_0^{z_{\text{top}}} n_{\text{CH}_4}(z) \, dz}{\int_0^{z_{\text{top}}} n_{\text{air}}(z) \, dz}

where nn denotes number density profiles. For boundary layer plumes:

XCH4VMRbg+ΔVMRhplumeHatm\text{XCH}_4 \approx \text{VMR}_{\text{bg}} + \Delta\text{VMR} \cdot \frac{h_{\text{plume}}}{H_{\text{atm}}}

where hplumeh_{\text{plume}} is plume vertical extent and HatmH_{\text{atm}} is atmospheric scale height (~8 km).


2. Optical Depth + Jacobian

Optical Depth

The optical depth (also called optical thickness or extinction) quantifies the total absorption along the atmospheric path. It is dimensionless and represents the natural logarithm of the attenuation factor.

τ(VMR)=σ(λ,T,P)NtotalVMR106LAMF\tau(\text{VMR}) = \sigma(\lambda,T,P) \cdot N_{\text{total}} \cdot \text{VMR} \cdot 10^{-6} \cdot L \cdot \text{AMF}

Notice how we have many control vectors, (λ,T,P)(\lambda,T,P), where we condition our prediction of the optical depth. Physical parameters with full definitions:

Linearity property:

The optical depth is linear in VMR:

τVMR=constant\frac{\partial \tau}{\partial \text{VMR}} = \text{constant}

This fundamental property enables analytical solutions and is exploited in linear inversions.


Enhancement Optical Depth

Additive decomposition:

τtotal=τbg+Δτ\tau_{\text{total}} = \tau_{\text{bg}} + \Delta\tau

where:

Background optical depth:

τbg=σ(λ,T,P)NtotalVMRbg106LAMF\tau_{\text{bg}} = \sigma(\lambda,T,P) \cdot N_{\text{total}} \cdot \text{VMR}_{\text{bg}} \cdot 10^{-6} \cdot L \cdot \text{AMF}

Enhancement optical depth:

Δτ=σ(λ,T,P)NtotalΔVMR106LAMF\Delta\tau = \sigma(\lambda,T,P) \cdot N_{\text{total}} \cdot \Delta\text{VMR} \cdot 10^{-6} \cdot L \cdot \text{AMF}

Jacobian of Optical Depth

The sensitivity of optical depth to VMR changes:

τVMR=σ(λ,T,P)Ntotal106LAMF\frac{\partial \tau}{\partial \text{VMR}} = \sigma(\lambda,T,P) \cdot N_{\text{total}} \cdot 10^{-6} \cdot L \cdot \text{AMF}

Units: ppm⁻¹ (dimensionless per ppm)

Physical interpretation:

Alternative notation for enhancement:

τ(ΔVMR)=σ(λ,T,P)Ntotal106LAMF\frac{\partial \tau}{\partial (\Delta\text{VMR})} = \sigma(\lambda,T,P) \cdot N_{\text{total}} \cdot 10^{-6} \cdot L \cdot \text{AMF}

This is mathematically identical, emphasizing that the Jacobian with respect to enhancement equals the Jacobian with respect to total VMR.

Wavelength dependence:

The Jacobian varies across wavelengths due to σ(λ)\sigma(\lambda):

Typical values at 2300 nm:


3. Transmittance + Jacobian

Transmittance

The transmittance is the fraction of incident radiation that passes through the atmosphere without being absorbed:

T(VMR)=exp(τ(VMR))T(\text{VMR}) = \exp(-\tau(\text{VMR}))

Units: dimensionless (range: 0 to 1, where 1 = no absorption, 0 = complete absorption)

Multiplicative decomposition:

Ttotal=TbgTenhT_{\text{total}} = T_{\text{bg}} \cdot T_{\text{enh}}

where:

Background transmittance:

Tbg=exp(τbg)T_{\text{bg}} = \exp(-\tau_{\text{bg}})

Enhancement transmittance:

Tenh=exp(Δτ)T_{\text{enh}} = \exp(-\Delta\tau)

Multiplicative property:

Using the exponential identity exp(a+b)=exp(a)exp(b)\exp(a + b) = \exp(a) \cdot \exp(b):

Ttotal=exp((τbg+Δτ))=exp(τbg)exp(Δτ)=TbgTenhT_{\text{total}} = \exp(-(\tau_{\text{bg}} + \Delta\tau)) = \exp(-\tau_{\text{bg}}) \cdot \exp(-\Delta\tau) = T_{\text{bg}} \cdot T_{\text{enh}}

Physical interpretation:

Saturation effects:

For large optical depths (τ>1\tau > 1), transmittance approaches zero exponentially:

This saturation limits the dynamic range for quantification at strong absorption lines.

Jacobian of Transmittance

With respect to total VMR:

TVMR=VMR[exp(τ)]=exp(τ)τVMR\frac{\partial T}{\partial \text{VMR}} = \frac{\partial}{\partial \text{VMR}}\left[\exp(-\tau)\right] = -\exp(-\tau) \cdot \frac{\partial \tau}{\partial \text{VMR}}

Explicit form:

TVMR=exp(τ)σNtotal106LAMF\frac{\partial T}{\partial \text{VMR}} = -\exp(-\tau) \cdot \sigma \cdot N_{\text{total}} \cdot 10^{-6} \cdot L \cdot \text{AMF}

Units: ppm⁻¹

At background:

TVMRbg=TbgσNtotal106LAMF\frac{\partial T}{\partial \text{VMR}}\bigg|_{\text{bg}} = -T_{\text{bg}} \cdot \sigma \cdot N_{\text{total}} \cdot 10^{-6} \cdot L \cdot \text{AMF}

With respect to enhancement (more natural for plume detection):

Tenh(ΔVMR)=exp(Δτ)σNtotal106LAMF\frac{\partial T_{\text{enh}}}{\partial (\Delta\text{VMR})} = -\exp(-\Delta\tau) \cdot \sigma \cdot N_{\text{total}} \cdot 10^{-6} \cdot L \cdot \text{AMF}

At background (Δτ=0\Delta\tau = 0):

Tenh(ΔVMR)0=σNtotal106LAMF\frac{\partial T_{\text{enh}}}{\partial (\Delta\text{VMR})}\bigg|_0 = -\sigma \cdot N_{\text{total}} \cdot 10^{-6} \cdot L \cdot \text{AMF}

Key properties:

  1. Nonlinear: Jacobian depends on current transmittance (or optical depth) through the exp(τ)\exp(-\tau) term

  2. Negative sign: Increasing VMR decreases transmittance (absorption effect)

  3. Maximum sensitivity at zero optical depth: TVMR|\frac{\partial T}{\partial \text{VMR}}| is largest when τ=0\tau = 0

  4. Saturation: As τ\tau increases, TVMR|\frac{\partial T}{\partial \text{VMR}}| decreases exponentially (diminishing sensitivity)

Sensitivity vs. optical depth:

TVMR=exp(τ)τVMR\left|\frac{\partial T}{\partial \text{VMR}}\right| = \exp(-\tau) \cdot \left|\frac{\partial \tau}{\partial \text{VMR}}\right|

This motivates using moderate absorption lines (not the strongest) for quantitative retrievals to avoid saturation.


4. Beer’s Law + Jacobian

Beer’s Law (Forward Model)

The at-sensor radiance observed by a hyperspectral instrument follows the Beer-Lambert law:

L(VMR)=F0RπT(VMR)=F0Rπexp(τ(VMR))L(\text{VMR}) = \frac{F_0 R}{\pi} \cdot T(\text{VMR}) = \frac{F_0 R}{\pi} \cdot \exp(-\tau(\text{VMR}))

Units: W/m²/sr/cm⁻¹ (spectral radiance)

Alternative units: W/m²/sr/nm (per wavelength), μW/(cm²·sr·cm⁻¹) (atmospheric science convention)

Decomposed form using multiplicative transmittance:

L(VMRtotal)=F0RπTbgTenhL(\text{VMR}_{\text{total}}) = \frac{F_0 R}{\pi} \cdot T_{\text{bg}} \cdot T_{\text{enh}}
L(VMRtotal)=F0Rπexp(τbg)Lbgexp(Δτ)TenhL(\text{VMR}_{\text{total}}) = \underbrace{\frac{F_0 R}{\pi} \cdot \exp(-\tau_{\text{bg}})}_{L_{\text{bg}}} \cdot \underbrace{\exp(-\Delta\tau)}_{T_{\text{enh}}}
L(VMRtotal)=Lbgexp(Δτ)\boxed{L(\text{VMR}_{\text{total}}) = L_{\text{bg}} \cdot \exp(-\Delta\tau)}

Physical components with full definitions:

Physical interpretation of decomposed form:

The observed radiance is the background radiance attenuated by the plume enhancement factor exp(Δτ)\exp(-\Delta\tau).

Assumptions and approximations:

  1. Single scattering: Neglects multiple scattering by molecules and aerosols

    • Valid for clear atmospheres and moderate optical depths

    • Error: <5% for τ<1\tau < 1 in clean air

  2. Lambertian surface: Isotropic reflection (BRDF approximated as constant)

    • Error depends on surface type and viewing geometry

    • Typical error: 10-30% in absolute radiance

  3. Plane-parallel atmosphere: Horizontal homogeneity

    • Valid for satellite footprints < 10 km

    • Breaks down near cloud edges or strong horizontal gradients

  4. Neglects atmospheric scattering: Rayleigh and aerosol scattering not included

    • Valid in SWIR where scattering is weak

    • More important in visible/near-IR

  5. Path-averaged concentration: Assumes vertically well-mixed plume

    • For boundary layer plumes, reasonable approximation

    • Breaks down for elevated plumes or vertical gradients

Jacobian of Radiance

The sensitivity of radiance to VMR changes:

LVMR=F0RπTVMR=F0Rπ(exp(τ))τVMR\frac{\partial L}{\partial \text{VMR}} = \frac{F_0 R}{\pi} \cdot \frac{\partial T}{\partial \text{VMR}} = \frac{F_0 R}{\pi} \cdot \left(-\exp(-\tau)\right) \cdot \frac{\partial \tau}{\partial \text{VMR}}

Explicit form:

LVMR=F0Rπexp(τ)σNtotal106LAMF\frac{\partial L}{\partial \text{VMR}} = -\frac{F_0 R}{\pi} \cdot \exp(-\tau) \cdot \sigma \cdot N_{\text{total}} \cdot 10^{-6} \cdot L \cdot \text{AMF}

Units: (W/m²/sr/cm⁻¹)/ppm

At background:

LVMRbg=LbgσNtotal106LAMF\frac{\partial L}{\partial \text{VMR}}\bigg|_{\text{bg}} = -L_{\text{bg}} \cdot \sigma \cdot N_{\text{total}} \cdot 10^{-6} \cdot L \cdot \text{AMF}

Alternative form using enhancement:

L(ΔVMR)=Lbgexp(Δτ)σNtotal106LAMF\frac{\partial L}{\partial (\Delta\text{VMR})} = -L_{\text{bg}} \cdot \exp(-\Delta\tau) \cdot \sigma \cdot N_{\text{total}} \cdot 10^{-6} \cdot L \cdot \text{AMF}

At background (Δτ=0\Delta\tau = 0):

L(ΔVMR)0=LbgσNtotal106LAMF\frac{\partial L}{\partial (\Delta\text{VMR})}\bigg|_0 = -L_{\text{bg}} \cdot \sigma \cdot N_{\text{total}} \cdot 10^{-6} \cdot L \cdot \text{AMF}

Key properties:

  1. Nonlinear: Depends on current radiance level (or optical depth) through exponential term

  2. Negative sign: Increasing methane decreases radiance (absorption)

  3. Proportional to background radiance: Brighter scenes have larger absolute sensitivity

  4. Wavelength dependent: Follows absorption cross-section spectrum

  5. Saturation: Sensitivity decreases exponentially as plume strengthens

Physical interpretation:

The Jacobian represents the expected change in observed radiance for a 1 ppm increase in methane concentration:

Signal-to-noise considerations:

For detection, require:

LVMRΔVMR>kσnoise\left|\frac{\partial L}{\partial \text{VMR}} \cdot \Delta\text{VMR}\right| > k \cdot \sigma_{\text{noise}}

where k=35k = 3-5 for 3σ-5σ detection threshold and σnoise\sigma_{\text{noise}} is the instrument noise.

Typical instrument noise:


5. Observations

Observation Model

The measured radiance spectrum at each pixel:

y=L(VMRtotal)+ϵ\mathbf{y} = L(\text{VMR}_{\text{total}}) + \boldsymbol{\epsilon}

Explicit form:

y=F0Rπexp(σNtotalVMRtotal106LAMF)+ϵ\mathbf{y} = \frac{F_0 R}{\pi} \exp\left(-\sigma N_{\text{total}} \text{VMR}_{\text{total}} \cdot 10^{-6} L \cdot \text{AMF}\right) + \boldsymbol{\epsilon}

Using enhancement formulation:

y=Lbgexp(Δτ(α))+ϵ\mathbf{y} = \mathbf{L}_{\text{bg}} \odot \exp(-\boldsymbol{\Delta\tau}(\alpha)) + \boldsymbol{\epsilon}

where α=ΔVMR\alpha = \Delta\text{VMR} is the unknown enhancement parameter.

Components with full definitions:

Noise Characteristics

Covariance matrix Σ\mathbf{\Sigma}:

Σ=E[ϵϵT]\mathbf{\Sigma} = \mathbb{E}[\boldsymbol{\epsilon}\boldsymbol{\epsilon}^T]

Diagonal elements Σii=σi2\Sigma_{ii} = \sigma^2_i:

Off-diagonal elements Σij\Sigma_{ij} (iji \neq j):

Simplified models:

  1. White noise (diagonal, equal variance):

    Σ=σ2I\mathbf{\Sigma} = \sigma^2 \mathbf{I}
    • Simplest assumption

    • Often reasonable for well-calibrated instruments

  2. Diagonal heteroscedastic (wavelength-dependent variance):

    Σ=diag(σ12,σ22,,σn2)\mathbf{\Sigma} = \text{diag}(\sigma^2_1, \sigma^2_2, \ldots, \sigma^2_n)
    • Accounts for wavelength-dependent SNR

    • Common practical choice

  3. Full covariance:

    Σ=full matrix\mathbf{\Sigma} = \text{full matrix}
    • Most accurate but computationally expensive

    • Required for optimal weighted least squares

Noise estimation methods:

Relationship to Raw Measurements

Sensor measurement chain:

  1. Photon flux → Detector

  2. Photoelectrons → Readout

  3. Digital counts (DN) → Calibration

  4. At-sensor radiance (W/m²/sr/nm) → Atmospheric correction

  5. Surface-leaving radiance → (not needed for absolute methane detection)

Radiometric calibration:

L(λ)=Gain(λ)(DN(λ)DNdark(λ))+Offset(λ)L(\lambda) = \text{Gain}(\lambda) \cdot (\text{DN}(\lambda) - \text{DN}_{\text{dark}}(\lambda)) + \text{Offset}(\lambda)

where:

Uncertainty propagation:

Total noise includes:

σtotal2=σradiometric2+σshot2+σsystematic2\sigma^2_{\text{total}} = \sigma^2_{\text{radiometric}} + \sigma^2_{\text{shot}} + \sigma^2_{\text{systematic}}

Typical allocation for well-calibrated instrument:


6. Measurement Model

Forward Measurement Model

The relationship between unknown VMR enhancement and observations:

y=g(α)+ϵ\mathbf{y} = g(\alpha) + \boldsymbol{\epsilon}

where the forward operator is:

g(α)=Lbgexp(Δτ(α))g(\alpha) = \mathbf{L}_{\text{bg}} \odot \exp(-\boldsymbol{\Delta\tau}(\alpha))

with:

Δτ(α)=σNtotalα106LAMF\boldsymbol{\Delta\tau}(\alpha) = \boldsymbol{\sigma} \odot N_{\text{total}} \cdot \alpha \cdot 10^{-6} L \cdot \text{AMF}

Model structure:

Properties of forward operator:

  1. Nonlinear: g(α1+α2)g(α1)+g(α2)g(\alpha_1 + \alpha_2) \neq g(\alpha_1) + g(\alpha_2) due to exponential

  2. Monotonic: gα<0\frac{\partial g}{\partial \alpha} < 0 (increasing VMR decreases radiance)

  3. Bounded: 0<g(α)<Lbg0 < g(\alpha) < \mathbf{L}_{\text{bg}} for α0\alpha \geq 0

  4. Smooth: Infinitely differentiable

  5. Wavelength-coupled: All wavelengths depend on same scalar α\alpha

Inverse Problem (Maximum Likelihood Estimation)

Goal: Estimate α\alpha from observations y\mathbf{y}

Likelihood function (Gaussian noise assumption):

p(yα)=1(2π)n/2Σ1/2exp(12(yg(α))TΣ1(yg(α)))p(\mathbf{y}|\alpha) = \frac{1}{(2\pi)^{n/2}|\mathbf{\Sigma}|^{1/2}} \exp\left(-\frac{1}{2}(\mathbf{y} - g(\alpha))^T \mathbf{\Sigma}^{-1} (\mathbf{y} - g(\alpha))\right)

Maximum likelihood estimate:

α^ML=argmaxαp(yα)=argminα{logp(yα)}\hat{\alpha}_{\text{ML}} = \arg\max_{\alpha} p(\mathbf{y}|\alpha) = \arg\min_{\alpha} \left\{-\log p(\mathbf{y}|\alpha)\right\}

Cost function (negative log-likelihood):

J(α)=12(yg(α))TΣ1(yg(α))+12logΣ+n2log(2π)J(\alpha) = \frac{1}{2}(\mathbf{y} - g(\alpha))^T \mathbf{\Sigma}^{-1} (\mathbf{y} - g(\alpha)) + \frac{1}{2}\log|\mathbf{\Sigma}| + \frac{n}{2}\log(2\pi)

Since last two terms are independent of α\alpha:

α^ML=argminα{(yLbgexp(Δτ(α)))TΣ1(yLbgexp(Δτ(α)))}\boxed{\hat{\alpha}_{\text{ML}} = \arg\min_{\alpha} \left\{(\mathbf{y} - \mathbf{L}_{\text{bg}} \odot \exp(-\boldsymbol{\Delta\tau}(\alpha)))^T \mathbf{\Sigma}^{-1} (\mathbf{y} - \mathbf{L}_{\text{bg}} \odot \exp(-\boldsymbol{\Delta\tau}(\alpha)))\right\}}

Units: dimensionless (cost function), ppm (estimated α^\hat{\alpha})

Properties of cost function:

  1. Non-convex: Multiple local minima possible (though typically one dominant minimum)

  2. Smooth: Differentiable, enabling gradient-based optimization

  3. Weighted least squares: Σ1\mathbf{\Sigma}^{-1} weights wavelengths by inverse noise variance

Optimality conditions:

Taking derivative and setting to zero:

Jα=H(α)TΣ1(yg(α))=0\frac{\partial J}{\partial \alpha} = -\mathbf{H}(\alpha)^T \mathbf{\Sigma}^{-1} (\mathbf{y} - g(\alpha)) = 0

where:

H(α)=gα=Lbgexp(Δτ(α))(σNtotal106LAMF)\mathbf{H}(\alpha) = \frac{\partial g}{\partial \alpha} = -\mathbf{L}_{\text{bg}} \odot \exp(-\boldsymbol{\Delta\tau}(\alpha)) \odot \left(\boldsymbol{\sigma} \odot N_{\text{total}} \cdot 10^{-6} L \cdot \text{AMF}\right)

is the Jacobian matrix (n_wavelengths × 1).

Solution methods:

  1. Gauss-Newton iteration:

    α(k+1)=α(k)+(HTΣ1H)1HTΣ1(yg(α(k)))\alpha^{(k+1)} = \alpha^{(k)} + \left(\mathbf{H}^T \mathbf{\Sigma}^{-1} \mathbf{H}\right)^{-1} \mathbf{H}^T \mathbf{\Sigma}^{-1} (\mathbf{y} - g(\alpha^{(k)}))
  2. Levenberg-Marquardt:

    α(k+1)=α(k)+(HTΣ1H+λI)1HTΣ1(yg(α(k)))\alpha^{(k+1)} = \alpha^{(k)} + \left(\mathbf{H}^T \mathbf{\Sigma}^{-1} \mathbf{H} + \lambda \mathbf{I}\right)^{-1} \mathbf{H}^T \mathbf{\Sigma}^{-1} (\mathbf{y} - g(\alpha^{(k)}))

    where λ\lambda is damping parameter (adjusted each iteration)

  3. Gradient descent:

    α(k+1)=α(k)ηJαα(k)\alpha^{(k+1)} = \alpha^{(k)} - \eta \frac{\partial J}{\partial \alpha}\bigg|_{\alpha^{(k)}}

    where η\eta is learning rate

Initialization:

Convergence criteria:

Typical thresholds: ϵα=1\epsilon_{\alpha} = 1 ppm, ϵJ=106\epsilon_J = 10^{-6}, ϵg=104\epsilon_g = 10^{-4}

Computational cost:


7. Taylor Expanded Measurement Model (Useful for 3DVar)

First-Order Taylor Expansion

Linearize the forward operator around a reference state α0\alpha_0 (typically 0 for background):

g(α)g(α0)+H(α0)(αα0)g(\alpha) \approx g(\alpha_0) + \mathbf{H}(\alpha_0) \cdot (\alpha - \alpha_0)

where H(α0)=gαα0\mathbf{H}(\alpha_0) = \frac{\partial g}{\partial \alpha}\bigg|_{\alpha_0} is the observation operator (Jacobian).

Units:

Linearization at Background (α0=0\alpha_0 = 0)

Forward model at background:

g(0)=Lbgexp(Δτ(0))=Lbgexp(0)=Lbgg(0) = \mathbf{L}_{\text{bg}} \odot \exp(-\boldsymbol{\Delta\tau}(0)) = \mathbf{L}_{\text{bg}} \odot \exp(\mathbf{0}) = \mathbf{L}_{\text{bg}}

Jacobian at background:

H=H(0)=gαα=0\mathbf{H} = \mathbf{H}(0) = \frac{\partial g}{\partial \alpha}\bigg|_{\alpha=0}
H=α[Lbgexp(Δτ(α))]α=0\mathbf{H} = \frac{\partial}{\partial \alpha}\left[\mathbf{L}_{\text{bg}} \odot \exp(-\boldsymbol{\Delta\tau}(\alpha))\right]\bigg|_{\alpha=0}

Using chain rule:

H=Lbgα[exp(Δτ(α))]α=0\mathbf{H} = \mathbf{L}_{\text{bg}} \odot \frac{\partial}{\partial \alpha}\left[\exp(-\boldsymbol{\Delta\tau}(\alpha))\right]\bigg|_{\alpha=0}
H=Lbg[exp(Δτ(0))Δτα]\mathbf{H} = \mathbf{L}_{\text{bg}} \odot \left[-\exp(-\boldsymbol{\Delta\tau}(0)) \cdot \frac{\partial \boldsymbol{\Delta\tau}}{\partial \alpha}\right]

Since Δτ(0)=0\boldsymbol{\Delta\tau}(0) = \mathbf{0} and exp(0)=1\exp(\mathbf{0}) = \mathbf{1}:

H=LbgΔτα\mathbf{H} = -\mathbf{L}_{\text{bg}} \odot \frac{\partial \boldsymbol{\Delta\tau}}{\partial \alpha}
H=Lbg(σNtotal106LAMF)\mathbf{H} = -\mathbf{L}_{\text{bg}} \odot \left(\boldsymbol{\sigma} \odot N_{\text{total}} \cdot 10^{-6} L \cdot \text{AMF}\right)
H=LbgσNtotal106LAMF\boxed{\mathbf{H} = -\mathbf{L}_{\text{bg}} \odot \boldsymbol{\sigma} \odot N_{\text{total}} \cdot 10^{-6} L \cdot \text{AMF}}

Typical magnitudes:

Linearized Observation Model

yLbg+Hα+ϵ\mathbf{y} \approx \mathbf{L}_{\text{bg}} + \mathbf{H} \cdot \alpha + \boldsymbol{\epsilon}

Rearranged form (innovation):

Define the innovation vector (observation minus background):

d=yLbg\mathbf{d} = \mathbf{y} - \mathbf{L}_{\text{bg}}

Linearized model:

dHα+ϵ\mathbf{d} \approx \mathbf{H} \cdot \alpha + \boldsymbol{\epsilon}

This is a linear relationship between innovation and enhancement.

Matrix form:

dn×1=Hn×1α1×1+ϵn×1\underbrace{\mathbf{d}}_{n \times 1} = \underbrace{\mathbf{H}}_{n \times 1} \underbrace{\alpha}_{1 \times 1} + \underbrace{\boldsymbol{\epsilon}}_{n \times 1}

Since α\alpha is scalar, this is equivalent to:

di=Hiα+ϵi,i=1,,nd_i = H_i \cdot \alpha + \epsilon_i, \quad i = 1, \ldots, n

3DVar Formulation

Three-Dimensional Variational Data Assimilation combines observations with prior information to estimate atmospheric state.

Cost function with background constraint:

J(α)=Jb(α)+Jo(α)J(\alpha) = J_b(\alpha) + J_o(\alpha)

where:

Background term:

Jb(α)=12(ααb)TB1(ααb)=12B(ααb)2J_b(\alpha) = \frac{1}{2}(\alpha - \alpha_b)^T B^{-1} (\alpha - \alpha_b) = \frac{1}{2B}(\alpha - \alpha_b)^2

(since α\alpha is scalar, BB is scalar variance)

Observation term:

Jo(α)=12(dHα)TΣ1(dHα)J_o(\alpha) = \frac{1}{2}(\mathbf{d} - \mathbf{H}\alpha)^T \mathbf{\Sigma}^{-1} (\mathbf{d} - \mathbf{H}\alpha)

Total cost function:

J(α)=12B(ααb)2+12(dHα)TΣ1(dHα)J(\alpha) = \frac{1}{2B}(\alpha - \alpha_b)^2 + \frac{1}{2}(\mathbf{d} - \mathbf{H}\alpha)^T \mathbf{\Sigma}^{-1} (\mathbf{d} - \mathbf{H}\alpha)

Units: dimensionless (both terms normalized by respective covariances)

Physical interpretation:

Optimal solution (analytical):

Taking derivative:

Jα=1B(ααb)HTΣ1(dHα)=0\frac{\partial J}{\partial \alpha} = \frac{1}{B}(\alpha - \alpha_b) - \mathbf{H}^T \mathbf{\Sigma}^{-1} (\mathbf{d} - \mathbf{H}\alpha) = 0

Rearranging:

1B(ααb)=HTΣ1(dHα)\frac{1}{B}(\alpha - \alpha_b) = \mathbf{H}^T \mathbf{\Sigma}^{-1} (\mathbf{d} - \mathbf{H}\alpha)
1Bα1Bαb=HTΣ1dHTΣ1Hα\frac{1}{B}\alpha - \frac{1}{B}\alpha_b = \mathbf{H}^T \mathbf{\Sigma}^{-1} \mathbf{d} - \mathbf{H}^T \mathbf{\Sigma}^{-1} \mathbf{H}\alpha
(1B+HTΣ1H)α=HTΣ1d+1Bαb\left(\frac{1}{B} + \mathbf{H}^T \mathbf{\Sigma}^{-1} \mathbf{H}\right)\alpha = \mathbf{H}^T \mathbf{\Sigma}^{-1} \mathbf{d} + \frac{1}{B}\alpha_b

Solution:

α^=BHTΣ1d+αbHTΣ1H+B1\boxed{\hat{\alpha} = \frac{B \mathbf{H}^T \mathbf{\Sigma}^{-1} \mathbf{d} + \alpha_b}{\mathbf{H}^T \mathbf{\Sigma}^{-1} \mathbf{H} + B^{-1}}}

or equivalently:

α^=αb+BHTΣ1(dHαb)BHTΣ1H+1\boxed{\hat{\alpha} = \alpha_b + \frac{B\mathbf{H}^T \mathbf{\Sigma}^{-1}(\mathbf{d} - \mathbf{H}\alpha_b)}{B\mathbf{H}^T \mathbf{\Sigma}^{-1}\mathbf{H} + 1}}

For αb=0\alpha_b = 0 (no prior plume):

α^=BHTΣ1dBHTΣ1H+1=HTΣ1dHTΣ1H+B1\hat{\alpha} = \frac{B \mathbf{H}^T \mathbf{\Sigma}^{-1} \mathbf{d}}{B\mathbf{H}^T \mathbf{\Sigma}^{-1}\mathbf{H} + 1} = \frac{\mathbf{H}^T \mathbf{\Sigma}^{-1} \mathbf{d}}{\mathbf{H}^T \mathbf{\Sigma}^{-1}\mathbf{H} + B^{-1}}

Units check:

Posterior error variance:

The uncertainty in the estimate is:

σα2=(HTΣ1H+B1)1\sigma^2_{\alpha} = \left(\mathbf{H}^T \mathbf{\Sigma}^{-1} \mathbf{H} + B^{-1}\right)^{-1}

Standard deviation:

σα=σα2\sigma_{\alpha} = \sqrt{\sigma^2_{\alpha}}

Typical values: 50-200 ppm for good observations, 200-500 ppm for noisy observations


8. Pedagogical Connection to Matched Filter

From 3DVar to Matched Filter

The matched filter emerges as a special limiting case of 3DVar under specific assumptions about prior knowledge.

Assumption 1: Uninformative prior (infinite prior uncertainty)

Set BB \to \infty, which implies:

Assumption 2: Zero background (no prior plume expectation)

Set αb=0\alpha_b = 0, which means:

Resulting cost function:

J(α)=12(dHα)TΣ1(dHα)J(\alpha) = \frac{1}{2}(\mathbf{d} - \mathbf{H}\alpha)^T \mathbf{\Sigma}^{-1} (\mathbf{d} - \mathbf{H}\alpha)

This is pure weighted least squares without regularization.

3DVar solution with BB \to \infty:

α^=HTΣ1dHTΣ1H\hat{\alpha} = \frac{\mathbf{H}^T \mathbf{\Sigma}^{-1} \mathbf{d}}{\mathbf{H}^T \mathbf{\Sigma}^{-1}\mathbf{H}}

This is the generalized least squares (GLS) or maximum likelihood estimate under Gaussian noise.

Matched Filter Formulation

Define the target spectrum:

The target is simply the Jacobian evaluated at background:

t=H=LbgσNtotal106LAMF\mathbf{t} = \mathbf{H} = -\mathbf{L}_{\text{bg}} \odot \boldsymbol{\sigma} \odot N_{\text{total}} \cdot 10^{-6} L \cdot \text{AMF}

For specific target enhancement ΔVMRtarget\Delta\text{VMR}_{\text{target}} (e.g., 1000 ppm):

tΔVMR=HΔVMRtarget\mathbf{t}_{\Delta\text{VMR}} = \mathbf{H} \cdot \Delta\text{VMR}_{\text{target}}

Define the innovation:

d=yLbg\mathbf{d} = \mathbf{y} - \mathbf{L}_{\text{bg}}

Matched filter estimate:

α^=tTΣ1dtTΣ1t\hat{\alpha} = \frac{\mathbf{t}^T \mathbf{\Sigma}^{-1} \mathbf{d}}{\mathbf{t}^T \mathbf{\Sigma}^{-1} \mathbf{t}}

Detection statistic (unnormalized):

δ=tTΣ1d=tTΣ1(yLbg)\delta = \mathbf{t}^T \mathbf{\Sigma}^{-1} \mathbf{d} = \mathbf{t}^T \mathbf{\Sigma}^{-1} (\mathbf{y} - \mathbf{L}_{\text{bg}})

Relationship between estimate and statistic:

α^=δtTΣ1t\hat{\alpha} = \frac{\delta}{\mathbf{t}^T \mathbf{\Sigma}^{-1} \mathbf{t}}

The denominator normalizes the statistic to give VMR units.

White Noise Simplification

For white (uncorrelated, equal variance) noise:

Σ=σnoise2I\mathbf{\Sigma} = \sigma^2_{\text{noise}} \mathbf{I}

where σnoise2\sigma^2_{\text{noise}} is the noise variance (assumed equal at all wavelengths).

Estimate simplifies to:

α^=tTdtTt=tT(yLbg)t2\hat{\alpha} = \frac{\mathbf{t}^T \mathbf{d}}{\mathbf{t}^T \mathbf{t}} = \frac{\mathbf{t}^T (\mathbf{y} - \mathbf{L}_{\text{bg}})}{\|\mathbf{t}\|^2}

Detection statistic:

δ=1σnoise2tT(yLbg)\delta = \frac{1}{\sigma^2_{\text{noise}}} \mathbf{t}^T (\mathbf{y} - \mathbf{L}_{\text{bg}})

Normalized Matched Filter (Unit Target)

Normalize target to unit L2 norm:

tunit=tt=ttTt\mathbf{t}_{\text{unit}} = \frac{\mathbf{t}}{\|\mathbf{t}\|} = \frac{\mathbf{t}}{\sqrt{\mathbf{t}^T \mathbf{t}}}

Detection statistic (normalized):

δnorm=tunitT(yLbg)\delta_{\text{norm}} = \mathbf{t}_{\text{unit}}^T (\mathbf{y} - \mathbf{L}_{\text{bg}})

For white noise:

δnorm=1ttT(yLbg)\delta_{\text{norm}} = \frac{1}{\|\mathbf{t}\|} \mathbf{t}^T (\mathbf{y} - \mathbf{L}_{\text{bg}})

Threshold for detection:

Declare plume detected if:

δnorm>λσeff\delta_{\text{norm}} > \lambda \cdot \sigma_{\text{eff}}

where:

False alarm probability:

Under null hypothesis (α=0\alpha = 0, no plume):

PFA=P(δnorm>λσeffH0)=1Φ(λ)P_{\text{FA}} = P(\delta_{\text{norm}} > \lambda \sigma_{\text{eff}} | H_0) = 1 - \Phi(\lambda)

where Φ\Phi is the standard normal CDF.

Values:

Missed detection probability:

Under alternative hypothesis (α=αtrue\alpha = \alpha_{\text{true}}, plume present):

PMD=P(δnorm<λσeffH1)=Φ(λμsignalσeff)P_{\text{MD}} = P(\delta_{\text{norm}} < \lambda \sigma_{\text{eff}} | H_1) = \Phi\left(\lambda - \frac{\mu_{\text{signal}}}{\sigma_{\text{eff}}}\right)

where μsignal=tunitTHαtrue=tαtrue\mu_{\text{signal}} = \mathbf{t}_{\text{unit}}^T \mathbf{H} \alpha_{\text{true}} = \|\mathbf{t}\| \alpha_{\text{true}} is the expected signal.

This depends on:

Key Insights and Connections

1. Matched filter = Maximum likelihood under specific assumptions

2. Target = Jacobian at linearization point

3. Innovation = Mean subtraction

4. Covariance weighting = Optimal combination

5. Linear approximation required

6. Computational efficiency

7. Extensions

Connection Summary

Nonlinear ModelTaylor expand at α=0Linearized Model: yLbg+Hα+ϵDefine d=yLbgInnovation Model: dHα+ϵAdd prior3DVar: J(α)=12B(ααb)2+12(dHα)TΣ1(dHα)B,αb=0GLS/MLE: α^=HTΣ1dHTΣ1HIdentify t=HMatched Filter: α^=tTΣ1dtTΣ1tNormalizeNormalized MF: δ=tunitTd\boxed{ \begin{aligned} \text{Nonlinear Model} &\xrightarrow{\text{Taylor expand at } \alpha=0} \text{Linearized Model: } \mathbf{y} \approx \mathbf{L}_{\text{bg}} + \mathbf{H}\alpha + \boldsymbol{\epsilon} \\ &\xrightarrow{\text{Define } \mathbf{d} = \mathbf{y} - \mathbf{L}_{\text{bg}}} \text{Innovation Model: } \mathbf{d} \approx \mathbf{H}\alpha + \boldsymbol{\epsilon} \\ &\xrightarrow{\text{Add prior}} \text{3DVar: } J(\alpha) = \frac{1}{2B}(\alpha - \alpha_b)^2 + \frac{1}{2}(\mathbf{d}-\mathbf{H}\alpha)^T\mathbf{\Sigma}^{-1}(\mathbf{d}-\mathbf{H}\alpha) \\ &\xrightarrow{B \to \infty, \alpha_b = 0} \text{GLS/MLE: } \hat{\alpha} = \frac{\mathbf{H}^T\mathbf{\Sigma}^{-1}\mathbf{d}}{\mathbf{H}^T\mathbf{\Sigma}^{-1}\mathbf{H}} \\ &\xrightarrow{\text{Identify } \mathbf{t} = \mathbf{H}} \text{Matched Filter: } \hat{\alpha} = \frac{\mathbf{t}^T\mathbf{\Sigma}^{-1}\mathbf{d}}{\mathbf{t}^T\mathbf{\Sigma}^{-1}\mathbf{t}} \\ &\xrightarrow{\text{Normalize}} \text{Normalized MF: } \delta = \mathbf{t}_{\text{unit}}^T \mathbf{d} \end{aligned} }

The matched filter is the optimal linear detector when:

For operational methane detection, matched filter serves as: