Derivation ¶ We start with the Beer-Lambert law which relates the intensity of light to the properties of the material through which the light is traveling.
The equation is commonly written as:
L ( λ ) = L 0 ( λ ) exp ( − α ⋅ ν ( λ ) ) L(\lambda) = L_0(\lambda) \exp \left(-\alpha \cdot \boldsymbol{\nu}(\lambda)\right) L ( λ ) = L 0 ( λ ) exp ( − α ⋅ ν ( λ ) ) where:
L ( λ ) L(\lambda) L ( λ ) is the at sensor radiance with CH4 _4 4 absorption
L 0 ( λ ) L_0(\lambda) L 0 ( λ ) is the at sensor radiance without any CH4 _4 4 absorption (background radiance)
α \alpha α is the CH4 _4 4 concentration-path length [ppm-m]
ν ( λ ) \boldsymbol{\nu}(\lambda) ν ( λ ) is the path-integrated unit absorption coefficient, which depends on the wavelength and the atmospheric conditions [dimensionless]
If we assume that there is a small absorption, i.e., α ⋅ s < < 1 \alpha \cdot s<< 1 α ⋅ s << 1 , then we can do a first-order Taylor expansion of the exponential term to make this function simpler.
We use the relationship that for small x x x , then exp ( − x ) ≈ 1 − x \exp(-x) \approx 1 - x exp ( − x ) ≈ 1 − x .
If we apply this to the Beer-Lambert’s law, we get:
L ( λ ) ≈ L 0 ( λ ) − α ⋅ ν ( λ ) ⋅ L 0 ( λ ) L(\lambda) \approx L_0(\lambda) - \alpha \cdot \boldsymbol{\nu}(\lambda) \cdot L_0(\lambda) L ( λ ) ≈ L 0 ( λ ) − α ⋅ ν ( λ ) ⋅ L 0 ( λ ) We can re-define some of the terms to make this equation more interpretable.
Firstly, let’s rearrange this equation to redefine is as the radiance due to the change in CH4 _4 4 concentration.
Δ L ( λ ) = L ( λ ) − L 0 ( λ ) ≈ − α ⋅ ν ( λ ) ⋅ L 0 ( λ ) \Delta L(\lambda) = L(\lambda) - L_0(\lambda) \approx - \alpha \cdot \boldsymbol{\nu}(\lambda) \cdot L_0(\lambda) Δ L ( λ ) = L ( λ ) − L 0 ( λ ) ≈ − α ⋅ ν ( λ ) ⋅ L 0 ( λ ) Next, we notice that the L 0 ( λ ) L_0(\lambda) L 0 ( λ ) is unknown.
However, we can assume that we can approximate this with the background mean
L 0 ( λ ) ≈ μ ( λ ) [ W m − 2 s r − 1 n m − 1 ] B u t , L_0(\lambda) \approx \mu(\lambda) \hspace{5mm} [W m^{-2} sr^{-1} nm^{-1}]But, L 0 ( λ ) ≈ μ ( λ ) [ W m − 2 s r − 1 n m − 1 ] B u t , Therefore, we can write the radiance due to the CH4 _4 4 enhancement term as
Δ L ( λ ) ≈ − α ⋅ ν ( λ ) ⋅ μ ( λ ) \Delta L(\lambda) \approx - \alpha \cdot \boldsymbol{\nu}(\lambda) \cdot \mu(\lambda) Δ L ( λ ) ≈ − α ⋅ ν ( λ ) ⋅ μ ( λ ) and likewise the optical depth term as
τ ( λ ) = α ⋅ ν ( λ ) \tau(\lambda) = \alpha \cdot \boldsymbol{\nu}(\lambda) τ ( λ ) = α ⋅ ν ( λ ) Finally, let’s define the target spectrum as
t ( λ ) = − ν ( λ ) ⋅ μ ( λ ) [ W m − 2 s r − 1 n m − 1 ] [ p p m ⋅ m ] − 1 t(\lambda) = - \boldsymbol{\nu}(\lambda) \cdot \mu(\lambda) \hspace{5mm} [W m^{-2} sr^{-1} nm^{-1}][ppm \cdot m]^{-1} t ( λ ) = − ν ( λ ) ⋅ μ ( λ ) [ W m − 2 s r − 1 n m − 1 ] [ pp m ⋅ m ] − 1 which finally gives us the linear relationship
Δ L ( λ ) ≈ α ⋅ t ( λ ) \Delta L(\lambda) \approx \alpha \cdot t(\lambda) Δ L ( λ ) ≈ α ⋅ t ( λ ) So, going back to the original measurement model, we can plug in our new definitions to get
L ( λ ) ≈ μ ( λ ) + α ⋅ t ( λ ) L(\lambda) \approx \mu(\lambda) + \alpha \cdot t(\lambda) L ( λ ) ≈ μ ( λ ) + α ⋅ t ( λ ) We can assume that we never actually observe the true radiance, L ( λ ) L(\lambda) L ( λ ) , even with this simplified linear model.
So, let’s add in the measurement model to include the noise, we get:
L ( λ ) = μ ( λ ) + α ⋅ t ( λ ) y ( λ ) = L ( λ ) + ϵ ( λ ) \begin{aligned}
L(\lambda) &= \mu(\lambda) + \alpha \cdot t(\lambda) \\
y(\lambda) &= L(\lambda) + \epsilon(\lambda) \\
\end{aligned} L ( λ ) y ( λ ) = μ ( λ ) + α ⋅ t ( λ ) = L ( λ ) + ϵ ( λ ) where ϵ ( λ ) \epsilon(\lambda) ϵ ( λ ) is the instrument noise, surface reflectance variability, and atmospheric effects.
This is a statistical linear model with a bias term (background mean) and additive noise.
In great Bayesian fashion, we can write the joint distribution of our variables and parameters
p ( y , L , λ , θ ) = p ( y ∣ L , λ , θ ) p ( L ∣ λ , θ ) p ( θ ) p(y,L,\lambda,\theta) = p(y|L,\lambda,\theta)p(L|\lambda,\theta)p(\theta) p ( y , L , λ , θ ) = p ( y ∣ L , λ , θ ) p ( L ∣ λ , θ ) p ( θ ) where:
y y y is the observed spectrum
L L L is the true spectrum
λ \lambda λ is the wavelength
θ \boldsymbol{\theta} θ are the parameters of the linear model, i.e., mean, covariance, and coefficient
This problem becomes a lot easier to solve as we simply the problem to a linear Gaussian model.
We can write the likelihood function as
log p ( θ ∣ D , λ ) = ∑ n = 1 N log N ( y n ( λ ) ∣ μ ( λ ) + α ⋅ t ( λ ) , Σ ) + log p ( θ ) \log p(\theta | \mathcal{D}, \lambda) =
\sum_{n=1}^{N} \log \mathcal{N}\left(y_n(\lambda) | \mu(\lambda) + \alpha \cdot t(\lambda), \Sigma\right)
+ \log p(\theta) log p ( θ ∣ D , λ ) = n = 1 ∑ N log N ( y n ( λ ) ∣ μ ( λ ) + α ⋅ t ( λ ) , Σ ) + log p ( θ ) This log-likelihood term can be explicity written as
log p ( α ∣ D , θ ) = ∑ n = 1 N ∣ ∣ y n ( λ ) − μ ( λ ) − α ⋅ t ( λ ) ∣ ∣ Σ − 1 2 + const \log p(\alpha | \mathcal{D}, \boldsymbol{\theta}) =
\sum_{n=1}^{N} ||y_n(\lambda) - \mu(\lambda) - \alpha \cdot t(\lambda)||^2_{\Sigma^{-1}} + \text{const} log p ( α ∣ D , θ ) = n = 1 ∑ N ∣∣ y n ( λ ) − μ ( λ ) − α ⋅ t ( λ ) ∣ ∣ Σ − 1 2 + const We can calculate the derivative of this log-likelihood term with respect to α \alpha α and set it to zero to get a closed-form solution for the maximum likelihood estimate of α \alpha α .
∂ α log p ( α ∣ D , θ ) = 2 t ( λ ) T Σ − 1 [ y ( λ ) − μ ( λ ) − α ⋅ t ( λ ) ] = 0 \partial_\alpha \log p(\alpha | \mathcal{D}, \boldsymbol{\theta}) =
2 t(\lambda)^T \Sigma^{-1}
\left[y(\lambda) - \mu(\lambda) - \alpha \cdot t(\lambda)\right] = 0 ∂ α log p ( α ∣ D , θ ) = 2 t ( λ ) T Σ − 1 [ y ( λ ) − μ ( λ ) − α ⋅ t ( λ ) ] = 0 Rearranging this equation gives us the maximum likelihood estimate of α \alpha α .
α ^ = t ( λ ) T Σ − 1 [ y ( λ ) − μ ( λ ) ] t ( λ ) T Σ − 1 t ( λ ) \hat{\alpha} =
\frac{t(\lambda)^T \Sigma^{-1} \left[y(\lambda) - \mu(\lambda)\right]}
{t(\lambda)^T \Sigma^{-1} t(\lambda)} α ^ = t ( λ ) T Σ − 1 t ( λ ) t ( λ ) T Σ − 1 [ y ( λ ) − μ ( λ ) ] So, solving for α ^ \hat{\alpha} α ^ , will give us access to the enhancement concentration, Δ CH 4 = α ^ ⋅ t ( λ ) \Delta \text{CH}_4 = \hat{\alpha} \cdot t(\lambda) Δ CH 4 = α ^ ⋅ t ( λ ) .
Approximations ¶ Background Mean and Covariance ¶ Let’s define the mean background mean and covariance spectrum.
We need to assume that these can be calculated from the observations, D = { y n } n = 1 N \mathcal{D}=\left\{y_n\right\}_{n=1}^{N} D = { y n } n = 1 N .
Background Mean : μ y ( λ ) = E [ Y ] = 1 N ∑ i = n N y n ( λ ) Background Cov : Σ y ( λ , λ ′ ) = Cov [ Y , Y ′ ] = 1 N − 1 ∑ n = 1 N [ y n ( λ ) − μ ( λ ) ] [ y n ( λ ′ ) − μ ( λ ′ ) ] T \begin{aligned}
\text{Background Mean}: && \mu_y(\lambda) &= \mathbb{E}[Y] = \frac{1}{N} \sum_{i=n}^{N} y_n(\lambda) \\
\text{Background Cov}: && \Sigma_y(\lambda,\lambda') &=
\text{Cov}[Y,Y'] =
\frac{1}{N-1} \sum_{n=1}^{N}
\left[y_n(\lambda) - \mu(\lambda)\right]
\left[y_n(\lambda') - \mu(\lambda')\right]^T
\end{aligned} Background Mean : Background Cov : μ y ( λ ) Σ y ( λ , λ ′ ) = E [ Y ] = N 1 i = n ∑ N y n ( λ ) = Cov [ Y , Y ′ ] = N − 1 1 n = 1 ∑ N [ y n ( λ ) − μ ( λ ) ] [ y n ( λ ′ ) − μ ( λ ′ ) ] T This means that the only unknown parameter is the concentration enhancement, α \alpha α .
RTM ¶ RTM Model : ν : = ν ( λ , AMF ) = f ( λ , AMF , θ ) + ϵ \begin{aligned}
\text{RTM Model}: && \boldsymbol{\nu}:=\boldsymbol{\nu}(\lambda, \text{AMF}) &= \boldsymbol{f}(\lambda,\text{AMF}, \boldsymbol{\theta}) + \epsilon
\end{aligned} RTM Model : ν := ν ( λ , AMF ) = f ( λ , AMF , θ ) + ϵ t = − A ⋅ A M F ( θ ) ⋅ μ ( λ ) t = - A \cdot AMF(\theta) \cdot \mu(\lambda) t = − A ⋅ A MF ( θ ) ⋅ μ ( λ ) Covariance Decomposition ¶ Σ = U Λ V ⊤ \Sigma= U\Lambda V^\top Σ = U Λ V ⊤ So the inverse will be
Σ − 1 = V Λ − 1 U ⊤ \Sigma^{-1} = V \Lambda^{-1} U^\top Σ − 1 = V Λ − 1 U ⊤ If we wish to use the regularized covariance matrix, we can use the decomposition
( Σ + λ I ) − 1 = V ( Λ + λ I ) − 1 U ⊤ \left(\Sigma + \lambda I\right)^{-1} = V \left(\Lambda + \lambda I\right)^{-1} U^\top ( Σ + λ I ) − 1 = V ( Λ + λ I ) − 1 U ⊤ Covariance Regularization ¶ We can use a shrinkage estimator to regularize the covariance matrix.
Σ reg = ( 1 − γ ) Σ + γ trace ( Σ ) d I \Sigma_{\text{reg}} = (1 - \gamma) \Sigma + \gamma \frac{\text{trace}(\Sigma)}{d} I Σ reg = ( 1 − γ ) Σ + γ d trace ( Σ ) I Logarithmic Matched Filter ¶ Beginning with Beer-Lambert’s law (1) , we can take the natural logarithm of both sides to get
log L ( λ ) = log L 0 ( λ ) − α ⋅ ν ( λ ) \log L(\lambda) = \log L_0(\lambda) - \alpha \cdot \boldsymbol{\nu}(\lambda) log L ( λ ) = log L 0 ( λ ) − α ⋅ ν ( λ ) We can extract the optical depth term as
τ ( λ ) = − α ⋅ ν ( λ ) = − log L ( λ ) + log L 0 ( λ ) = − log L ( λ ) L 0 ( λ ) = log L 0 ( λ ) L ( λ ) \begin{aligned}
\tau(\lambda) &= - \alpha \cdot \boldsymbol{\nu}(\lambda) \\
&= -\log L(\lambda) + \log L_0(\lambda) \\
&= -\log \frac{L(\lambda)}{L_0(\lambda)} \\
&= \log \frac{L_0(\lambda)}{L(\lambda)}
\end{aligned} τ ( λ ) = − α ⋅ ν ( λ ) = − log L ( λ ) + log L 0 ( λ ) = − log L 0 ( λ ) L ( λ ) = log L ( λ ) L 0 ( λ ) The key inversion problem still remains the same, as the L 0 ( λ ) L_0(\lambda) L 0 ( λ ) is still unknown for each pixel.
Strategy 1: Approximate Optical Depth ¶ We could approximate the background mean again as L 0 ( λ ) ≈ μ ( λ ) L_0(\lambda) \approx \mu(\lambda) L 0 ( λ ) ≈ μ ( λ ) .
This would give us the optical depth as
τ ( λ ) ≈ log [ μ λ L λ ] \tau(\lambda) \approx \log \left[\frac{\mu_\lambda}{L_\lambda}\right] τ ( λ ) ≈ log [ L λ μ λ ] Strategy 2: Log-Radiance Space ¶ When we have a small perturbation in the radiance, we can use a first-order Taylor expansion to approximate the logarithm of the radiance.
So, assuming we have the radiances plus a small perturbation, i.e.,
L λ ≈ μ λ + Δ L λ L_\lambda \approx \mu_\lambda + \Delta L_\lambda L λ ≈ μ λ + Δ L λ where where Δ L λ < < μ λ \Delta L_\lambda << \mu_\lambda Δ L λ << μ λ .
Firstly, we should
τ ( λ ) ≈ log [ μ λ μ λ + Δ L λ ] ≈ log [ 1 1 + Δ L λ / μ λ ] ≈ − log [ 1 + Δ L λ μ λ ] \begin{aligned}
\tau(\lambda)
&\approx \log \left[\frac{\mu_\lambda}{\mu_\lambda + \Delta L_\lambda}\right] \\
&\approx \log \left[\frac{1}{1+ \Delta L_\lambda/\mu_\lambda}\right] \\
&\approx - \log \left[1+ \frac{\Delta L_\lambda}{\mu_\lambda}\right]
\end{aligned} τ ( λ ) ≈ log [ μ λ + Δ L λ μ λ ] ≈ log [ 1 + Δ L λ / μ λ 1 ] ≈ − log [ 1 + μ λ Δ L λ ] Using the Taylor expansion for small x x x , log ( 1 + x ) ≈ x \log(1+x) \approx x log ( 1 + x ) ≈ x for ∣ x ∣ < < 1 |x| << 1 ∣ x ∣ << 1 , we can approximate the optical depth as
τ λ ≈ − Δ L λ μ λ \tau_\lambda \approx - \frac{\Delta L_\lambda}{\mu_\lambda} τ λ ≈ − μ λ Δ L λ So, substituting the optical depth term τ λ = α ⋅ t λ \tau_\lambda = \alpha \cdot t_\lambda τ λ = α ⋅ t λ into this equation, we get
α ⋅ t λ ≈ − Δ L λ μ λ \alpha \cdot t_\lambda \approx - \frac{\Delta L_\lambda}{\mu_\lambda} α ⋅ t λ ≈ − μ λ Δ L λ which yields
Δ L λ ≈ − α ⋅ t λ ⋅ μ λ \Delta L_\lambda \approx - \alpha \cdot t_\lambda \cdot \mu_\lambda Δ L λ ≈ − α ⋅ t λ ⋅ μ λ This yields the exact same approach as the linear matched filter, except that the target spectrum is nowNow,
Code Design ¶ Pre-train RTM model to learn f ( λ , AMF , θ ) \boldsymbol{f}(\lambda,\text{AMF}, \boldsymbol{\theta}) f ( λ , AMF , θ )
Calculate background mean and covariance from data
Calculate concentration enhancement, α \alpha α , using closed-form solution (Matched Filter)
Matched Filter ¶ # initialize obs data and target spectrum
data: Array["Samples Wavelength"] = ... # Load your spectral data here
target_spectrum: XArray["Wavelength"] = ... # Load or define your target spectrum here
# define the RTM model
params: PyTree = ... # Load or define your RTM model parameters here
rtm_model: Callable[[data, target_spectrum], inferred_conc] = load_rtm_model(params)