Sensitivity Analysis - Gaussian Approximation
Context ¶ We are given some approximations of inputs"
Uncertainty Inputs : x ∼ N ( μ x , Σ x ) Model : x = μ x + ε x , ε x ∼ N ( 0 , Σ x ) \begin{aligned}
\text{Uncertainty Inputs}: && &&
\boldsymbol{x} &\sim \mathcal{N}(\boldsymbol{\mu_x},\boldsymbol{\Sigma_x}) \\
\text{Model}: && &&
\boldsymbol{x} &= \boldsymbol{\mu_x} + \boldsymbol{\varepsilon_x}, && &&
\boldsymbol{\varepsilon_x} \sim \mathcal{N}(0,\boldsymbol{\Sigma_x})
\end{aligned} Uncertainty Inputs : Model : x x ∼ N ( μ x , Σ x ) = μ x + ε x , ε x ∼ N ( 0 , Σ x ) We have the data likelihood given
y ∼ p ( y ∣ x , θ ) \boldsymbol{y}\sim p(\boldsymbol{y}|\boldsymbol{x},\boldsymbol{\theta}) y ∼ p ( y ∣ x , θ ) We assume that this predictive density is given by a Gaussian distribution
y ∼ p ( y ∣ x , θ ) ≈ N ( y ∣ μ y , Σ y ) \begin{aligned}
\boldsymbol{y} &\sim
p(\boldsymbol{y}|\boldsymbol{x},\boldsymbol{\theta}) \approx
\mathcal{N}
\left(\boldsymbol{y}\mid \boldsymbol{\mu_y}, \boldsymbol{\Sigma_y}\right)
\end{aligned} \\ y ∼ p ( y ∣ x , θ ) ≈ N ( y ∣ μ y , Σ y ) where h μ \boldsymbol{h_\mu} h μ and h σ \boldsymbol{h_\sigma} h σ is the predictive mean and variance respectively.
For example, this predictive mean could be a basis function, a non-linear function or a neural network.
The predictive variance function could be constant or a simple linear function.
Predictive Mean : μ y = h μ ( x ; θ ) , h μ : R D x × Θ → R D y Predictive Variance : Σ y = h σ 2 ( x ; θ ) , h σ 2 : R D x × Θ → R D y × D y \begin{aligned}
\text{Predictive Mean}: && &&
\boldsymbol{\mu_y} &=
\boldsymbol{h_\mu}(\boldsymbol{x};\boldsymbol{\theta}), && &&
\boldsymbol{h_\mu}: \mathbb{R}^{D_x}\times\mathbb{\Theta}\rightarrow\mathbb{R}^{D_y}\\
\text{Predictive Variance}: && &&
\boldsymbol{\Sigma_y} &=
\boldsymbol{h_{\sigma^2}}(\boldsymbol{x};\boldsymbol{\theta}), && &&
\boldsymbol{h_{\sigma^2}}: \mathbb{R}^{D_x}\times\mathbb{\Theta}\rightarrow\mathbb{R}^{D_y\times D_y}
\end{aligned} Predictive Mean : Predictive Variance : μ y Σ y = h μ ( x ; θ ) , = h σ 2 ( x ; θ ) , h μ : R D x × Θ → R D y h σ 2 : R D x × Θ → R D y × D y We have ways to estimate these quantities as follows using the law of iterated expectations.
μ y ( x ; θ ) = E x [ h μ ( x , θ ) ] σ y ( x ; θ ) = E x [ h σ 2 ( x , θ ) ] + E x [ h μ 2 ( x , θ ) ] − E x 2 [ h μ ( x , θ ) ] \begin{aligned}
\boldsymbol{\mu}_{\boldsymbol{y}}(\boldsymbol{x};\boldsymbol{\theta}) &=
\mathbb{E}_{\boldsymbol{x}} \left[ \boldsymbol{h_\mu}(\boldsymbol{x},\boldsymbol{\theta})\right] \\
\boldsymbol{\sigma}_{\boldsymbol{y}}(\boldsymbol{x};\boldsymbol{\theta}) &=
\mathbb{E}_{\boldsymbol{x}} \left[ \boldsymbol{h_{\sigma^2}}(\boldsymbol{x},\boldsymbol{\theta})\right] +
\mathbb{E}_{\boldsymbol{x}} \left[ \boldsymbol{h_\mu}^2(\boldsymbol{x},\boldsymbol{\theta})\right] -
\mathbb{E}_{\boldsymbol{x}}^2 \left[ \boldsymbol{h_\mu}(\boldsymbol{x},\boldsymbol{\theta})\right]
\end{aligned} μ y ( x ; θ ) σ y ( x ; θ ) = E x [ h μ ( x , θ ) ] = E x [ h σ 2 ( x , θ ) ] + E x [ h μ 2 ( x , θ ) ] − E x 2 [ h μ ( x , θ ) ] In integral form, we can write this as:
μ y ( x ; θ ) = ∫ μ x ( x , θ ) p ( x ) d x σ y ( x ; θ ) = ∫ h σ 2 ( x , θ ) p ( x ) d x + ∫ h μ 2 ( x , θ ) p ( x ) d x − [ ∫ μ x ( x , θ ) p ( x ) d x ] 2 \begin{aligned}
\boldsymbol{\mu}_{\boldsymbol{y}}(\boldsymbol{x};\boldsymbol{\theta}) &=
\int \boldsymbol{\mu_x}(\boldsymbol{x},\boldsymbol{\theta}) p(\boldsymbol{x})d\boldsymbol{x} \\
\boldsymbol{\sigma}_{\boldsymbol{y}}(\boldsymbol{x};\boldsymbol{\theta}) &=
\int \boldsymbol{h_{\sigma^2}}(\boldsymbol{x},\boldsymbol{\theta}) p(\boldsymbol{x})d\boldsymbol{x} +
\int \boldsymbol{h_\mu}^2(\boldsymbol{x},\boldsymbol{\theta})p(\boldsymbol{x})d\boldsymbol{x} -
\left[\int \boldsymbol{\mu_x}(\boldsymbol{x},\boldsymbol{\theta}) p(\boldsymbol{x})d\boldsymbol{x}\right]^2
\end{aligned} μ y ( x ; θ ) σ y ( x ; θ ) = ∫ μ x ( x , θ ) p ( x ) d x = ∫ h σ 2 ( x , θ ) p ( x ) d x + ∫ h μ 2 ( x , θ ) p ( x ) d x − [ ∫ μ x ( x , θ ) p ( x ) d x ] 2 Taylor Approximation ¶ E x [ h μ ( x , θ ) ] ≈ h μ ( μ x , θ ) + 1 2 Tr [ ∂ 2 h μ ( μ x , θ ) Σ x ] E x [ h μ ( x , θ ) ] ≈ h μ ( μ x , θ ) + 1 2 Tr [ ∂ 2 h μ ( μ x , θ ) Σ x ] E x [ h μ ( x , θ ) ] ≈ h μ ( μ x , θ ) + 1 2 Tr [ ∂ 2 h μ ( μ x , θ ) Σ x ] \begin{aligned}
\mathbb{E}_{\boldsymbol{x}} \left[ \boldsymbol{h_\mu}(\boldsymbol{x},\boldsymbol{\theta})\right]
&\approx
\boldsymbol{h_\mu}(\boldsymbol{\mu_x},\boldsymbol{\theta}) +
\frac{1}{2}\text{Tr}\left[
\partial^2\boldsymbol{h_\mu}(\boldsymbol{\mu_x},\boldsymbol{\theta})
\boldsymbol{\Sigma_x}
\right] \\
\mathbb{E}_{\boldsymbol{x}} \left[ \boldsymbol{h_\mu}(\boldsymbol{x},\boldsymbol{\theta})\right]
&\approx
\boldsymbol{h_\mu}(\boldsymbol{\mu_x},\boldsymbol{\theta}) +
\frac{1}{2}\text{Tr}\left[
\partial^2\boldsymbol{h_\mu}(\boldsymbol{\mu_x},\boldsymbol{\theta})
\boldsymbol{\Sigma_x}
\right] \\
\mathbb{E}_{\boldsymbol{x}} \left[ \boldsymbol{h_\mu}(\boldsymbol{x},\boldsymbol{\theta})\right]
&\approx
\boldsymbol{h_\mu}(\boldsymbol{\mu_x},\boldsymbol{\theta}) +
\frac{1}{2}\text{Tr}\left[
\partial^2\boldsymbol{h_\mu}(\boldsymbol{\mu_x},\boldsymbol{\theta})
\boldsymbol{\Sigma_x}
\right]
\end{aligned} E x [ h μ ( x , θ ) ] E x [ h μ ( x , θ ) ] E x [ h μ ( x , θ ) ] ≈ h μ ( μ x , θ ) + 2 1 Tr [ ∂ 2 h μ ( μ x , θ ) Σ x ] ≈ h μ ( μ x , θ ) + 2 1 Tr [ ∂ 2 h μ ( μ x , θ ) Σ x ] ≈ h μ ( μ x , θ ) + 2 1 Tr [ ∂ 2 h μ ( μ x , θ ) Σ x ] Moment-Matching ¶