Inverse Problems - Research Notebook

Problem Formulation¶

\mathbf{y} = \boldsymbol{h}(\mathbf{x};\boldsymbol{\theta}) + \boldsymbol{\eta}

(1)

where:

$\mathbf{y} \in \mathbb{R}^{D_\mathbf{y}}$ is a noisy measurement
$\mathbf{x} \in \mathbb{R}^{D_\mathbf{x}}$ is the original signal
$\boldsymbol{h}(\:\cdot \:;\boldsymbol{\theta}): \mathbb{R}^{D_\mathbf{x}} \rightarrow \in \mathbb{R}^{D_\mathbf{y}}$ is a measurement function, parameterized by $\boldsymbol{\theta}$ .

Data¶

\mathbf{x} \in \mathbb{R}^{D_{\mathbf{x}}}

(2)

1D Signal: $\mathbf{x} \in \mathbb{R}^{D}$ , $\mathbf{x} \in \mathbb{R}^{T}$
2D Image: $\mathbf{x} \in \mathbb{R}^{H\times W}$ , $\mathbf{x} \in \mathbb{R}^{u\times v}$
3D Volume: $\mathbf{x} \in \mathbb{R}^{H \times W \times C}$ , $\mathbf{x} \in \mathbb{R}^{u \times v \times depth}$ , $\mathbf{x} \in \mathbb{R}^{u \times v \times time}$
2D Volume Sequence: $\mathbf{x} \in \mathbb{R}^{H \times W \times C \times time}$ , $\mathbf{x} \in \mathbb{R}^{u \times v \times depth \times time}$

Forward Direction¶

The first direction is the forward direction.

\mathbf{y} = \boldsymbol{h}(\mathbf{x};\boldsymbol{\theta})

(3)

We can try to find a solution for the forward problem which is to use the state, $\mathbf{x}$ , to help predict the observations, $\mathbf{y}$ . This is typically the easier of the two directions. Many times we simply need to approximate $\mathbf{y}$ with some function $\boldsymbol{f}_{\boldsymbol{\theta}}(\mathbf{x})$ . We can use point estimates, i.e. $\hat{\mathbf{y}}\approx \mathbf{y}^* = \operatorname*{argmax} p(\mathbf{y|x})$ , or we can try to obtain posterior distributions, i.e. $\hat{p}_{\boldsymbol{\theta}}(\mathbf{y|x}) \approx p(\mathbf{y|x})$ . We often call this discriminative or transductive machine learning. However, discriminative models are hard to interpret, explain and validate.

Note: This is quite different than traditional sciences because we don’t “model the real world”.

Inverse Direction¶

The other direction is the inverse direction.

\mathbf{x} = \boldsymbol{h}^{-1}(\mathbf{y};\boldsymbol{\theta})

(4)

We can also try to find a solution to the inverse problem whereby we have some observations, $\mathbf{y}$ , and we want to them help us predict some state, $\mathbf{x}$ . So we are more interested in the data generation likelihood process, $p(\mathbf{x}|\mathbf{y})$ . It is a more difficult problem because we need to make assumptions about our system in order to formulate a problem. Fortunately, once the problem has been formulated, we can use Bayes theorem and the standard tricks within to solve the problem. This is often called generative modeling.

Example Problem:

Let’s take some hidden system parameters, $\mathbf{x}$ , and let’s take some observations of the system behaviour, $\mathbf{y}$ . Our objective is the determine the posterior $p(\mathbf{x}|\mathbf{y=\hat{y}})$ to estimate the parameters, $\mathbf{x}$ , from some measured $\hat{\mathbf{y}}$ . We could learn a $p(\mathbf{x|y})$ using synthetic data from a simulation $\mathbf{y}=\boldsymbol{g}(\mathbf{x}; \boldsymbol{\epsilon})$ of the forward process.

Noise¶

Ill-Posed¶

Our problem is ill-posed. This means that the space of possible solutions is very large, many of which are nonsensical. So we need to constrain our model such that the space is manageable. If we think about our unknowns, first we have an unknown input, $\mathbf{x}$ .

\mathcal{L}(\theta,\mathbf{x}) = \argmin_{\mathbf{x},\theta} ||\mathbf{y} - \boldsymbol{f}(\mathbf{x};\theta)||_2^2 + \lambda \mathcal{R}(\mathbf{x})

(5)

In the case of $\boldsymbol{f}$ , we also have unknown parameters, θ.

\mathcal{L}(\theta,\mathbf{x}) = \argmin_{\mathbf{x},\theta} ||\mathbf{y} - \boldsymbol{f}(\mathbf{x};\theta)||_2^2 + \lambda_1 \mathcal{R}(\mathbf{x}) + \lambda_2 \mathcal{R}(\theta)

(6)

Solutions¶

Empirical Minimization¶

This is the case when we want to learn a solver for deterministic inverse problems. Given a set of inputs, $\mathcal{D}=\{ \mathbf{y}_i, \mathbf{x}_i \}_{i=1}^N$ . We also choose the functional form of our model, $\boldsymbol{f}$ , e.g. a neural network. In this case, our function is parameterized by θ.