Conditional Normalizing Flows#
Model#
\[
\mathbf{y} = \boldsymbol{f}(\mathbf{x};\boldsymbol{\theta})
\]
\[
\tilde{\mathbf{x}} = \text{NN}_{\boldsymbol \theta}(\mathbf{x})
\]
Transform#
\[
\mathbf{z} = \boldsymbol{T}_{\boldsymbol{\theta}}(\mathbf{y};\mathbf{x})
\]
We can also add a type of encoder function for the inputs, \(\mathbf{x}\).
\[
\tilde{\mathbf{x}} = \text{NN}_{\boldsymbol \theta}(\mathbf{x})
\]
So this alters the formulation slightly:
\[
\mathbf{z} = \boldsymbol{T}_{\boldsymbol{\theta}}(\mathbf{y}; \tilde{\mathbf{x}})
\]
or more compactly:
\[
\mathbf{z} = \boldsymbol{T}_{\boldsymbol{\theta}}(\mathbf{y}; \text{NN}_{\boldsymbol \theta}(\mathbf{x}))
\]
Prior#
\[
p(\mathbf{z}|\mathbf{x}) = \mathcal{N}\left(\mathbf{z}; \boldsymbol{\mu}(\mathbf{x}), \boldsymbol{\sigma}^2(\mathbf{x}) \right)
\]
Split Prior#
\[
p(\mathbf{z}_{\ell+1}|\mathbf{z}_\ell, \mathbf{x}) = \mathcal{N}(\mathbf{z}_{\ell+1}; \boldsymbol{\mu}_{\boldsymbol\theta}(\mathbf{z}_\ell, \mathbf{x}), \boldsymbol{\sigma}^2_{\boldsymbol\theta}(\mathbf{z}_\ell,\mathbf{x}))
\]
Transformations#
This is arguably the most powerful method to incorporate prior knowledge.
Bijections#
Coupling#
\[
\mathbf{z}_\ell = \boldsymbol{T}(\mathbf{x}; \text{NN}_{\boldsymbol \theta}(\mathbf{z}_\ell,\mathbf{x}))
\]
Sources#
Learning Likelihoods with Conditional Normalizing Flows - Winkler et al (2019)