Skip to article frontmatterSkip to article content

An overview of the methods used to find the best parameters.

Inference

In general, there are three different ways to get the parameters.


Point Estimates

We assume that the posterior distribution is proportional to the decomposition of the joint distribution and ignore the normalization constant.

p(θy)p(yθ)p(θ)p(\boldsymbol{\theta}|\mathbf{y}) \propto p(\mathbf{y}|\boldsymbol{\theta})p(\boldsymbol{\theta})

Thus, we will acquire an approximate estimate of the parameters given the measurements. To minimize this, we will simply

L(θ)=argminθn=1Nlogp(θyn)\boldsymbol{L}(\boldsymbol{\theta}) = \underset{\boldsymbol{\theta}}{\text{argmin}} \hspace{2mm} \sum_{n=1}^N\log p(\boldsymbol{\theta}|\mathbf{y}_n)

MLE Estimation

LMLE(θ)=argminθn=1Nlogp(ynθ)\boldsymbol{L}_\text{MLE}(\boldsymbol{\theta}) = \underset{\boldsymbol{\theta}}{\text{argmin}} \hspace{2mm} \sum_{n=1}^N\log p(y_n|\boldsymbol{\theta})

We put some constraints on the parameters. The mean and shape parameters are allowed to be completely free however, the scale parameter is constrained to be positive.

Mean:μRScale:σR+Shape:κR\begin{aligned} \text{Mean}: && && \mu &\in \mathbb{R} \\ \text{Scale}: && && \sigma &\in \mathbb{R}^+ \\ \text{Shape}: && && \kappa &\in \mathbb{R} \end{aligned}

MAP Estimation

The MAP estiamtion is very similar to the MLE estimation except that we put priors on the parameters.

LMAP(θ)=argminθn=1Nlogp(ynθ)+logp(θ)\boldsymbol{L}_\text{MAP}(\boldsymbol{\theta}) = \underset{\boldsymbol{\theta}}{\text{argmin}} \hspace{2mm} \sum_{n=1}^N\log p(y_n|\boldsymbol{\theta}) + \log p(\boldsymbol{\theta})

We put some prior distributions on the parameters. The mean and shape parameters are allowed to be completely free however, the scale parameter is constrained to be positive.

Mean:μNormal(μ^,σ^)Scale:σLogNormal(0.5σ^,0.25)Shape:κNormal(κ^,0.1)\begin{aligned} \text{Mean}: && && \mu &\sim \text{Normal}(\hat{\mu},\hat{\sigma})\\ \text{Scale}: && && \sigma &\sim \text{LogNormal}(0.5\hat{\sigma}, 0.25)\\ \text{Shape}: && && \kappa &\sim \text{Normal}(\hat{\kappa}, 0.1)\\ \end{aligned}

The estimated parameters for the μ are estimated directly from the data by calculating the mean and standard deviation. We use the same estimated parameter


Approximate Inference


Laplace Approximation (TODO)


SVI (TODO)


Sampling


MCMC (TODO)

References
  1. Fletcher, R. (2000). Practical Methods of Optimization. Wiley. 10.1002/9781118723203