Information Theory Measures
- Summary
- Information
- Entropy
- Mutual Information
- Total Correlation (Mutual Information)
- Kullback-Leibler Divergence (KLD)
Summary
Information
The information content (or surprisal) of an event \(x\) with probability \(p(x)\) is:
Rare events carry more information than common ones.
Entropy
The entropy of a random variable \(X\) with PDF \(p(x)\) is the expected information content:
Entropy measures the uncertainty or randomness of a distribution. A Gaussian distribution has the maximum entropy among all distributions with a given mean and variance. See Gaussian Distribution for the closed-form expression.
Mutual Information
The mutual information between two random variables \(X\) and \(Y\) measures the amount of information shared between them:
Equivalently:
Mutual information is zero if and only if \(X\) and \(Y\) are independent.
Total Correlation (Mutual Information)
This is a term that measures the statistical dependency of multi-variate sources using the common mutual-information measure.
where \(H(\mathbf{x})\) is the differential entropy of \(\mathbf{x}\) and \(H(x_d)\) represents the differential entropy of the \(d^\text{th}\) component of \(\mathbf{x}\). This is nicely summarized in equation 1 from (Lyu & Simoncelli, 2008).
Note: In 2 dimensions, the total correlation \(I\) is equivalent to the mutual information.
We can decompose this measure into two parts representing second order and higher-order dependencies:
again, nicely summarized with equation 2 from (Lyu & Simoncelli, 2008).
Sources: * Nonlinear Extraction of "Independent Components" of elliptically symmetric densities using radial Gaussianization - Lyu & Simoncelli - PDF
Kullback-Leibler Divergence (KLD)
The KL-Divergence measures the difference between two probability distributions \(p\) and \(q\):
See RBIG for how RBIG can be used to estimate the KLD.