Entropy¶

Intuition
Formulas
Code - Step-by-Step
Code - Refactored
Estimating Entropy
Histogram
Kernel Density Estimation
- KNN Approximation
Single Variable
Multivariate
Relative Entropy (KL-Divergence)

Intuition¶

Expected uncertainty.

$H(X) = \log \frac{\text{\# of Outcomes}}{\text{States}}$

Lower bound on the number of bits needed to represent a RV, e.g. a RV that has a unform distribution over 32 outcomes.
Lower bound on the average length of the shortest description of $X$
Self-Information

Formulas¶

$H(\mathbf{X}) = - \int_\mathcal{X} p(\mathbf{x}) \log p(\mathbf{x}) d\mathbf{x}$

And we can estimate this empirically by:

$H(\mathbf{X}) = -\sum_{i=1}^N p_i \log p_i$

where $p_i = P(\mathbf{X})$ .

Code - Step-by-Step¶

# 1. obtain all possible occurrences of the outcomes
values, counts = np.unique(labels, return_counts=True)

# 2. Normalize the occurrences to obtain a probability distribution 
counts /= counts.sum()

# 3. Calculate the entropy using the formula above
H = - (counts * np.log(counts, 2)).sum()

As a general rule-of-thumb, I never try to reinvent the wheel so I look to use whatever other software is available for calculating entropy. The simplest I have found is from scipy which has an entropy function. We still need a probability distribution (the counts variable). From there we can just use the entropy function.

Code - Refactored¶

# 1. obtain all possible occurrences of the outcomes
values, counts = np.unique(labels, return_counts=True)

# 2. Normalize the occurrences to obtain a probability distribution 
counts /= counts.sum()

# 3. Calculate the entropy using the formula above
base = 2
H = entropy(counts, base=base)

Estimating Entropy¶

Histogram¶

import numpy as np
from scipy import stats

# data
s1 = np.random.normal(10, 10, 1_000)

# construct histogram
hist_pdf, hist_bins = np.histogram(data, bins=50, range=(), density=True)

# calculate the entropy
H_data = stats.entropy(hist_pdf, base=2)

Kernel Density Estimation¶

KNN Approximation¶

Single Variable¶

$H(X) = \mathbb{E}_{p(X)} \left( \log \frac{1}{p(X)}\right)$

Multivariate¶

$H(X) = \mathbb{E}_{p(X,Y)} \left( \log \frac{1}{p(X,Y)}\right)$

Relative Entropy (KL-Divergence)¶

Measure of distance between two distributions

$D_{KL} (P,Q) = \int_\mathcal{X} p(x) \:\log \frac{p(x)}{q(x)}\;dx$

aka expected log-likelihood ratio
measure of inefficiency of assuming that the distribution is $q$ when we know the true distribution is $p$ .