Quickstart¶
Get from zero to results in under five minutes.
Part 1 — Density Estimation & Generative Modeling¶
Fit RBIG to data¶
import numpy as np
from rbig import AnnealedRBIG
# Generate a 2-D sin-wave dataset
rng = np.random.RandomState(42)
n = 2_000
x = np.abs(2 * rng.randn(1, n))
y = np.sin(x) + 0.25 * rng.randn(1, n)
data = np.vstack((x, y)).T # (2000, 2)
# Fit RBIG
model = AnnealedRBIG(n_layers=50, rotation="pca", patience=10, random_state=42)
model.fit(data)
Transform to Gaussian space¶
import matplotlib.pyplot as plt
Z = model.transform(data)
fig, axes = plt.subplots(1, 2, figsize=(10, 4))
axes[0].hexbin(data[:, 0], data[:, 1], gridsize=30, cmap="Blues", mincnt=1)
axes[0].set_title("Original data")
axes[1].hexbin(Z[:, 0], Z[:, 1], gridsize=30, cmap="Purples", mincnt=1)
axes[1].set_title("After RBIG (≈ standard Gaussian)")
plt.tight_layout()
plt.show()
Generate new samples¶
samples = model.sample(n_samples=1_000, random_state=0)
fig, axes = plt.subplots(1, 2, figsize=(10, 4))
axes[0].hexbin(data[:, 0], data[:, 1], gridsize=30, cmap="Blues", mincnt=1)
axes[0].set_title("Training data")
axes[1].hexbin(samples[:, 0], samples[:, 1], gridsize=30, cmap="Oranges", mincnt=1)
axes[1].set_title("Generated samples")
plt.tight_layout()
plt.show()
Estimate log-probabilities¶
log_probs = model.score_samples(data) # per-sample log p(x)
mean_ll = model.score(data) # mean log-likelihood
fig, ax = plt.subplots(figsize=(5, 4))
h = ax.scatter(data[:, 0], data[:, 1], s=8, c=log_probs, cmap="Reds")
ax.set_title("Data coloured by log p(x)")
plt.colorbar(h, ax=ax, label="log p(x)")
plt.tight_layout()
plt.show()
For a full walkthrough with theory, see the RBIG Walk-Through notebook.
Part 2 — Information Theory Measures¶
RBIG estimates information-theoretic quantities using the per-layer TC reduction approach from Laparra et al. (2011, 2020). Each RBIG layer removes statistical dependence; summing these reductions gives the total correlation — no Jacobian estimation needed.
Quick estimates (Level 0 — data only)¶
The simplest API: pass data, get an IT estimate.
import numpy as np
from rbig import estimate_tc, estimate_entropy, estimate_mi, estimate_kld
rng = np.random.RandomState(42)
# Two correlated 2-D random vectors
cov = np.eye(4)
cov[0, 2] = cov[2, 0] = 0.8
cov[1, 3] = cov[3, 1] = 0.5
joint = rng.multivariate_normal(np.zeros(4), cov, size=1_000)
X, Y = joint[:, :2], joint[:, 2:]
tc = estimate_tc(joint, random_state=42)
h = estimate_entropy(joint, random_state=42)
mi = estimate_mi(X, Y, random_state=42)
kl = estimate_kld(X, Y, random_state=42)
print(f"TC(X,Y): {tc:.4f} nats")
print(f"H(X,Y): {h:.4f} nats")
print(f"MI(X; Y): {mi:.4f} nats")
print(f"KLD(X || Y): {kl:.4f} nats")
Pre-fitted models (Level 1 — more control)¶
Fit models once, then compute multiple measures without re-fitting.
from rbig import (
AnnealedRBIG,
total_correlation_rbig_reduction,
entropy_rbig_reduction,
mutual_information_rbig_reduction,
)
kwargs = dict(n_layers=50, rotation="pca", random_state=42)
model_x = AnnealedRBIG(**kwargs)
model_y = AnnealedRBIG(**kwargs)
model_x.fit(X)
model_y.fit(Y)
tc = total_correlation_rbig_reduction(model_x)
h = entropy_rbig_reduction(model_x, X)
mi = mutual_information_rbig_reduction(model_x, model_y, X, Y, rbig_kwargs=kwargs)
print(f"TC(X): {tc:.4f} nats")
print(f"H(X): {h:.4f} nats")
print(f"MI(X; Y): {mi:.4f} nats")
Model-level access (Level 2)¶
model = AnnealedRBIG(n_layers=50, rotation="pca", random_state=42)
model.fit(joint)
# Per-layer TC values (recorded during fit)
print(f"Input TC: {model.tc_per_layer_[0]:.4f}")
print(f"Residual TC: {model.tc_per_layer_[-1]:.4f}")
print(f"TC removed: {model.total_correlation_reduction():.4f}")
Change-of-variables approach (also available)¶
The package also supports the standard normalizing-flow density approach
using log p(x) = log p_Z(f(x)) + log|det J|:
from rbig import entropy_rbig, mutual_information_rbig, kl_divergence_rbig
model_xy = AnnealedRBIG(**kwargs)
model_xy.fit(np.hstack([X, Y]))
H_cov = entropy_rbig(model_xy, np.hstack([X, Y]))
mi_cov = mutual_information_rbig(model_x, model_y, model_xy)
kld_cov = kl_divergence_rbig(model_x, Y)
Available IT measures¶
| Measure | RBIG-way (recommended) | Change-of-variables |
|---|---|---|
| Total Correlation | estimate_tc(X) |
total_correlation_rbig(X) |
| Entropy | estimate_entropy(X) |
entropy_rbig(model, X) |
| Mutual Information | estimate_mi(X, Y) |
mutual_information_rbig(m_x, m_y, m_xy) |
| KL-Divergence | estimate_kld(X, Y) |
kl_divergence_rbig(model_P, X_Q) |
| Log-Likelihood | — | model.score_samples(X) |
| TC Reduction | model.total_correlation_reduction() |
— |
For detailed examples, see the Information Theory notebook and Dependence Detection notebooks.