Rotation Choices in RBIG¶

After each marginal Gaussianization step, RBIG applies a rotation to mix dimensions before the next iteration. Without rotation, marginal Gaussianization alone cannot remove joint dependence — it only fixes the marginals. The rotation exposes new dependence structure for the next round to attack.

The choice of rotation affects convergence speed and quality. This notebook compares the three built-in options and provides the mathematical background.

Mathematical Background¶

A rotation matrix $\mathbf{R}$ is an orthogonal matrix ($\mathbf{R}^\top \mathbf{R} = \mathbf{I}$) that preserves lengths and angles.

Forward / Inverse:

$$\mathbf{z} = \mathbf{R} \cdot \mathbf{x} \qquad\text{and}\qquad \mathbf{x} = \mathbf{R}^\top \cdot \mathbf{z}$$

Why the Jacobian is free: The determinant of an orthogonal matrix is $\pm 1$ (+1 for proper rotations), so $\log|\det(\mathbf{R})| = 0$. This means the rotation step adds zero cost to the log-likelihood computation — only the marginal Gaussianization contributes to the Jacobian.

Available options in rbig:

Rotation	How it works	Data-dependent?
PCA	Eigenvectors of the covariance matrix	Yes
ICA	Maximizes non-Gaussianity of projections	Yes
Random	Haar-distributed orthogonal matrix	No

Colab / fresh environment? Run the cell below to install rbig from GitHub. Skip if already installed.

In [1]:

Copied!

!pip install "rbig[all] @ git+https://github.com/jejjohnson/rbig.git" -q
!pip install "rbig[all] @ git+https://github.com/jejjohnson/rbig.git" -q

In [2]:

Copied!





%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

from rbig import (
    AnnealedRBIG,
    MarginalGaussianize,
    PCARotation,
    ICARotation,
    RandomRotation,
    RBIGLayer,
)
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

from rbig import (
    AnnealedRBIG,
    MarginalGaussianize,
    PCARotation,
    ICARotation,
    RandomRotation,
    RBIGLayer,
)

/anaconda/lib/python3.13/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Data¶

We use the standard 2-D sin-wave dataset.

In [3]:

Copied!





seed = 123
rng = np.random.RandomState(seed=seed)

n_samples = 2_000
x = np.abs(2 * rng.randn(1, n_samples))
y = np.sin(x) + 0.25 * rng.randn(1, n_samples)
data = np.vstack((x, y)).T

fig, ax = plt.subplots(figsize=(5, 5))
ax.hexbin(data[:, 0], data[:, 1], gridsize=30, cmap="Blues", mincnt=1)
ax.set_title("Original data")
ax.set_xticks([])
ax.set_yticks([])
plt.tight_layout()
plt.show()
seed = 123
rng = np.random.RandomState(seed=seed)

n_samples = 2_000
x = np.abs(2 * rng.randn(1, n_samples))
y = np.sin(x) + 0.25 * rng.randn(1, n_samples)
data = np.vstack((x, y)).T

fig, ax = plt.subplots(figsize=(5, 5))
ax.hexbin(data[:, 0], data[:, 1], gridsize=30, cmap="Blues", mincnt=1)
ax.set_title("Original data")
ax.set_xticks([])
ax.set_yticks([])
plt.tight_layout()
plt.show()

No description has been provided for this image

Single Layer Comparison¶

Let's see what one RBIG layer looks like with each rotation.

In [4]:

Copied!





rotations = {
    "PCA": PCARotation(whiten=False),
    "ICA": ICARotation(random_state=seed),
    "Random": RandomRotation(random_state=seed),
}

fig, axes = plt.subplots(1, 3, figsize=(15, 5))

for ax, (name, rot) in zip(axes, rotations.items()):
    layer = RBIGLayer(
        marginal=MarginalGaussianize(),
        rotation=rot,
    )
    layer.fit(data)
    Z = layer.transform(data)
    ax.hexbin(Z[:, 0], Z[:, 1], gridsize=30, cmap="Reds", mincnt=1)
    ax.set_title(f"1 layer — {name}")
    ax.set_xticks([])
    ax.set_yticks([])

plt.suptitle("Single RBIG layer with different rotations", y=1.01)
plt.tight_layout()
plt.show()
rotations = {
    "PCA": PCARotation(whiten=False),
    "ICA": ICARotation(random_state=seed),
    "Random": RandomRotation(random_state=seed),
}

fig, axes = plt.subplots(1, 3, figsize=(15, 5))

for ax, (name, rot) in zip(axes, rotations.items()):
    layer = RBIGLayer(
        marginal=MarginalGaussianize(),
        rotation=rot,
    )
    layer.fit(data)
    Z = layer.transform(data)
    ax.hexbin(Z[:, 0], Z[:, 1], gridsize=30, cmap="Reds", mincnt=1)
    ax.set_title(f"1 layer — {name}")
    ax.set_xticks([])
    ax.set_yticks([])

plt.suptitle("Single RBIG layer with different rotations", y=1.01)
plt.tight_layout()
plt.show()

After just one layer, the outputs already look different:

PCA decorrelates and whitens — the cloud is centred and scaled
ICA looks for statistically independent directions — orientation may differ
Random applies an arbitrary mixing — may or may not reduce dependence in one step

Convergence Comparison¶

We fit AnnealedRBIG with each rotation type and compare:

Number of layers needed to converge
Total correlation (TC) curve
Final reconstruction error

In [5]:

Copied!





n_layers = 80
patience = 10

models = {}
for name in ["pca", "ica", "random"]:
    model = AnnealedRBIG(
        n_layers=n_layers,
        rotation=name,
        patience=patience,
        random_state=seed,
    )
    model.fit(data)
    models[name] = model
    print(f"{name:>6s}: converged in {len(model.layers_):>2d} layers")
n_layers = 80
patience = 10

models = {}
for name in ["pca", "ica", "random"]:
    model = AnnealedRBIG(
        n_layers=n_layers,
        rotation=name,
        patience=patience,
        random_state=seed,
    )
    model.fit(data)
    models[name] = model
    print(f"{name:>6s}: converged in {len(model.layers_):>2d} layers")

   pca: converged in 27 layers

   ica: converged in 80 layers

random: converged in 80 layers

TC convergence curves¶

In [6]:

Copied!





fig, ax = plt.subplots(figsize=(8, 5))

colors = {"pca": "steelblue", "ica": "darkorange", "random": "seagreen"}
for name, model in models.items():
    ax.plot(model.tc_per_layer_, label=f"{name.upper()} ({len(model.layers_)} layers)",
            color=colors[name], linewidth=2)

ax.set_xlabel("Layer index")
ax.set_ylabel("Total Correlation (nats)")
ax.set_title("TC convergence by rotation type")
ax.legend()
plt.tight_layout()
plt.show()
fig, ax = plt.subplots(figsize=(8, 5))

colors = {"pca": "steelblue", "ica": "darkorange", "random": "seagreen"}
for name, model in models.items():
    ax.plot(model.tc_per_layer_, label=f"{name.upper()} ({len(model.layers_)} layers)",
            color=colors[name], linewidth=2)

ax.set_xlabel("Layer index")
ax.set_ylabel("Total Correlation (nats)")
ax.set_title("TC convergence by rotation type")
ax.legend()
plt.tight_layout()
plt.show()

Transformed distributions¶

In [7]:

Copied!





fig, axes = plt.subplots(1, 3, figsize=(15, 5))

for ax, (name, model) in zip(axes, models.items()):
    Z = model.transform(data)
    ax.hexbin(Z[:, 0], Z[:, 1], gridsize=30, cmap="Purples", mincnt=1)
    ax.set_title(f"{name.upper()} — {len(model.layers_)} layers")
    ax.set_xticks([])
    ax.set_yticks([])

plt.suptitle("Converged RBIG output by rotation type", y=1.01)
plt.tight_layout()
plt.show()
fig, axes = plt.subplots(1, 3, figsize=(15, 5))

for ax, (name, model) in zip(axes, models.items()):
    Z = model.transform(data)
    ax.hexbin(Z[:, 0], Z[:, 1], gridsize=30, cmap="Purples", mincnt=1)
    ax.set_title(f"{name.upper()} — {len(model.layers_)} layers")
    ax.set_xticks([])
    ax.set_yticks([])

plt.suptitle("Converged RBIG output by rotation type", y=1.01)
plt.tight_layout()
plt.show()

Reconstruction error¶

In [8]:

Copied!





for name, model in models.items():
    Z = model.transform(data)
    data_rec = model.inverse_transform(Z)
    err = np.abs(data - data_rec).mean()
    print(f"{name:>6s}: mean |reconstruction error| = {err:.2e}")
for name, model in models.items():
    Z = model.transform(data)
    data_rec = model.inverse_transform(Z)
    err = np.abs(data - data_rec).mean()
    print(f"{name:>6s}: mean |reconstruction error| = {err:.2e}")

   pca: mean |reconstruction error| = 2.53e-02
   ica: mean |reconstruction error| = 5.59e-02

random: mean |reconstruction error| = 6.10e-02

Log-Likelihood Comparison¶

In [9]:

Copied!

for name, model in models.items():
    ll = model.score(data)
    print(f"{name:>6s}: mean log-likelihood = {ll:.4f}")
for name, model in models.items():
    ll = model.score(data)
    print(f"{name:>6s}: mean log-likelihood = {ll:.4f}")

   pca: mean log-likelihood = -1.6289

   ica: mean log-likelihood = -2.1271

random: mean log-likelihood = -1.5083

When to Use Which¶

Rotation	Strengths	Best for
PCA	Fast, deterministic, good default	Most use cases — decorrelation is usually what you want between layers
ICA	Finds non-Gaussian structure	Data with clear independent sources (e.g., signal separation)
Random	No data dependence, cheap	Very high-dimensional data where PCA/ICA is expensive; ensemble averaging

In practice, PCA is the recommended default. It tends to converge in fewer layers because the data-dependent orthogonal rotation decorrelates the features, leaving mainly higher-order structure for subsequent layers to handle. If you also want unit-variance components, you can optionally enable whitening via PCARotation(whiten=True), but this is not the library default for rotation="pca".

Random rotations may need more layers but are useful when:

Dimensionality is very high (PCA eigendecomposition is $O(d^3)$)
You want to average over multiple runs for robust IT estimates

ICA can help when the data has a known independent-source structure, but it is slower per layer and may not converge faster overall.

Additional Rotations¶

Beyond the three main options, rbig provides specialized rotation classes for advanced use cases:

Class	Use case
`PicardRotation`	ICA via the Picard algorithm (faster, more robust than FastICA)
`RandomOrthogonalProjection`	Semi-orthogonal dimensionality reduction
`GaussianRandomProjection`	Johnson-Lindenstrauss style projection
`OrthogonalDimensionalityReduction`	Full orthogonal rotation + dimension truncation
`DCTRotation`	2-D Discrete Cosine Transform (image data)
`HartleyRotation`	2-D Discrete Hartley Transform (image data)
`RandomChannelRotation`	1x1 convolution mixing channels (image data)

These can be passed directly to RBIGLayer(rotation=...) for custom pipelines. See the API Reference for full signatures.