API Reference¶

Models¶

`rbig.AnnealedRBIG` ¶

Bases: TransformerMixin, BaseEstimator

Rotation-Based Iterative Gaussianization (RBIG).

RBIG is a density estimation and data transformation method that iteratively Gaussianizes multivariate data by alternating between:

Marginal Gaussianization: mapping each feature to a Gaussian using its empirical CDF and the probit transform.
Rotation: applying an orthogonal matrix (PCA or ICA) to de-correlate the Gaussianized features.

The process repeats until the total correlation (TC) of the transformed data converges. After fitting, the model represents a normalizing flow whose density is given by the change-of-variables formula:

log p(x) = log p_Z(f(x)) + log|det J_f(x)|

where f is the composition of all fitted layers and p_Z is a standard multivariate Gaussian.

Parameters¶

n_layers : int, default=100 Maximum number of RBIG layers to apply. Early stopping via patience may halt training before this limit. rotation : str, default="pca" Rotation method: "pca" (PCA without whitening — orthogonal), "ica" (Independent Component Analysis), or "random" (Haar-distributed orthogonal rotation). patience : int, default=10 Number of consecutive layers showing a TC change smaller than tol before training stops early. (Formerly zero_tolerance, which is still accepted but deprecated.) tol : float or "auto", default=1e-5 Convergence threshold for the per-layer change in total correlation: |TC(k) − TC(k−1)| < tol. When set to "auto", the tolerance is chosen adaptively based on the number of training samples using an empirically calibrated lookup table. random_state : int or None, default=None Seed for the random number generator used by stochastic components such as ICA or random rotations. strategy : list or None, default=None Optional per-layer override list. Each entry may be a string (rotation name) or a (rotation_name, marginal_name) pair. Entries cycle if the list is shorter than n_layers. verbose : bool or int, default=False Controls progress bar display. False (or 0) disables all progress bars. True (or 1) shows a progress bar for the fit loop. 2 additionally shows progress bars for transform, inverse_transform, score_samples, and jacobian.

Attributes¶

n_features_in_ : int Number of features seen during fit. layers_ : list of RBIGLayer Fitted RBIG layers in application order. tc_per_layer_ : list of float Total correlation of the data at each stage. Index 0 is the TC of the input data (before any layers); index k >= 1 is the TC after layer k. log_det_train_ : np.ndarray of shape (n_samples,) Accumulated per-sample log-det-Jacobian over all layers, computed on the training data during fit. X_transformed_ : np.ndarray of shape (n_samples, n_features) Training data after passing through all fitted layers.

Notes¶

Total correlation is defined as:

TC(X) = ∑ᵢ H(Xᵢ) − H(X)

where H(Xᵢ) is the marginal entropy of the i-th feature and H(X) is the joint entropy. For a fully Gaussianized, independent dataset, TC = 0.

References¶

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: From ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537–549. https://doi.org/10.1109/TNN.2011.2106511

Examples¶

import numpy as np from rbig._src.model import AnnealedRBIG rng = np.random.default_rng(42) X = rng.standard_normal((300, 4)) model = AnnealedRBIG(n_layers=20, rotation="pca") model.fit(X) Z = model.transform(X) Z.shape (300, 4) model.score(X) # mean log-likelihood in nats -5.65...

Source code in rbig/_src/model.py

class AnnealedRBIG(TransformerMixin, BaseEstimator):
    """Rotation-Based Iterative Gaussianization (RBIG).

    RBIG is a density estimation and data transformation method that
    iteratively Gaussianizes multivariate data by alternating between:

    1. **Marginal Gaussianization**: mapping each feature to a Gaussian
       using its empirical CDF and the probit transform.
    2. **Rotation**: applying an orthogonal matrix (PCA or ICA) to
       de-correlate the Gaussianized features.

    The process repeats until the total correlation (TC) of the
    transformed data converges.  After fitting, the model represents a
    normalizing flow whose density is given by the change-of-variables
    formula:

        log p(x) = log p_Z(f(x)) + log|det J_f(x)|

    where ``f`` is the composition of all fitted layers and ``p_Z`` is a
    standard multivariate Gaussian.

    Parameters
    ----------
    n_layers : int, default=100
        Maximum number of RBIG layers to apply.  Early stopping via
        ``patience`` may halt training before this limit.
    rotation : str, default="pca"
        Rotation method: ``"pca"`` (PCA without whitening — orthogonal),
        ``"ica"`` (Independent Component Analysis), or ``"random"``
        (Haar-distributed orthogonal rotation).
    patience : int, default=10
        Number of consecutive layers showing a TC change smaller than
        ``tol`` before training stops early.  (Formerly ``zero_tolerance``,
        which is still accepted but deprecated.)
    tol : float or "auto", default=1e-5
        Convergence threshold for the per-layer change in total correlation:
        ``|TC(k) − TC(k−1)| < tol``.  When set to ``"auto"``, the tolerance
        is chosen adaptively based on the number of training samples using
        an empirically calibrated lookup table.
    random_state : int or None, default=None
        Seed for the random number generator used by stochastic components
        such as ICA or random rotations.
    strategy : list or None, default=None
        Optional per-layer override list.  Each entry may be a string
        (rotation name) or a ``(rotation_name, marginal_name)`` pair.
        Entries cycle if the list is shorter than ``n_layers``.
    verbose : bool or int, default=False
        Controls progress bar display.  ``False`` (or ``0``) disables all
        progress bars.  ``True`` (or ``1``) shows a progress bar for the
        ``fit`` loop.  ``2`` additionally shows progress bars for
        ``transform``, ``inverse_transform``, ``score_samples``, and
        ``jacobian``.

    Attributes
    ----------
    n_features_in_ : int
        Number of features seen during ``fit``.
    layers_ : list of RBIGLayer
        Fitted RBIG layers in application order.
    tc_per_layer_ : list of float
        Total correlation of the data at each stage.  Index 0 is the TC
        of the *input* data (before any layers); index *k* >= 1 is the TC
        after layer *k*.
    log_det_train_ : np.ndarray of shape (n_samples,)
        Accumulated per-sample log-det-Jacobian over all layers,
        computed on the training data during ``fit``.
    X_transformed_ : np.ndarray of shape (n_samples, n_features)
        Training data after passing through all fitted layers.

    Notes
    -----
    Total correlation is defined as:

        TC(X) = ∑ᵢ H(Xᵢ) − H(X)

    where H(Xᵢ) is the marginal entropy of the i-th feature and H(X) is
    the joint entropy.  For a fully Gaussianized, independent dataset,
    TC = 0.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    From ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537–549. https://doi.org/10.1109/TNN.2011.2106511

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.model import AnnealedRBIG
    >>> rng = np.random.default_rng(42)
    >>> X = rng.standard_normal((300, 4))
    >>> model = AnnealedRBIG(n_layers=20, rotation="pca")
    >>> model.fit(X)
    <rbig._src.model.AnnealedRBIG object at ...>
    >>> Z = model.transform(X)
    >>> Z.shape
    (300, 4)
    >>> model.score(X)  # mean log-likelihood in nats
    -5.65...
    """

    def __init__(
        self,
        n_layers: int = 100,
        rotation: str = "pca",
        patience: int = 10,
        tol: float | str = 1e-5,
        random_state: int | None = None,
        strategy: list | None = None,
        verbose: bool | int = False,
    ):
        self.n_layers = n_layers
        self.rotation = rotation
        self.patience = patience
        self.tol = tol
        self.random_state = random_state
        self.strategy = strategy
        self.verbose = verbose

    @property
    def zero_tolerance(self):
        """Deprecated alias for ``patience``."""
        import warnings

        warnings.warn(
            "zero_tolerance is deprecated, use patience instead",
            FutureWarning,
            stacklevel=2,
        )
        return self.patience

    @zero_tolerance.setter
    def zero_tolerance(self, value):
        import warnings

        warnings.warn(
            "zero_tolerance is deprecated, use patience instead",
            FutureWarning,
            stacklevel=2,
        )
        self.patience = value

    def fit(self, X: np.ndarray, y=None) -> AnnealedRBIG:
        """Fit the RBIG model by iteratively Gaussianizing X.

        At each layer k the algorithm:

        1. Builds a new :class:`RBIGLayer` with the configured marginal and
           rotation transforms.
        2. Fits the layer on the current working copy ``Xt``.
        3. Accumulates the per-sample log-det-Jacobian:
           ``log_det_train_ += log|det J_k(Xt)|``.
        4. Advances ``Xt`` through the layer: ``Xt = f_k(Xt)``.
        5. Measures residual total correlation: ``TC(Xt) = ∑ᵢ H(Xᵢ) − H(X)``.
        6. Stops early when TC has not changed by more than ``tol`` for
           ``patience`` consecutive layers.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.
        y : ignored
            Not used, present for sklearn pipeline compatibility.

        Returns
        -------
        self : AnnealedRBIG
            The fitted model.
        """
        X = validate_data(self, X)
        n_samples, n_features = X.shape
        if n_samples < 2:
            raise ValueError(
                f"RBIG requires at least 2 samples to estimate marginal CDFs, "
                f"got n_samples = {n_samples}."
            )
        self.n_features_in_ = n_features  # remember input dimensionality
        self.layers_: list[RBIGLayer] = []
        self.tc_per_layer_: list[float] = []

        # Validate and resolve tolerance
        if self.tol == "auto":
            tol = self._get_information_tolerance(n_samples)
        elif isinstance(self.tol, int | float):
            tol = float(self.tol)
        else:
            raise ValueError(f"tol must be a float or 'auto', got {self.tol!r}")
        self.tol_: float = tol  # store resolved tolerance for inspection

        Xt = X.copy()  # working copy; shape (n_samples, n_features)
        self.log_det_train_ = np.zeros(
            n_samples
        )  # accumulated log|det J|; shape (n_samples,)
        zero_count = 0  # consecutive non-improving layer counter

        # Record TC of the *input* data (before any layers).  This is
        # needed by total_correlation_reduction() which uses
        # tc_per_layer_[0] - tc_per_layer_[-1].
        self.tc_per_layer_.append(self._total_correlation(Xt))

        pbar = maybe_tqdm(
            range(self.n_layers),
            verbose=self.verbose,
            level=1,
            desc="Fitting RBIG",
            total=self.n_layers,
        )
        for i in pbar:
            # Build layer i with the appropriate marginal and rotation components
            layer = RBIGLayer(
                marginal=self._make_marginal(layer_index=i),
                rotation=self._make_rotation(layer_index=i),
            )
            layer.fit(Xt)
            # Accumulate log|det J_i(Xt)| before advancing Xt
            self.log_det_train_ += layer.log_det_jacobian(Xt)
            Xt = layer.transform(Xt)  # advance to next representation
            self.layers_.append(layer)

            # Measure residual total correlation: TC = sum_i H(Xi) - H(X)
            tc = self._total_correlation(Xt)
            self.tc_per_layer_.append(tc)

            if hasattr(pbar, "set_postfix"):
                postfix = {"TC": f"{tc:.4g}"}
                if i > 0:
                    delta = abs(self.tc_per_layer_[-2] - tc)
                    postfix["δTC"] = f"{delta:.2e}"
                pbar.set_postfix(postfix)

            if i > 0:
                # Check convergence: how much did TC improve this layer?
                delta = abs(self.tc_per_layer_[-2] - tc)
                if delta < tol:
                    zero_count += 1
                else:
                    zero_count = 0  # reset on any significant improvement

            # Stop early if TC has been flat for patience consecutive layers
            if zero_count >= self.patience:
                if hasattr(pbar, "total"):
                    pbar.total = i + 1
                    pbar.refresh()
                break

        # Store the fully transformed training data for efficient entropy estimation
        self.X_transformed_ = Xt  # shape (n_samples, n_features)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Map X to the Gaussian latent space through all fitted layers.

        Applies each fitted :class:`RBIGLayer` in order:
        ``Z = fₖ(… f₂(f₁(x)) …)``.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Data in the approximately Gaussian latent space.
        """
        check_is_fitted(self)
        Xt = validate_data(self, X, reset=False).copy()
        layers_iter = maybe_tqdm(
            self.layers_,
            verbose=self.verbose,
            level=2,
            desc="Transforming",
            total=len(self.layers_),
        )
        for layer in layers_iter:
            Xt = layer.transform(Xt)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Map latent-space data back to the original input space.

        Applies layers in reverse order:
        ``x = f₁⁻¹(… fₖ₋₁⁻¹(fₖ⁻¹(z)) …)``.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the latent (approximately Gaussian) space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original input space.
        """
        check_is_fitted(self)
        Xt = validate_data(self, X, reset=False).copy()
        layers_iter = maybe_tqdm(
            reversed(self.layers_),
            verbose=self.verbose,
            level=2,
            desc="Inverse transforming",
            total=len(self.layers_),
        )
        for layer in layers_iter:
            Xt = layer.inverse_transform(Xt)
        return Xt

    def fit_transform(self, X: np.ndarray, y=None) -> np.ndarray:
        """Fit the model to X and return the latent-space representation.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Transformed data in the latent space.
        """
        return self.fit(X).transform(X)

    def score_samples(self, X: np.ndarray) -> np.ndarray:
        """Per-sample log-likelihood under the fitted density model.

        Uses the change-of-variables formula for normalizing flows:

            log p(x) = log p_Z(f(x)) + log|det J_f(x)|

        where ``p_Z = 𝒩(0, I)`` is the standard Gaussian base density,
        ``f`` is the composition of all fitted layers, and ``J_f(x)`` is
        the Jacobian of ``f`` at ``x``.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data points at which to evaluate the log-likelihood.

        Returns
        -------
        log_prob : np.ndarray of shape (n_samples,)
            Per-sample log-likelihood in nats.

        Notes
        -----
        The log-det-Jacobian is accumulated layer by layer to avoid
        recomputing intermediate representations:

            log|det J_f(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|
        """
        check_is_fitted(self)
        X = validate_data(self, X, reset=False)
        Xt = X.copy()  # shape (n_samples, n_features)
        log_det_jac = np.zeros(X.shape[0])  # accumulator; shape (n_samples,)
        layers_iter = maybe_tqdm(
            self.layers_,
            verbose=self.verbose,
            level=2,
            desc="Scoring",
            total=len(self.layers_),
        )
        for layer in layers_iter:
            # Accumulate log|det Jₖ| before advancing through layer k
            log_det_jac += layer.log_det_jacobian(Xt)
            Xt = layer.transform(Xt)  # xₖ = fₖ(xₖ₋₁)
        # log p_Z(z) = sum_i log N(z_i; 0, 1); shape (n_samples,)
        log_pz = np.sum(stats.norm.logpdf(Xt), axis=1)
        # change-of-variables: log p(x) = log p_Z(f(x)) + log|det J_f(x)|
        return log_pz + log_det_jac

    def score(self, X: np.ndarray, y=None) -> float:
        """Mean log-likelihood of samples X under the fitted density.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data points to evaluate.
        y : ignored
            Not used, present for sklearn pipeline compatibility.

        Returns
        -------
        mean_log_prob : float
            Average per-sample log-likelihood in nats.
        """
        return float(np.mean(self.score_samples(X)))

    def entropy(self) -> float:
        """Differential entropy of the fitted distribution in nats.

        Estimated from the training data using:

            H(X) = −𝔼_X[log p(x)]

        The expectation is approximated by the sample mean over the training
        set.  The log-likelihoods are obtained via the efficient cached path
        :meth:`score_samples_raw_` which reuses pre-computed quantities from
        ``fit``.

        Returns
        -------
        h : float
            Estimated entropy in nats.  Always ≥ 0 for continuous
            distributions.

        Notes
        -----
        This is equivalent to ``-self.score(X_train)`` but avoids the cost
        of re-passing training data through all layers.
        """
        check_is_fitted(self)
        return float(-np.mean(self.score_samples_raw_()))

    def total_correlation_reduction(self) -> float:
        """Total correlation removed by RBIG (RBIG-way TC estimation).

        Uses the per-layer TC reduction approach from Laparra et al. (2011):

            TC(X) = TC₀ − TCₖ = Σₖ ΔTCₖ

        where TC₀ is the total correlation of the input and TCₖ is the
        residual TC after K layers of Gaussianization.  When the model has
        converged, TCₖ ≈ 0 and the result equals TC₀.

        Returns
        -------
        tc : float
            Estimated total correlation in nats.
        """
        check_is_fitted(self)
        return float(self.tc_per_layer_[0] - self.tc_per_layer_[-1])

    def entropy_reduction(self, X: np.ndarray) -> float:
        """Differential entropy via RBIG-way TC reduction.

        Uses the identity H(X) = Σ_d H(X_d) − TC(X) where marginal
        entropies are estimated via KDE and TC is obtained from the
        cumulative per-layer TC reduction (Laparra et al. 2011).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data whose entropy is estimated (typically the training data).

        Returns
        -------
        h : float
            Estimated differential entropy in nats.
        """
        check_is_fitted(self)
        from rbig._src.densities import marginal_entropy

        h_marginals = marginal_entropy(X)  # shape (n_features,)
        tc = self.total_correlation_reduction()
        return float(np.sum(h_marginals) - tc)

    def score_samples_raw_(self) -> np.ndarray:
        """Log-likelihood for the stored training data without recomputing layers.

        Reuses ``X_transformed_`` and ``log_det_train_`` cached during
        :meth:`fit`, so the cost is a single Gaussian log-pdf evaluation
        rather than a full forward pass through all layers.

        Returns
        -------
        log_prob : np.ndarray of shape (n_samples,)
            Per-sample log-likelihood of the training data in nats.
        """
        # log p_Z evaluated at the pre-computed transformed training data
        log_pz = np.sum(
            stats.norm.logpdf(self.X_transformed_), axis=1
        )  # shape (n_samples,)
        # add the accumulated log-det-Jacobian stored during fit
        return log_pz + self.log_det_train_

    def sample(self, n_samples: int, random_state: int | None = None) -> np.ndarray:
        """Generate samples from the learned distribution.

        Draws i.i.d. standard Gaussian samples in the latent space and maps
        them back to the data space via the inverse normalizing flow.

        Parameters
        ----------
        n_samples : int
            Number of samples to generate.
        random_state : int or None, optional
            Seed for the random number generator.  If ``None``, a random
            seed is used.

        Returns
        -------
        X_new : np.ndarray of shape (n_samples, n_features_in_)
            Samples in the original data space.
        """
        check_is_fitted(self)
        rng = np.random.default_rng(random_state)
        Z = rng.standard_normal((n_samples, self.n_features_in_))  # latent samples
        return self.inverse_transform(Z)

    def predict_proba(
        self,
        X: np.ndarray,
        domain: str = "input",
    ) -> np.ndarray | tuple[np.ndarray, np.ndarray]:
        """Return probability density estimates for X.

        Uses the change-of-variables formula via the full Jacobian matrix
        to compute the density in the requested domain.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data points to evaluate.
        domain : str, default="input"
            Which domain to return densities in:

            - ``"input"`` — density in the original data space:
              ``p(x) = p_Z(f(x)) · |det J_f(x)|``
            - ``"transform"`` — density in the Gaussian latent space:
              ``p_Z(f(x)) = ∏ᵢ φ(fᵢ(x))``
            - ``"both"`` — returns a tuple ``(p_input, p_transform)``

        Returns
        -------
        proba : np.ndarray of shape (n_samples,) or tuple
            Probability density estimates.  When ``domain="both"``, returns
            ``(p_input, p_transform)``.
        """
        check_is_fitted(self)
        X = validate_data(self, X, reset=False)
        jac, Xt = self.jacobian(X, return_X_transform=True)

        # Work in log-space for numerical stability
        log_p_transform = np.sum(stats.norm.logpdf(Xt), axis=1)

        if domain == "transform":
            p_transform = np.exp(log_p_transform)
            p_transform = np.where(np.isfinite(p_transform), p_transform, 0.0)
            return p_transform

        # Input-domain density via change of variables (log-space)
        _sign, log_abs_det = np.linalg.slogdet(jac)
        log_p_input = log_p_transform + log_abs_det
        p_input = np.exp(log_p_input)
        p_input = np.where(np.isfinite(p_input), p_input, 0.0)

        if domain == "input":
            return p_input
        if domain == "both":
            p_transform = np.exp(log_p_transform)
            p_transform = np.where(np.isfinite(p_transform), p_transform, 0.0)
            return p_input, p_transform
        raise ValueError(
            f"Unknown domain: {domain!r}. Use 'input', 'transform', or 'both'."
        )

    def jacobian(
        self,
        X: np.ndarray,
        return_X_transform: bool = False,
    ) -> np.ndarray | tuple[np.ndarray, np.ndarray]:
        """Compute the full Jacobian matrix of the RBIG transform.

        For each sample, returns the ``(n_features, n_features)`` Jacobian
        matrix ``df/dx`` of the composition of all fitted layers.  Uses the
        seed-dimension approach from the legacy implementation: for each input
        dimension ``idim``, a unit vector is propagated through the chain of
        per-feature marginal derivatives and rotation matrices.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data at which to evaluate the Jacobian.
        return_X_transform : bool, default False
            If True, also return the fully transformed data ``f(X)`` (computed
            as a side-effect of the Jacobian calculation).

        Returns
        -------
        jac : np.ndarray of shape (n_samples, n_features, n_features)
            Full Jacobian matrix per sample.  ``jac[n, i, j]`` is the partial
            derivative ``df_i/dx_j`` for the n-th sample.
        X_transformed : np.ndarray of shape (n_samples, n_features)
            Only returned when ``return_X_transform=True``.  The data after
            passing through all layers.
        """
        check_is_fitted(self)
        n_samples, n_features = X.shape

        # ── Forward pass: collect per-layer derivatives and rotation matrices ──
        derivs_per_layer = []  # each: (n_samples, n_features)
        rotmats_per_layer = []  # each: (n_features, n_features)

        Xt = X.copy()
        layers_iter = maybe_tqdm(
            self.layers_,
            verbose=self.verbose,
            level=2,
            desc="Jacobian (forward)",
            total=len(self.layers_),
        )
        for layer in layers_iter:
            if not hasattr(layer.marginal, "_per_feature_log_deriv"):
                raise NotImplementedError(
                    f"Jacobian computation requires a marginal with "
                    f"_per_feature_log_deriv(); "
                    f"{type(layer.marginal).__name__} does not support this."
                )
            # Per-feature marginal derivatives and transformed data in one pass
            log_d, Xt_marginal = layer.marginal._per_feature_log_deriv(
                Xt, return_transform=True
            )
            derivs_per_layer.append(np.exp(log_d))

            # Rotation matrix in row-vector convention: y = z @ R
            rot = self._extract_rotation_matrix(layer.rotation)
            rotmats_per_layer.append(rot)

            # Advance through rotation only
            Xt = layer.rotation.transform(Xt_marginal)

        # ── Seed-dimension loop: propagate unit vectors through the chain ──
        jac = np.zeros((n_samples, n_features, n_features))

        dims_iter = maybe_tqdm(
            range(n_features),
            verbose=self.verbose,
            level=2,
            desc="Jacobian (dims)",
            total=n_features,
        )
        for idim in dims_iter:
            # Initialize seed: unit vector in dimension idim
            XX = np.zeros((n_samples, n_features))
            XX[:, idim] = 1.0

            for derivs, R in zip(derivs_per_layer, rotmats_per_layer, strict=True):
                # Chain rule: XX_new = diag(derivs) @ XX @ R
                XX = (derivs * XX) @ R

            jac[:, :, idim] = XX

        if return_X_transform:
            return jac, Xt
        return jac

    @staticmethod
    def _extract_rotation_matrix(rotation) -> np.ndarray:
        """Extract the effective rotation matrix in row-vector convention.

        For PCA with whitening the effective matrix is
        ``components_.T / sqrt(explained_variance_)`` so that
        ``y = (x - mu) @ R``.

        Parameters
        ----------
        rotation : BaseTransform
            A fitted rotation object (PCARotation, ICARotation, etc.).

        Returns
        -------
        R : np.ndarray of shape (n_features, n_features)
            Rotation matrix such that ``y = x @ R`` (ignoring mean shift).
        """
        from rbig._src.rotation import ICARotation, PCARotation

        if isinstance(rotation, PCARotation):
            R = rotation.pca_.components_.T.copy()
            if rotation.whiten:
                R /= np.sqrt(rotation.pca_.explained_variance_)[np.newaxis, :]
            return R

        if isinstance(rotation, ICARotation):
            # ICA unmixing: W = components_, transform is x @ W.T
            if hasattr(rotation, "K_") and rotation.K_ is not None:
                # Picard path: y = (x @ K.T) @ W.T
                return rotation.K_.T @ rotation.W_.T
            return rotation.ica_.components_.T.copy()

        # Generic fallback: try to get rotation_matrix_ attribute.
        # These rotations apply X @ rotation_matrix_.T, so transpose
        # to match the y = x @ R convention used by PCA/ICA above.
        if hasattr(rotation, "rotation_matrix_"):
            return rotation.rotation_matrix_.T.copy()

        raise TypeError(
            f"Cannot extract rotation matrix from {type(rotation).__name__}. "
            f"Jacobian computation requires PCARotation, ICARotation, or an "
            f"object with a rotation_matrix_ attribute."
        )

    def _make_rotation(self, layer_index: int = 0):
        """Instantiate the rotation component for a given layer.

        Parameters
        ----------
        layer_index : int, default=0
            Index of the layer being constructed.  Used when cycling through
            a ``strategy`` list.

        Returns
        -------
        rotation : RotationBijector
            An unfitted rotation bijector instance.
        """
        if self.strategy is not None:
            # cycle through the strategy list to select rotation for this layer
            idx = layer_index % len(self.strategy)
            entry = self.strategy[idx]
            rotation_name = entry[0] if isinstance(entry, list | tuple) else entry
            return self._get_component(rotation_name, "rotation", layer_index)
        if self.rotation == "pca":
            return PCARotation(whiten=False)
        elif self.rotation == "ica":
            from rbig._src.rotation import ICARotation

            return ICARotation(random_state=self.random_state)
        elif self.rotation == "random":
            from rbig._src.rotation import RandomRotation

            seed = (self.random_state or 0) + layer_index
            return RandomRotation(random_state=seed)
        else:
            raise ValueError(
                f"Unknown rotation: {self.rotation}. Use 'pca', 'ica', or 'random'."
            )

    def _make_marginal(self, layer_index: int = 0):
        """Instantiate the marginal Gaussianization component for a given layer.

        Parameters
        ----------
        layer_index : int, default=0
            Index of the layer being constructed.  Used when cycling through
            a ``strategy`` list.

        Returns
        -------
        marginal : MarginalBijector
            An unfitted marginal Gaussianizer instance.
        """
        if self.strategy is not None:
            # cycle through the strategy list to select marginal for this layer
            idx = layer_index % len(self.strategy)
            entry = self.strategy[idx]
            marginal_name = (
                entry[1] if isinstance(entry, list | tuple) else "gaussianize"
            )
            return self._get_component(marginal_name, "marginal", layer_index)
        return MarginalGaussianize()

    def _get_component(self, name: str, kind: str, seed: int = 0):
        """Instantiate a rotation or marginal component by name.

        Parameters
        ----------
        name : str
            Component name, e.g. ``"pca"``, ``"ica"``, ``"gaussianize"``.
        kind : str
            Either ``"rotation"`` or ``"marginal"``.
        seed : int, default=0
            Layer index added to ``random_state`` to vary seeds per layer.

        Returns
        -------
        component : Bijector
            An unfitted bijector of the requested kind.
        """
        rng_seed = (self.random_state or 0) + seed
        if kind == "rotation":
            return self._make_rotation_by_name(name, rng_seed)
        return self._make_marginal_by_name(name, rng_seed)

    def _make_rotation_by_name(self, name: str, seed: int):
        """Instantiate a rotation bijector from its string name.

        Parameters
        ----------
        name : str
            One of ``"pca"``, ``"ica"``, or ``"random"``.
        seed : int
            Random seed for stochastic rotations.

        Returns
        -------
        rotation : RotationBijector
            The corresponding unfitted rotation instance.

        Raises
        ------
        ValueError
            If ``name`` is not a recognised rotation type.
        """
        if name == "pca":
            return PCARotation(whiten=False)
        if name == "ica":
            from rbig._src.rotation import ICARotation

            return ICARotation(random_state=seed)
        if name == "random":
            from rbig._src.rotation import RandomRotation

            return RandomRotation(random_state=seed)
        raise ValueError(f"Unknown rotation: {name!r}. Use 'pca', 'ica', or 'random'.")

    def _make_marginal_by_name(self, name: str, seed: int):
        """Instantiate a marginal Gaussianizer from its string name.

        Parameters
        ----------
        name : str
            One of ``"gaussianize"`` / ``"empirical"``, ``"quantile"``,
            ``"kde"``, ``"gmm"``, or ``"spline"``.
        seed : int
            Random seed for stochastic marginal estimators.

        Returns
        -------
        marginal : MarginalBijector
            The corresponding unfitted marginal Gaussianizer instance.

        Raises
        ------
        ValueError
            If ``name`` is not a recognised marginal type.
        """
        if name in ("gaussianize", "empirical", None):
            return MarginalGaussianize()
        if name == "quantile":
            from rbig._src.marginal import QuantileGaussianizer

            return QuantileGaussianizer(random_state=seed)
        if name == "kde":
            from rbig._src.marginal import KDEGaussianizer

            return KDEGaussianizer()
        if name == "gmm":
            from rbig._src.marginal import GMMGaussianizer

            return GMMGaussianizer(random_state=seed)
        if name == "spline":
            from rbig._src.marginal import SplineGaussianizer

            return SplineGaussianizer()
        raise ValueError(
            f"Unknown marginal: {name!r}. Use 'gaussianize', 'quantile', 'kde', 'gmm', or 'spline'."
        )

    @staticmethod
    def _get_information_tolerance(n_samples: int) -> float:
        """Compute a sample-size-adaptive convergence tolerance.

        Interpolates from an empirically calibrated lookup table mapping
        dataset size to an appropriate TC-change threshold.  Larger datasets
        can resolve finer changes in total correlation, so the tolerance
        decreases with sample count.

        Parameters
        ----------
        n_samples : int
            Number of training samples.

        Returns
        -------
        tol : float
            Adaptive tolerance value.
        """
        from scipy.interpolate import interp1d

        xxx = np.logspace(2, 8, 7)
        yyy = [0.1571, 0.0468, 0.0145, 0.0046, 0.0014, 0.0001, 0.00001]
        return float(interp1d(xxx, yyy, fill_value="extrapolate")(n_samples))

    @staticmethod
    def _calculate_negentropy(X: np.ndarray) -> np.ndarray:
        """Negentropy of each marginal: J(xᵢ) = H(Gauss) − H(xᵢ) ≥ 0.

        Negentropy measures how far a distribution is from Gaussian.  It is
        zero if and only if the distribution is Gaussian.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data whose per-feature negentropy is computed.

        Returns
        -------
        neg_entropy : np.ndarray of shape (n_features,)
            Non-negative negentropy for each feature dimension.

        Notes
        -----
        The negentropy is computed as:

            J(xᵢ) = H(𝒩(μᵢ, σᵢ²)) − H(xᵢ)

        where H(𝒩(μ, σ²)) = ½(1 + log(2πσ²)) is the Gaussian entropy with
        the same variance.
        """
        from rbig._src.densities import marginal_entropy

        # Gaussian entropy for a Gaussian with the same variance: 0.5*(1 + log(2*pi*var))
        gauss_h = 0.5 * (1 + np.log(2 * np.pi)) + 0.5 * np.log(np.var(X, axis=0))
        marg_h = marginal_entropy(X)  # empirical marginal entropy per feature
        return gauss_h - marg_h  # shape (n_features,); always >= 0

    @staticmethod
    def _total_correlation(X: np.ndarray) -> float:
        """Total correlation of X: TC(X) = ∑ᵢ H(Xᵢ) − H(X).

        Total correlation (also called multi-information) quantifies the
        statistical dependence among all features jointly.  It equals zero
        when all features are mutually independent.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data whose total correlation is measured.

        Returns
        -------
        tc : float
            Total correlation in nats.  Non-negative by the subadditivity of
            entropy.

        Notes
        -----
        The joint entropy ``H(X)`` is estimated under a Gaussian assumption
        (using the log-determinant of the covariance matrix), while the
        marginal entropies ``H(Xᵢ)`` are estimated empirically.
        """
        from rbig._src.densities import joint_entropy_gaussian, marginal_entropy

        marg_h = marginal_entropy(X)  # per-feature entropy; shape (n_features,)
        joint_h = joint_entropy_gaussian(X)  # Gaussian approximation to joint entropy
        return float(np.sum(marg_h) - joint_h)

`zero_tolerance` `property` `writable` ¶

Deprecated alias for patience.

`fit(X, y=None)` ¶

Fit the RBIG model by iteratively Gaussianizing X.

At each layer k the algorithm:

Builds a new :class:RBIGLayer with the configured marginal and rotation transforms.
Fits the layer on the current working copy Xt.
Accumulates the per-sample log-det-Jacobian: log_det_train_ += log|det J_k(Xt)|.
Advances Xt through the layer: Xt = f_k(Xt).
Measures residual total correlation: TC(Xt) = ∑ᵢ H(Xᵢ) − H(X).
Stops early when TC has not changed by more than tol for patience consecutive layers.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data. y : ignored Not used, present for sklearn pipeline compatibility.

Returns¶

self : AnnealedRBIG The fitted model.

Source code in rbig/_src/model.py

def fit(self, X: np.ndarray, y=None) -> AnnealedRBIG:
    """Fit the RBIG model by iteratively Gaussianizing X.

    At each layer k the algorithm:

    1. Builds a new :class:`RBIGLayer` with the configured marginal and
       rotation transforms.
    2. Fits the layer on the current working copy ``Xt``.
    3. Accumulates the per-sample log-det-Jacobian:
       ``log_det_train_ += log|det J_k(Xt)|``.
    4. Advances ``Xt`` through the layer: ``Xt = f_k(Xt)``.
    5. Measures residual total correlation: ``TC(Xt) = ∑ᵢ H(Xᵢ) − H(X)``.
    6. Stops early when TC has not changed by more than ``tol`` for
       ``patience`` consecutive layers.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.
    y : ignored
        Not used, present for sklearn pipeline compatibility.

    Returns
    -------
    self : AnnealedRBIG
        The fitted model.
    """
    X = validate_data(self, X)
    n_samples, n_features = X.shape
    if n_samples < 2:
        raise ValueError(
            f"RBIG requires at least 2 samples to estimate marginal CDFs, "
            f"got n_samples = {n_samples}."
        )
    self.n_features_in_ = n_features  # remember input dimensionality
    self.layers_: list[RBIGLayer] = []
    self.tc_per_layer_: list[float] = []

    # Validate and resolve tolerance
    if self.tol == "auto":
        tol = self._get_information_tolerance(n_samples)
    elif isinstance(self.tol, int | float):
        tol = float(self.tol)
    else:
        raise ValueError(f"tol must be a float or 'auto', got {self.tol!r}")
    self.tol_: float = tol  # store resolved tolerance for inspection

    Xt = X.copy()  # working copy; shape (n_samples, n_features)
    self.log_det_train_ = np.zeros(
        n_samples
    )  # accumulated log|det J|; shape (n_samples,)
    zero_count = 0  # consecutive non-improving layer counter

    # Record TC of the *input* data (before any layers).  This is
    # needed by total_correlation_reduction() which uses
    # tc_per_layer_[0] - tc_per_layer_[-1].
    self.tc_per_layer_.append(self._total_correlation(Xt))

    pbar = maybe_tqdm(
        range(self.n_layers),
        verbose=self.verbose,
        level=1,
        desc="Fitting RBIG",
        total=self.n_layers,
    )
    for i in pbar:
        # Build layer i with the appropriate marginal and rotation components
        layer = RBIGLayer(
            marginal=self._make_marginal(layer_index=i),
            rotation=self._make_rotation(layer_index=i),
        )
        layer.fit(Xt)
        # Accumulate log|det J_i(Xt)| before advancing Xt
        self.log_det_train_ += layer.log_det_jacobian(Xt)
        Xt = layer.transform(Xt)  # advance to next representation
        self.layers_.append(layer)

        # Measure residual total correlation: TC = sum_i H(Xi) - H(X)
        tc = self._total_correlation(Xt)
        self.tc_per_layer_.append(tc)

        if hasattr(pbar, "set_postfix"):
            postfix = {"TC": f"{tc:.4g}"}
            if i > 0:
                delta = abs(self.tc_per_layer_[-2] - tc)
                postfix["δTC"] = f"{delta:.2e}"
            pbar.set_postfix(postfix)

        if i > 0:
            # Check convergence: how much did TC improve this layer?
            delta = abs(self.tc_per_layer_[-2] - tc)
            if delta < tol:
                zero_count += 1
            else:
                zero_count = 0  # reset on any significant improvement

        # Stop early if TC has been flat for patience consecutive layers
        if zero_count >= self.patience:
            if hasattr(pbar, "total"):
                pbar.total = i + 1
                pbar.refresh()
            break

    # Store the fully transformed training data for efficient entropy estimation
    self.X_transformed_ = Xt  # shape (n_samples, n_features)
    return self

`transform(X)` ¶

Map X to the Gaussian latent space through all fitted layers.

Applies each fitted :class:RBIGLayer in order: Z = fₖ(… f₂(f₁(x)) …).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns¶

Z : np.ndarray of shape (n_samples, n_features) Data in the approximately Gaussian latent space.

Source code in rbig/_src/model.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Map X to the Gaussian latent space through all fitted layers.

    Applies each fitted :class:`RBIGLayer` in order:
    ``Z = fₖ(… f₂(f₁(x)) …)``.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Data in the approximately Gaussian latent space.
    """
    check_is_fitted(self)
    Xt = validate_data(self, X, reset=False).copy()
    layers_iter = maybe_tqdm(
        self.layers_,
        verbose=self.verbose,
        level=2,
        desc="Transforming",
        total=len(self.layers_),
    )
    for layer in layers_iter:
        Xt = layer.transform(Xt)
    return Xt

`inverse_transform(X)` ¶

Map latent-space data back to the original input space.

Applies layers in reverse order: x = f₁⁻¹(… fₖ₋₁⁻¹(fₖ⁻¹(z)) …).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data in the latent (approximately Gaussian) space.

Returns¶

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original input space.

Source code in rbig/_src/model.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Map latent-space data back to the original input space.

    Applies layers in reverse order:
    ``x = f₁⁻¹(… fₖ₋₁⁻¹(fₖ⁻¹(z)) …)``.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the latent (approximately Gaussian) space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original input space.
    """
    check_is_fitted(self)
    Xt = validate_data(self, X, reset=False).copy()
    layers_iter = maybe_tqdm(
        reversed(self.layers_),
        verbose=self.verbose,
        level=2,
        desc="Inverse transforming",
        total=len(self.layers_),
    )
    for layer in layers_iter:
        Xt = layer.inverse_transform(Xt)
    return Xt

`fit_transform(X, y=None)` ¶

Fit the model to X and return the latent-space representation.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns¶

Z : np.ndarray of shape (n_samples, n_features) Transformed data in the latent space.

Source code in rbig/_src/model.py

def fit_transform(self, X: np.ndarray, y=None) -> np.ndarray:
    """Fit the model to X and return the latent-space representation.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Transformed data in the latent space.
    """
    return self.fit(X).transform(X)

`score_samples(X)` ¶

Per-sample log-likelihood under the fitted density model.

Uses the change-of-variables formula for normalizing flows:

log p(x) = log p_Z(f(x)) + log|det J_f(x)|

where p_Z = 𝒩(0, I) is the standard Gaussian base density, f is the composition of all fitted layers, and J_f(x) is the Jacobian of f at x.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data points at which to evaluate the log-likelihood.

Returns¶

log_prob : np.ndarray of shape (n_samples,) Per-sample log-likelihood in nats.

Notes¶

The log-det-Jacobian is accumulated layer by layer to avoid recomputing intermediate representations:

log|det J_f(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|

Source code in rbig/_src/model.py

def score_samples(self, X: np.ndarray) -> np.ndarray:
    """Per-sample log-likelihood under the fitted density model.

    Uses the change-of-variables formula for normalizing flows:

        log p(x) = log p_Z(f(x)) + log|det J_f(x)|

    where ``p_Z = 𝒩(0, I)`` is the standard Gaussian base density,
    ``f`` is the composition of all fitted layers, and ``J_f(x)`` is
    the Jacobian of ``f`` at ``x``.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data points at which to evaluate the log-likelihood.

    Returns
    -------
    log_prob : np.ndarray of shape (n_samples,)
        Per-sample log-likelihood in nats.

    Notes
    -----
    The log-det-Jacobian is accumulated layer by layer to avoid
    recomputing intermediate representations:

        log|det J_f(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|
    """
    check_is_fitted(self)
    X = validate_data(self, X, reset=False)
    Xt = X.copy()  # shape (n_samples, n_features)
    log_det_jac = np.zeros(X.shape[0])  # accumulator; shape (n_samples,)
    layers_iter = maybe_tqdm(
        self.layers_,
        verbose=self.verbose,
        level=2,
        desc="Scoring",
        total=len(self.layers_),
    )
    for layer in layers_iter:
        # Accumulate log|det Jₖ| before advancing through layer k
        log_det_jac += layer.log_det_jacobian(Xt)
        Xt = layer.transform(Xt)  # xₖ = fₖ(xₖ₋₁)
    # log p_Z(z) = sum_i log N(z_i; 0, 1); shape (n_samples,)
    log_pz = np.sum(stats.norm.logpdf(Xt), axis=1)
    # change-of-variables: log p(x) = log p_Z(f(x)) + log|det J_f(x)|
    return log_pz + log_det_jac

`score(X, y=None)` ¶

Mean log-likelihood of samples X under the fitted density.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data points to evaluate. y : ignored Not used, present for sklearn pipeline compatibility.

Returns¶

mean_log_prob : float Average per-sample log-likelihood in nats.

Source code in rbig/_src/model.py

def score(self, X: np.ndarray, y=None) -> float:
    """Mean log-likelihood of samples X under the fitted density.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data points to evaluate.
    y : ignored
        Not used, present for sklearn pipeline compatibility.

    Returns
    -------
    mean_log_prob : float
        Average per-sample log-likelihood in nats.
    """
    return float(np.mean(self.score_samples(X)))

`entropy()` ¶

Differential entropy of the fitted distribution in nats.

Estimated from the training data using:

H(X) = −𝔼_X[log p(x)]

The expectation is approximated by the sample mean over the training set. The log-likelihoods are obtained via the efficient cached path :meth:score_samples_raw_ which reuses pre-computed quantities from fit.

Returns¶

h : float Estimated entropy in nats. Always ≥ 0 for continuous distributions.

Notes¶

This is equivalent to -self.score(X_train) but avoids the cost of re-passing training data through all layers.

Source code in rbig/_src/model.py

def entropy(self) -> float:
    """Differential entropy of the fitted distribution in nats.

    Estimated from the training data using:

        H(X) = −𝔼_X[log p(x)]

    The expectation is approximated by the sample mean over the training
    set.  The log-likelihoods are obtained via the efficient cached path
    :meth:`score_samples_raw_` which reuses pre-computed quantities from
    ``fit``.

    Returns
    -------
    h : float
        Estimated entropy in nats.  Always ≥ 0 for continuous
        distributions.

    Notes
    -----
    This is equivalent to ``-self.score(X_train)`` but avoids the cost
    of re-passing training data through all layers.
    """
    check_is_fitted(self)
    return float(-np.mean(self.score_samples_raw_()))

`total_correlation_reduction()` ¶

Total correlation removed by RBIG (RBIG-way TC estimation).

Uses the per-layer TC reduction approach from Laparra et al. (2011):

TC(X) = TC₀ − TCₖ = Σₖ ΔTCₖ

where TC₀ is the total correlation of the input and TCₖ is the residual TC after K layers of Gaussianization. When the model has converged, TCₖ ≈ 0 and the result equals TC₀.

Returns¶

tc : float Estimated total correlation in nats.

Source code in rbig/_src/model.py

def total_correlation_reduction(self) -> float:
    """Total correlation removed by RBIG (RBIG-way TC estimation).

    Uses the per-layer TC reduction approach from Laparra et al. (2011):

        TC(X) = TC₀ − TCₖ = Σₖ ΔTCₖ

    where TC₀ is the total correlation of the input and TCₖ is the
    residual TC after K layers of Gaussianization.  When the model has
    converged, TCₖ ≈ 0 and the result equals TC₀.

    Returns
    -------
    tc : float
        Estimated total correlation in nats.
    """
    check_is_fitted(self)
    return float(self.tc_per_layer_[0] - self.tc_per_layer_[-1])

`entropy_reduction(X)` ¶

Differential entropy via RBIG-way TC reduction.

Uses the identity H(X) = Σ_d H(X_d) − TC(X) where marginal entropies are estimated via KDE and TC is obtained from the cumulative per-layer TC reduction (Laparra et al. 2011).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data whose entropy is estimated (typically the training data).

Returns¶

h : float Estimated differential entropy in nats.

Source code in rbig/_src/model.py

def entropy_reduction(self, X: np.ndarray) -> float:
    """Differential entropy via RBIG-way TC reduction.

    Uses the identity H(X) = Σ_d H(X_d) − TC(X) where marginal
    entropies are estimated via KDE and TC is obtained from the
    cumulative per-layer TC reduction (Laparra et al. 2011).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data whose entropy is estimated (typically the training data).

    Returns
    -------
    h : float
        Estimated differential entropy in nats.
    """
    check_is_fitted(self)
    from rbig._src.densities import marginal_entropy

    h_marginals = marginal_entropy(X)  # shape (n_features,)
    tc = self.total_correlation_reduction()
    return float(np.sum(h_marginals) - tc)

`score_samples_raw_()` ¶

Log-likelihood for the stored training data without recomputing layers.

Reuses X_transformed_ and log_det_train_ cached during :meth:fit, so the cost is a single Gaussian log-pdf evaluation rather than a full forward pass through all layers.

Returns¶

log_prob : np.ndarray of shape (n_samples,) Per-sample log-likelihood of the training data in nats.

Source code in rbig/_src/model.py

def score_samples_raw_(self) -> np.ndarray:
    """Log-likelihood for the stored training data without recomputing layers.

    Reuses ``X_transformed_`` and ``log_det_train_`` cached during
    :meth:`fit`, so the cost is a single Gaussian log-pdf evaluation
    rather than a full forward pass through all layers.

    Returns
    -------
    log_prob : np.ndarray of shape (n_samples,)
        Per-sample log-likelihood of the training data in nats.
    """
    # log p_Z evaluated at the pre-computed transformed training data
    log_pz = np.sum(
        stats.norm.logpdf(self.X_transformed_), axis=1
    )  # shape (n_samples,)
    # add the accumulated log-det-Jacobian stored during fit
    return log_pz + self.log_det_train_

`sample(n_samples, random_state=None)` ¶

Generate samples from the learned distribution.

Draws i.i.d. standard Gaussian samples in the latent space and maps them back to the data space via the inverse normalizing flow.

Parameters¶

n_samples : int Number of samples to generate. random_state : int or None, optional Seed for the random number generator. If None, a random seed is used.

Returns¶

X_new : np.ndarray of shape (n_samples, n_features_in_) Samples in the original data space.

Source code in rbig/_src/model.py

def sample(self, n_samples: int, random_state: int | None = None) -> np.ndarray:
    """Generate samples from the learned distribution.

    Draws i.i.d. standard Gaussian samples in the latent space and maps
    them back to the data space via the inverse normalizing flow.

    Parameters
    ----------
    n_samples : int
        Number of samples to generate.
    random_state : int or None, optional
        Seed for the random number generator.  If ``None``, a random
        seed is used.

    Returns
    -------
    X_new : np.ndarray of shape (n_samples, n_features_in_)
        Samples in the original data space.
    """
    check_is_fitted(self)
    rng = np.random.default_rng(random_state)
    Z = rng.standard_normal((n_samples, self.n_features_in_))  # latent samples
    return self.inverse_transform(Z)

`predict_proba(X, domain='input')` ¶

Return probability density estimates for X.

Uses the change-of-variables formula via the full Jacobian matrix to compute the density in the requested domain.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data points to evaluate. domain : str, default="input" Which domain to return densities in:

- ``"input"`` — density in the original data space:
  ``p(x) = p_Z(f(x)) · |det J_f(x)|``
- ``"transform"`` — density in the Gaussian latent space:
  ``p_Z(f(x)) = ∏ᵢ φ(fᵢ(x))``
- ``"both"`` — returns a tuple ``(p_input, p_transform)``

Returns¶

proba : np.ndarray of shape (n_samples,) or tuple Probability density estimates. When domain="both", returns (p_input, p_transform).

Source code in rbig/_src/model.py

def predict_proba(
    self,
    X: np.ndarray,
    domain: str = "input",
) -> np.ndarray | tuple[np.ndarray, np.ndarray]:
    """Return probability density estimates for X.

    Uses the change-of-variables formula via the full Jacobian matrix
    to compute the density in the requested domain.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data points to evaluate.
    domain : str, default="input"
        Which domain to return densities in:

        - ``"input"`` — density in the original data space:
          ``p(x) = p_Z(f(x)) · |det J_f(x)|``
        - ``"transform"`` — density in the Gaussian latent space:
          ``p_Z(f(x)) = ∏ᵢ φ(fᵢ(x))``
        - ``"both"`` — returns a tuple ``(p_input, p_transform)``

    Returns
    -------
    proba : np.ndarray of shape (n_samples,) or tuple
        Probability density estimates.  When ``domain="both"``, returns
        ``(p_input, p_transform)``.
    """
    check_is_fitted(self)
    X = validate_data(self, X, reset=False)
    jac, Xt = self.jacobian(X, return_X_transform=True)

    # Work in log-space for numerical stability
    log_p_transform = np.sum(stats.norm.logpdf(Xt), axis=1)

    if domain == "transform":
        p_transform = np.exp(log_p_transform)
        p_transform = np.where(np.isfinite(p_transform), p_transform, 0.0)
        return p_transform

    # Input-domain density via change of variables (log-space)
    _sign, log_abs_det = np.linalg.slogdet(jac)
    log_p_input = log_p_transform + log_abs_det
    p_input = np.exp(log_p_input)
    p_input = np.where(np.isfinite(p_input), p_input, 0.0)

    if domain == "input":
        return p_input
    if domain == "both":
        p_transform = np.exp(log_p_transform)
        p_transform = np.where(np.isfinite(p_transform), p_transform, 0.0)
        return p_input, p_transform
    raise ValueError(
        f"Unknown domain: {domain!r}. Use 'input', 'transform', or 'both'."
    )

`jacobian(X, return_X_transform=False)` ¶

Compute the full Jacobian matrix of the RBIG transform.

For each sample, returns the (n_features, n_features) Jacobian matrix df/dx of the composition of all fitted layers. Uses the seed-dimension approach from the legacy implementation: for each input dimension idim, a unit vector is propagated through the chain of per-feature marginal derivatives and rotation matrices.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data at which to evaluate the Jacobian. return_X_transform : bool, default False If True, also return the fully transformed data f(X) (computed as a side-effect of the Jacobian calculation).

Returns¶

jac : np.ndarray of shape (n_samples, n_features, n_features) Full Jacobian matrix per sample. jac[n, i, j] is the partial derivative df_i/dx_j for the n-th sample. X_transformed : np.ndarray of shape (n_samples, n_features) Only returned when return_X_transform=True. The data after passing through all layers.

Source code in rbig/_src/model.py

def jacobian(
    self,
    X: np.ndarray,
    return_X_transform: bool = False,
) -> np.ndarray | tuple[np.ndarray, np.ndarray]:
    """Compute the full Jacobian matrix of the RBIG transform.

    For each sample, returns the ``(n_features, n_features)`` Jacobian
    matrix ``df/dx`` of the composition of all fitted layers.  Uses the
    seed-dimension approach from the legacy implementation: for each input
    dimension ``idim``, a unit vector is propagated through the chain of
    per-feature marginal derivatives and rotation matrices.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data at which to evaluate the Jacobian.
    return_X_transform : bool, default False
        If True, also return the fully transformed data ``f(X)`` (computed
        as a side-effect of the Jacobian calculation).

    Returns
    -------
    jac : np.ndarray of shape (n_samples, n_features, n_features)
        Full Jacobian matrix per sample.  ``jac[n, i, j]`` is the partial
        derivative ``df_i/dx_j`` for the n-th sample.
    X_transformed : np.ndarray of shape (n_samples, n_features)
        Only returned when ``return_X_transform=True``.  The data after
        passing through all layers.
    """
    check_is_fitted(self)
    n_samples, n_features = X.shape

    # ── Forward pass: collect per-layer derivatives and rotation matrices ──
    derivs_per_layer = []  # each: (n_samples, n_features)
    rotmats_per_layer = []  # each: (n_features, n_features)

    Xt = X.copy()
    layers_iter = maybe_tqdm(
        self.layers_,
        verbose=self.verbose,
        level=2,
        desc="Jacobian (forward)",
        total=len(self.layers_),
    )
    for layer in layers_iter:
        if not hasattr(layer.marginal, "_per_feature_log_deriv"):
            raise NotImplementedError(
                f"Jacobian computation requires a marginal with "
                f"_per_feature_log_deriv(); "
                f"{type(layer.marginal).__name__} does not support this."
            )
        # Per-feature marginal derivatives and transformed data in one pass
        log_d, Xt_marginal = layer.marginal._per_feature_log_deriv(
            Xt, return_transform=True
        )
        derivs_per_layer.append(np.exp(log_d))

        # Rotation matrix in row-vector convention: y = z @ R
        rot = self._extract_rotation_matrix(layer.rotation)
        rotmats_per_layer.append(rot)

        # Advance through rotation only
        Xt = layer.rotation.transform(Xt_marginal)

    # ── Seed-dimension loop: propagate unit vectors through the chain ──
    jac = np.zeros((n_samples, n_features, n_features))

    dims_iter = maybe_tqdm(
        range(n_features),
        verbose=self.verbose,
        level=2,
        desc="Jacobian (dims)",
        total=n_features,
    )
    for idim in dims_iter:
        # Initialize seed: unit vector in dimension idim
        XX = np.zeros((n_samples, n_features))
        XX[:, idim] = 1.0

        for derivs, R in zip(derivs_per_layer, rotmats_per_layer, strict=True):
            # Chain rule: XX_new = diag(derivs) @ XX @ R
            XX = (derivs * XX) @ R

        jac[:, :, idim] = XX

    if return_X_transform:
        return jac, Xt
    return jac

`rbig.RBIGLayer` `dataclass` ¶

Single RBIG layer: marginal Gaussianization followed by rotation.

One iteration of the RBIG algorithm applies two successive bijections:

Marginal Gaussianization – maps each feature independently to a standard Gaussian via its empirical CDF and the probit function:

z = Φ⁻¹(F̂ₙ(x))

where F̂ₙ is the estimated marginal CDF and Φ⁻¹ is the standard normal quantile function.

Rotation/whitening – applies a linear transform R (default: PCA whitening) to de-correlate the Gaussianized features:

y = R · z

The full single-layer transform is therefore:

y = R · Φ⁻¹(F̂ₙ(x))

Parameters¶

marginal : MarginalGaussianize, optional Marginal Gaussianization transform (fitted per feature). Defaults to a new MarginalGaussianize instance. rotation : PCARotation, optional Rotation transform applied after marginal Gaussianization. Defaults to a new PCARotation instance.

Attributes¶

marginal : MarginalGaussianize Fitted marginal transform. rotation : PCARotation Fitted rotation transform.

Notes¶

The layer log-det-Jacobian is the sum of the marginal and rotation contributions:

log|det J_layer(x)| = log|det J_marginal(x)| + log|det J_rotation|
                     = ∑ᵢ log|Φ⁻¹′(F̂ₙ(xᵢ)) · f̂ₙ(xᵢ)| + log|det J_rotation|

The rotation term log|det J_rotation| is zero when the rotation is strictly orthogonal (|det R| = 1). The default PCARotation(whiten=False) is orthogonal, so its log-det is always zero. PCARotation(whiten=True) includes per-component scaling by 1/√λ and is not orthogonal (non-zero log-det). Note that both converge to identical results in practice because marginal Gaussianization already produces near-unit-variance features.

References¶

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: From ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537–549. https://doi.org/10.1109/TNN.2011.2106511

Examples¶

import numpy as np from rbig._src.model import RBIGLayer rng = np.random.default_rng(0) X = rng.standard_normal((500, 3)) layer = RBIGLayer() layer.fit(X) RBIGLayer(...) Z = layer.transform(X) Z.shape (500, 3)

Source code in rbig/_src/model.py

@dataclass
class RBIGLayer:
    """Single RBIG layer: marginal Gaussianization followed by rotation.

    One iteration of the RBIG algorithm applies two successive bijections:

    1. **Marginal Gaussianization** – maps each feature independently to a
       standard Gaussian via its empirical CDF and the probit function:

           z = Φ⁻¹(F̂ₙ(x))

       where F̂ₙ is the estimated marginal CDF and Φ⁻¹ is the standard
       normal quantile function.

    2. **Rotation/whitening** – applies a linear transform R (default: PCA
       whitening) to de-correlate the Gaussianized features:

           y = R · z

    The full single-layer transform is therefore:

        y = R · Φ⁻¹(F̂ₙ(x))

    Parameters
    ----------
    marginal : MarginalGaussianize, optional
        Marginal Gaussianization transform (fitted per feature).
        Defaults to a new ``MarginalGaussianize`` instance.
    rotation : PCARotation, optional
        Rotation transform applied after marginal Gaussianization.
        Defaults to a new ``PCARotation`` instance.

    Attributes
    ----------
    marginal : MarginalGaussianize
        Fitted marginal transform.
    rotation : PCARotation
        Fitted rotation transform.

    Notes
    -----
    The layer log-det-Jacobian is the sum of the marginal and rotation
    contributions:

        log|det J_layer(x)| = log|det J_marginal(x)| + log|det J_rotation|
                             = ∑ᵢ log|Φ⁻¹′(F̂ₙ(xᵢ)) · f̂ₙ(xᵢ)| + log|det J_rotation|

    The rotation term ``log|det J_rotation|`` is zero when the rotation is
    strictly orthogonal (``|det R| = 1``).  The default
    ``PCARotation(whiten=False)`` is orthogonal, so its log-det is always
    zero.  ``PCARotation(whiten=True)`` includes per-component scaling by
    ``1/√λ`` and is *not* orthogonal (non-zero log-det).  Note that both
    converge to identical results in practice because marginal
    Gaussianization already produces near-unit-variance features.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    From ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537–549. https://doi.org/10.1109/TNN.2011.2106511

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.model import RBIGLayer
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((500, 3))
    >>> layer = RBIGLayer()
    >>> layer.fit(X)
    RBIGLayer(...)
    >>> Z = layer.transform(X)
    >>> Z.shape
    (500, 3)
    """

    marginal: MarginalGaussianize = field(default_factory=MarginalGaussianize)
    rotation: PCARotation = field(default_factory=lambda: PCARotation(whiten=False))

    def fit(self, X: np.ndarray, y=None) -> RBIGLayer:
        """Fit the marginal and rotation transforms to data X.

        First fits the marginal Gaussianizer on X, applies it to obtain the
        intermediate Gaussianized representation, then fits the rotation on
        that intermediate representation.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.
        y : ignored
            Not used, present for sklearn pipeline compatibility.

        Returns
        -------
        self : RBIGLayer
            The fitted layer.
        """
        Xm = self.marginal.fit_transform(
            X
        )  # shape (n_samples, n_features) - Gaussianized
        self.rotation.fit(Xm)  # fit rotation on the Gaussianized data
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply marginal Gaussianization then rotation: y = R · Φ⁻¹(F̂ₙ(x)).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Y : np.ndarray of shape (n_samples, n_features)
            Transformed data after Gaussianization and rotation.
        """
        Xm = self.marginal.transform(
            X
        )  # marginal Gaussianization, shape (n_samples, n_features)
        return self.rotation.transform(Xm)  # rotation, shape (n_samples, n_features)

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Log |det J| for this layer at input X.

        The total log-det-Jacobian is the sum of contributions from the
        marginal step and the rotation step:

            log|det J_layer(x)| = log|det J_marginal(x)| + log|det J_rotation(z)|

        For orthogonal rotations (e.g. ``RandomRotation``,
        ``PCARotation(whiten=False)``), the rotation term is zero.  For
        ``PCARotation(whiten=True)`` the rotation includes a per-component
        rescaling, so its term is generally non-zero.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points.

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant of the layer Jacobian.
        """
        Xm = self.marginal.transform(
            X
        )  # intermediate Gaussianized data, shape (n_samples, n_features)
        # marginal log-det + rotation log-det (non-zero for PCARotation with whiten=True)
        return self.marginal.log_det_jacobian(X) + self.rotation.log_det_jacobian(Xm)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the layer: apply inverse rotation then inverse marginal.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the layer's output (latent) space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original input space.
        """
        Xr = self.rotation.inverse_transform(
            X
        )  # undo rotation, shape (n_samples, n_features)
        return self.marginal.inverse_transform(Xr)  # undo marginal Gaussianization

`fit(X, y=None)` ¶

Fit the marginal and rotation transforms to data X.

First fits the marginal Gaussianizer on X, applies it to obtain the intermediate Gaussianized representation, then fits the rotation on that intermediate representation.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data. y : ignored Not used, present for sklearn pipeline compatibility.

Returns¶

self : RBIGLayer The fitted layer.

Source code in rbig/_src/model.py

def fit(self, X: np.ndarray, y=None) -> RBIGLayer:
    """Fit the marginal and rotation transforms to data X.

    First fits the marginal Gaussianizer on X, applies it to obtain the
    intermediate Gaussianized representation, then fits the rotation on
    that intermediate representation.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.
    y : ignored
        Not used, present for sklearn pipeline compatibility.

    Returns
    -------
    self : RBIGLayer
        The fitted layer.
    """
    Xm = self.marginal.fit_transform(
        X
    )  # shape (n_samples, n_features) - Gaussianized
    self.rotation.fit(Xm)  # fit rotation on the Gaussianized data
    return self

`transform(X)` ¶

Apply marginal Gaussianization then rotation: y = R · Φ⁻¹(F̂ₙ(x)).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns¶

Y : np.ndarray of shape (n_samples, n_features) Transformed data after Gaussianization and rotation.

Source code in rbig/_src/model.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply marginal Gaussianization then rotation: y = R · Φ⁻¹(F̂ₙ(x)).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Y : np.ndarray of shape (n_samples, n_features)
        Transformed data after Gaussianization and rotation.
    """
    Xm = self.marginal.transform(
        X
    )  # marginal Gaussianization, shape (n_samples, n_features)
    return self.rotation.transform(Xm)  # rotation, shape (n_samples, n_features)

`log_det_jacobian(X)` ¶

Log |det J| for this layer at input X.

The total log-det-Jacobian is the sum of contributions from the marginal step and the rotation step:

log|det J_layer(x)| = log|det J_marginal(x)| + log|det J_rotation(z)|

For orthogonal rotations (e.g. RandomRotation, PCARotation(whiten=False)), the rotation term is zero. For PCARotation(whiten=True) the rotation includes a per-component rescaling, so its term is generally non-zero.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input points.

Returns¶

ldj : np.ndarray of shape (n_samples,) Per-sample log absolute determinant of the layer Jacobian.

Source code in rbig/_src/model.py

def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Log |det J| for this layer at input X.

    The total log-det-Jacobian is the sum of contributions from the
    marginal step and the rotation step:

        log|det J_layer(x)| = log|det J_marginal(x)| + log|det J_rotation(z)|

    For orthogonal rotations (e.g. ``RandomRotation``,
    ``PCARotation(whiten=False)``), the rotation term is zero.  For
    ``PCARotation(whiten=True)`` the rotation includes a per-component
    rescaling, so its term is generally non-zero.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points.

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant of the layer Jacobian.
    """
    Xm = self.marginal.transform(
        X
    )  # intermediate Gaussianized data, shape (n_samples, n_features)
    # marginal log-det + rotation log-det (non-zero for PCARotation with whiten=True)
    return self.marginal.log_det_jacobian(X) + self.rotation.log_det_jacobian(Xm)

`inverse_transform(X)` ¶

Invert the layer: apply inverse rotation then inverse marginal.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data in the layer's output (latent) space.

Returns¶

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original input space.

Source code in rbig/_src/model.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the layer: apply inverse rotation then inverse marginal.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the layer's output (latent) space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original input space.
    """
    Xr = self.rotation.inverse_transform(
        X
    )  # undo rotation, shape (n_samples, n_features)
    return self.marginal.inverse_transform(Xr)  # undo marginal Gaussianization

Marginal Transforms¶

`rbig.MarginalUniformize` ¶

Bases: BaseTransform

Transform each marginal to uniform [0, 1] using the empirical CDF.

For each feature dimension i, the empirical CDF is estimated from the training data with a mid-point (Hazen) continuity correction::

u_hat = F_hat_n(x) = (rank(x, X_train) + 0.5) / N

where rank is the number of training samples <= x (left-sided searchsorted) and N is the number of training samples. The +0.5 shift avoids the degenerate values 0 and 1 for in-sample boundary points.

Parameters¶

bound_correct : bool, default True If True, clip the output to [eps, 1 - eps] to prevent exact 0 or 1, which is useful when feeding the result into a probit or logit function. eps : float, default 1e-6 Half-width of the clipping margin when bound_correct=True.

Attributes¶

support_ : np.ndarray of shape (n_samples, n_features) Column-wise sorted training data. Serves as empirical quantile nodes for both the forward transform and piecewise-linear inversion. n_features_ : int Number of feature dimensions seen during fit.

Notes¶

The mid-point empirical CDF (Hazen plotting position) is::

F_hat_n(x) = (rank + 0.5) / N

The inverse is approximated by piecewise-linear interpolation between the sorted support values and their corresponding uniform probabilities np.linspace(0, 1, N).

References¶

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples¶

import numpy as np from rbig._src.marginal import MarginalUniformize rng = np.random.default_rng(0) X = rng.standard_normal((100, 2)) uni = MarginalUniformize().fit(X) U = uni.transform(X) U.shape (100, 2) bool(U.min() > 0.0) and bool(U.max() < 1.0) True Xr = uni.inverse_transform(U) Xr.shape (100, 2)

Source code in rbig/_src/marginal.py

class MarginalUniformize(BaseTransform):
    """Transform each marginal to uniform [0, 1] using the empirical CDF.

    For each feature dimension *i*, the empirical CDF is estimated from the
    training data with a mid-point (Hazen) continuity correction::

        u_hat = F_hat_n(x) = (rank(x, X_train) + 0.5) / N

    where *rank* is the number of training samples <= x (left-sided
    ``searchsorted``) and *N* is the number of training samples.  The
    ``+0.5`` shift avoids the degenerate values 0 and 1 for in-sample
    boundary points.

    Parameters
    ----------
    bound_correct : bool, default True
        If True, clip the output to ``[eps, 1 - eps]`` to prevent exact 0
        or 1, which is useful when feeding the result into a probit or
        logit function.
    eps : float, default 1e-6
        Half-width of the clipping margin when ``bound_correct=True``.

    Attributes
    ----------
    support_ : np.ndarray of shape (n_samples, n_features)
        Column-wise sorted training data.  Serves as empirical quantile
        nodes for both the forward transform and piecewise-linear inversion.
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The mid-point empirical CDF (Hazen plotting position) is::

        F_hat_n(x) = (rank + 0.5) / N

    The inverse is approximated by piecewise-linear interpolation between
    the sorted support values and their corresponding uniform probabilities
    ``np.linspace(0, 1, N)``.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import MarginalUniformize
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((100, 2))
    >>> uni = MarginalUniformize().fit(X)
    >>> U = uni.transform(X)
    >>> U.shape
    (100, 2)
    >>> bool(U.min() > 0.0) and bool(U.max() < 1.0)
    True
    >>> Xr = uni.inverse_transform(U)
    >>> Xr.shape
    (100, 2)
    """

    def __init__(
        self,
        bound_correct: bool = True,
        eps: float = 1e-6,
        pdf_extension: float = 0.0,
        pdf_resolution: int = 1000,
    ):
        self.bound_correct = bound_correct
        self.eps = eps
        self.pdf_extension = pdf_extension
        self.pdf_resolution = pdf_resolution

    def fit(self, X: np.ndarray, y=None) -> MarginalUniformize:
        """Fit the transform by storing sorted training values per feature.

        When ``pdf_extension > 0``, a histogram-based CDF pipeline is used
        instead of the default empirical CDF.  This extends the support by
        ``pdf_extension`` percent of the data range and builds an interpolated,
        monotonic CDF on a grid of ``pdf_resolution`` points.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.  Each column is sorted and stored as the empirical
            support (quantile nodes) for that feature.

        Returns
        -------
        self : MarginalUniformize
            Fitted transform instance.
        """
        self.n_features_ = X.shape[1]

        if self.pdf_extension > 0:
            self._fit_histogram_cdf(X)
        else:
            # Sort each column independently to obtain empirical quantile nodes
            self.support_ = np.sort(X, axis=0)
        return self

    def _fit_histogram_cdf(self, X: np.ndarray) -> None:
        """Build per-feature histogram CDF with extended support."""
        n_samples = X.shape[0]
        self.cdf_support_ = []
        self.cdf_values_ = []
        self.pdf_support_ = []
        self.pdf_values_ = []

        for i in range(self.n_features_):
            xi = X[:, i]
            x_min, x_max = xi.min(), xi.max()

            # Handle constant-valued feature: trivial linear CDF
            if x_min == x_max:
                support = np.array([x_min - 1.0, x_min, x_min + 1.0])
                cdf_vals = np.array([0.0, 0.5, 1.0])
                pdf_sup = np.array([x_min - 1.0, x_min, x_min + 1.0])
                pdf_vals = np.array([0.0, 1.0, 0.0])
                self.cdf_support_.append(support)
                self.cdf_values_.append(cdf_vals)
                self.pdf_support_.append(pdf_sup)
                self.pdf_values_.append(pdf_vals)
                continue

            support_ext = (self.pdf_extension / 100) * abs(x_max - x_min)

            # Build histogram bins: sqrt(n) + 1 edges
            n_bin_edges = int(np.sqrt(float(n_samples)) + 1)
            bin_edges = np.linspace(x_min, x_max, n_bin_edges)
            bin_centers = (bin_edges[:-1] + bin_edges[1:]) / 2
            counts, _ = np.histogram(xi, bin_edges)
            bin_size = bin_edges[1] - bin_edges[0]

            # Empirical PDF with zero-padded edges
            pdf_support = np.concatenate(
                [
                    [bin_centers[0] - bin_size],
                    bin_centers,
                    [bin_centers[-1] + bin_size],
                ]
            )
            empirical_pdf = np.concatenate(
                [
                    [0.0],
                    counts / (np.sum(counts) * bin_size),
                    [0.0],
                ]
            )

            # CDF from cumulative counts with extended support
            c_sum = np.cumsum(counts)
            cdf = (1 - 1 / n_samples) * c_sum / n_samples
            incr_bin = bin_size / 2

            new_bin_edges = np.concatenate(
                [
                    [x_min - support_ext],
                    [x_min],
                    bin_centers + incr_bin,
                    [x_max + support_ext + incr_bin],
                ]
            )
            extended_cdf = np.concatenate(
                [
                    [0.0],
                    [1.0 / n_samples],
                    cdf,
                    [1.0],
                ]
            )

            # Interpolate onto fine grid, enforce monotonicity, normalize
            new_support = np.linspace(
                new_bin_edges[0], new_bin_edges[-1], self.pdf_resolution
            )
            learned_cdf = np.interp(new_support, new_bin_edges, extended_cdf)
            uniform_cdf = make_cdf_monotonic(learned_cdf)
            uniform_cdf /= uniform_cdf.max()

            self.cdf_support_.append(new_support)
            self.cdf_values_.append(uniform_cdf)
            self.pdf_support_.append(pdf_support)
            self.pdf_values_.append(empirical_pdf)

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Map each feature to [0, 1] via the mid-point empirical CDF.

        Applies ``u = (rank + 0.5) / N`` to every column independently.
        When ``pdf_extension > 0``, uses interpolation with the stored
        histogram-based CDF grid instead.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Uniformized data in [0, 1] (or ``[eps, 1 - eps]`` when
            ``bound_correct=True``).
        """
        Xt = np.zeros_like(X, dtype=float)
        if self.pdf_extension > 0:
            for i in range(self.n_features_):
                Xt[:, i] = np.interp(X[:, i], self.cdf_support_[i], self.cdf_values_[i])
        else:
            for i in range(self.n_features_):
                Xt[:, i] = self._uniformize(X[:, i], self.support_[:, i])
        if self.bound_correct:
            # Clip to (eps, 1-eps) to prevent boundary issues downstream
            Xt = np.clip(Xt, self.eps, 1 - self.eps)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Map uniform [0, 1] values back to the original space.

        Uses piecewise-linear interpolation between the stored sorted support
        values and their corresponding uniform probabilities
        ``np.linspace(0, 1, N)``.  When ``pdf_extension > 0``, uses the
        inverted histogram CDF grid instead.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the uniform [0, 1] space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
        """
        Xt = np.zeros_like(X, dtype=float)
        if self.pdf_extension > 0:
            for i in range(self.n_features_):
                # Ensure strictly increasing xp for np.interp by
                # dropping duplicate CDF values
                cdf_vals = self.cdf_values_[i]
                cdf_sup = self.cdf_support_[i]
                unique_mask = np.concatenate([[True], np.diff(cdf_vals) > 0])
                Xt[:, i] = np.interp(
                    X[:, i], cdf_vals[unique_mask], cdf_sup[unique_mask]
                )
        else:
            for i in range(self.n_features_):
                # Interpolate: uniform grid [0, 1] -> sorted training values
                Xt[:, i] = np.interp(
                    X[:, i],
                    np.linspace(0, 1, len(self.support_[:, i])),
                    self.support_[:, i],
                )
        return Xt

    @staticmethod
    def _uniformize(x: np.ndarray, support: np.ndarray) -> np.ndarray:
        """Compute the mid-point empirical CDF for a single feature.

        Parameters
        ----------
        x : np.ndarray of shape (n_samples,)
            New data values to evaluate the empirical CDF at.
        support : np.ndarray of shape (n_train,)
            Sorted training values used as the empirical quantile nodes.

        Returns
        -------
        u : np.ndarray of shape (n_samples,)
            Empirical CDF values: ``(rank + 0.5) / n_train``.
        """
        n = len(support)
        # Left-sided searchsorted gives the number of training points <= x
        ranks = np.searchsorted(support, x, side="left")
        # Mid-point shift (+0.5) avoids exact 0 and 1
        return (ranks + 0.5) / n

`fit(X, y=None)` ¶

Fit the transform by storing sorted training values per feature.

When pdf_extension > 0, a histogram-based CDF pipeline is used instead of the default empirical CDF. This extends the support by pdf_extension percent of the data range and builds an interpolated, monotonic CDF on a grid of pdf_resolution points.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data. Each column is sorted and stored as the empirical support (quantile nodes) for that feature.

Returns¶

self : MarginalUniformize Fitted transform instance.

Source code in rbig/_src/marginal.py

def fit(self, X: np.ndarray, y=None) -> MarginalUniformize:
    """Fit the transform by storing sorted training values per feature.

    When ``pdf_extension > 0``, a histogram-based CDF pipeline is used
    instead of the default empirical CDF.  This extends the support by
    ``pdf_extension`` percent of the data range and builds an interpolated,
    monotonic CDF on a grid of ``pdf_resolution`` points.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.  Each column is sorted and stored as the empirical
        support (quantile nodes) for that feature.

    Returns
    -------
    self : MarginalUniformize
        Fitted transform instance.
    """
    self.n_features_ = X.shape[1]

    if self.pdf_extension > 0:
        self._fit_histogram_cdf(X)
    else:
        # Sort each column independently to obtain empirical quantile nodes
        self.support_ = np.sort(X, axis=0)
    return self

`transform(X)` ¶

Map each feature to [0, 1] via the mid-point empirical CDF.

Applies u = (rank + 0.5) / N to every column independently. When pdf_extension > 0, uses interpolation with the stored histogram-based CDF grid instead.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns¶

Xt : np.ndarray of shape (n_samples, n_features) Uniformized data in [0, 1] (or [eps, 1 - eps] when bound_correct=True).

Source code in rbig/_src/marginal.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Map each feature to [0, 1] via the mid-point empirical CDF.

    Applies ``u = (rank + 0.5) / N`` to every column independently.
    When ``pdf_extension > 0``, uses interpolation with the stored
    histogram-based CDF grid instead.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Uniformized data in [0, 1] (or ``[eps, 1 - eps]`` when
        ``bound_correct=True``).
    """
    Xt = np.zeros_like(X, dtype=float)
    if self.pdf_extension > 0:
        for i in range(self.n_features_):
            Xt[:, i] = np.interp(X[:, i], self.cdf_support_[i], self.cdf_values_[i])
    else:
        for i in range(self.n_features_):
            Xt[:, i] = self._uniformize(X[:, i], self.support_[:, i])
    if self.bound_correct:
        # Clip to (eps, 1-eps) to prevent boundary issues downstream
        Xt = np.clip(Xt, self.eps, 1 - self.eps)
    return Xt

`inverse_transform(X)` ¶

Map uniform [0, 1] values back to the original space.

Uses piecewise-linear interpolation between the stored sorted support values and their corresponding uniform probabilities np.linspace(0, 1, N). When pdf_extension > 0, uses the inverted histogram CDF grid instead.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data in the uniform [0, 1] space.

Returns¶

Xt : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space.

Source code in rbig/_src/marginal.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Map uniform [0, 1] values back to the original space.

    Uses piecewise-linear interpolation between the stored sorted support
    values and their corresponding uniform probabilities
    ``np.linspace(0, 1, N)``.  When ``pdf_extension > 0``, uses the
    inverted histogram CDF grid instead.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the uniform [0, 1] space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
    """
    Xt = np.zeros_like(X, dtype=float)
    if self.pdf_extension > 0:
        for i in range(self.n_features_):
            # Ensure strictly increasing xp for np.interp by
            # dropping duplicate CDF values
            cdf_vals = self.cdf_values_[i]
            cdf_sup = self.cdf_support_[i]
            unique_mask = np.concatenate([[True], np.diff(cdf_vals) > 0])
            Xt[:, i] = np.interp(
                X[:, i], cdf_vals[unique_mask], cdf_sup[unique_mask]
            )
    else:
        for i in range(self.n_features_):
            # Interpolate: uniform grid [0, 1] -> sorted training values
            Xt[:, i] = np.interp(
                X[:, i],
                np.linspace(0, 1, len(self.support_[:, i])),
                self.support_[:, i],
            )
    return Xt

`rbig.MarginalGaussianize` ¶

Bases: BaseTransform

Transform each marginal to standard Gaussian using empirical CDF + probit.

Combines a mid-point empirical CDF estimate with the Gaussian probit (quantile) function Phi^{-1} to map each feature to an approximately standard-normal marginal::

z = Phi ^ {-1}(F_hat_n(x))

where F_hat_n(x) = (rank + 0.5) / N is the mid-point empirical CDF and Phi^{-1} is the inverse standard-normal CDF (probit).

Parameters¶

bound_correct : bool, default True Clip the intermediate uniform value to [eps, 1 - eps] before applying the probit to prevent +/-inf outputs at the tails. eps : float, default 1e-6 Clipping margin for the uniform intermediate value.

Attributes¶

support_ : np.ndarray of shape (n_samples, n_features) Column-wise sorted training data (empirical quantile nodes). n_features_ : int Number of feature dimensions seen during fit.

Notes¶

The log-absolute Jacobian determinant needed for density estimation is::

log|dz/dx| = log f_hat_n(x) - log phi(Phi^{-1}(F_hat_n(x)))

where f_hat_n is the empirical density estimated from the spacing of adjacent sorted training values, and phi is the standard-normal PDF. This is computed in :meth:log_det_jacobian.

References¶

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples¶

import numpy as np from rbig._src.marginal import MarginalGaussianize rng = np.random.default_rng(0) X = rng.standard_normal((200, 3)) mg = MarginalGaussianize().fit(X) Z = mg.transform(X) Z.shape (200, 3) abs(float(Z.mean())) < 0.5 True

Source code in rbig/_src/marginal.py

class MarginalGaussianize(BaseTransform):
    """Transform each marginal to standard Gaussian using empirical CDF + probit.

    Combines a mid-point empirical CDF estimate with the Gaussian probit
    (quantile) function Phi^{-1} to map each feature to an approximately
    standard-normal marginal::

        z = Phi ^ {-1}(F_hat_n(x))

    where ``F_hat_n(x) = (rank + 0.5) / N`` is the mid-point empirical CDF
    and ``Phi^{-1}`` is the inverse standard-normal CDF (probit).

    Parameters
    ----------
    bound_correct : bool, default True
        Clip the intermediate uniform value to ``[eps, 1 - eps]`` before
        applying the probit to prevent +/-inf outputs at the tails.
    eps : float, default 1e-6
        Clipping margin for the uniform intermediate value.

    Attributes
    ----------
    support_ : np.ndarray of shape (n_samples, n_features)
        Column-wise sorted training data (empirical quantile nodes).
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The log-absolute Jacobian determinant needed for density estimation is::

        log|dz/dx| = log f_hat_n(x) - log phi(Phi^{-1}(F_hat_n(x)))

    where ``f_hat_n`` is the empirical density estimated from the spacing of
    adjacent sorted training values, and ``phi`` is the standard-normal PDF.
    This is computed in :meth:`log_det_jacobian`.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import MarginalGaussianize
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 3))
    >>> mg = MarginalGaussianize().fit(X)
    >>> Z = mg.transform(X)
    >>> Z.shape
    (200, 3)
    >>> abs(float(Z.mean())) < 0.5
    True
    """

    def __init__(self, bound_correct: bool = True, eps: float = 1e-6):
        self.bound_correct = bound_correct
        self.eps = eps

    def fit(self, X: np.ndarray, y=None) -> MarginalGaussianize:
        """Fit by storing the column-wise sorted training data.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data used to build the per-feature empirical CDF.

        Returns
        -------
        self : MarginalGaussianize
            Fitted transform instance.
        """
        # Sorted columns serve as empirical quantile nodes
        self.support_ = np.sort(X, axis=0)
        self.n_features_ = X.shape[1]
        self.kdes_ = [
            stats.gaussian_kde(self.support_[:, i].copy())
            for i in range(self.n_features_)
        ]
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Map each feature to N(0, 1) via empirical CDF then probit.

        Applies ``z = Phi^{-1}(F_hat_n(x))`` column by column.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Gaussianized data; each column has approximately N(0, 1) marginal.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Step 1: empirical CDF -> uniform value in (0, 1)
            u = MarginalUniformize._uniformize(X[:, i], self.support_[:, i])
            if self.bound_correct:
                # Clip to avoid Phi^{-1}(0) = -inf or Phi^{-1}(1) = +inf
                u = np.clip(u, self.eps, 1 - self.eps)
            # Step 2: probit transform Phi^{-1}(u) -> standard normal
            Xt[:, i] = ndtri(u)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Map standard-normal values back to the original space.

        Applies the normal CDF Phi to obtain uniform values, then uses
        piecewise-linear interpolation through the sorted support to recover
        approximate original-space values.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the Gaussianized (standard-normal) space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Invert probit: z -> Phi(z) in (0, 1)
            u = stats.norm.cdf(X[:, i])
            # Invert empirical CDF via linear interpolation
            Xt[:, i] = np.interp(
                u, np.linspace(0, 1, len(self.support_[:, i])), self.support_[:, i]
            )
        return Xt

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Log |det J| for marginal Gaussianization.

        For g(x) = Phi^{-1}(F_n(x)):
            log|dg/dx| = log f_n(x_i) - log phi(g(x_i))

        where f_n is estimated from a Gaussian KDE fitted to the training
        data for each feature.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data at which to evaluate the log-det-Jacobian.

        Returns
        -------
        log_jac : np.ndarray of shape (n_samples,)
            Per-sample sum of per-feature log-derivatives.

        Notes
        -----
        The empirical density is approximated via a Gaussian KDE (one per
        feature) fitted during :meth:`fit`.  Bandwidth is selected
        automatically using Scott's rule (the default in
        :func:`scipy.stats.gaussian_kde`).  The KDE objects are cached
        in ``self.kdes_`` so that ``log_det_jacobian`` and repeated calls
        to ``_per_feature_log_deriv`` do not re-fit the KDEs.
        """
        return np.sum(self._per_feature_log_deriv(X), axis=1)

    def _per_feature_log_deriv(
        self, X: np.ndarray, return_transform: bool = False
    ) -> np.ndarray | tuple[np.ndarray, np.ndarray]:
        """Per-feature log |dz_i/dx_i| via cached KDE density estimates.

        Uses the per-feature Gaussian KDEs stored in ``self.kdes_`` (fitted
        during :meth:`fit` with Scott's rule bandwidth) to evaluate the
        marginal density f_n(x_i) at each query point.  The log-derivative
        is then ``log f_n(x_i) - log phi(z_i)`` where ``z_i`` is the
        Gaussianized value.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data at which to evaluate the per-feature log-derivatives.
        return_transform : bool, default False
            If True, also return the Gaussianized output to avoid recomputing.

        Returns
        -------
        log_derivs : np.ndarray of shape (n_samples, n_features)
            Per-feature log |dz_i/dx_i| for each sample.
        Xt : np.ndarray of shape (n_samples, n_features)
            Only returned when ``return_transform=True``.
        """
        Xt = self.transform(X)  # Gaussianized output, shape (N, D)
        log_derivs = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # KDE-based density estimate for feature i
            # After pickle with readonly memmap, the KDE's internal arrays
            # may be read-only.  Re-create the KDE from a writable copy of
            # the dataset if necessary.
            kde = self.kdes_[i]
            if not kde.dataset.flags.writeable:
                kde = stats.gaussian_kde(kde.dataset.copy())
                self.kdes_[i] = kde
            xi = np.ascontiguousarray(X[:, i])
            log_f_i = np.log(np.maximum(kde(xi), 1e-300))
            # Log standard-normal PDF at Gaussianized value: log phi(z_i)
            log_phi_gi = stats.norm.logpdf(Xt[:, i])
            # Chain rule: log|dz/dx| = log f(x) - log phi(z)
            log_derivs[:, i] = log_f_i - log_phi_gi
        if return_transform:
            return log_derivs, Xt
        return log_derivs

`fit(X, y=None)` ¶

Fit by storing the column-wise sorted training data.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data used to build the per-feature empirical CDF.

Returns¶

self : MarginalGaussianize Fitted transform instance.

Source code in rbig/_src/marginal.py

def fit(self, X: np.ndarray, y=None) -> MarginalGaussianize:
    """Fit by storing the column-wise sorted training data.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data used to build the per-feature empirical CDF.

    Returns
    -------
    self : MarginalGaussianize
        Fitted transform instance.
    """
    # Sorted columns serve as empirical quantile nodes
    self.support_ = np.sort(X, axis=0)
    self.n_features_ = X.shape[1]
    self.kdes_ = [
        stats.gaussian_kde(self.support_[:, i].copy())
        for i in range(self.n_features_)
    ]
    return self

`transform(X)` ¶

Map each feature to N(0, 1) via empirical CDF then probit.

Applies z = Phi^{-1}(F_hat_n(x)) column by column.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns¶

Xt : np.ndarray of shape (n_samples, n_features) Gaussianized data; each column has approximately N(0, 1) marginal.

Source code in rbig/_src/marginal.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Map each feature to N(0, 1) via empirical CDF then probit.

    Applies ``z = Phi^{-1}(F_hat_n(x))`` column by column.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Gaussianized data; each column has approximately N(0, 1) marginal.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Step 1: empirical CDF -> uniform value in (0, 1)
        u = MarginalUniformize._uniformize(X[:, i], self.support_[:, i])
        if self.bound_correct:
            # Clip to avoid Phi^{-1}(0) = -inf or Phi^{-1}(1) = +inf
            u = np.clip(u, self.eps, 1 - self.eps)
        # Step 2: probit transform Phi^{-1}(u) -> standard normal
        Xt[:, i] = ndtri(u)
    return Xt

`inverse_transform(X)` ¶

Map standard-normal values back to the original space.

Applies the normal CDF Phi to obtain uniform values, then uses piecewise-linear interpolation through the sorted support to recover approximate original-space values.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data in the Gaussianized (standard-normal) space.

Returns¶

Xt : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space.

Source code in rbig/_src/marginal.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Map standard-normal values back to the original space.

    Applies the normal CDF Phi to obtain uniform values, then uses
    piecewise-linear interpolation through the sorted support to recover
    approximate original-space values.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the Gaussianized (standard-normal) space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Invert probit: z -> Phi(z) in (0, 1)
        u = stats.norm.cdf(X[:, i])
        # Invert empirical CDF via linear interpolation
        Xt[:, i] = np.interp(
            u, np.linspace(0, 1, len(self.support_[:, i])), self.support_[:, i]
        )
    return Xt

`log_det_jacobian(X)` ¶

Log |det J| for marginal Gaussianization.

For g(x) = Phi^{-1}(F_n(x)): log|dg/dx| = log f_n(x_i) - log phi(g(x_i))

where f_n is estimated from a Gaussian KDE fitted to the training data for each feature.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data at which to evaluate the log-det-Jacobian.

Returns¶

log_jac : np.ndarray of shape (n_samples,) Per-sample sum of per-feature log-derivatives.

Notes¶

The empirical density is approximated via a Gaussian KDE (one per feature) fitted during :meth:fit. Bandwidth is selected automatically using Scott's rule (the default in :func:scipy.stats.gaussian_kde). The KDE objects are cached in self.kdes_ so that log_det_jacobian and repeated calls to _per_feature_log_deriv do not re-fit the KDEs.

Source code in rbig/_src/marginal.py

def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Log |det J| for marginal Gaussianization.

    For g(x) = Phi^{-1}(F_n(x)):
        log|dg/dx| = log f_n(x_i) - log phi(g(x_i))

    where f_n is estimated from a Gaussian KDE fitted to the training
    data for each feature.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data at which to evaluate the log-det-Jacobian.

    Returns
    -------
    log_jac : np.ndarray of shape (n_samples,)
        Per-sample sum of per-feature log-derivatives.

    Notes
    -----
    The empirical density is approximated via a Gaussian KDE (one per
    feature) fitted during :meth:`fit`.  Bandwidth is selected
    automatically using Scott's rule (the default in
    :func:`scipy.stats.gaussian_kde`).  The KDE objects are cached
    in ``self.kdes_`` so that ``log_det_jacobian`` and repeated calls
    to ``_per_feature_log_deriv`` do not re-fit the KDEs.
    """
    return np.sum(self._per_feature_log_deriv(X), axis=1)

`rbig.MarginalKDEGaussianize` ¶

Bases: BaseTransform

Transform each marginal to Gaussian using a KDE-estimated CDF.

A kernel density estimate (KDE) with a Gaussian kernel is fitted to each feature dimension. The cumulative integral of the KDE serves as a smooth approximation to the marginal CDF, which is then composed with the probit function Phi^{-1} to Gaussianize each dimension::

z = Phi ^ {-1}(F_KDE(x))

where F_KDE(x) = integral_{-inf}^{x} f_KDE(t) dt and f_KDE is the Gaussian-kernel density estimate.

Parameters¶

bw_method : str, float, or None, default None Bandwidth selection method passed to :class:scipy.stats.gaussian_kde. None uses Scott's rule; 'silverman' uses Silverman's rule; a scalar sets the bandwidth factor directly. eps : float, default 1e-6 Clipping margin to prevent Phi^{-1}(0) = -inf or Phi^{-1}(1) = +inf.

Attributes¶

kdes_ : list of scipy.stats.gaussian_kde One fitted KDE object per feature dimension. n_features_ : int Number of feature dimensions seen during fit.

Notes¶

The inverse transform inverts the KDE CDF numerically via Brent's method (:func:scipy.optimize.brentq) searching in [-100, 100]. Samples outside this range default to 0.0.

References¶

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples¶

import numpy as np from rbig._src.marginal import MarginalKDEGaussianize rng = np.random.default_rng(0) X = rng.standard_normal((50, 2)) kde_g = MarginalKDEGaussianize().fit(X) Z = kde_g.transform(X) Z.shape (50, 2)

Source code in rbig/_src/marginal.py

class MarginalKDEGaussianize(BaseTransform):
    """Transform each marginal to Gaussian using a KDE-estimated CDF.

    A kernel density estimate (KDE) with a Gaussian kernel is fitted to each
    feature dimension.  The cumulative integral of the KDE serves as a smooth
    approximation to the marginal CDF, which is then composed with the probit
    function Phi^{-1} to Gaussianize each dimension::

        z = Phi ^ {-1}(F_KDE(x))

    where ``F_KDE(x) = integral_{-inf}^{x} f_KDE(t) dt`` and ``f_KDE`` is
    the Gaussian-kernel density estimate.

    Parameters
    ----------
    bw_method : str, float, or None, default None
        Bandwidth selection method passed to
        :class:`scipy.stats.gaussian_kde`.  ``None`` uses Scott's rule;
        ``'silverman'`` uses Silverman's rule; a scalar sets the bandwidth
        factor directly.
    eps : float, default 1e-6
        Clipping margin to prevent ``Phi^{-1}(0) = -inf`` or
        ``Phi^{-1}(1) = +inf``.

    Attributes
    ----------
    kdes_ : list of scipy.stats.gaussian_kde
        One fitted KDE object per feature dimension.
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The inverse transform inverts the KDE CDF numerically via Brent's method
    (:func:`scipy.optimize.brentq`) searching in [-100, 100].  Samples
    outside this range default to 0.0.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import MarginalKDEGaussianize
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((50, 2))
    >>> kde_g = MarginalKDEGaussianize().fit(X)
    >>> Z = kde_g.transform(X)
    >>> Z.shape
    (50, 2)
    """

    def __init__(self, bw_method: str | float | None = None, eps: float = 1e-6):
        self.bw_method = bw_method
        self.eps = eps

    def fit(self, X: np.ndarray, y=None) -> MarginalKDEGaussianize:
        """Fit a Gaussian KDE to each feature dimension.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : MarginalKDEGaussianize
            Fitted transform instance.
        """
        self.kdes_ = []
        self.n_features_ = X.shape[1]
        for i in range(self.n_features_):
            # Fit an independent Gaussian KDE per feature
            self.kdes_.append(stats.gaussian_kde(X[:, i], bw_method=self.bw_method))
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Map each feature to N(0, 1) via KDE CDF then probit.

        Computes ``z = Phi^{-1}(F_KDE(x))`` per feature using numerical
        integration of the fitted KDE.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Gaussianized data.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Integrate KDE from -inf to each sample value to get CDF
            u = np.array(
                [self.kdes_[i].integrate_box_1d(-np.inf, xi) for xi in X[:, i]]
            )
            # Clip to avoid +/-inf from the probit function
            u = np.clip(u, self.eps, 1 - self.eps)
            Xt[:, i] = ndtri(u)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Map standard-normal values back to the original space.

        Numerically inverts the KDE CDF via Brent's root-finding method.
        For each sample *j* and feature *i*, solves::

            F_KDE(x) = Phi(z_j)

        searching on the interval [-100, 100].

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the Gaussianized space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
            Samples that fail root-finding are set to 0.0.
        """
        from scipy.optimize import brentq

        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            for j, xj in enumerate(X[:, i]):
                # Map z -> u in (0, 1) via normal CDF
                u = stats.norm.cdf(xj)
                try:
                    # Numerically invert F_KDE(x) = u via root-finding
                    Xt[j, i] = brentq(
                        lambda x, u=u, i=i: (
                            self.kdes_[i].integrate_box_1d(-np.inf, x) - u
                        ),
                        -100,
                        100,
                    )
                except ValueError:
                    # Root not found in [-100, 100]; fall back to zero
                    Xt[j, i] = 0.0
        return Xt

`fit(X, y=None)` ¶

Fit a Gaussian KDE to each feature dimension.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns¶

self : MarginalKDEGaussianize Fitted transform instance.

Source code in rbig/_src/marginal.py

def fit(self, X: np.ndarray, y=None) -> MarginalKDEGaussianize:
    """Fit a Gaussian KDE to each feature dimension.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : MarginalKDEGaussianize
        Fitted transform instance.
    """
    self.kdes_ = []
    self.n_features_ = X.shape[1]
    for i in range(self.n_features_):
        # Fit an independent Gaussian KDE per feature
        self.kdes_.append(stats.gaussian_kde(X[:, i], bw_method=self.bw_method))
    return self

`transform(X)` ¶

Map each feature to N(0, 1) via KDE CDF then probit.

Computes z = Phi^{-1}(F_KDE(x)) per feature using numerical integration of the fitted KDE.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns¶

Xt : np.ndarray of shape (n_samples, n_features) Gaussianized data.

Source code in rbig/_src/marginal.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Map each feature to N(0, 1) via KDE CDF then probit.

    Computes ``z = Phi^{-1}(F_KDE(x))`` per feature using numerical
    integration of the fitted KDE.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Gaussianized data.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Integrate KDE from -inf to each sample value to get CDF
        u = np.array(
            [self.kdes_[i].integrate_box_1d(-np.inf, xi) for xi in X[:, i]]
        )
        # Clip to avoid +/-inf from the probit function
        u = np.clip(u, self.eps, 1 - self.eps)
        Xt[:, i] = ndtri(u)
    return Xt

`inverse_transform(X)` ¶

Map standard-normal values back to the original space.

Numerically inverts the KDE CDF via Brent's root-finding method. For each sample j and feature i, solves::

F_KDE(x) = Phi(z_j)

searching on the interval [-100, 100].

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data in the Gaussianized space.

Returns¶

Xt : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space. Samples that fail root-finding are set to 0.0.

Source code in rbig/_src/marginal.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Map standard-normal values back to the original space.

    Numerically inverts the KDE CDF via Brent's root-finding method.
    For each sample *j* and feature *i*, solves::

        F_KDE(x) = Phi(z_j)

    searching on the interval [-100, 100].

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the Gaussianized space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
        Samples that fail root-finding are set to 0.0.
    """
    from scipy.optimize import brentq

    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        for j, xj in enumerate(X[:, i]):
            # Map z -> u in (0, 1) via normal CDF
            u = stats.norm.cdf(xj)
            try:
                # Numerically invert F_KDE(x) = u via root-finding
                Xt[j, i] = brentq(
                    lambda x, u=u, i=i: (
                        self.kdes_[i].integrate_box_1d(-np.inf, x) - u
                    ),
                    -100,
                    100,
                )
            except ValueError:
                # Root not found in [-100, 100]; fall back to zero
                Xt[j, i] = 0.0
    return Xt

`rbig.SplineGaussianizer` ¶

Bases: Bijector

Gaussianize each marginal using monotone PCHIP spline interpolation.

Estimates the marginal CDF from empirical quantiles and fits a shape-preserving (monotone) cubic Hermite spline (PCHIP) from original-space quantile values to the corresponding Gaussian quantiles. The forward transform is::

z = S(x)

where S is the fitted :class:scipy.interpolate.PchipInterpolator mapping data values to standard-normal quantiles. Because PCHIP preserves monotonicity, the mapping is guaranteed to be invertible.

Parameters¶

n_quantiles : int, default 200 Number of quantile nodes used to fit the splines. Capped at n_samples when fewer training samples are available. eps : float, default 1e-6 Clipping margin applied to the Gaussian quantile grid to keep the spline endpoints finite (avoids +/-inf at boundary quantiles).

Attributes¶

splines_ : list of scipy.interpolate.PchipInterpolator Forward splines (x -> z) per feature, mapping original-space values to standard-normal quantiles. inv_splines_ : list of scipy.interpolate.PchipInterpolator Inverse splines (z -> x) per feature, mapping Gaussian quantiles back to original-space values. n_features_ : int Number of feature dimensions seen during fit.

Notes¶

The log-det-Jacobian uses the analytic first derivative of the spline::

log|dz/dx| = log|S'(x)|

where S' is the first derivative of the PCHIP forward spline, evaluated via spline(x, 1) (the derivative-order argument).

Duplicate x-values (arising from discrete or constant features) are removed before fitting to ensure strict monotonicity.

References¶

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples¶

import numpy as np from rbig._src.marginal import SplineGaussianizer rng = np.random.default_rng(0) X = rng.standard_normal((300, 3)) sg = SplineGaussianizer(n_quantiles=100).fit(X) Z = sg.transform(X) Z.shape (300, 3) ldj = sg.get_log_det_jacobian(X) ldj.shape (300,)

Source code in rbig/_src/marginal.py

class SplineGaussianizer(Bijector):
    """Gaussianize each marginal using monotone PCHIP spline interpolation.

    Estimates the marginal CDF from empirical quantiles and fits a
    shape-preserving (monotone) cubic Hermite spline (PCHIP) from
    original-space quantile values to the corresponding Gaussian quantiles.
    The forward transform is::

        z = S(x)

    where ``S`` is the fitted :class:`scipy.interpolate.PchipInterpolator`
    mapping data values to standard-normal quantiles.  Because PCHIP
    preserves monotonicity, the mapping is guaranteed to be invertible.

    Parameters
    ----------
    n_quantiles : int, default 200
        Number of quantile nodes used to fit the splines.  Capped at
        ``n_samples`` when fewer training samples are available.
    eps : float, default 1e-6
        Clipping margin applied to the Gaussian quantile grid to keep
        the spline endpoints finite (avoids +/-inf at boundary quantiles).

    Attributes
    ----------
    splines_ : list of scipy.interpolate.PchipInterpolator
        Forward splines (x -> z) per feature, mapping original-space
        values to standard-normal quantiles.
    inv_splines_ : list of scipy.interpolate.PchipInterpolator
        Inverse splines (z -> x) per feature, mapping Gaussian quantiles
        back to original-space values.
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The log-det-Jacobian uses the analytic first derivative of the spline::

        log|dz/dx| = log|S'(x)|

    where ``S'`` is the first derivative of the PCHIP forward spline,
    evaluated via ``spline(x, 1)`` (the derivative-order argument).

    Duplicate x-values (arising from discrete or constant features) are
    removed before fitting to ensure strict monotonicity.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import SplineGaussianizer
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((300, 3))
    >>> sg = SplineGaussianizer(n_quantiles=100).fit(X)
    >>> Z = sg.transform(X)
    >>> Z.shape
    (300, 3)
    >>> ldj = sg.get_log_det_jacobian(X)
    >>> ldj.shape
    (300,)
    """

    def __init__(self, n_quantiles: int = 200, eps: float = 1e-6):
        self.n_quantiles = n_quantiles
        self.eps = eps

    def fit(self, X: np.ndarray, y=None) -> SplineGaussianizer:
        """Fit forward and inverse PCHIP splines for each feature.

        For each dimension, ``n_quantiles`` evenly-spaced probability levels
        are mapped to their empirical quantile values in the data, and the
        corresponding Gaussian quantile values ``Phi^{-1}(p)`` are computed.
        PCHIP interpolants are then fitted in both directions.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : SplineGaussianizer
            Fitted bijector instance.
        """
        from scipy.interpolate import PchipInterpolator

        self.splines_ = []
        self.inv_splines_ = []
        self.n_features_ = X.shape[1]
        # Use at most n_samples quantile nodes
        n_q = min(self.n_quantiles, X.shape[0])
        # Probability grid: n_q evenly-spaced points in [0, 1]
        quantiles = np.linspace(0, 1, n_q)
        # Corresponding Gaussian quantiles Phi^{-1}(p), clipped away from +/-inf
        g_q = ndtri(np.clip(quantiles, self.eps, 1 - self.eps))
        for i in range(self.n_features_):
            xi_sorted = np.sort(X[:, i])
            # Empirical quantile values at each probability level
            x_q = np.quantile(xi_sorted, quantiles)
            # Remove duplicate x values so PchipInterpolator gets a strictly
            # increasing sequence (duplicates arise with discrete/tied data).
            x_q_u, idx = np.unique(x_q, return_index=True)
            g_q_u = g_q[idx]
            self.splines_.append(PchipInterpolator(x_q_u, g_q_u))
            self.inv_splines_.append(PchipInterpolator(g_q_u, x_q_u))
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the forward spline map: x -> z = S(x).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Gaussianized data with approximately standard-normal marginals.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Evaluate the forward PCHIP spline at the input values
            Xt[:, i] = self.splines_[i](X[:, i])
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the inverse spline map: z -> x = S^{-1}(z).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the Gaussianized space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Evaluate the inverse PCHIP spline at the Gaussian values
            Xt[:, i] = self.inv_splines_[i](X[:, i])
        return Xt

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute log |det J| using the analytic spline first derivative.

        Because the Jacobian is diagonal::

            log|det J| = sum_i log|S'(x_i)|

        where ``S'`` is the first derivative of the PCHIP forward spline,
        evaluated via ``spline(x, 1)`` (the derivative order argument).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points at which to evaluate the log-det-Jacobian.

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant.
        """
        log_det = np.zeros(X.shape[0])
        for i in range(self.n_features_):
            deriv = self.splines_[i](X[:, i], 1)  # first derivative
            log_det += np.log(np.maximum(np.abs(deriv), 1e-300))
        return log_det

`fit(X, y=None)` ¶

Fit forward and inverse PCHIP splines for each feature.

For each dimension, n_quantiles evenly-spaced probability levels are mapped to their empirical quantile values in the data, and the corresponding Gaussian quantile values Phi^{-1}(p) are computed. PCHIP interpolants are then fitted in both directions.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns¶

self : SplineGaussianizer Fitted bijector instance.

Source code in rbig/_src/marginal.py

def fit(self, X: np.ndarray, y=None) -> SplineGaussianizer:
    """Fit forward and inverse PCHIP splines for each feature.

    For each dimension, ``n_quantiles`` evenly-spaced probability levels
    are mapped to their empirical quantile values in the data, and the
    corresponding Gaussian quantile values ``Phi^{-1}(p)`` are computed.
    PCHIP interpolants are then fitted in both directions.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : SplineGaussianizer
        Fitted bijector instance.
    """
    from scipy.interpolate import PchipInterpolator

    self.splines_ = []
    self.inv_splines_ = []
    self.n_features_ = X.shape[1]
    # Use at most n_samples quantile nodes
    n_q = min(self.n_quantiles, X.shape[0])
    # Probability grid: n_q evenly-spaced points in [0, 1]
    quantiles = np.linspace(0, 1, n_q)
    # Corresponding Gaussian quantiles Phi^{-1}(p), clipped away from +/-inf
    g_q = ndtri(np.clip(quantiles, self.eps, 1 - self.eps))
    for i in range(self.n_features_):
        xi_sorted = np.sort(X[:, i])
        # Empirical quantile values at each probability level
        x_q = np.quantile(xi_sorted, quantiles)
        # Remove duplicate x values so PchipInterpolator gets a strictly
        # increasing sequence (duplicates arise with discrete/tied data).
        x_q_u, idx = np.unique(x_q, return_index=True)
        g_q_u = g_q[idx]
        self.splines_.append(PchipInterpolator(x_q_u, g_q_u))
        self.inv_splines_.append(PchipInterpolator(g_q_u, x_q_u))
    return self

`transform(X)` ¶

Apply the forward spline map: x -> z = S(x).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns¶

Xt : np.ndarray of shape (n_samples, n_features) Gaussianized data with approximately standard-normal marginals.

Source code in rbig/_src/marginal.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the forward spline map: x -> z = S(x).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Gaussianized data with approximately standard-normal marginals.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Evaluate the forward PCHIP spline at the input values
        Xt[:, i] = self.splines_[i](X[:, i])
    return Xt

`inverse_transform(X)` ¶

Apply the inverse spline map: z -> x = S^{-1}(z).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data in the Gaussianized space.

Returns¶

Xt : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space.

Source code in rbig/_src/marginal.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the inverse spline map: z -> x = S^{-1}(z).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the Gaussianized space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Evaluate the inverse PCHIP spline at the Gaussian values
        Xt[:, i] = self.inv_splines_[i](X[:, i])
    return Xt

`get_log_det_jacobian(X)` ¶

Compute log |det J| using the analytic spline first derivative.

Because the Jacobian is diagonal::

log|det J| = sum_i log|S'(x_i)|

where S' is the first derivative of the PCHIP forward spline, evaluated via spline(x, 1) (the derivative order argument).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input points at which to evaluate the log-det-Jacobian.

Returns¶

log_det : np.ndarray of shape (n_samples,) Per-sample log absolute determinant.

Source code in rbig/_src/marginal.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute log |det J| using the analytic spline first derivative.

    Because the Jacobian is diagonal::

        log|det J| = sum_i log|S'(x_i)|

    where ``S'`` is the first derivative of the PCHIP forward spline,
    evaluated via ``spline(x, 1)`` (the derivative order argument).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points at which to evaluate the log-det-Jacobian.

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant.
    """
    log_det = np.zeros(X.shape[0])
    for i in range(self.n_features_):
        deriv = self.splines_[i](X[:, i], 1)  # first derivative
        log_det += np.log(np.maximum(np.abs(deriv), 1e-300))
    return log_det

`rbig.KDEGaussianizer` ¶

Bases: Bijector

Gaussianize each marginal using a KDE-estimated CDF and probit.

Fits a Gaussian kernel density estimate (KDE) to each feature dimension, then maps samples to standard-normal values via::

z = Phi ^ {-1}(F_KDE(x))

where F_KDE(x) = integral_{-inf}^{x} f_KDE(t) dt is the smooth KDE-based CDF and Phi^{-1} is the Gaussian probit (inverse CDF).

Parameters¶

bw_method : str, float, or None, default None Bandwidth selection passed to :class:scipy.stats.gaussian_kde. None uses Scott's rule; 'silverman' uses Silverman's rule; a scalar sets the smoothing factor directly. eps : float, default 1e-6 Clipping margin applied to the CDF value before the probit to prevent Phi^{-1}(0) = -inf or Phi^{-1}(1) = +inf.

Attributes¶

kdes_ : list of scipy.stats.gaussian_kde One fitted KDE per feature dimension. n_features_ : int Number of feature dimensions seen during fit.

Notes¶

The log-det-Jacobian uses the analytic KDE density::

log|dz/dx| = log f_KDE(x) - log phi(z)

where phi is the standard-normal PDF evaluated at the Gaussianized value z = Phi^{-1}(F_KDE(x)).

The inverse transform uses Brent's root-finding algorithm to numerically invert F_KDE(x) = Phi(z) on the interval [-100, 100].

References¶

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples¶

import numpy as np from rbig._src.marginal import KDEGaussianizer rng = np.random.default_rng(0) X = rng.standard_normal((100, 2)) kde = KDEGaussianizer().fit(X) Z = kde.transform(X) Z.shape (100, 2) ldj = kde.get_log_det_jacobian(X) ldj.shape (100,)

Source code in rbig/_src/marginal.py

class KDEGaussianizer(Bijector):
    """Gaussianize each marginal using a KDE-estimated CDF and probit.

    Fits a Gaussian kernel density estimate (KDE) to each feature dimension,
    then maps samples to standard-normal values via::

        z = Phi ^ {-1}(F_KDE(x))

    where ``F_KDE(x) = integral_{-inf}^{x} f_KDE(t) dt`` is the smooth
    KDE-based CDF and ``Phi^{-1}`` is the Gaussian probit (inverse CDF).

    Parameters
    ----------
    bw_method : str, float, or None, default None
        Bandwidth selection passed to :class:`scipy.stats.gaussian_kde`.
        ``None`` uses Scott's rule; ``'silverman'`` uses Silverman's rule;
        a scalar sets the smoothing factor directly.
    eps : float, default 1e-6
        Clipping margin applied to the CDF value before the probit to
        prevent ``Phi^{-1}(0) = -inf`` or ``Phi^{-1}(1) = +inf``.

    Attributes
    ----------
    kdes_ : list of scipy.stats.gaussian_kde
        One fitted KDE per feature dimension.
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The log-det-Jacobian uses the analytic KDE density::

        log|dz/dx| = log f_KDE(x) - log phi(z)

    where ``phi`` is the standard-normal PDF evaluated at the Gaussianized
    value ``z = Phi^{-1}(F_KDE(x))``.

    The inverse transform uses Brent's root-finding algorithm to numerically
    invert ``F_KDE(x) = Phi(z)`` on the interval [-100, 100].

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import KDEGaussianizer
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((100, 2))
    >>> kde = KDEGaussianizer().fit(X)
    >>> Z = kde.transform(X)
    >>> Z.shape
    (100, 2)
    >>> ldj = kde.get_log_det_jacobian(X)
    >>> ldj.shape
    (100,)
    """

    def __init__(self, bw_method: str | float | None = None, eps: float = 1e-6):
        self.bw_method = bw_method
        self.eps = eps

    def fit(self, X: np.ndarray, y=None) -> KDEGaussianizer:
        """Fit a Gaussian KDE to each feature dimension.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : KDEGaussianizer
            Fitted bijector instance.
        """
        self.kdes_ = []
        self.n_features_ = X.shape[1]
        for i in range(self.n_features_):
            # Independent KDE per feature
            self.kdes_.append(stats.gaussian_kde(X[:, i], bw_method=self.bw_method))
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Map each feature to N(0, 1) via KDE CDF then probit.

        Computes ``z = Phi^{-1}(F_KDE(x))`` for each feature independently.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Gaussianized data with approximately standard-normal marginals.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Numerical integration of KDE from -inf to each sample value
            u = np.array(
                [self.kdes_[i].integrate_box_1d(-np.inf, xi) for xi in X[:, i]]
            )
            # Clip CDF values away from boundaries before probit
            u = np.clip(u, self.eps, 1 - self.eps)
            Xt[:, i] = ndtri(u)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Map standard-normal values back to the original space.

        Numerically inverts ``F_KDE(x) = Phi(z)`` using Brent's method
        on the interval [-100, 100] per sample and feature.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the Gaussianized (standard-normal) space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
            Samples for which root-finding fails default to 0.0.
        """
        from scipy.optimize import brentq

        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            for j, xj in enumerate(X[:, i]):
                # Map z -> u = Phi(z) in (0, 1)
                u = stats.norm.cdf(xj)
                try:
                    # Find x such that F_KDE(x) = u
                    Xt[j, i] = brentq(
                        lambda x, u=u, i=i: (
                            self.kdes_[i].integrate_box_1d(-np.inf, x) - u
                        ),
                        -100,
                        100,
                    )
                except ValueError:
                    # Root not bracketed in [-100, 100]; use zero as fallback
                    Xt[j, i] = 0.0
        return Xt

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute log |det J| using the analytic KDE density.

        Because the Jacobian is diagonal (each feature transformed
        independently)::

            log|det J| = sum_i log|dz_i/dx_i|
                       = sum_i [log f_KDE(x_i) - log phi(z_i)]

        where ``phi`` is the standard-normal PDF evaluated at
        ``z_i = Phi^{-1}(F_KDE(x_i))``.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points at which to evaluate the log-det-Jacobian.

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant.
        """
        log_det = np.zeros(X.shape[0])
        for i in range(self.n_features_):
            # Evaluate KDE density (used as the empirical marginal PDF)
            pdf = self.kdes_[i](X[:, i])
            # Compute KDE CDF via numerical integration
            u = np.array(
                [self.kdes_[i].integrate_box_1d(-np.inf, xi) for xi in X[:, i]]
            )
            u = np.clip(u, self.eps, 1 - self.eps)
            g = ndtri(u)  # Gaussianized value z = Phi^{-1}(u)
            log_phi = stats.norm.logpdf(g)  # log phi(z)
            # log|dz/dx| = log f_KDE(x) - log phi(z)
            log_det += np.log(np.maximum(pdf, 1e-300)) - log_phi
        return log_det

`fit(X, y=None)` ¶

Fit a Gaussian KDE to each feature dimension.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns¶

self : KDEGaussianizer Fitted bijector instance.

Source code in rbig/_src/marginal.py

def fit(self, X: np.ndarray, y=None) -> KDEGaussianizer:
    """Fit a Gaussian KDE to each feature dimension.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : KDEGaussianizer
        Fitted bijector instance.
    """
    self.kdes_ = []
    self.n_features_ = X.shape[1]
    for i in range(self.n_features_):
        # Independent KDE per feature
        self.kdes_.append(stats.gaussian_kde(X[:, i], bw_method=self.bw_method))
    return self

`transform(X)` ¶

Map each feature to N(0, 1) via KDE CDF then probit.

Computes z = Phi^{-1}(F_KDE(x)) for each feature independently.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns¶

Xt : np.ndarray of shape (n_samples, n_features) Gaussianized data with approximately standard-normal marginals.

Source code in rbig/_src/marginal.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Map each feature to N(0, 1) via KDE CDF then probit.

    Computes ``z = Phi^{-1}(F_KDE(x))`` for each feature independently.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Gaussianized data with approximately standard-normal marginals.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Numerical integration of KDE from -inf to each sample value
        u = np.array(
            [self.kdes_[i].integrate_box_1d(-np.inf, xi) for xi in X[:, i]]
        )
        # Clip CDF values away from boundaries before probit
        u = np.clip(u, self.eps, 1 - self.eps)
        Xt[:, i] = ndtri(u)
    return Xt

`inverse_transform(X)` ¶

Map standard-normal values back to the original space.

Numerically inverts F_KDE(x) = Phi(z) using Brent's method on the interval [-100, 100] per sample and feature.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data in the Gaussianized (standard-normal) space.

Returns¶

Xt : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space. Samples for which root-finding fails default to 0.0.

Source code in rbig/_src/marginal.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Map standard-normal values back to the original space.

    Numerically inverts ``F_KDE(x) = Phi(z)`` using Brent's method
    on the interval [-100, 100] per sample and feature.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the Gaussianized (standard-normal) space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
        Samples for which root-finding fails default to 0.0.
    """
    from scipy.optimize import brentq

    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        for j, xj in enumerate(X[:, i]):
            # Map z -> u = Phi(z) in (0, 1)
            u = stats.norm.cdf(xj)
            try:
                # Find x such that F_KDE(x) = u
                Xt[j, i] = brentq(
                    lambda x, u=u, i=i: (
                        self.kdes_[i].integrate_box_1d(-np.inf, x) - u
                    ),
                    -100,
                    100,
                )
            except ValueError:
                # Root not bracketed in [-100, 100]; use zero as fallback
                Xt[j, i] = 0.0
    return Xt

`get_log_det_jacobian(X)` ¶

Compute log |det J| using the analytic KDE density.

Because the Jacobian is diagonal (each feature transformed independently)::

log|det J| = sum_i log|dz_i/dx_i|
           = sum_i [log f_KDE(x_i) - log phi(z_i)]

where phi is the standard-normal PDF evaluated at z_i = Phi^{-1}(F_KDE(x_i)).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input points at which to evaluate the log-det-Jacobian.

Returns¶

log_det : np.ndarray of shape (n_samples,) Per-sample log absolute determinant.

Source code in rbig/_src/marginal.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute log |det J| using the analytic KDE density.

    Because the Jacobian is diagonal (each feature transformed
    independently)::

        log|det J| = sum_i log|dz_i/dx_i|
                   = sum_i [log f_KDE(x_i) - log phi(z_i)]

    where ``phi`` is the standard-normal PDF evaluated at
    ``z_i = Phi^{-1}(F_KDE(x_i))``.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points at which to evaluate the log-det-Jacobian.

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant.
    """
    log_det = np.zeros(X.shape[0])
    for i in range(self.n_features_):
        # Evaluate KDE density (used as the empirical marginal PDF)
        pdf = self.kdes_[i](X[:, i])
        # Compute KDE CDF via numerical integration
        u = np.array(
            [self.kdes_[i].integrate_box_1d(-np.inf, xi) for xi in X[:, i]]
        )
        u = np.clip(u, self.eps, 1 - self.eps)
        g = ndtri(u)  # Gaussianized value z = Phi^{-1}(u)
        log_phi = stats.norm.logpdf(g)  # log phi(z)
        # log|dz/dx| = log f_KDE(x) - log phi(z)
        log_det += np.log(np.maximum(pdf, 1e-300)) - log_phi
    return log_det

`rbig.GMMGaussianizer` ¶

Bases: Bijector

Gaussianize each marginal using a Gaussian Mixture Model (GMM) CDF.

Fits a univariate GMM with n_components Gaussian components to each feature dimension, then maps samples to standard-normal values via the analytic GMM CDF::

F_GMM(x) = sum_k w_k * Phi((x - mu_k) / sigma_k)

followed by the probit function::

z = Phi ^ {-1}(F_GMM(x))

where Phi is the standard-normal CDF, w_k are mixture weights, and mu_k, sigma_k are the component means and standard deviations.

Parameters¶

n_components : int, default 5 Number of mixture components. Capped at max(1, min(n_components, n_samples // 5, n_samples)) during fit to avoid over-fitting on small data sets. random_state : int or None, default 0 Seed for reproducible GMM initialisation.

Attributes¶

gmms_ : list of sklearn.mixture.GaussianMixture One fitted 1-D GMM per feature dimension. n_features_ : int Number of feature dimensions seen during fit.

Notes¶

The log-det-Jacobian uses the analytic GMM density::

log|dz/dx| = log f_GMM(x) - log phi(z)

where f_GMM(x) = sum_k w_k * phi((x - mu_k) / sigma_k) / sigma_k is the GMM PDF and phi is the standard-normal PDF.

The inverse transform numerically inverts the GMM CDF via Brent's method on [-50, 50]; samples outside this range default to 0.0.

References¶

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples¶

import numpy as np from rbig._src.marginal import GMMGaussianizer rng = np.random.default_rng(0) X = rng.standard_normal((200, 2)) gmm = GMMGaussianizer(n_components=3).fit(X) Z = gmm.transform(X) Z.shape (200, 2) ldj = gmm.get_log_det_jacobian(X) ldj.shape (200,)

Source code in rbig/_src/marginal.py

class GMMGaussianizer(Bijector):
    """Gaussianize each marginal using a Gaussian Mixture Model (GMM) CDF.

    Fits a univariate GMM with ``n_components`` Gaussian components to each
    feature dimension, then maps samples to standard-normal values via the
    analytic GMM CDF::

        F_GMM(x) = sum_k w_k * Phi((x - mu_k) / sigma_k)

    followed by the probit function::

        z = Phi ^ {-1}(F_GMM(x))

    where ``Phi`` is the standard-normal CDF, ``w_k`` are mixture weights,
    and ``mu_k``, ``sigma_k`` are the component means and standard deviations.

    Parameters
    ----------
    n_components : int, default 5
        Number of mixture components.  Capped at
        ``max(1, min(n_components, n_samples // 5, n_samples))`` during
        ``fit`` to avoid over-fitting on small data sets.
    random_state : int or None, default 0
        Seed for reproducible GMM initialisation.

    Attributes
    ----------
    gmms_ : list of sklearn.mixture.GaussianMixture
        One fitted 1-D GMM per feature dimension.
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The log-det-Jacobian uses the analytic GMM density::

        log|dz/dx| = log f_GMM(x) - log phi(z)

    where ``f_GMM(x) = sum_k w_k * phi((x - mu_k) / sigma_k) / sigma_k``
    is the GMM PDF and ``phi`` is the standard-normal PDF.

    The inverse transform numerically inverts the GMM CDF via Brent's
    method on [-50, 50]; samples outside this range default to 0.0.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import GMMGaussianizer
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 2))
    >>> gmm = GMMGaussianizer(n_components=3).fit(X)
    >>> Z = gmm.transform(X)
    >>> Z.shape
    (200, 2)
    >>> ldj = gmm.get_log_det_jacobian(X)
    >>> ldj.shape
    (200,)
    """

    def __init__(self, n_components: int = 5, random_state: int | None = 0):
        self.n_components = n_components
        self.random_state = random_state

    def fit(self, X: np.ndarray, y=None) -> GMMGaussianizer:
        """Fit a univariate GMM to each feature dimension.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : GMMGaussianizer
            Fitted bijector instance.

        Notes
        -----
        The number of mixture components is capped at
        ``max(1, min(n_components, n_samples // 5, n_samples))`` to avoid
        over-fitting when ``n_samples`` is small.
        """
        from sklearn.mixture import GaussianMixture

        self.gmms_ = []
        self.n_features_ = X.shape[1]
        n_samples = X.shape[0]
        # Cap n_components to avoid GMMs with more components than data points
        n_components = max(1, min(self.n_components, n_samples // 5, n_samples))
        for i in range(self.n_features_):
            gmm = GaussianMixture(
                n_components=n_components,
                random_state=self.random_state,
            )
            # Reshape to (n_samples, 1) as required by sklearn GMM API
            gmm.fit(X[:, i : i + 1])
            self.gmms_.append(gmm)
        return self

    def _cdf(self, gmm, x: np.ndarray) -> np.ndarray:
        """Compute the GMM CDF at points x (1-D).

        Evaluates the mixture CDF::

            F_GMM(x) = sum_k w_k * Phi((x - mu_k) / sigma_k)

        Parameters
        ----------
        gmm : sklearn.mixture.GaussianMixture
            Fitted 1-D GMM.
        x : np.ndarray of shape (n_samples,)
            Query points.

        Returns
        -------
        cdf : np.ndarray of shape (n_samples,)
            GMM CDF values in [0, 1].
        """
        weights = gmm.weights_  # mixture weights, shape (K,)
        means = gmm.means_.ravel()  # component means, shape (K,)
        stds = np.sqrt(gmm.covariances_.ravel())  # component stds, shape (K,)
        cdf = np.zeros_like(x, dtype=float)
        for w, mu, sigma in zip(weights, means, stds, strict=False):
            # Weighted sum of normal CDFs: w_k * Phi((x - mu_k) / sigma_k)
            cdf += w * stats.norm.cdf(x, loc=mu, scale=sigma)
        return cdf

    def _pdf(self, gmm, x: np.ndarray) -> np.ndarray:
        """Compute the GMM PDF at points x (1-D).

        Evaluates the mixture density::

            f_GMM(x) = sum_k w_k * phi((x - mu_k) / sigma_k) / sigma_k

        Parameters
        ----------
        gmm : sklearn.mixture.GaussianMixture
            Fitted 1-D GMM.
        x : np.ndarray of shape (n_samples,)
            Query points.

        Returns
        -------
        pdf : np.ndarray of shape (n_samples,)
            GMM PDF values (>= 0).
        """
        weights = gmm.weights_
        means = gmm.means_.ravel()
        stds = np.sqrt(gmm.covariances_.ravel())
        pdf = np.zeros_like(x, dtype=float)
        for w, mu, sigma in zip(weights, means, stds, strict=False):
            # Weighted sum of normal PDFs: w_k * phi((x - mu_k) / sigma_k) / sigma_k
            pdf += w * stats.norm.pdf(x, loc=mu, scale=sigma)
        return pdf

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Map each feature to N(0, 1) via GMM CDF then probit.

        Applies ``z = Phi^{-1}(F_GMM(x))`` to each feature independently.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Gaussianized data with approximately standard-normal marginals.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Evaluate analytic GMM CDF
            u = self._cdf(self.gmms_[i], X[:, i])
            # Clip away from boundaries before probit
            u = np.clip(u, 1e-6, 1 - 1e-6)
            Xt[:, i] = ndtri(u)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Map standard-normal values back to the original space.

        Numerically inverts ``F_GMM(x) = Phi(z)`` per sample via Brent's
        method on the interval [-50, 50].

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the Gaussianized (standard-normal) space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
            Samples for which root-finding fails default to 0.0.
        """
        from scipy.optimize import brentq

        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            for j, xj in enumerate(X[:, i]):
                # Map z -> u = Phi(z) in (0, 1)
                u = stats.norm.cdf(xj)
                try:
                    # Numerically solve F_GMM(x) = u for x
                    Xt[j, i] = brentq(
                        lambda x, u=u, i=i: (
                            self._cdf(self.gmms_[i], np.array([x]))[0] - u
                        ),
                        -50,
                        50,
                    )
                except ValueError:
                    # Root not found in [-50, 50]; fall back to zero
                    Xt[j, i] = 0.0
        return Xt

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute log |det J| using the analytic GMM density.

        Because the Jacobian is diagonal (each feature transformed
        independently)::

            log|det J| = sum_i log|dz_i/dx_i|
                       = sum_i [log f_GMM(x_i) - log phi(z_i)]

        where ``z_i = Phi^{-1}(F_GMM(x_i))`` and ``phi`` is the
        standard-normal PDF.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points at which to evaluate the log-det-Jacobian.

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant.
        """
        log_det = np.zeros(X.shape[0])
        for i in range(self.n_features_):
            # Evaluate GMM CDF and clip to avoid probit boundary issues
            u = self._cdf(self.gmms_[i], X[:, i])
            u = np.clip(u, 1e-6, 1 - 1e-6)
            g = ndtri(u)  # z = Phi^{-1}(F_GMM(x))
            pdf = self._pdf(self.gmms_[i], X[:, i])  # f_GMM(x)
            log_phi = stats.norm.logpdf(g)  # log phi(z)
            # log|dz/dx| = log f_GMM(x) - log phi(z)
            log_det += np.log(np.maximum(pdf, 1e-300)) - log_phi
        return log_det

`fit(X, y=None)` ¶

Fit a univariate GMM to each feature dimension.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns¶

self : GMMGaussianizer Fitted bijector instance.

Notes¶

The number of mixture components is capped at max(1, min(n_components, n_samples // 5, n_samples)) to avoid over-fitting when n_samples is small.

Source code in rbig/_src/marginal.py

def fit(self, X: np.ndarray, y=None) -> GMMGaussianizer:
    """Fit a univariate GMM to each feature dimension.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : GMMGaussianizer
        Fitted bijector instance.

    Notes
    -----
    The number of mixture components is capped at
    ``max(1, min(n_components, n_samples // 5, n_samples))`` to avoid
    over-fitting when ``n_samples`` is small.
    """
    from sklearn.mixture import GaussianMixture

    self.gmms_ = []
    self.n_features_ = X.shape[1]
    n_samples = X.shape[0]
    # Cap n_components to avoid GMMs with more components than data points
    n_components = max(1, min(self.n_components, n_samples // 5, n_samples))
    for i in range(self.n_features_):
        gmm = GaussianMixture(
            n_components=n_components,
            random_state=self.random_state,
        )
        # Reshape to (n_samples, 1) as required by sklearn GMM API
        gmm.fit(X[:, i : i + 1])
        self.gmms_.append(gmm)
    return self

`transform(X)` ¶

Map each feature to N(0, 1) via GMM CDF then probit.

Applies z = Phi^{-1}(F_GMM(x)) to each feature independently.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns¶

Xt : np.ndarray of shape (n_samples, n_features) Gaussianized data with approximately standard-normal marginals.

Source code in rbig/_src/marginal.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Map each feature to N(0, 1) via GMM CDF then probit.

    Applies ``z = Phi^{-1}(F_GMM(x))`` to each feature independently.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Gaussianized data with approximately standard-normal marginals.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Evaluate analytic GMM CDF
        u = self._cdf(self.gmms_[i], X[:, i])
        # Clip away from boundaries before probit
        u = np.clip(u, 1e-6, 1 - 1e-6)
        Xt[:, i] = ndtri(u)
    return Xt

`inverse_transform(X)` ¶

Map standard-normal values back to the original space.

Numerically inverts F_GMM(x) = Phi(z) per sample via Brent's method on the interval [-50, 50].

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data in the Gaussianized (standard-normal) space.

Returns¶

Xt : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space. Samples for which root-finding fails default to 0.0.

Source code in rbig/_src/marginal.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Map standard-normal values back to the original space.

    Numerically inverts ``F_GMM(x) = Phi(z)`` per sample via Brent's
    method on the interval [-50, 50].

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the Gaussianized (standard-normal) space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
        Samples for which root-finding fails default to 0.0.
    """
    from scipy.optimize import brentq

    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        for j, xj in enumerate(X[:, i]):
            # Map z -> u = Phi(z) in (0, 1)
            u = stats.norm.cdf(xj)
            try:
                # Numerically solve F_GMM(x) = u for x
                Xt[j, i] = brentq(
                    lambda x, u=u, i=i: (
                        self._cdf(self.gmms_[i], np.array([x]))[0] - u
                    ),
                    -50,
                    50,
                )
            except ValueError:
                # Root not found in [-50, 50]; fall back to zero
                Xt[j, i] = 0.0
    return Xt

`get_log_det_jacobian(X)` ¶

Compute log |det J| using the analytic GMM density.

Because the Jacobian is diagonal (each feature transformed independently)::

log|det J| = sum_i log|dz_i/dx_i|
           = sum_i [log f_GMM(x_i) - log phi(z_i)]

where z_i = Phi^{-1}(F_GMM(x_i)) and phi is the standard-normal PDF.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input points at which to evaluate the log-det-Jacobian.

Returns¶

log_det : np.ndarray of shape (n_samples,) Per-sample log absolute determinant.

Source code in rbig/_src/marginal.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute log |det J| using the analytic GMM density.

    Because the Jacobian is diagonal (each feature transformed
    independently)::

        log|det J| = sum_i log|dz_i/dx_i|
                   = sum_i [log f_GMM(x_i) - log phi(z_i)]

    where ``z_i = Phi^{-1}(F_GMM(x_i))`` and ``phi`` is the
    standard-normal PDF.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points at which to evaluate the log-det-Jacobian.

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant.
    """
    log_det = np.zeros(X.shape[0])
    for i in range(self.n_features_):
        # Evaluate GMM CDF and clip to avoid probit boundary issues
        u = self._cdf(self.gmms_[i], X[:, i])
        u = np.clip(u, 1e-6, 1 - 1e-6)
        g = ndtri(u)  # z = Phi^{-1}(F_GMM(x))
        pdf = self._pdf(self.gmms_[i], X[:, i])  # f_GMM(x)
        log_phi = stats.norm.logpdf(g)  # log phi(z)
        # log|dz/dx| = log f_GMM(x) - log phi(z)
        log_det += np.log(np.maximum(pdf, 1e-300)) - log_phi
    return log_det

`rbig.QuantileGaussianizer` ¶

Bases: Bijector

Gaussianize each marginal using sklearn's QuantileTransformer.

Wraps :class:sklearn.preprocessing.QuantileTransformer configured with output_distribution='normal' to map each feature to an approximately standard-normal distribution. The quantile transform is a step-function CDF estimate that is particularly robust to outliers.

Parameters¶

n_quantiles : int, default 1000 Number of quantile nodes used to define the piecewise-linear mapping. Capped at n_samples during fit to avoid requesting more quantiles than there are training points. random_state : int or None, default 0 Seed for reproducible subsampling inside QuantileTransformer.

Attributes¶

qt_ : sklearn.preprocessing.QuantileTransformer Fitted quantile transformer with output_distribution='normal'. n_features_ : int Number of feature dimensions seen during fit.

Notes¶

The log-absolute-Jacobian is estimated via central finite differences::

dz_i/dx_i ~= (z_i(x + eps*e_i) - z_i(x - eps*e_i)) / (2*eps)

with eps = 1e-5. This approximation may be inaccurate near discontinuities of the piecewise quantile function.

References¶

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples¶

import numpy as np from rbig._src.marginal import QuantileGaussianizer rng = np.random.default_rng(0) X = rng.standard_normal((200, 3)) qg = QuantileGaussianizer().fit(X) Z = qg.transform(X) Z.shape (200, 3) Xr = qg.inverse_transform(Z) Xr.shape (200, 3)

Source code in rbig/_src/marginal.py

class QuantileGaussianizer(Bijector):
    """Gaussianize each marginal using sklearn's QuantileTransformer.

    Wraps :class:`sklearn.preprocessing.QuantileTransformer` configured with
    ``output_distribution='normal'`` to map each feature to an approximately
    standard-normal distribution.  The quantile transform is a step-function
    CDF estimate that is particularly robust to outliers.

    Parameters
    ----------
    n_quantiles : int, default 1000
        Number of quantile nodes used to define the piecewise-linear mapping.
        Capped at ``n_samples`` during ``fit`` to avoid requesting more
        quantiles than there are training points.
    random_state : int or None, default 0
        Seed for reproducible subsampling inside ``QuantileTransformer``.

    Attributes
    ----------
    qt_ : sklearn.preprocessing.QuantileTransformer
        Fitted quantile transformer with ``output_distribution='normal'``.
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The log-absolute-Jacobian is estimated via central finite differences::

        dz_i/dx_i ~= (z_i(x + eps*e_i) - z_i(x - eps*e_i)) / (2*eps)

    with ``eps = 1e-5``.  This approximation may be inaccurate near
    discontinuities of the piecewise quantile function.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import QuantileGaussianizer
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 3))
    >>> qg = QuantileGaussianizer().fit(X)
    >>> Z = qg.transform(X)
    >>> Z.shape
    (200, 3)
    >>> Xr = qg.inverse_transform(Z)
    >>> Xr.shape
    (200, 3)
    """

    def __init__(self, n_quantiles: int = 1000, random_state: int | None = 0):
        self.n_quantiles = n_quantiles
        self.random_state = random_state

    def fit(self, X: np.ndarray, y=None) -> QuantileGaussianizer:
        """Fit the quantile transformer to the training data.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : QuantileGaussianizer
            Fitted bijector instance.
        """
        from sklearn.preprocessing import QuantileTransformer

        # Cap quantile count so it cannot exceed the available samples
        n_quantiles = min(self.n_quantiles, X.shape[0])
        self.qt_ = QuantileTransformer(
            n_quantiles=n_quantiles,
            output_distribution="normal",
            random_state=self.random_state,
        )
        self.qt_.fit(X)
        self.n_features_ = X.shape[1]
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the quantile transform: x -> z approximately N(0, 1).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Gaussianized data.
        """
        return self.qt_.transform(X)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the quantile transform: z -> x.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the Gaussianized (standard-normal) space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
        """
        return self.qt_.inverse_transform(X)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Estimate log |det J| by finite differences on the quantile transform.

        Uses a small perturbation ``eps = 1e-5`` in each dimension::

            dz_i/dx_i ~= (z_i(x + eps*e_i) - z_i(x - eps*e_i)) / (2*eps)

        and sums the log-absolute-derivatives::

            log|det J| = sum_i log|dz_i/dx_i|

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points at which to evaluate the log-det-Jacobian.

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant approximation.

        Notes
        -----
        The quantile transform is piecewise-linear; the finite-difference
        derivative equals the local slope and is exact within each segment.
        """
        eps = 1e-5
        log_det = np.zeros(X.shape[0])
        for i in range(X.shape[1]):
            dummy_plus = X.copy()
            dummy_plus[:, i] = X[:, i] + eps
            dummy_minus = X.copy()
            dummy_minus[:, i] = X[:, i] - eps
            y_plus = self.qt_.transform(dummy_plus)[:, i]
            y_minus = self.qt_.transform(dummy_minus)[:, i]
            # Central-difference derivative for dimension i
            deriv = (y_plus - y_minus) / (2 * eps)
            log_det += np.log(np.maximum(np.abs(deriv), 1e-300))
        return log_det

`fit(X, y=None)` ¶

Fit the quantile transformer to the training data.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns¶

self : QuantileGaussianizer Fitted bijector instance.

Source code in rbig/_src/marginal.py

def fit(self, X: np.ndarray, y=None) -> QuantileGaussianizer:
    """Fit the quantile transformer to the training data.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : QuantileGaussianizer
        Fitted bijector instance.
    """
    from sklearn.preprocessing import QuantileTransformer

    # Cap quantile count so it cannot exceed the available samples
    n_quantiles = min(self.n_quantiles, X.shape[0])
    self.qt_ = QuantileTransformer(
        n_quantiles=n_quantiles,
        output_distribution="normal",
        random_state=self.random_state,
    )
    self.qt_.fit(X)
    self.n_features_ = X.shape[1]
    return self

`transform(X)` ¶

Apply the quantile transform: x -> z approximately N(0, 1).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns¶

Z : np.ndarray of shape (n_samples, n_features) Gaussianized data.

Source code in rbig/_src/marginal.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the quantile transform: x -> z approximately N(0, 1).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Gaussianized data.
    """
    return self.qt_.transform(X)

`inverse_transform(X)` ¶

Invert the quantile transform: z -> x.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data in the Gaussianized (standard-normal) space.

Returns¶

Xr : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space.

Source code in rbig/_src/marginal.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the quantile transform: z -> x.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the Gaussianized (standard-normal) space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
    """
    return self.qt_.inverse_transform(X)

`get_log_det_jacobian(X)` ¶

Estimate log |det J| by finite differences on the quantile transform.

Uses a small perturbation eps = 1e-5 in each dimension::

dz_i/dx_i ~= (z_i(x + eps*e_i) - z_i(x - eps*e_i)) / (2*eps)

and sums the log-absolute-derivatives::

log|det J| = sum_i log|dz_i/dx_i|

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input points at which to evaluate the log-det-Jacobian.

Returns¶

log_det : np.ndarray of shape (n_samples,) Per-sample log absolute determinant approximation.

Notes¶

The quantile transform is piecewise-linear; the finite-difference derivative equals the local slope and is exact within each segment.

Source code in rbig/_src/marginal.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Estimate log |det J| by finite differences on the quantile transform.

    Uses a small perturbation ``eps = 1e-5`` in each dimension::

        dz_i/dx_i ~= (z_i(x + eps*e_i) - z_i(x - eps*e_i)) / (2*eps)

    and sums the log-absolute-derivatives::

        log|det J| = sum_i log|dz_i/dx_i|

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points at which to evaluate the log-det-Jacobian.

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant approximation.

    Notes
    -----
    The quantile transform is piecewise-linear; the finite-difference
    derivative equals the local slope and is exact within each segment.
    """
    eps = 1e-5
    log_det = np.zeros(X.shape[0])
    for i in range(X.shape[1]):
        dummy_plus = X.copy()
        dummy_plus[:, i] = X[:, i] + eps
        dummy_minus = X.copy()
        dummy_minus[:, i] = X[:, i] - eps
        y_plus = self.qt_.transform(dummy_plus)[:, i]
        y_minus = self.qt_.transform(dummy_minus)[:, i]
        # Central-difference derivative for dimension i
        deriv = (y_plus - y_minus) / (2 * eps)
        log_det += np.log(np.maximum(np.abs(deriv), 1e-300))
    return log_det

Rotations¶

`rbig.PCARotation` ¶

Bases: BaseTransform

PCA-based rotation with optional whitening (decorrelation + rescaling).

Fits a standard PCA (via scikit-learn's :class:~sklearn.decomposition.PCA) and uses it as a linear rotation transform. When whiten=True (default), each principal component is additionally rescaled by the reciprocal square root of its eigenvalue so the output has unit variance per component::

z = Lambda^{-1/2} V^T (x - mu)

where V in R^{D x K} is the matrix of leading eigenvectors (principal axes), Lambda in R^{K x K} is the diagonal eigenvalue matrix, and mu is the sample mean. When whiten=False, the rescaling is omitted and the transform is a pure rotation::

z = V ^ T(x - mu)

Parameters¶

n_components : int or None, default None Number of principal components to retain. If None, all D components are kept. whiten : bool, default True If True, divide each component by sqrt(lambda_i) to decorrelate and normalise variance. If False, only rotate (and center).

Attributes¶

pca_ : sklearn.decomposition.PCA Fitted PCA object containing eigenvectors, eigenvalues, and the sample mean.

Notes¶

The log-absolute-Jacobian determinant for the whitening transform is::

log|det J| = -1/2 * sum_i log(lambda_i)

because each whitening factor Lambda^{-1/2} contributes -1/2 * log(lambda_i) per component. For a pure rotation (whiten=False), the determinant is 1 and the log is 0.

References¶

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples¶

import numpy as np from rbig._src.rotation import PCARotation rng = np.random.default_rng(0) X = rng.standard_normal((200, 4)) pca_rot = PCARotation(whiten=True).fit(X) Z = pca_rot.transform(X) Z.shape (200, 4) ldj = pca_rot.log_det_jacobian(X) ldj.shape (200,)

Source code in rbig/_src/rotation.py

class PCARotation(BaseTransform):
    """PCA-based rotation with optional whitening (decorrelation + rescaling).

    Fits a standard PCA (via scikit-learn's :class:`~sklearn.decomposition.PCA`)
    and uses it as a linear rotation transform.  When ``whiten=True`` (default),
    each principal component is additionally rescaled by the reciprocal square
    root of its eigenvalue so the output has unit variance per component::

        z = Lambda^{-1/2} V^T (x - mu)

    where ``V`` in R^{D x K} is the matrix of leading eigenvectors (principal
    axes), ``Lambda`` in R^{K x K} is the diagonal eigenvalue matrix, and
    ``mu`` is the sample mean.  When ``whiten=False``, the rescaling is
    omitted and the transform is a pure rotation::

        z = V ^ T(x - mu)

    Parameters
    ----------
    n_components : int or None, default None
        Number of principal components to retain.  If ``None``, all D
        components are kept.
    whiten : bool, default True
        If True, divide each component by sqrt(lambda_i) to decorrelate
        *and* normalise variance.  If False, only rotate (and center).

    Attributes
    ----------
    pca_ : sklearn.decomposition.PCA
        Fitted PCA object containing eigenvectors, eigenvalues, and the
        sample mean.

    Notes
    -----
    The log-absolute-Jacobian determinant for the whitening transform is::

        log|det J| = -1/2 * sum_i log(lambda_i)

    because each whitening factor Lambda^{-1/2} contributes
    ``-1/2 * log(lambda_i)`` per component.  For a pure rotation
    (``whiten=False``), the determinant is 1 and the log is 0.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import PCARotation
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 4))
    >>> pca_rot = PCARotation(whiten=True).fit(X)
    >>> Z = pca_rot.transform(X)
    >>> Z.shape
    (200, 4)
    >>> ldj = pca_rot.log_det_jacobian(X)
    >>> ldj.shape
    (200,)
    """

    def __init__(self, n_components: int | None = None, whiten: bool = True):
        self.n_components = n_components
        self.whiten = whiten

    def fit(self, X: np.ndarray, y=None) -> PCARotation:
        """Fit PCA to the training data.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : PCARotation
            Fitted transform instance.
        """
        # Stores eigenvectors, eigenvalues, and mean in pca_
        self.pca_ = PCA(n_components=self.n_components, whiten=self.whiten)
        self.pca_.fit(X)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply PCA rotation (and optional whitening) to X.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_components)
            Rotated (and optionally whitened) data.
        """
        return self.pca_.transform(X)  # (N, D) -> (N, K)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the PCA rotation (and optional whitening).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_components)
            Data in the PCA / whitened space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
        """
        return self.pca_.inverse_transform(X)  # (N, K) -> (N, D)

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Log absolute Jacobian determinant (constant for linear transforms).

        For a whitening PCA the Jacobian determinant is constant::

            log|det J| = -1/2 * sum_i log(lambda_i)

        where ``lambda_i`` are the PCA eigenvalues (``explained_variance_``).
        For a plain rotation (``whiten=False``), ``|det J| = 1`` and the
        log is 0.

        .. note::
            This method is only valid when the transform is square (i.e.
            ``n_components`` is ``None`` or equals the number of input
            features).  A dimensionality-reducing PCA (``n_components`` <
            ``n_features``) is not bijective and its Jacobian determinant is
            undefined.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine the number of samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Constant per-sample log absolute Jacobian determinant.
        """
        if self.whiten:
            # Whitening scales each eigendirection by lambda_i^{-1/2}
            # -> log|det| = -1/2 * sum log(lambda_i)
            log_det = -0.5 * np.sum(np.log(self.pca_.explained_variance_))
        else:
            # Pure rotation: |det Q| = 1 -> log|det| = 0
            log_det = 0.0
        return np.full(X.shape[0], log_det)

`fit(X, y=None)` ¶

Fit PCA to the training data.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns¶

self : PCARotation Fitted transform instance.

Source code in rbig/_src/rotation.py

def fit(self, X: np.ndarray, y=None) -> PCARotation:
    """Fit PCA to the training data.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : PCARotation
        Fitted transform instance.
    """
    # Stores eigenvectors, eigenvalues, and mean in pca_
    self.pca_ = PCA(n_components=self.n_components, whiten=self.whiten)
    self.pca_.fit(X)
    return self

`transform(X)` ¶

Apply PCA rotation (and optional whitening) to X.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns¶

Z : np.ndarray of shape (n_samples, n_components) Rotated (and optionally whitened) data.

Source code in rbig/_src/rotation.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply PCA rotation (and optional whitening) to X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_components)
        Rotated (and optionally whitened) data.
    """
    return self.pca_.transform(X)  # (N, D) -> (N, K)

`inverse_transform(X)` ¶

Invert the PCA rotation (and optional whitening).

Parameters¶

X : np.ndarray of shape (n_samples, n_components) Data in the PCA / whitened space.

Returns¶

Xr : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space.

Source code in rbig/_src/rotation.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the PCA rotation (and optional whitening).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_components)
        Data in the PCA / whitened space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
    """
    return self.pca_.inverse_transform(X)  # (N, K) -> (N, D)

`log_det_jacobian(X)` ¶

Log absolute Jacobian determinant (constant for linear transforms).

For a whitening PCA the Jacobian determinant is constant::

log|det J| = -1/2 * sum_i log(lambda_i)

where lambda_i are the PCA eigenvalues (explained_variance_). For a plain rotation (whiten=False), |det J| = 1 and the log is 0.

.. note:: This method is only valid when the transform is square (i.e. n_components is None or equals the number of input features). A dimensionality-reducing PCA (n_components < n_features) is not bijective and its Jacobian determinant is undefined.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine the number of samples).

Returns¶

ldj : np.ndarray of shape (n_samples,) Constant per-sample log absolute Jacobian determinant.

Source code in rbig/_src/rotation.py

def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Log absolute Jacobian determinant (constant for linear transforms).

    For a whitening PCA the Jacobian determinant is constant::

        log|det J| = -1/2 * sum_i log(lambda_i)

    where ``lambda_i`` are the PCA eigenvalues (``explained_variance_``).
    For a plain rotation (``whiten=False``), ``|det J| = 1`` and the
    log is 0.

    .. note::
        This method is only valid when the transform is square (i.e.
        ``n_components`` is ``None`` or equals the number of input
        features).  A dimensionality-reducing PCA (``n_components`` <
        ``n_features``) is not bijective and its Jacobian determinant is
        undefined.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine the number of samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Constant per-sample log absolute Jacobian determinant.
    """
    if self.whiten:
        # Whitening scales each eigendirection by lambda_i^{-1/2}
        # -> log|det| = -1/2 * sum log(lambda_i)
        log_det = -0.5 * np.sum(np.log(self.pca_.explained_variance_))
    else:
        # Pure rotation: |det Q| = 1 -> log|det| = 0
        log_det = 0.0
    return np.full(X.shape[0], log_det)

`rbig.ICARotation` ¶

Bases: BaseTransform

ICA-based rotation using the Picard algorithm or FastICA fallback.

Fits an Independent Component Analysis (ICA) model that learns a linear unmixing matrix. When the optional picard package is available, it is used for faster and more accurate convergence::

s = W K x

where K in R^{K x D} is a pre-whitening matrix and W in R^{K x K} is the ICA unmixing matrix. The combined transform is W K.

If picard is not installed, :class:sklearn.decomposition.FastICA is used as a drop-in replacement.

Parameters¶

n_components : int or None, default None Number of independent components. If None, all D components are estimated (square unmixing matrix). random_state : int or None, default None Seed for reproducible ICA initialisation.

Attributes¶

K_ : np.ndarray of shape (n_components, n_features) or None Pre-whitening matrix from the Picard solver. None when using the FastICA fallback. W_ : np.ndarray of shape (n_components, n_components) or None ICA unmixing matrix from the Picard solver. None when using FastICA. ica_ : sklearn.decomposition.FastICA or None Fitted FastICA object used when Picard is unavailable. n_features_in_ : int Number of input features (set only when using Picard).

Notes¶

The log-absolute-Jacobian determinant is::

log|det J| = log|det(W K)|

for the Picard path, or log|det(components_)| for FastICA. The Jacobian is constant (independent of x) for any linear transform.

References¶

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples¶

import numpy as np from rbig._src.rotation import ICARotation rng = np.random.default_rng(0) X = rng.standard_normal((200, 3)) ica = ICARotation(random_state=0).fit(X) S = ica.transform(X) S.shape (200, 3)

Source code in rbig/_src/rotation.py

class ICARotation(BaseTransform):
    """ICA-based rotation using the Picard algorithm or FastICA fallback.

    Fits an Independent Component Analysis (ICA) model that learns a linear
    unmixing matrix.  When the optional ``picard`` package is available, it
    is used for faster and more accurate convergence::

        s = W K x

    where ``K`` in R^{K x D} is a pre-whitening matrix and ``W`` in
    R^{K x K} is the ICA unmixing matrix.  The combined transform is
    ``W K``.

    If ``picard`` is not installed, :class:`sklearn.decomposition.FastICA`
    is used as a drop-in replacement.

    Parameters
    ----------
    n_components : int or None, default None
        Number of independent components.  If ``None``, all D components
        are estimated (square unmixing matrix).
    random_state : int or None, default None
        Seed for reproducible ICA initialisation.

    Attributes
    ----------
    K_ : np.ndarray of shape (n_components, n_features) or None
        Pre-whitening matrix from the Picard solver.  ``None`` when using
        the FastICA fallback.
    W_ : np.ndarray of shape (n_components, n_components) or None
        ICA unmixing matrix from the Picard solver.  ``None`` when using
        FastICA.
    ica_ : sklearn.decomposition.FastICA or None
        Fitted FastICA object used when Picard is unavailable.
    n_features_in_ : int
        Number of input features (set only when using Picard).

    Notes
    -----
    The log-absolute-Jacobian determinant is::

        log|det J| = log|det(W K)|

    for the Picard path, or ``log|det(components_)|`` for FastICA.  The
    Jacobian is constant (independent of x) for any linear transform.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import ICARotation
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 3))
    >>> ica = ICARotation(random_state=0).fit(X)
    >>> S = ica.transform(X)
    >>> S.shape
    (200, 3)
    """

    def __init__(
        self,
        n_components: int | None = None,
        random_state: int | None = None,
        orthogonal: bool = True,
    ):
        self.n_components = n_components
        self.random_state = random_state
        self.orthogonal = orthogonal

    def fit(self, X: np.ndarray, y=None) -> ICARotation:
        """Fit the ICA model (Picard if available, otherwise FastICA).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : ICARotation
            Fitted transform instance.
        """
        try:
            from picard import picard

            n = X.shape[1] if self.n_components is None else self.n_components
            # Picard expects data as (n_features, n_samples), so transpose X
            K, W, _ = picard(
                X.T,
                n_components=n,
                random_state=self.random_state,
                max_iter=500,
                tol=1e-5,
            )
            self.K_ = K  # whitening matrix, shape (K, D)
            self.W_ = W  # unmixing matrix, shape (K, K)
            self.n_features_in_ = X.shape[1]
            if (
                self.orthogonal
                and self.n_components is not None
                and self.n_components != X.shape[1]
            ):
                raise ValueError(
                    "orthogonal=True requires n_components=None or "
                    "n_components=n_features"
                )
        except ImportError:
            from sklearn.decomposition import FastICA

            # Fall back to FastICA when picard is not installed
            self.ica_ = FastICA(
                n_components=self.n_components,
                random_state=self.random_state,
                max_iter=500,
            )
            self.ica_.fit(X)
            self.K_ = None  # signals that FastICA path is active
            if (
                self.orthogonal
                and self.n_components is not None
                and self.n_components != X.shape[1]
            ):
                raise ValueError(
                    "orthogonal=True requires n_components=None or "
                    "n_components=n_features"
                ) from None
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the ICA unmixing to X.

        When ``orthogonal=True`` (default), applies only the orthogonal
        rotation W_ (skips whitening K_), giving ``s = W x``.  When
        ``orthogonal=False``, applies the full unmixing ``s = W K x``.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original (mixed) space.

        Returns
        -------
        S : np.ndarray of shape (n_samples, n_components)
            Estimated independent components.
        """
        if self.K_ is None:
            # FastICA path: uses sklearn's built-in transform
            return self.ica_.transform(X)
        if self.orthogonal:
            # Orthogonal mode: apply only W_ (rotation without whitening)
            return X @ self.W_.T  # (N, D) @ (D, D) -> (N, D)
        # Full unmixing: whiten then rotate
        Xw = X @ self.K_.T  # (N, D) @ (D, K) -> (N, K)  whitening step
        return Xw @ self.W_.T  # (N, K) @ (K, K) -> (N, K)  unmixing step

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the ICA unmixing.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_components)
            Independent-component representation.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original mixed space.
        """
        if self.K_ is None:
            return self.ica_.inverse_transform(X)
        if self.orthogonal:
            # W_ is orthogonal, so W_^{-1} = W_^T
            return X @ self.W_  # (N, D) @ (D, D) -> (N, D)
        # Invert unmixing W then whitening K using pseudo-inverses
        Xw = X @ np.linalg.pinv(self.W_).T  # (N, K) -> (N, K)
        return Xw @ np.linalg.pinv(self.K_).T  # (N, K) -> (N, D)

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Log absolute Jacobian determinant (constant for linear transforms).

        Computes ``log|det(W K)|`` (Picard path) or
        ``log|det(components_)|`` (FastICA path).  The result is replicated
        for every sample since the Jacobian of a linear transform is constant.

        .. note::
            This method is only valid when the unmixing matrix is square (i.e.
            ``n_components`` is ``None`` or equals the number of input
            features).  A non-square unmixing matrix is not bijective and its
            Jacobian determinant is undefined.  A :exc:`ValueError` is raised
            in that case.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine the number of samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Constant per-sample log absolute Jacobian determinant.

        Raises
        ------
        ValueError
            If the unmixing matrix is not square (``n_components != n_features``).
        """
        if self.K_ is None:
            W = self.ica_.components_  # shape (K, D) or (D, D)
            if W.shape[0] != W.shape[1]:
                raise ValueError(
                    "ICARotation.log_det_jacobian is only defined for square "
                    "unmixing matrices. Got components_ with shape "
                    f"{W.shape}. Ensure that `n_components` is None or "
                    "equals the number of features."
                )
            log_det = np.log(np.abs(np.linalg.det(W)))
        elif self.orthogonal:
            # W_ is orthogonal → |det W_| = 1 → log|det| = 0
            log_det = 0.0
        else:
            # Combined unmixing matrix: W @ K, shape (K, D)
            WK = self.W_ @ self.K_
            if WK.shape[0] != WK.shape[1]:
                raise ValueError(
                    "ICARotation.log_det_jacobian is only defined for square "
                    "unmixing matrices. Got W @ K with shape "
                    f"{WK.shape}. Ensure that `n_components` is None or "
                    "equals the number of features."
                )
            log_det = np.log(np.abs(np.linalg.det(WK)))
        return np.full(X.shape[0], log_det)

`fit(X, y=None)` ¶

Fit the ICA model (Picard if available, otherwise FastICA).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns¶

self : ICARotation Fitted transform instance.

Source code in rbig/_src/rotation.py

def fit(self, X: np.ndarray, y=None) -> ICARotation:
    """Fit the ICA model (Picard if available, otherwise FastICA).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : ICARotation
        Fitted transform instance.
    """
    try:
        from picard import picard

        n = X.shape[1] if self.n_components is None else self.n_components
        # Picard expects data as (n_features, n_samples), so transpose X
        K, W, _ = picard(
            X.T,
            n_components=n,
            random_state=self.random_state,
            max_iter=500,
            tol=1e-5,
        )
        self.K_ = K  # whitening matrix, shape (K, D)
        self.W_ = W  # unmixing matrix, shape (K, K)
        self.n_features_in_ = X.shape[1]
        if (
            self.orthogonal
            and self.n_components is not None
            and self.n_components != X.shape[1]
        ):
            raise ValueError(
                "orthogonal=True requires n_components=None or "
                "n_components=n_features"
            )
    except ImportError:
        from sklearn.decomposition import FastICA

        # Fall back to FastICA when picard is not installed
        self.ica_ = FastICA(
            n_components=self.n_components,
            random_state=self.random_state,
            max_iter=500,
        )
        self.ica_.fit(X)
        self.K_ = None  # signals that FastICA path is active
        if (
            self.orthogonal
            and self.n_components is not None
            and self.n_components != X.shape[1]
        ):
            raise ValueError(
                "orthogonal=True requires n_components=None or "
                "n_components=n_features"
            ) from None
    return self

`transform(X)` ¶

Apply the ICA unmixing to X.

When orthogonal=True (default), applies only the orthogonal rotation W_ (skips whitening K_), giving s = W x. When orthogonal=False, applies the full unmixing s = W K x.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data in the original (mixed) space.

Returns¶

S : np.ndarray of shape (n_samples, n_components) Estimated independent components.

Source code in rbig/_src/rotation.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the ICA unmixing to X.

    When ``orthogonal=True`` (default), applies only the orthogonal
    rotation W_ (skips whitening K_), giving ``s = W x``.  When
    ``orthogonal=False``, applies the full unmixing ``s = W K x``.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original (mixed) space.

    Returns
    -------
    S : np.ndarray of shape (n_samples, n_components)
        Estimated independent components.
    """
    if self.K_ is None:
        # FastICA path: uses sklearn's built-in transform
        return self.ica_.transform(X)
    if self.orthogonal:
        # Orthogonal mode: apply only W_ (rotation without whitening)
        return X @ self.W_.T  # (N, D) @ (D, D) -> (N, D)
    # Full unmixing: whiten then rotate
    Xw = X @ self.K_.T  # (N, D) @ (D, K) -> (N, K)  whitening step
    return Xw @ self.W_.T  # (N, K) @ (K, K) -> (N, K)  unmixing step

`inverse_transform(X)` ¶

Invert the ICA unmixing.

Parameters¶

X : np.ndarray of shape (n_samples, n_components) Independent-component representation.

Returns¶

Xr : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original mixed space.

Source code in rbig/_src/rotation.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the ICA unmixing.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_components)
        Independent-component representation.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original mixed space.
    """
    if self.K_ is None:
        return self.ica_.inverse_transform(X)
    if self.orthogonal:
        # W_ is orthogonal, so W_^{-1} = W_^T
        return X @ self.W_  # (N, D) @ (D, D) -> (N, D)
    # Invert unmixing W then whitening K using pseudo-inverses
    Xw = X @ np.linalg.pinv(self.W_).T  # (N, K) -> (N, K)
    return Xw @ np.linalg.pinv(self.K_).T  # (N, K) -> (N, D)

`log_det_jacobian(X)` ¶

Log absolute Jacobian determinant (constant for linear transforms).

Computes log|det(W K)| (Picard path) or log|det(components_)| (FastICA path). The result is replicated for every sample since the Jacobian of a linear transform is constant.

.. note:: This method is only valid when the unmixing matrix is square (i.e. n_components is None or equals the number of input features). A non-square unmixing matrix is not bijective and its Jacobian determinant is undefined. A :exc:ValueError is raised in that case.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine the number of samples).

Returns¶

ldj : np.ndarray of shape (n_samples,) Constant per-sample log absolute Jacobian determinant.

Raises¶

ValueError If the unmixing matrix is not square (n_components != n_features).

Source code in rbig/_src/rotation.py

def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Log absolute Jacobian determinant (constant for linear transforms).

    Computes ``log|det(W K)|`` (Picard path) or
    ``log|det(components_)|`` (FastICA path).  The result is replicated
    for every sample since the Jacobian of a linear transform is constant.

    .. note::
        This method is only valid when the unmixing matrix is square (i.e.
        ``n_components`` is ``None`` or equals the number of input
        features).  A non-square unmixing matrix is not bijective and its
        Jacobian determinant is undefined.  A :exc:`ValueError` is raised
        in that case.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine the number of samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Constant per-sample log absolute Jacobian determinant.

    Raises
    ------
    ValueError
        If the unmixing matrix is not square (``n_components != n_features``).
    """
    if self.K_ is None:
        W = self.ica_.components_  # shape (K, D) or (D, D)
        if W.shape[0] != W.shape[1]:
            raise ValueError(
                "ICARotation.log_det_jacobian is only defined for square "
                "unmixing matrices. Got components_ with shape "
                f"{W.shape}. Ensure that `n_components` is None or "
                "equals the number of features."
            )
        log_det = np.log(np.abs(np.linalg.det(W)))
    elif self.orthogonal:
        # W_ is orthogonal → |det W_| = 1 → log|det| = 0
        log_det = 0.0
    else:
        # Combined unmixing matrix: W @ K, shape (K, D)
        WK = self.W_ @ self.K_
        if WK.shape[0] != WK.shape[1]:
            raise ValueError(
                "ICARotation.log_det_jacobian is only defined for square "
                "unmixing matrices. Got W @ K with shape "
                f"{WK.shape}. Ensure that `n_components` is None or "
                "equals the number of features."
            )
        log_det = np.log(np.abs(np.linalg.det(WK)))
    return np.full(X.shape[0], log_det)

`rbig.RandomRotation` ¶

Bases: RotationBijector

Random orthogonal rotation drawn from the Haar measure via QR.

Generates a uniformly random orthogonal matrix Q in R^{D x D} by QR decomposing a matrix of i.i.d. standard-normal entries and applying a sign correction to ensure the result is Haar-uniform::

A ~ N(0, 1)^{D x D},  A = Q R,  Q <- Q * diag(sign(diag(R)))

The sign correction guarantees that Q is sampled uniformly from the orthogonal group O(D) (the Haar measure).

Parameters¶

random_state : int or None, default None Seed for reproducible rotation matrix generation.

Attributes¶

rotation_matrix_ : np.ndarray of shape (n_features, n_features) The sampled orthogonal rotation matrix Q.

Notes¶

Because Q is orthogonal, |det Q| = 1 and::

log|det J| = log|det Q| = 0

This is the default implementation inherited from :class:~rbig._src.base.RotationBijector.

References¶

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples¶

import numpy as np from rbig._src.rotation import RandomRotation rng_data = np.random.default_rng(0) X = rng_data.standard_normal((100, 4)) rot = RandomRotation(random_state=42).fit(X) Z = rot.transform(X) Z.shape (100, 4) Xr = rot.inverse_transform(Z) np.allclose(X, Xr) True

Source code in rbig/_src/rotation.py

class RandomRotation(RotationBijector):
    """Random orthogonal rotation drawn from the Haar measure via QR.

    Generates a uniformly random orthogonal matrix Q in R^{D x D} by QR
    decomposing a matrix of i.i.d. standard-normal entries and applying a
    sign correction to ensure the result is Haar-uniform::

        A ~ N(0, 1)^{D x D},  A = Q R,  Q <- Q * diag(sign(diag(R)))

    The sign correction guarantees that Q is sampled uniformly from the
    orthogonal group O(D) (the Haar measure).

    Parameters
    ----------
    random_state : int or None, default None
        Seed for reproducible rotation matrix generation.

    Attributes
    ----------
    rotation_matrix_ : np.ndarray of shape (n_features, n_features)
        The sampled orthogonal rotation matrix Q.

    Notes
    -----
    Because Q is orthogonal, ``|det Q| = 1`` and::

        log|det J| = log|det Q| = 0

    This is the default implementation inherited from
    :class:`~rbig._src.base.RotationBijector`.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import RandomRotation
    >>> rng_data = np.random.default_rng(0)
    >>> X = rng_data.standard_normal((100, 4))
    >>> rot = RandomRotation(random_state=42).fit(X)
    >>> Z = rot.transform(X)
    >>> Z.shape
    (100, 4)
    >>> Xr = rot.inverse_transform(Z)
    >>> np.allclose(X, Xr)
    True
    """

    def __init__(self, random_state: int | None = None):
        self.random_state = random_state

    def fit(self, X: np.ndarray, y=None) -> RandomRotation:
        """Sample a Haar-uniform orthogonal rotation matrix of size D x D.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data (used only to infer the dimensionality D).

        Returns
        -------
        self : RandomRotation
            Fitted transform instance with ``rotation_matrix_`` set.
        """
        rng = np.random.default_rng(self.random_state)
        n_features = X.shape[1]
        # Draw a random D x D Gaussian matrix
        A = rng.standard_normal((n_features, n_features))
        Q, R = np.linalg.qr(A)
        # Sign correction: multiply columns of Q by sign(diag(R)) for Haar measure
        Q *= np.sign(np.diag(R))  # ensures uniform distribution on O(D)
        self.rotation_matrix_ = Q  # shape (D, D)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Rotate X by the sampled orthogonal matrix: z = Q x.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Rotated data.
        """
        return X @ self.rotation_matrix_.T  # (N, D) @ (D, D) -> (N, D)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the rotation: x = Q^T z = Q^{-1} z.

        Because Q is orthogonal, its inverse equals its transpose.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Rotated data.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original space.
        """
        return X @ self.rotation_matrix_  # Q^{-1} = Q^T -> (N, D) @ (D, D)

`fit(X, y=None)` ¶

Sample a Haar-uniform orthogonal rotation matrix of size D x D.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data (used only to infer the dimensionality D).

Returns¶

self : RandomRotation Fitted transform instance with rotation_matrix_ set.

Source code in rbig/_src/rotation.py

def fit(self, X: np.ndarray, y=None) -> RandomRotation:
    """Sample a Haar-uniform orthogonal rotation matrix of size D x D.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data (used only to infer the dimensionality D).

    Returns
    -------
    self : RandomRotation
        Fitted transform instance with ``rotation_matrix_`` set.
    """
    rng = np.random.default_rng(self.random_state)
    n_features = X.shape[1]
    # Draw a random D x D Gaussian matrix
    A = rng.standard_normal((n_features, n_features))
    Q, R = np.linalg.qr(A)
    # Sign correction: multiply columns of Q by sign(diag(R)) for Haar measure
    Q *= np.sign(np.diag(R))  # ensures uniform distribution on O(D)
    self.rotation_matrix_ = Q  # shape (D, D)
    return self

`transform(X)` ¶

Rotate X by the sampled orthogonal matrix: z = Q x.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns¶

Z : np.ndarray of shape (n_samples, n_features) Rotated data.

Source code in rbig/_src/rotation.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Rotate X by the sampled orthogonal matrix: z = Q x.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Rotated data.
    """
    return X @ self.rotation_matrix_.T  # (N, D) @ (D, D) -> (N, D)

`inverse_transform(X)` ¶

Invert the rotation: x = Q^T z = Q^{-1} z.

Because Q is orthogonal, its inverse equals its transpose.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Rotated data.

Returns¶

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original space.

Source code in rbig/_src/rotation.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the rotation: x = Q^T z = Q^{-1} z.

    Because Q is orthogonal, its inverse equals its transpose.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Rotated data.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original space.
    """
    return X @ self.rotation_matrix_  # Q^{-1} = Q^T -> (N, D) @ (D, D)

`rbig.PicardRotation` ¶

Bases: RotationBijector

ICA rotation via the Picard algorithm with a FastICA fallback.

Fits an ICA model that learns maximally statistically-independent sources. When the optional picard package is available, it solves::

K, W = picard(X^T)
s = W K x

where K in R^{K x D} is the pre-whitening matrix and W in R^{K x K} is the Picard unmixing matrix. The log-det-Jacobian is::

log|det J| = log|det(W K)|

If picard is not installed (or incompatible), :class:sklearn .decomposition.FastICA is used as a fallback.

Parameters¶

n_components : int or None, default None Number of independent components K. If None, K = D. extended : bool, default False If True, use the extended Picard algorithm that can handle both super- and sub-Gaussian sources (passed directly to picard). random_state : int or None, default None Seed for reproducible initialisation. max_iter : int, default 500 Maximum number of ICA iterations. tol : float, default 1e-5 Convergence tolerance for the ICA algorithm.

Attributes¶

K_ : np.ndarray of shape (n_components, n_features) or None Pre-whitening matrix (Picard path). None when using FastICA. W_ : np.ndarray of shape (n_components, n_components) or None Unmixing matrix (Picard path). None when using FastICA. use_picard_ : bool True if the Picard solver was used; False if FastICA was used. ica_ : sklearn.decomposition.FastICA or None Fitted FastICA model (FastICA path only).

Notes¶

The log-det-Jacobian is::

log|det J| = log|det(W K)|

for the Picard path, or log|det(components_)| for the FastICA path. The Jacobian is constant because the transform is linear.

:meth:get_log_det_jacobian raises :class:ValueError if the unmixing matrix is not square (i.e. n_components != n_features).

References¶

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Ablin, P., Cardoso, J.-F., & Gramfort, A. (2018). Faster Independent Component Analysis by Preconditioning with Hessian Approximations. IEEE Transactions on Signal Processing, 66(15), 4040-4049.

Examples¶

import numpy as np from rbig._src.rotation import PicardRotation rng = np.random.default_rng(0) X = rng.standard_normal((200, 3)) pic = PicardRotation(random_state=0).fit(X) S = pic.transform(X) S.shape (200, 3)

Source code in rbig/_src/rotation.py

class PicardRotation(RotationBijector):
    """ICA rotation via the Picard algorithm with a FastICA fallback.

    Fits an ICA model that learns maximally statistically-independent
    sources.  When the optional ``picard`` package is available, it solves::

        K, W = picard(X^T)
        s = W K x

    where ``K`` in R^{K x D} is the pre-whitening matrix and ``W`` in
    R^{K x K} is the Picard unmixing matrix.  The log-det-Jacobian is::

        log|det J| = log|det(W K)|

    If ``picard`` is not installed (or incompatible), :class:`sklearn
    .decomposition.FastICA` is used as a fallback.

    Parameters
    ----------
    n_components : int or None, default None
        Number of independent components K.  If ``None``, K = D.
    extended : bool, default False
        If True, use the extended Picard algorithm that can handle both
        super- and sub-Gaussian sources (passed directly to ``picard``).
    random_state : int or None, default None
        Seed for reproducible initialisation.
    max_iter : int, default 500
        Maximum number of ICA iterations.
    tol : float, default 1e-5
        Convergence tolerance for the ICA algorithm.

    Attributes
    ----------
    K_ : np.ndarray of shape (n_components, n_features) or None
        Pre-whitening matrix (Picard path).  ``None`` when using FastICA.
    W_ : np.ndarray of shape (n_components, n_components) or None
        Unmixing matrix (Picard path).  ``None`` when using FastICA.
    use_picard_ : bool
        True if the Picard solver was used; False if FastICA was used.
    ica_ : sklearn.decomposition.FastICA or None
        Fitted FastICA model (FastICA path only).

    Notes
    -----
    The log-det-Jacobian is::

        log|det J| = log|det(W K)|

    for the Picard path, or ``log|det(components_)|`` for the FastICA path.
    The Jacobian is constant because the transform is linear.

    :meth:`get_log_det_jacobian` raises :class:`ValueError` if the unmixing
    matrix is not square (i.e. ``n_components != n_features``).

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Ablin, P., Cardoso, J.-F., & Gramfort, A. (2018). Faster Independent
    Component Analysis by Preconditioning with Hessian Approximations.
    *IEEE Transactions on Signal Processing*, 66(15), 4040-4049.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import PicardRotation
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 3))
    >>> pic = PicardRotation(random_state=0).fit(X)
    >>> S = pic.transform(X)
    >>> S.shape
    (200, 3)
    """

    def __init__(
        self,
        n_components: int | None = None,
        extended: bool = False,
        random_state: int | None = None,
        max_iter: int = 500,
        tol: float = 1e-5,
        orthogonal: bool = True,
    ):
        self.n_components = n_components
        self.extended = extended
        self.random_state = random_state
        self.max_iter = max_iter
        self.tol = tol
        self.orthogonal = orthogonal

    def fit(self, X: np.ndarray, y=None) -> PicardRotation:
        """Fit ICA (Picard if available, otherwise FastICA).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : PicardRotation
            Fitted transform instance.
        """
        try:
            from picard import picard

            n = X.shape[1] if self.n_components is None else self.n_components
            # Picard expects (n_features, n_samples); returns K (whitening) and W (unmixing)
            K, W, _ = picard(
                X.T,  # (D, N)
                n_components=n,
                random_state=self.random_state,
                max_iter=self.max_iter,
                tol=self.tol,
                extended=self.extended,
            )
            self.K_ = K  # pre-whitening matrix, shape (K, D)
            self.W_ = W  # ICA unmixing matrix, shape (K, K)
            self.use_picard_ = True
            if (
                self.orthogonal
                and self.n_components is not None
                and self.n_components != X.shape[1]
            ):
                raise ValueError(
                    "orthogonal=True requires n_components=None or "
                    "n_components=n_features"
                )
        except (ImportError, TypeError):
            from sklearn.decomposition import FastICA

            # FastICA fallback when picard is unavailable or incompatible
            self.ica_ = FastICA(
                n_components=self.n_components,
                random_state=self.random_state,
                max_iter=self.max_iter,
            )
            self.ica_.fit(X)
            self.K_ = None
            self.use_picard_ = False
            if (
                self.orthogonal
                and self.n_components is not None
                and self.n_components != X.shape[1]
            ):
                raise ValueError(
                    "orthogonal=True requires n_components=None or "
                    "n_components=n_features"
                ) from None
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the ICA unmixing.

        When ``orthogonal=True`` (default), applies only the orthogonal
        rotation W_ (skips whitening K_).  When ``orthogonal=False``,
        applies the full unmixing ``s = W K x``.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original (mixed) space.

        Returns
        -------
        S : np.ndarray of shape (n_samples, n_components)
            Estimated independent components.
        """
        if not self.use_picard_:
            return self.ica_.transform(X)
        if self.orthogonal:
            return X @ self.W_.T  # (N, D) @ (D, D) -> (N, D)
        return (X @ self.K_.T) @ self.W_.T

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the ICA unmixing.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_components)
            Independent-component representation.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original mixed space.
        """
        if not self.use_picard_:
            return self.ica_.inverse_transform(X)
        if self.orthogonal:
            return X @ self.W_  # W_ orthogonal → W_^{-1} = W_^T
        return (X @ np.linalg.pinv(self.W_).T) @ np.linalg.pinv(self.K_).T

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Log absolute Jacobian determinant (constant for linear transforms).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine n_samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Constant per-sample log absolute Jacobian determinant.

        Raises
        ------
        ValueError
            If the unmixing matrix is not square
            (``n_components != n_features``).
        """
        if not self.use_picard_:
            W = self.ica_.components_
            if W.shape[0] != W.shape[1]:
                raise ValueError(
                    "PicardRotation.get_log_det_jacobian is only defined for square "
                    "unmixing matrices when using the FastICA fallback. Got "
                    f"components_ with shape {W.shape}. Ensure that `n_components` "
                    "is None or equals the number of features."
                )
            log_det = np.log(np.abs(np.linalg.det(W)))
        elif self.orthogonal:
            # W_ is orthogonal → |det W_| = 1 → log|det| = 0
            log_det = 0.0
        else:
            WK = self.W_ @ self.K_
            if WK.shape[0] != WK.shape[1]:
                raise ValueError(
                    "PicardRotation.get_log_det_jacobian is only defined for square "
                    "unmixing matrices when using the Picard solver. Got "
                    f"W @ K with shape {WK.shape}. Ensure that `n_components` "
                    "is None or equals the number of features."
                )
            log_det = np.log(np.abs(np.linalg.det(WK)))
        return np.full(X.shape[0], log_det)

`fit(X, y=None)` ¶

Fit ICA (Picard if available, otherwise FastICA).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns¶

self : PicardRotation Fitted transform instance.

Source code in rbig/_src/rotation.py

def fit(self, X: np.ndarray, y=None) -> PicardRotation:
    """Fit ICA (Picard if available, otherwise FastICA).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : PicardRotation
        Fitted transform instance.
    """
    try:
        from picard import picard

        n = X.shape[1] if self.n_components is None else self.n_components
        # Picard expects (n_features, n_samples); returns K (whitening) and W (unmixing)
        K, W, _ = picard(
            X.T,  # (D, N)
            n_components=n,
            random_state=self.random_state,
            max_iter=self.max_iter,
            tol=self.tol,
            extended=self.extended,
        )
        self.K_ = K  # pre-whitening matrix, shape (K, D)
        self.W_ = W  # ICA unmixing matrix, shape (K, K)
        self.use_picard_ = True
        if (
            self.orthogonal
            and self.n_components is not None
            and self.n_components != X.shape[1]
        ):
            raise ValueError(
                "orthogonal=True requires n_components=None or "
                "n_components=n_features"
            )
    except (ImportError, TypeError):
        from sklearn.decomposition import FastICA

        # FastICA fallback when picard is unavailable or incompatible
        self.ica_ = FastICA(
            n_components=self.n_components,
            random_state=self.random_state,
            max_iter=self.max_iter,
        )
        self.ica_.fit(X)
        self.K_ = None
        self.use_picard_ = False
        if (
            self.orthogonal
            and self.n_components is not None
            and self.n_components != X.shape[1]
        ):
            raise ValueError(
                "orthogonal=True requires n_components=None or "
                "n_components=n_features"
            ) from None
    return self

`transform(X)` ¶

Apply the ICA unmixing.

When orthogonal=True (default), applies only the orthogonal rotation W_ (skips whitening K_). When orthogonal=False, applies the full unmixing s = W K x.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data in the original (mixed) space.

Returns¶

S : np.ndarray of shape (n_samples, n_components) Estimated independent components.

Source code in rbig/_src/rotation.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the ICA unmixing.

    When ``orthogonal=True`` (default), applies only the orthogonal
    rotation W_ (skips whitening K_).  When ``orthogonal=False``,
    applies the full unmixing ``s = W K x``.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original (mixed) space.

    Returns
    -------
    S : np.ndarray of shape (n_samples, n_components)
        Estimated independent components.
    """
    if not self.use_picard_:
        return self.ica_.transform(X)
    if self.orthogonal:
        return X @ self.W_.T  # (N, D) @ (D, D) -> (N, D)
    return (X @ self.K_.T) @ self.W_.T

`inverse_transform(X)` ¶

Invert the ICA unmixing.

Parameters¶

X : np.ndarray of shape (n_samples, n_components) Independent-component representation.

Returns¶

Xr : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original mixed space.

Source code in rbig/_src/rotation.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the ICA unmixing.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_components)
        Independent-component representation.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original mixed space.
    """
    if not self.use_picard_:
        return self.ica_.inverse_transform(X)
    if self.orthogonal:
        return X @ self.W_  # W_ orthogonal → W_^{-1} = W_^T
    return (X @ np.linalg.pinv(self.W_).T) @ np.linalg.pinv(self.K_).T

`get_log_det_jacobian(X)` ¶

Log absolute Jacobian determinant (constant for linear transforms).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine n_samples).

Returns¶

ldj : np.ndarray of shape (n_samples,) Constant per-sample log absolute Jacobian determinant.

Raises¶

ValueError If the unmixing matrix is not square (n_components != n_features).

Source code in rbig/_src/rotation.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Log absolute Jacobian determinant (constant for linear transforms).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine n_samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Constant per-sample log absolute Jacobian determinant.

    Raises
    ------
    ValueError
        If the unmixing matrix is not square
        (``n_components != n_features``).
    """
    if not self.use_picard_:
        W = self.ica_.components_
        if W.shape[0] != W.shape[1]:
            raise ValueError(
                "PicardRotation.get_log_det_jacobian is only defined for square "
                "unmixing matrices when using the FastICA fallback. Got "
                f"components_ with shape {W.shape}. Ensure that `n_components` "
                "is None or equals the number of features."
            )
        log_det = np.log(np.abs(np.linalg.det(W)))
    elif self.orthogonal:
        # W_ is orthogonal → |det W_| = 1 → log|det| = 0
        log_det = 0.0
    else:
        WK = self.W_ @ self.K_
        if WK.shape[0] != WK.shape[1]:
            raise ValueError(
                "PicardRotation.get_log_det_jacobian is only defined for square "
                "unmixing matrices when using the Picard solver. Got "
                f"W @ K with shape {WK.shape}. Ensure that `n_components` "
                "is None or equals the number of features."
            )
        log_det = np.log(np.abs(np.linalg.det(WK)))
    return np.full(X.shape[0], log_det)

`rbig.RandomOrthogonalProjection` ¶

Bases: RotationBijector

Semi-orthogonal random projection from D to K dimensions via QR.

Generates a semi-orthogonal matrix P in R^{D x K} (K <= D) whose columns are orthonormal, obtained by taking the first K columns of a QR decomposition of a random Gaussian matrix::

A ~ N(0, 1)^{D x K},  A = Q R,  P = Q[:, :K]

The forward transform projects D-dimensional input to K dimensions::

z = X P   where P in R^{D x K}

Parameters¶

n_components : int or None, default None Output dimensionality K. If None, K = D (square case). random_state : int or None, default None Seed for reproducible matrix generation.

Attributes¶

projection_matrix_ : np.ndarray of shape (n_features, n_components) Semi-orthogonal projection matrix P with orthonormal columns. input_dim_ : int Input dimensionality D. output_dim_ : int Output dimensionality K.

Notes¶

When K = D the matrix is fully orthogonal and log|det J| = 0. When K < D the transform is not invertible and both :meth:inverse_transform and :meth:get_log_det_jacobian raise :class:NotImplementedError.

References¶

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples¶

import numpy as np from rbig._src.rotation import RandomOrthogonalProjection rng = np.random.default_rng(0) X = rng.standard_normal((100, 4)) proj = RandomOrthogonalProjection(random_state=0).fit(X) Z = proj.transform(X) Z.shape (100, 4)

Source code in rbig/_src/rotation.py

class RandomOrthogonalProjection(RotationBijector):
    """Semi-orthogonal random projection from D to K dimensions via QR.

    Generates a semi-orthogonal matrix P in R^{D x K} (K <= D) whose
    columns are orthonormal, obtained by taking the first K columns of a
    QR decomposition of a random Gaussian matrix::

        A ~ N(0, 1)^{D x K},  A = Q R,  P = Q[:, :K]

    The forward transform projects D-dimensional input to K dimensions::

        z = X P   where P in R^{D x K}

    Parameters
    ----------
    n_components : int or None, default None
        Output dimensionality K.  If ``None``, K = D (square case).
    random_state : int or None, default None
        Seed for reproducible matrix generation.

    Attributes
    ----------
    projection_matrix_ : np.ndarray of shape (n_features, n_components)
        Semi-orthogonal projection matrix P with orthonormal columns.
    input_dim_ : int
        Input dimensionality D.
    output_dim_ : int
        Output dimensionality K.

    Notes
    -----
    When K = D the matrix is fully orthogonal and ``log|det J| = 0``.
    When K < D the transform is not invertible and both
    :meth:`inverse_transform` and :meth:`get_log_det_jacobian` raise
    :class:`NotImplementedError`.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import RandomOrthogonalProjection
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((100, 4))
    >>> proj = RandomOrthogonalProjection(random_state=0).fit(X)
    >>> Z = proj.transform(X)
    >>> Z.shape
    (100, 4)
    """

    def __init__(
        self,
        n_components: int | None = None,
        random_state: int | None = None,
    ):
        self.n_components = n_components
        self.random_state = random_state

    def fit(self, X: np.ndarray, y=None) -> RandomOrthogonalProjection:
        """Build the semi-orthogonal projection matrix P.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data (used only to infer D = n_features).

        Returns
        -------
        self : RandomOrthogonalProjection
            Fitted transform instance.
        """
        rng = np.random.default_rng(self.random_state)
        D = X.shape[1]
        K = self.n_components if self.n_components is not None else D
        # Random Gaussian seed matrix; QR gives orthonormal columns
        A = rng.standard_normal((D, K))
        Q, _ = np.linalg.qr(A)
        self.projection_matrix_ = Q[:, :K]  # (D, K)  semi-orthogonal basis
        self.input_dim_ = D
        self.output_dim_ = K
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Project X from D to K dimensions: z = X P.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_components)
            Projected data.
        """
        return X @ self.projection_matrix_  # (N, D) @ (D, K) -> (N, K)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the projection (only valid for square case K = D).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_components)
            Projected data.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original space (exact only when K = D).

        Raises
        ------
        NotImplementedError
            If ``n_components < n_features`` (projection is not invertible).
        """
        if self.output_dim_ < self.input_dim_:
            raise NotImplementedError(
                "RandomOrthogonalProjection with n_components < input dimension "
                "is not bijective; inverse_transform is undefined."
            )
        return X @ self.projection_matrix_.T  # exact inverse only when square (N, D)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros for the square (bijective) case.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine n_samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Zeros, because ``|det P| = 1`` for a square orthogonal matrix.

        Raises
        ------
        NotImplementedError
            If ``n_components < n_features`` (Jacobian determinant undefined).
        """
        if self.output_dim_ < self.input_dim_:
            raise NotImplementedError(
                "RandomOrthogonalProjection with n_components < input dimension "
                "does not have a well-defined Jacobian determinant."
            )
        # For a square orthogonal matrix, |det(J)| = 1, so log|det(J)| = 0.
        return np.zeros(X.shape[0])

`fit(X, y=None)` ¶

Build the semi-orthogonal projection matrix P.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data (used only to infer D = n_features).

Returns¶

self : RandomOrthogonalProjection Fitted transform instance.

Source code in rbig/_src/rotation.py

def fit(self, X: np.ndarray, y=None) -> RandomOrthogonalProjection:
    """Build the semi-orthogonal projection matrix P.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data (used only to infer D = n_features).

    Returns
    -------
    self : RandomOrthogonalProjection
        Fitted transform instance.
    """
    rng = np.random.default_rng(self.random_state)
    D = X.shape[1]
    K = self.n_components if self.n_components is not None else D
    # Random Gaussian seed matrix; QR gives orthonormal columns
    A = rng.standard_normal((D, K))
    Q, _ = np.linalg.qr(A)
    self.projection_matrix_ = Q[:, :K]  # (D, K)  semi-orthogonal basis
    self.input_dim_ = D
    self.output_dim_ = K
    return self

`transform(X)` ¶

Project X from D to K dimensions: z = X P.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns¶

Z : np.ndarray of shape (n_samples, n_components) Projected data.

Source code in rbig/_src/rotation.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Project X from D to K dimensions: z = X P.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_components)
        Projected data.
    """
    return X @ self.projection_matrix_  # (N, D) @ (D, K) -> (N, K)

`inverse_transform(X)` ¶

Invert the projection (only valid for square case K = D).

Parameters¶

X : np.ndarray of shape (n_samples, n_components) Projected data.

Returns¶

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original space (exact only when K = D).

Raises¶

NotImplementedError If n_components < n_features (projection is not invertible).

Source code in rbig/_src/rotation.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the projection (only valid for square case K = D).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_components)
        Projected data.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original space (exact only when K = D).

    Raises
    ------
    NotImplementedError
        If ``n_components < n_features`` (projection is not invertible).
    """
    if self.output_dim_ < self.input_dim_:
        raise NotImplementedError(
            "RandomOrthogonalProjection with n_components < input dimension "
            "is not bijective; inverse_transform is undefined."
        )
    return X @ self.projection_matrix_.T  # exact inverse only when square (N, D)

`get_log_det_jacobian(X)` ¶

Return zeros for the square (bijective) case.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine n_samples).

Returns¶

ldj : np.ndarray of shape (n_samples,) Zeros, because |det P| = 1 for a square orthogonal matrix.

Raises¶

NotImplementedError If n_components < n_features (Jacobian determinant undefined).

Source code in rbig/_src/rotation.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros for the square (bijective) case.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine n_samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Zeros, because ``|det P| = 1`` for a square orthogonal matrix.

    Raises
    ------
    NotImplementedError
        If ``n_components < n_features`` (Jacobian determinant undefined).
    """
    if self.output_dim_ < self.input_dim_:
        raise NotImplementedError(
            "RandomOrthogonalProjection with n_components < input dimension "
            "does not have a well-defined Jacobian determinant."
        )
    # For a square orthogonal matrix, |det(J)| = 1, so log|det(J)| = 0.
    return np.zeros(X.shape[0])

`rbig.GaussianRandomProjection` ¶

Bases: RotationBijector

Johnson-Lindenstrauss style random projection with Gaussian entries.

Constructs a random projection matrix M in R^{D x K} whose entries are drawn i.i.d. from N(0, 1/K)::

M_ij ~ N(0, 1/K)

The 1/K normalisation approximately preserves pairwise Euclidean distances (Johnson-Lindenstrauss lemma)::

(1 - eps)||x - y||^2 <= ||Mx - My||^2 <= (1 + eps)||x - y||^2

with high probability when K = O(eps^{-2} log n).

Parameters¶

n_components : int or None, default None Output dimensionality K. If None, K = D (square case). random_state : int or None, default None Seed for reproducible matrix generation.

Attributes¶

matrix_ : np.ndarray of shape (n_features, n_components) The random projection matrix with entries ~ N(0, 1/K).

Notes¶

Unlike :class:RandomOrthogonalProjection, the columns of this matrix are not orthogonal, so |det M| != 1 in general. :meth:get_log_det_jacobian returns zeros as an approximation. For density estimation where accuracy matters, prefer :class:RandomOrthogonalProjection or :class:RandomRotation.

The inverse uses the Moore-Penrose pseudoinverse computed by :func:numpy.linalg.pinv.

References¶

Johnson, W. B., & Lindenstrauss, J. (1984). Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics, 26, 189-206.

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples¶

import numpy as np from rbig._src.rotation import GaussianRandomProjection rng = np.random.default_rng(0) X = rng.standard_normal((100, 4)) grp = GaussianRandomProjection(random_state=0).fit(X) Z = grp.transform(X) Z.shape (100, 4)

Source code in rbig/_src/rotation.py

class GaussianRandomProjection(RotationBijector):
    """Johnson-Lindenstrauss style random projection with Gaussian entries.

    Constructs a random projection matrix M in R^{D x K} whose entries are
    drawn i.i.d. from N(0, 1/K)::

        M_ij ~ N(0, 1/K)

    The 1/K normalisation approximately preserves pairwise Euclidean
    distances (Johnson-Lindenstrauss lemma)::

        (1 - eps)||x - y||^2 <= ||Mx - My||^2 <= (1 + eps)||x - y||^2

    with high probability when K = O(eps^{-2} log n).

    Parameters
    ----------
    n_components : int or None, default None
        Output dimensionality K.  If ``None``, K = D (square case).
    random_state : int or None, default None
        Seed for reproducible matrix generation.

    Attributes
    ----------
    matrix_ : np.ndarray of shape (n_features, n_components)
        The random projection matrix with entries ~ N(0, 1/K).

    Notes
    -----
    Unlike :class:`RandomOrthogonalProjection`, the columns of this matrix
    are *not* orthogonal, so ``|det M| != 1`` in general.
    :meth:`get_log_det_jacobian` returns zeros as an approximation.
    For density estimation where accuracy matters, prefer
    :class:`RandomOrthogonalProjection` or :class:`RandomRotation`.

    The inverse uses the Moore-Penrose pseudoinverse computed by
    :func:`numpy.linalg.pinv`.

    References
    ----------
    Johnson, W. B., & Lindenstrauss, J. (1984). Extensions of Lipschitz
    mappings into a Hilbert space. *Contemporary Mathematics*, 26, 189-206.

    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import GaussianRandomProjection
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((100, 4))
    >>> grp = GaussianRandomProjection(random_state=0).fit(X)
    >>> Z = grp.transform(X)
    >>> Z.shape
    (100, 4)
    """

    def __init__(
        self,
        n_components: int | None = None,
        random_state: int | None = None,
    ):
        self.n_components = n_components
        self.random_state = random_state

    def fit(self, X: np.ndarray, y=None) -> GaussianRandomProjection:
        """Build the Gaussian random projection matrix.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data (used only to infer D = n_features).

        Returns
        -------
        self : GaussianRandomProjection
            Fitted transform instance.
        """
        rng = np.random.default_rng(self.random_state)
        D = X.shape[1]
        K = self.n_components if self.n_components is not None else D
        # Entries drawn from N(0, 1), then scaled by 1/sqrt(K) for distance preservation
        self.matrix_ = rng.standard_normal((D, K)) / np.sqrt(K)  # (D, K)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Project X: z = X M.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_components)
            Projected data.
        """
        return X @ self.matrix_  # (N, D) @ (D, K) -> (N, K)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Approximate inverse via the Moore-Penrose pseudoinverse.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_components)
            Projected data.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Approximately recovered original data.
        """
        # Pseudoinverse: M^+ has shape (K, D), so X @ M^+ gives (N, D)
        return X @ np.linalg.pinv(self.matrix_)  # (N, K) @ (K, D) -> (N, D)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros (approximation; Gaussian projections are not isometric).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine n_samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Zeros (approximate; the true log-det is generally non-zero).
        """
        return np.zeros(X.shape[0])

`fit(X, y=None)` ¶

Build the Gaussian random projection matrix.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data (used only to infer D = n_features).

Returns¶

self : GaussianRandomProjection Fitted transform instance.

Source code in rbig/_src/rotation.py

def fit(self, X: np.ndarray, y=None) -> GaussianRandomProjection:
    """Build the Gaussian random projection matrix.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data (used only to infer D = n_features).

    Returns
    -------
    self : GaussianRandomProjection
        Fitted transform instance.
    """
    rng = np.random.default_rng(self.random_state)
    D = X.shape[1]
    K = self.n_components if self.n_components is not None else D
    # Entries drawn from N(0, 1), then scaled by 1/sqrt(K) for distance preservation
    self.matrix_ = rng.standard_normal((D, K)) / np.sqrt(K)  # (D, K)
    return self

`transform(X)` ¶

Project X: z = X M.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns¶

Z : np.ndarray of shape (n_samples, n_components) Projected data.

Source code in rbig/_src/rotation.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Project X: z = X M.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_components)
        Projected data.
    """
    return X @ self.matrix_  # (N, D) @ (D, K) -> (N, K)

`inverse_transform(X)` ¶

Approximate inverse via the Moore-Penrose pseudoinverse.

Parameters¶

X : np.ndarray of shape (n_samples, n_components) Projected data.

Returns¶

Xr : np.ndarray of shape (n_samples, n_features) Approximately recovered original data.

Source code in rbig/_src/rotation.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Approximate inverse via the Moore-Penrose pseudoinverse.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_components)
        Projected data.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Approximately recovered original data.
    """
    # Pseudoinverse: M^+ has shape (K, D), so X @ M^+ gives (N, D)
    return X @ np.linalg.pinv(self.matrix_)  # (N, K) @ (K, D) -> (N, D)

`get_log_det_jacobian(X)` ¶

Return zeros (approximation; Gaussian projections are not isometric).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine n_samples).

Returns¶

ldj : np.ndarray of shape (n_samples,) Zeros (approximate; the true log-det is generally non-zero).

Source code in rbig/_src/rotation.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros (approximation; Gaussian projections are not isometric).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine n_samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Zeros (approximate; the true log-det is generally non-zero).
    """
    return np.zeros(X.shape[0])

`rbig.OrthogonalDimensionalityReduction` ¶

Bases: RotationBijector

Full orthogonal rotation followed by optional dimension truncation.

Applies a D x D orthogonal rotation Q (drawn from the Haar measure via QR) and then retains only the first K <= D components::

z = (X Q^T)[:, :K]

The rotation is sampled fresh at fit time from a square standard-normal matrix processed through QR with sign correction.

Parameters¶

n_components : int or None, default None Number of output dimensions K. If None, K = D (no truncation). random_state : int or None, default None Seed for reproducible rotation matrix generation.

Attributes¶

rotation_matrix_ : np.ndarray of shape (n_features, n_features) Full D x D orthogonal rotation matrix Q. n_components_ : int Number of retained output dimensions K. input_dim_ : int Input dimensionality D.

Notes¶

When K = D the transform is a bijection and::

log|det J| = log|det Q| = 0

When K < D the transform is not invertible; both :meth:inverse_transform and :meth:get_log_det_jacobian raise :class:NotImplementedError.

References¶

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples¶

import numpy as np from rbig._src.rotation import OrthogonalDimensionalityReduction rng = np.random.default_rng(0) X = rng.standard_normal((100, 4)) odr = OrthogonalDimensionalityReduction(random_state=0).fit(X) Z = odr.transform(X) Z.shape (100, 4)

Source code in rbig/_src/rotation.py

class OrthogonalDimensionalityReduction(RotationBijector):
    """Full orthogonal rotation followed by optional dimension truncation.

    Applies a D x D orthogonal rotation Q (drawn from the Haar measure via
    QR) and then retains only the first K <= D components::

        z = (X Q^T)[:, :K]

    The rotation is sampled fresh at ``fit`` time from a square
    standard-normal matrix processed through QR with sign correction.

    Parameters
    ----------
    n_components : int or None, default None
        Number of output dimensions K.  If ``None``, K = D (no truncation).
    random_state : int or None, default None
        Seed for reproducible rotation matrix generation.

    Attributes
    ----------
    rotation_matrix_ : np.ndarray of shape (n_features, n_features)
        Full D x D orthogonal rotation matrix Q.
    n_components_ : int
        Number of retained output dimensions K.
    input_dim_ : int
        Input dimensionality D.

    Notes
    -----
    When K = D the transform is a bijection and::

        log|det J| = log|det Q| = 0

    When K < D the transform is not invertible; both
    :meth:`inverse_transform` and :meth:`get_log_det_jacobian` raise
    :class:`NotImplementedError`.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import OrthogonalDimensionalityReduction
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((100, 4))
    >>> odr = OrthogonalDimensionalityReduction(random_state=0).fit(X)
    >>> Z = odr.transform(X)
    >>> Z.shape
    (100, 4)
    """

    def __init__(
        self,
        n_components: int | None = None,
        random_state: int | None = None,
    ):
        self.n_components = n_components
        self.random_state = random_state

    def fit(self, X: np.ndarray, y=None) -> OrthogonalDimensionalityReduction:
        """Sample a Haar-uniform D x D rotation matrix.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data (used only to infer D = n_features).

        Returns
        -------
        self : OrthogonalDimensionalityReduction
            Fitted transform instance.
        """
        rng = np.random.default_rng(self.random_state)
        D = X.shape[1]
        K = self.n_components if self.n_components is not None else D
        # QR of a random Gaussian matrix gives a Haar-uniform orthogonal matrix
        A = rng.standard_normal((D, D))
        Q, _ = np.linalg.qr(A)
        self.rotation_matrix_ = Q  # (D, D)
        self.n_components_ = K
        self.input_dim_ = D
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Rotate then truncate: z = (X Q^T)[:, :K].

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_components)
            Rotated and (optionally) truncated data.
        """
        Xr = X @ self.rotation_matrix_.T  # (N, D) @ (D, D) -> (N, D)  full rotation
        return Xr[:, : self.n_components_]  # (N, K)  keep first K components

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the rotation (only valid for square case K = D).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_components)
            Rotated data.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original space.

        Raises
        ------
        NotImplementedError
            If ``n_components < n_features`` (not invertible).
        """
        if self.n_components_ < self.input_dim_:
            raise NotImplementedError(
                "OrthogonalDimensionalityReduction with n_components < input dimension "
                "is not bijective; inverse_transform is undefined."
            )
        return X @ self.rotation_matrix_  # (N, D) @ (D, D) -> (N, D)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros for the square (bijective) case.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine n_samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Zeros, because ``|det Q| = 1`` for any orthogonal Q.

        Raises
        ------
        NotImplementedError
            If ``n_components < n_features`` (Jacobian determinant undefined).
        """
        if self.n_components_ < self.input_dim_:
            raise NotImplementedError(
                "OrthogonalDimensionalityReduction with n_components < input dimension "
                "does not have a well-defined Jacobian determinant."
            )
        return np.zeros(X.shape[0])

`fit(X, y=None)` ¶

Sample a Haar-uniform D x D rotation matrix.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data (used only to infer D = n_features).

Returns¶

self : OrthogonalDimensionalityReduction Fitted transform instance.

Source code in rbig/_src/rotation.py

def fit(self, X: np.ndarray, y=None) -> OrthogonalDimensionalityReduction:
    """Sample a Haar-uniform D x D rotation matrix.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data (used only to infer D = n_features).

    Returns
    -------
    self : OrthogonalDimensionalityReduction
        Fitted transform instance.
    """
    rng = np.random.default_rng(self.random_state)
    D = X.shape[1]
    K = self.n_components if self.n_components is not None else D
    # QR of a random Gaussian matrix gives a Haar-uniform orthogonal matrix
    A = rng.standard_normal((D, D))
    Q, _ = np.linalg.qr(A)
    self.rotation_matrix_ = Q  # (D, D)
    self.n_components_ = K
    self.input_dim_ = D
    return self

`transform(X)` ¶

Rotate then truncate: z = (X Q^T)[:, :K].

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns¶

Z : np.ndarray of shape (n_samples, n_components) Rotated and (optionally) truncated data.

Source code in rbig/_src/rotation.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Rotate then truncate: z = (X Q^T)[:, :K].

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_components)
        Rotated and (optionally) truncated data.
    """
    Xr = X @ self.rotation_matrix_.T  # (N, D) @ (D, D) -> (N, D)  full rotation
    return Xr[:, : self.n_components_]  # (N, K)  keep first K components

`inverse_transform(X)` ¶

Invert the rotation (only valid for square case K = D).

Parameters¶

X : np.ndarray of shape (n_samples, n_components) Rotated data.

Returns¶

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original space.

Raises¶

NotImplementedError If n_components < n_features (not invertible).

Source code in rbig/_src/rotation.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the rotation (only valid for square case K = D).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_components)
        Rotated data.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original space.

    Raises
    ------
    NotImplementedError
        If ``n_components < n_features`` (not invertible).
    """
    if self.n_components_ < self.input_dim_:
        raise NotImplementedError(
            "OrthogonalDimensionalityReduction with n_components < input dimension "
            "is not bijective; inverse_transform is undefined."
        )
    return X @ self.rotation_matrix_  # (N, D) @ (D, D) -> (N, D)

`get_log_det_jacobian(X)` ¶

Return zeros for the square (bijective) case.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine n_samples).

Returns¶

ldj : np.ndarray of shape (n_samples,) Zeros, because |det Q| = 1 for any orthogonal Q.

Raises¶

NotImplementedError If n_components < n_features (Jacobian determinant undefined).

Source code in rbig/_src/rotation.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros for the square (bijective) case.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine n_samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Zeros, because ``|det Q| = 1`` for any orthogonal Q.

    Raises
    ------
    NotImplementedError
        If ``n_components < n_features`` (Jacobian determinant undefined).
    """
    if self.n_components_ < self.input_dim_:
        raise NotImplementedError(
            "OrthogonalDimensionalityReduction with n_components < input dimension "
            "does not have a well-defined Jacobian determinant."
        )
    return np.zeros(X.shape[0])

Parametric Transforms¶

`rbig.BoxCoxTransform` ¶

Bases: BaseTransform

Box-Cox power transform fitted independently to each feature.

The Box-Cox family of transforms is parameterised by λ (one per feature):

λ ≠ 0 :  y = (xᵏ − 1) / λ
λ → 0 :  y = log(x)             (continuity limit)

λ values are estimated via maximum likelihood (scipy's boxcox). Features with non-positive values are left unchanged (λ = 0 applied as identity rather than log, since log requires positive inputs).

The inverse transform is:

λ ≠ 0 :  x = (λy + 1)^{1/λ}
λ = 0 :  x = exp(y)

The log-det of the Jacobian is:

λ ≠ 0 :  ∑ᵢ (λ − 1) log xᵢ
λ = 0 :  ∑ᵢ (−log xᵢ)          (from d(log x)/dx = 1/x ⟹ log|dy/dx| = −log xᵢ)

.. note:: The current implementation uses −xᵢ (not −log xᵢ) for the λ = 0 branch, matching the original code behaviour. This is an approximation that differs from the exact analytical log-det.

Parameters¶

method : str, optional (default='mle') Fitting method passed conceptually; currently scipy MLE is always used regardless of this value.

Attributes¶

lambdas_ : np.ndarray of shape (n_features,) Fitted λ values after calling fit.

Examples¶

import numpy as np from rbig._src.parametric import BoxCoxTransform rng = np.random.default_rng(1) X = rng.exponential(scale=2.0, size=(200, 3)) # strictly positive tr = BoxCoxTransform().fit(X) Y = tr.transform(X) X_rec = tr.inverse_transform(Y) np.allclose(X, X_rec, atol=1e-6) True

Source code in rbig/_src/parametric.py

class BoxCoxTransform(BaseTransform):
    """Box-Cox power transform fitted independently to each feature.

    The Box-Cox family of transforms is parameterised by λ (one per feature):

        λ ≠ 0 :  y = (xᵏ − 1) / λ
        λ → 0 :  y = log(x)             (continuity limit)

    λ values are estimated via maximum likelihood (scipy's ``boxcox``).
    Features with non-positive values are left unchanged (λ = 0 applied as
    identity rather than log, since log requires positive inputs).

    The inverse transform is:

        λ ≠ 0 :  x = (λy + 1)^{1/λ}
        λ = 0 :  x = exp(y)

    The log-det of the Jacobian is:

        λ ≠ 0 :  ∑ᵢ (λ − 1) log xᵢ
        λ = 0 :  ∑ᵢ (−log xᵢ)          (from d(log x)/dx = 1/x ⟹ log|dy/dx| = −log xᵢ)

    .. note::
        The current implementation uses ``−xᵢ`` (not ``−log xᵢ``) for the
        λ = 0 branch, matching the original code behaviour.  This is an
        approximation that differs from the exact analytical log-det.

    Parameters
    ----------
    method : str, optional (default='mle')
        Fitting method passed conceptually; currently scipy MLE is always
        used regardless of this value.

    Attributes
    ----------
    lambdas_ : np.ndarray of shape (n_features,)
        Fitted λ values after calling ``fit``.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import BoxCoxTransform
    >>> rng = np.random.default_rng(1)
    >>> X = rng.exponential(scale=2.0, size=(200, 3))  # strictly positive
    >>> tr = BoxCoxTransform().fit(X)
    >>> Y = tr.transform(X)
    >>> X_rec = tr.inverse_transform(Y)
    >>> np.allclose(X, X_rec, atol=1e-6)
    True
    """

    def __init__(self, method: str = "mle"):
        self.method = method

    def fit(self, X: np.ndarray, y=None) -> BoxCoxTransform:
        """Estimate one Box-Cox λ per feature via MLE.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.  Features that contain non-positive values are
            assigned λ = 0 (no transform applied during ``transform``).

        Returns
        -------
        self : BoxCoxTransform
            Fitted instance with ``lambdas_`` attribute set.
        """
        self.lambdas_ = np.zeros(X.shape[1])  # λ per feature, default 0
        for i in range(X.shape[1]):
            xi = X[:, i]
            if np.all(xi > 0):
                _, lam = stats.boxcox(xi)  # MLE for λ
            else:
                lam = 0.0  # non-positive data: no power transform
            self.lambdas_[i] = lam
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the fitted Box-Cox transform to X.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.  Features with λ = 0 and non-positive values are
            passed through unchanged.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Box-Cox transformed data.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(X.shape[1]):
            xi = X[:, i]
            lam = self.lambdas_[i]
            if np.all(xi > 0):
                Xt[:, i] = stats.boxcox(xi, lmbda=lam)  # y = (x^lam - 1)/lam or log(x)
            else:
                Xt[:, i] = xi  # pass-through for non-positive
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the Box-Cox transform.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Box-Cox transformed data.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Recovered original-scale data.  Uses:

            * λ = 0 : x = exp(y)
            * λ ≠ 0 : x = (λy + 1)^{1/λ}   (clamped to 0 for stability)
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(X.shape[1]):
            lam = self.lambdas_[i]
            if np.abs(lam) < 1e-10:
                Xt[:, i] = np.exp(X[:, i])  # x = exp(y)
            else:
                # x = (λy + 1)^{1/λ}, clamp argument to ≥ 0
                Xt[:, i] = np.power(np.maximum(lam * X[:, i] + 1, 0), 1 / lam)
        return Xt

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute per-sample log |det J| of the forward Box-Cox transform.

        The Jacobian is diagonal; for each feature:

            λ ≠ 0 :  d/dx[(xᵏ−1)/λ] = x^{λ−1}  ⟹  log = (λ−1) log x
            λ = 0 :  d/dx[log x] = 1/x           ⟹  exact log = −log x

        .. note::
            The λ = 0 branch accumulates ``−xᵢ`` rather than the exact
            ``−log xᵢ``.  This preserves the original implementation
            behaviour.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data (pre-transform, original scale).

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Sum of per-feature log Jacobian contributions.
        """
        log_jac = np.zeros(X.shape[0])
        for i in range(X.shape[1]):
            xi = X[:, i]
            lam = self.lambdas_[i]
            if np.abs(lam) < 1e-10:
                log_jac += -xi  # log(1/x) ~= -x (lam->0 limit)
            else:
                # (lam-1) log xi from x^{lam-1} Jacobian
                log_jac += (lam - 1) * np.log(np.maximum(xi, 1e-300))
        return log_jac

`fit(X, y=None)` ¶

Estimate one Box-Cox λ per feature via MLE.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data. Features that contain non-positive values are assigned λ = 0 (no transform applied during transform).

Returns¶

self : BoxCoxTransform Fitted instance with lambdas_ attribute set.

Source code in rbig/_src/parametric.py

def fit(self, X: np.ndarray, y=None) -> BoxCoxTransform:
    """Estimate one Box-Cox λ per feature via MLE.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.  Features that contain non-positive values are
        assigned λ = 0 (no transform applied during ``transform``).

    Returns
    -------
    self : BoxCoxTransform
        Fitted instance with ``lambdas_`` attribute set.
    """
    self.lambdas_ = np.zeros(X.shape[1])  # λ per feature, default 0
    for i in range(X.shape[1]):
        xi = X[:, i]
        if np.all(xi > 0):
            _, lam = stats.boxcox(xi)  # MLE for λ
        else:
            lam = 0.0  # non-positive data: no power transform
        self.lambdas_[i] = lam
    return self

`transform(X)` ¶

Apply the fitted Box-Cox transform to X.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data. Features with λ = 0 and non-positive values are passed through unchanged.

Returns¶

Xt : np.ndarray of shape (n_samples, n_features) Box-Cox transformed data.

Source code in rbig/_src/parametric.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the fitted Box-Cox transform to X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.  Features with λ = 0 and non-positive values are
        passed through unchanged.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Box-Cox transformed data.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(X.shape[1]):
        xi = X[:, i]
        lam = self.lambdas_[i]
        if np.all(xi > 0):
            Xt[:, i] = stats.boxcox(xi, lmbda=lam)  # y = (x^lam - 1)/lam or log(x)
        else:
            Xt[:, i] = xi  # pass-through for non-positive
    return Xt

`inverse_transform(X)` ¶

Invert the Box-Cox transform.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Box-Cox transformed data.

Returns¶

Xt : np.ndarray of shape (n_samples, n_features) Recovered original-scale data. Uses:

* λ = 0 : x = exp(y)
* λ ≠ 0 : x = (λy + 1)^{1/λ}   (clamped to 0 for stability)

Source code in rbig/_src/parametric.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the Box-Cox transform.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Box-Cox transformed data.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Recovered original-scale data.  Uses:

        * λ = 0 : x = exp(y)
        * λ ≠ 0 : x = (λy + 1)^{1/λ}   (clamped to 0 for stability)
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(X.shape[1]):
        lam = self.lambdas_[i]
        if np.abs(lam) < 1e-10:
            Xt[:, i] = np.exp(X[:, i])  # x = exp(y)
        else:
            # x = (λy + 1)^{1/λ}, clamp argument to ≥ 0
            Xt[:, i] = np.power(np.maximum(lam * X[:, i] + 1, 0), 1 / lam)
    return Xt

`log_det_jacobian(X)` ¶

Compute per-sample log |det J| of the forward Box-Cox transform.

The Jacobian is diagonal; for each feature:

λ ≠ 0 :  d/dx[(xᵏ−1)/λ] = x^{λ−1}  ⟹  log = (λ−1) log x
λ = 0 :  d/dx[log x] = 1/x           ⟹  exact log = −log x

.. note:: The λ = 0 branch accumulates −xᵢ rather than the exact −log xᵢ. This preserves the original implementation behaviour.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data (pre-transform, original scale).

Returns¶

log_det : np.ndarray of shape (n_samples,) Sum of per-feature log Jacobian contributions.

Source code in rbig/_src/parametric.py

def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute per-sample log |det J| of the forward Box-Cox transform.

    The Jacobian is diagonal; for each feature:

        λ ≠ 0 :  d/dx[(xᵏ−1)/λ] = x^{λ−1}  ⟹  log = (λ−1) log x
        λ = 0 :  d/dx[log x] = 1/x           ⟹  exact log = −log x

    .. note::
        The λ = 0 branch accumulates ``−xᵢ`` rather than the exact
        ``−log xᵢ``.  This preserves the original implementation
        behaviour.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data (pre-transform, original scale).

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Sum of per-feature log Jacobian contributions.
    """
    log_jac = np.zeros(X.shape[0])
    for i in range(X.shape[1]):
        xi = X[:, i]
        lam = self.lambdas_[i]
        if np.abs(lam) < 1e-10:
            log_jac += -xi  # log(1/x) ~= -x (lam->0 limit)
        else:
            # (lam-1) log xi from x^{lam-1} Jacobian
            log_jac += (lam - 1) * np.log(np.maximum(xi, 1e-300))
    return log_jac

`rbig.LogitTransform` ¶

Bases: BaseTransform

Logit transform: bijectively maps the unit hypercube (0,1)ᵈ to ℝᵈ.

Each feature is transformed independently by the logit (log-odds) function:

Forward  : y = log(x / (1 − x))
Inverse  : x = σ(y) = 1 / (1 + e^{−y})        (sigmoid)
Log-det  : ∑ᵢ [−log xᵢ − log(1 − xᵢ)]

The transform is useful as a pre-processing step when data lives in (0, 1), e.g. probabilities or proportions.

Examples¶

import numpy as np from rbig._src.parametric import LogitTransform rng = np.random.default_rng(0) X = rng.uniform(0.05, 0.95, size=(100, 3)) # data in (0, 1) tr = LogitTransform().fit(X) Y = tr.transform(X) # data now in ℝ X_rec = tr.inverse_transform(Y) np.allclose(X, X_rec, atol=1e-10) True

Source code in rbig/_src/parametric.py

class LogitTransform(BaseTransform):
    """Logit transform: bijectively maps the unit hypercube (0,1)ᵈ to ℝᵈ.

    Each feature is transformed independently by the logit (log-odds) function:

        Forward  : y = log(x / (1 − x))
        Inverse  : x = σ(y) = 1 / (1 + e^{−y})        (sigmoid)
        Log-det  : ∑ᵢ [−log xᵢ − log(1 − xᵢ)]

    The transform is useful as a pre-processing step when data lives in (0, 1),
    e.g. probabilities or proportions.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import LogitTransform
    >>> rng = np.random.default_rng(0)
    >>> X = rng.uniform(0.05, 0.95, size=(100, 3))  # data in (0, 1)
    >>> tr = LogitTransform().fit(X)
    >>> Y = tr.transform(X)  # data now in ℝ
    >>> X_rec = tr.inverse_transform(Y)
    >>> np.allclose(X, X_rec, atol=1e-10)
    True
    """

    def fit(self, X: np.ndarray, y=None) -> LogitTransform:
        """No-op fit (stateless transform).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Ignored.

        Returns
        -------
        self : LogitTransform
        """
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the logit map y = log(x / (1 − x)).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in (0, 1).

        Returns
        -------
        Y : np.ndarray of shape (n_samples, n_features)
            Log-odds transformed data in ℝ.
        """
        return np.log(X / (1 - X))  # y = logit(x) = log(x/(1-x))

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the inverse sigmoid (logistic) map x = 1 / (1 + e^{−y}).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in ℝ.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Recovered data in (0, 1).
        """
        return 1 / (1 + np.exp(-X))  # x = sigmoid(y) = 1/(1+e^{-y})

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute per-sample log |det J| of the forward logit transform.

        The Jacobian of logit is diagonal with entries
        d(logit xᵢ)/dxᵢ = 1/xᵢ + 1/(1−xᵢ), so:

            log |det J| = ∑ᵢ [−log xᵢ − log(1 − xᵢ)]

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in (0, 1) (pre-transform).

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Log absolute determinant of the Jacobian for each sample.
        """
        # Diagonal Jacobian: sum_i (-log xi - log(1-xi))
        return np.sum(-np.log(X) - np.log(1 - X), axis=1)

`fit(X, y=None)` ¶

No-op fit (stateless transform).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Ignored.

Returns¶

self : LogitTransform

Source code in rbig/_src/parametric.py

def fit(self, X: np.ndarray, y=None) -> LogitTransform:
    """No-op fit (stateless transform).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Ignored.

    Returns
    -------
    self : LogitTransform
    """
    return self

`transform(X)` ¶

Apply the logit map y = log(x / (1 − x)).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data in (0, 1).

Returns¶

Y : np.ndarray of shape (n_samples, n_features) Log-odds transformed data in ℝ.

Source code in rbig/_src/parametric.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the logit map y = log(x / (1 − x)).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in (0, 1).

    Returns
    -------
    Y : np.ndarray of shape (n_samples, n_features)
        Log-odds transformed data in ℝ.
    """
    return np.log(X / (1 - X))  # y = logit(x) = log(x/(1-x))

`inverse_transform(X)` ¶

Apply the inverse sigmoid (logistic) map x = 1 / (1 + e^{−y}).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data in ℝ.

Returns¶

Z : np.ndarray of shape (n_samples, n_features) Recovered data in (0, 1).

Source code in rbig/_src/parametric.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the inverse sigmoid (logistic) map x = 1 / (1 + e^{−y}).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in ℝ.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Recovered data in (0, 1).
    """
    return 1 / (1 + np.exp(-X))  # x = sigmoid(y) = 1/(1+e^{-y})

`log_det_jacobian(X)` ¶

Compute per-sample log |det J| of the forward logit transform.

The Jacobian of logit is diagonal with entries d(logit xᵢ)/dxᵢ = 1/xᵢ + 1/(1−xᵢ), so:

log |det J| = ∑ᵢ [−log xᵢ − log(1 − xᵢ)]

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data in (0, 1) (pre-transform).

Returns¶

log_det : np.ndarray of shape (n_samples,) Log absolute determinant of the Jacobian for each sample.

Source code in rbig/_src/parametric.py

def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute per-sample log |det J| of the forward logit transform.

    The Jacobian of logit is diagonal with entries
    d(logit xᵢ)/dxᵢ = 1/xᵢ + 1/(1−xᵢ), so:

        log |det J| = ∑ᵢ [−log xᵢ − log(1 − xᵢ)]

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in (0, 1) (pre-transform).

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Log absolute determinant of the Jacobian for each sample.
    """
    # Diagonal Jacobian: sum_i (-log xi - log(1-xi))
    return np.sum(-np.log(X) - np.log(1 - X), axis=1)

`rbig.QuantileTransform` ¶

Bases: BaseTransform

Quantile transform that maps each feature to a target distribution.

Wraps sklearn.preprocessing.QuantileTransformer to provide a uniform interface compatible with RBIG pipelines. By default, features are mapped to a standard Gaussian distribution, which is a common pre-processing step for Gaussianisation.

Parameters¶

n_quantiles : int, optional (default=1000) Number of quantiles used to build the empirical CDF. Capped at n_samples during fit. output_distribution : str, optional (default='normal') Target distribution for the transform. Accepted values are 'normal' (standard Gaussian) and 'uniform'.

Attributes¶

qt_ : sklearn.preprocessing.QuantileTransformer Fitted sklearn transformer, available after calling fit.

Examples¶

import numpy as np from rbig._src.parametric import QuantileTransform rng = np.random.default_rng(42) X = rng.exponential(scale=1.0, size=(500, 2)) tr = QuantileTransform(n_quantiles=200).fit(X) Y = tr.transform(X) # approximately standard Gaussian Y.shape (500, 2)

Marginal means should be near zero, stds near 1¶

np.allclose(Y.mean(axis=0), 0, atol=0.1) True

Source code in rbig/_src/parametric.py

class QuantileTransform(BaseTransform):
    """Quantile transform that maps each feature to a target distribution.

    Wraps ``sklearn.preprocessing.QuantileTransformer`` to provide a uniform
    interface compatible with RBIG pipelines.  By default, features are
    mapped to a standard Gaussian distribution, which is a common
    pre-processing step for Gaussianisation.

    Parameters
    ----------
    n_quantiles : int, optional (default=1000)
        Number of quantiles used to build the empirical CDF.  Capped at
        ``n_samples`` during ``fit``.
    output_distribution : str, optional (default='normal')
        Target distribution for the transform.  Accepted values are
        ``'normal'`` (standard Gaussian) and ``'uniform'``.

    Attributes
    ----------
    qt_ : sklearn.preprocessing.QuantileTransformer
        Fitted sklearn transformer, available after calling ``fit``.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import QuantileTransform
    >>> rng = np.random.default_rng(42)
    >>> X = rng.exponential(scale=1.0, size=(500, 2))
    >>> tr = QuantileTransform(n_quantiles=200).fit(X)
    >>> Y = tr.transform(X)  # approximately standard Gaussian
    >>> Y.shape
    (500, 2)
    >>> # Marginal means should be near zero, stds near 1
    >>> np.allclose(Y.mean(axis=0), 0, atol=0.1)
    True
    """

    def __init__(self, n_quantiles: int = 1000, output_distribution: str = "normal"):
        self.n_quantiles = n_quantiles
        self.output_distribution = output_distribution

    def fit(self, X: np.ndarray, y=None) -> QuantileTransform:
        """Fit the quantile transform to the training data.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : QuantileTransform
            Fitted instance with ``qt_`` attribute set.
        """
        from sklearn.preprocessing import QuantileTransformer

        # Cap n_quantiles at the number of available training samples
        n_quantiles = min(self.n_quantiles, X.shape[0])
        self.qt_ = QuantileTransformer(
            n_quantiles=n_quantiles,
            output_distribution=self.output_distribution,
            random_state=0,
        )
        self.qt_.fit(X)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the fitted quantile transform.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Y : np.ndarray of shape (n_samples, n_features)
            Data mapped to the target distribution.
        """
        return self.qt_.transform(X)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the quantile transform.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the target distribution space.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Recovered data in the original distribution space.
        """
        return self.qt_.inverse_transform(X)

`fit(X, y=None)` ¶

Fit the quantile transform to the training data.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns¶

self : QuantileTransform Fitted instance with qt_ attribute set.

Source code in rbig/_src/parametric.py

def fit(self, X: np.ndarray, y=None) -> QuantileTransform:
    """Fit the quantile transform to the training data.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : QuantileTransform
        Fitted instance with ``qt_`` attribute set.
    """
    from sklearn.preprocessing import QuantileTransformer

    # Cap n_quantiles at the number of available training samples
    n_quantiles = min(self.n_quantiles, X.shape[0])
    self.qt_ = QuantileTransformer(
        n_quantiles=n_quantiles,
        output_distribution=self.output_distribution,
        random_state=0,
    )
    self.qt_.fit(X)
    return self

`transform(X)` ¶

Apply the fitted quantile transform.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns¶

Y : np.ndarray of shape (n_samples, n_features) Data mapped to the target distribution.

Source code in rbig/_src/parametric.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the fitted quantile transform.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Y : np.ndarray of shape (n_samples, n_features)
        Data mapped to the target distribution.
    """
    return self.qt_.transform(X)

`inverse_transform(X)` ¶

Invert the quantile transform.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data in the target distribution space.

Returns¶

Z : np.ndarray of shape (n_samples, n_features) Recovered data in the original distribution space.

Source code in rbig/_src/parametric.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the quantile transform.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the target distribution space.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Recovered data in the original distribution space.
    """
    return self.qt_.inverse_transform(X)

Base Classes¶

`rbig.BaseTransform` ¶

Bases: TransformerMixin, BaseEstimator, ABC

Abstract base class for all RBIG transforms.

Defines the common interface shared by every learnable data transformation in this library: fitting to data, forward mapping, and its inverse. Subclasses that support density estimation should also implement log_det_jacobian.

Notes¶

The change-of-variables formula for a normalizing flow relates the density of the input x to a base density p_Z via a bijection f:

log p(x) = log p_Z(f(x)) + log|det J_f(x)|

where J_f(x) is the Jacobian of f evaluated at x.

Source code in rbig/_src/base.py

class BaseTransform(TransformerMixin, BaseEstimator, ABC):
    """Abstract base class for all RBIG transforms.

    Defines the common interface shared by every learnable data transformation
    in this library: fitting to data, forward mapping, and its inverse.
    Subclasses that support density estimation should also implement
    ``log_det_jacobian``.

    Notes
    -----
    The change-of-variables formula for a normalizing flow relates the density
    of the input ``x`` to a base density ``p_Z`` via a bijection ``f``:

        log p(x) = log p_Z(f(x)) + log|det J_f(x)|

    where ``J_f(x)`` is the Jacobian of ``f`` evaluated at ``x``.
    """

    @abstractmethod
    def fit(self, X: np.ndarray, y=None) -> "BaseTransform":
        """Fit the transform to data X.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data used to estimate any internal parameters.
        y : ignored
            Not used, present for sklearn pipeline compatibility.

        Returns
        -------
        self : BaseTransform
            The fitted transform instance.
        """
        ...

    @abstractmethod
    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the fitted forward transform to data X.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data to transform.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Transformed data.
        """
        ...

    @abstractmethod
    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the fitted inverse transform to data X.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the transformed (latent) space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original input space.
        """
        ...

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Log absolute determinant of the Jacobian evaluated at X.

        For a transform f, this returns ``log|det J_f(x)|`` per sample,
        which is the volume-correction term required in the change-of-variables
        formula for density estimation.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points at which to evaluate the log-det-Jacobian.

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant of the Jacobian.

        Raises
        ------
        NotImplementedError
            If the subclass does not implement this method.
        """
        raise NotImplementedError

`fit(X, y=None)` `abstractmethod` ¶

Fit the transform to data X.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data used to estimate any internal parameters. y : ignored Not used, present for sklearn pipeline compatibility.

Returns¶

self : BaseTransform The fitted transform instance.

Source code in rbig/_src/base.py

@abstractmethod
def fit(self, X: np.ndarray, y=None) -> "BaseTransform":
    """Fit the transform to data X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data used to estimate any internal parameters.
    y : ignored
        Not used, present for sklearn pipeline compatibility.

    Returns
    -------
    self : BaseTransform
        The fitted transform instance.
    """
    ...

`transform(X)` `abstractmethod` ¶

Apply the fitted forward transform to data X.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data to transform.

Returns¶

Xt : np.ndarray of shape (n_samples, n_features) Transformed data.

Source code in rbig/_src/base.py

@abstractmethod
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the fitted forward transform to data X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data to transform.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Transformed data.
    """
    ...

`inverse_transform(X)` `abstractmethod` ¶

Apply the fitted inverse transform to data X.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data in the transformed (latent) space.

Returns¶

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original input space.

Source code in rbig/_src/base.py

@abstractmethod
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the fitted inverse transform to data X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the transformed (latent) space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original input space.
    """
    ...

`log_det_jacobian(X)` ¶

Log absolute determinant of the Jacobian evaluated at X.

For a transform f, this returns log|det J_f(x)| per sample, which is the volume-correction term required in the change-of-variables formula for density estimation.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input points at which to evaluate the log-det-Jacobian.

Returns¶

ldj : np.ndarray of shape (n_samples,) Per-sample log absolute determinant of the Jacobian.

Raises¶

NotImplementedError If the subclass does not implement this method.

Source code in rbig/_src/base.py

def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Log absolute determinant of the Jacobian evaluated at X.

    For a transform f, this returns ``log|det J_f(x)|`` per sample,
    which is the volume-correction term required in the change-of-variables
    formula for density estimation.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points at which to evaluate the log-det-Jacobian.

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant of the Jacobian.

    Raises
    ------
    NotImplementedError
        If the subclass does not implement this method.
    """
    raise NotImplementedError

`rbig.Bijector` ¶

Bases: TransformerMixin, BaseEstimator, ABC

Abstract base class for invertible transformations (bijectors).

A bijector implements a differentiable, invertible map f : ℝᵈ → ℝᵈ and provides the log absolute determinant of its Jacobian. These are the building blocks of normalizing flows.

The density of a random variable X = f⁻¹(Z) where Z ~ p_Z is:

log p(x) = log p_Z(f(x)) + log|det J_f(x)|

Notes¶

Concrete subclasses must implement :meth:fit, :meth:transform, :meth:inverse_transform, and :meth:get_log_det_jacobian. log_det_jacobian is provided as a convenience alias for the last method, for compatibility with RBIGLayer.

References¶

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: From ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537–549. https://doi.org/10.1109/TNN.2011.2106511

Source code in rbig/_src/base.py

class Bijector(TransformerMixin, BaseEstimator, ABC):
    """Abstract base class for invertible transformations (bijectors).

    A bijector implements a differentiable, invertible map ``f : ℝᵈ → ℝᵈ``
    and provides the log absolute determinant of its Jacobian.  These are
    the building blocks of normalizing flows.

    The density of a random variable ``X = f⁻¹(Z)`` where ``Z ~ p_Z`` is:

        log p(x) = log p_Z(f(x)) + log|det J_f(x)|

    Notes
    -----
    Concrete subclasses must implement :meth:`fit`, :meth:`transform`,
    :meth:`inverse_transform`, and :meth:`get_log_det_jacobian`.
    ``log_det_jacobian`` is provided as a convenience alias for the last
    method, for compatibility with ``RBIGLayer``.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    From ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537–549. https://doi.org/10.1109/TNN.2011.2106511
    """

    @abstractmethod
    def fit(self, X: np.ndarray, y=None) -> "Bijector":
        """Fit the bijector to data X.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.
        y : ignored
            Not used, present for sklearn pipeline compatibility.

        Returns
        -------
        self : Bijector
            The fitted bijector.
        """
        ...

    @abstractmethod
    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the forward bijection f(x).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Data mapped to the latent space.
        """
        ...

    @abstractmethod
    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the inverse bijection f⁻¹(z).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the latent space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original space.
        """
        ...

    @abstractmethod
    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute log|det J_f(x)| per sample.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points at which to evaluate the log-det-Jacobian.

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant of the forward Jacobian J_f.
        """
        ...

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Alias for get_log_det_jacobian for compatibility with RBIGLayer.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points.

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Per-sample log|det J_f(x)|.
        """
        return self.get_log_det_jacobian(X)

`fit(X, y=None)` `abstractmethod` ¶

Fit the bijector to data X.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data. y : ignored Not used, present for sklearn pipeline compatibility.

Returns¶

self : Bijector The fitted bijector.

Source code in rbig/_src/base.py

@abstractmethod
def fit(self, X: np.ndarray, y=None) -> "Bijector":
    """Fit the bijector to data X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.
    y : ignored
        Not used, present for sklearn pipeline compatibility.

    Returns
    -------
    self : Bijector
        The fitted bijector.
    """
    ...

`transform(X)` `abstractmethod` ¶

Apply the forward bijection f(x).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns¶

Z : np.ndarray of shape (n_samples, n_features) Data mapped to the latent space.

Source code in rbig/_src/base.py

@abstractmethod
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the forward bijection f(x).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Data mapped to the latent space.
    """
    ...

`inverse_transform(X)` `abstractmethod` ¶

Apply the inverse bijection f⁻¹(z).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data in the latent space.

Returns¶

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original space.

Source code in rbig/_src/base.py

@abstractmethod
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the inverse bijection f⁻¹(z).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the latent space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original space.
    """
    ...

`get_log_det_jacobian(X)` `abstractmethod` ¶

Compute log|det J_f(x)| per sample.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input points at which to evaluate the log-det-Jacobian.

Returns¶

ldj : np.ndarray of shape (n_samples,) Per-sample log absolute determinant of the forward Jacobian J_f.

Source code in rbig/_src/base.py

@abstractmethod
def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute log|det J_f(x)| per sample.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points at which to evaluate the log-det-Jacobian.

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant of the forward Jacobian J_f.
    """
    ...

`log_det_jacobian(X)` ¶

Alias for get_log_det_jacobian for compatibility with RBIGLayer.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input points.

Returns¶

ldj : np.ndarray of shape (n_samples,) Per-sample log|det J_f(x)|.

Source code in rbig/_src/base.py

def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Alias for get_log_det_jacobian for compatibility with RBIGLayer.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points.

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Per-sample log|det J_f(x)|.
    """
    return self.get_log_det_jacobian(X)

`rbig.MarginalBijector` ¶

Bases: Bijector

Abstract bijector for independent, per-dimension (marginal) transforms.

Each feature dimension is transformed by a separate invertible function. Because the transform is applied independently to each coordinate, the Jacobian is diagonal and its log-determinant is the sum of per-dimension log-derivatives:

log|det J_f(x)| = ∑ᵢ log|f′(xᵢ)|

Subclasses implement concrete marginal mappings such as empirical CDF Gaussianization, quantile transform, or kernel density estimation.

Notes¶

In RBIG, the marginal step maps each dimension to a standard Gaussian via

z = Φ⁻¹(F̂ₙ(x))

where F̂ₙ is the estimated marginal CDF and Φ⁻¹ is the standard normal quantile function (probit).

Source code in rbig/_src/base.py

class MarginalBijector(Bijector):
    """Abstract bijector for independent, per-dimension (marginal) transforms.

    Each feature dimension is transformed by a separate invertible function.
    Because the transform is applied independently to each coordinate, the
    Jacobian is diagonal and its log-determinant is the sum of per-dimension
    log-derivatives:

        log|det J_f(x)| = ∑ᵢ log|f′(xᵢ)|

    Subclasses implement concrete marginal mappings such as empirical CDF
    Gaussianization, quantile transform, or kernel density estimation.

    Notes
    -----
    In RBIG, the marginal step maps each dimension to a standard Gaussian via

        z = Φ⁻¹(F̂ₙ(x))

    where F̂ₙ is the estimated marginal CDF and Φ⁻¹ is the standard normal
    quantile function (probit).
    """

`rbig.RotationBijector` ¶

Bases: Bijector

Abstract bijector for orthogonal rotation transforms.

Rotation matrices Q satisfy QᵀQ = I and |det Q| = 1, so the log-absolute-determinant of the Jacobian is exactly zero:

log|det J_Q(x)| = log|det Q| = log 1 = 0

This default implementation of get_log_det_jacobian returns a zero vector of length n_samples, which concrete subclasses (e.g. PCA, ICA, random orthogonal) can inherit without override.

Notes¶

In RBIG, the rotation step de-correlates the marginally Gaussianized data, driving the joint distribution closer to a standard multivariate Gaussian with each iteration.

Source code in rbig/_src/base.py

class RotationBijector(Bijector):
    """Abstract bijector for orthogonal rotation transforms.

    Rotation matrices Q satisfy QᵀQ = I and |det Q| = 1, so the
    log-absolute-determinant of the Jacobian is exactly zero:

        log|det J_Q(x)| = log|det Q| = log 1 = 0

    This default implementation of ``get_log_det_jacobian`` returns a
    zero vector of length ``n_samples``, which concrete subclasses (e.g.
    PCA, ICA, random orthogonal) can inherit without override.

    Notes
    -----
    In RBIG, the rotation step de-correlates the marginally Gaussianized
    data, driving the joint distribution closer to a standard multivariate
    Gaussian with each iteration.
    """

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros because |det Q| = 1 for any orthogonal matrix Q.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine n_samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Array of zeros; rotations do not change volume.
        """
        # Orthogonal matrices preserve volume: log|det Q| = 0 for all x
        return np.zeros(X.shape[0])

`get_log_det_jacobian(X)` ¶

Return zeros because |det Q| = 1 for any orthogonal matrix Q.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine n_samples).

Returns¶

ldj : np.ndarray of shape (n_samples,) Array of zeros; rotations do not change volume.

Source code in rbig/_src/base.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros because |det Q| = 1 for any orthogonal matrix Q.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine n_samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Array of zeros; rotations do not change volume.
    """
    # Orthogonal matrices preserve volume: log|det Q| = 0 for all x
    return np.zeros(X.shape[0])

`rbig.CompositeBijector` ¶

Bases: Bijector

A bijector that chains a sequence of bijectors in order.

Applies bijectors f₁, f₂, …, fₖ in sequence so that the composite map is g = fₖ ∘ … ∘ f₂ ∘ f₁. The log-det-Jacobian of the composition follows the chain rule:

log|det J_g(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|

where xₖ₋₁ = fₖ₋₁ ∘ … ∘ f₁(x) is the input to the k-th bijector.

Parameters¶

bijectors : list of Bijector Ordered list of bijectors to chain. They are applied left-to-right during transform and right-to-left during inverse_transform.

Attributes¶

bijectors : list of Bijector The constituent bijectors in application order.

Notes¶

Fitting is done sequentially: each bijector is fitted to the output of the previous one, so that the full model is trained in a single fit call.

Examples¶

import numpy as np from rbig._src.base import CompositeBijector from rbig._src.marginal import MarginalGaussianize from rbig._src.rotation import PCARotation rng = np.random.default_rng(0) X = rng.standard_normal((200, 4)) cb = CompositeBijector([MarginalGaussianize(), PCARotation()]) cb.fit(X) # doctest: +ELLIPSIS Z = cb.transform(X) Z.shape (200, 4)

Source code in rbig/_src/base.py

class CompositeBijector(Bijector):
    """A bijector that chains a sequence of bijectors in order.

    Applies bijectors ``f₁, f₂, …, fₖ`` in sequence so that the composite
    map is ``g = fₖ ∘ … ∘ f₂ ∘ f₁``.  The log-det-Jacobian of the
    composition follows the chain rule:

        log|det J_g(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|

    where ``xₖ₋₁ = fₖ₋₁ ∘ … ∘ f₁(x)`` is the input to the k-th bijector.

    Parameters
    ----------
    bijectors : list of Bijector
        Ordered list of bijectors to chain.  They are applied left-to-right
        during ``transform`` and right-to-left during ``inverse_transform``.

    Attributes
    ----------
    bijectors : list of Bijector
        The constituent bijectors in application order.

    Notes
    -----
    Fitting is done sequentially: each bijector is fitted to the output of
    the previous one, so that the full model is trained in a single
    ``fit`` call.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.base import CompositeBijector
    >>> from rbig._src.marginal import MarginalGaussianize
    >>> from rbig._src.rotation import PCARotation
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 4))
    >>> cb = CompositeBijector([MarginalGaussianize(), PCARotation()])
    >>> cb.fit(X)  # doctest: +ELLIPSIS
    <rbig._src.base.CompositeBijector ...>
    >>> Z = cb.transform(X)
    >>> Z.shape
    (200, 4)
    """

    def __init__(self, bijectors: list):
        self.bijectors = bijectors

    def fit(self, X: np.ndarray, y=None) -> "CompositeBijector":
        """Fit each bijector sequentially on the output of the previous one.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.
        y : ignored
            Not used, present for sklearn pipeline compatibility.

        Returns
        -------
        self : CompositeBijector
            The fitted composite bijector.
        """
        Xt = X.copy()  # working copy; shape (n_samples, n_features)
        for b in self.bijectors:
            # fit bijector b on current Xt, then advance Xt to b's output
            Xt = b.fit_transform(Xt)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply all bijectors left-to-right: g(x) = fₖ(… f₁(x) …).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Data after passing through every bijector in sequence.
        """
        Xt = X.copy()  # shape (n_samples, n_features)
        for b in self.bijectors:
            Xt = b.transform(Xt)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the composite map: g⁻¹(z) = f₁⁻¹(… fₖ⁻¹(z) …).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the latent space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original input space.
        """
        Xt = X.copy()  # shape (n_samples, n_features)
        # reverse order to undo the forward composition
        for b in reversed(self.bijectors):
            Xt = b.inverse_transform(Xt)
        return Xt

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Sum log|det Jₖ| over all bijectors (chain rule).

        Uses the chain rule for Jacobian determinants:

            log|det J_g(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points.

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Per-sample sum of log-det-Jacobians across all bijectors.
        """
        Xt = X.copy()  # shape (n_samples, n_features)
        log_det = np.zeros(X.shape[0])  # accumulator, shape (n_samples,)
        for b in self.bijectors:
            # add log|det Jₖ| at the *current* intermediate input Xt
            log_det += b.get_log_det_jacobian(Xt)
            # advance Xt to the output of bijector b for the next iteration
            Xt = b.transform(Xt)
        return log_det

`fit(X, y=None)` ¶

Fit each bijector sequentially on the output of the previous one.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Training data. y : ignored Not used, present for sklearn pipeline compatibility.

Returns¶

self : CompositeBijector The fitted composite bijector.

Source code in rbig/_src/base.py

def fit(self, X: np.ndarray, y=None) -> "CompositeBijector":
    """Fit each bijector sequentially on the output of the previous one.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.
    y : ignored
        Not used, present for sklearn pipeline compatibility.

    Returns
    -------
    self : CompositeBijector
        The fitted composite bijector.
    """
    Xt = X.copy()  # working copy; shape (n_samples, n_features)
    for b in self.bijectors:
        # fit bijector b on current Xt, then advance Xt to b's output
        Xt = b.fit_transform(Xt)
    return self

`transform(X)` ¶

Apply all bijectors left-to-right: g(x) = fₖ(… f₁(x) …).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns¶

Z : np.ndarray of shape (n_samples, n_features) Data after passing through every bijector in sequence.

Source code in rbig/_src/base.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply all bijectors left-to-right: g(x) = fₖ(… f₁(x) …).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Data after passing through every bijector in sequence.
    """
    Xt = X.copy()  # shape (n_samples, n_features)
    for b in self.bijectors:
        Xt = b.transform(Xt)
    return Xt

`inverse_transform(X)` ¶

Invert the composite map: g⁻¹(z) = f₁⁻¹(… fₖ⁻¹(z) …).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data in the latent space.

Returns¶

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original input space.

Source code in rbig/_src/base.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the composite map: g⁻¹(z) = f₁⁻¹(… fₖ⁻¹(z) …).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the latent space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original input space.
    """
    Xt = X.copy()  # shape (n_samples, n_features)
    # reverse order to undo the forward composition
    for b in reversed(self.bijectors):
        Xt = b.inverse_transform(Xt)
    return Xt

`get_log_det_jacobian(X)` ¶

Sum log|det Jₖ| over all bijectors (chain rule).

Uses the chain rule for Jacobian determinants:

log|det J_g(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Input points.

Returns¶

log_det : np.ndarray of shape (n_samples,) Per-sample sum of log-det-Jacobians across all bijectors.

Source code in rbig/_src/base.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Sum log|det Jₖ| over all bijectors (chain rule).

    Uses the chain rule for Jacobian determinants:

        log|det J_g(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points.

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Per-sample sum of log-det-Jacobians across all bijectors.
    """
    Xt = X.copy()  # shape (n_samples, n_features)
    log_det = np.zeros(X.shape[0])  # accumulator, shape (n_samples,)
    for b in self.bijectors:
        # add log|det Jₖ| at the *current* intermediate input Xt
        log_det += b.get_log_det_jacobian(Xt)
        # advance Xt to the output of bijector b for the next iteration
        Xt = b.transform(Xt)
    return log_det

Information Theory¶

`rbig.total_correlation_rbig(X)` ¶

Estimate Total Correlation (multivariate mutual information) of X.

Total Correlation is defined as:

TC(X) = ∑ᵢ H(Xᵢ) − H(X)

where the marginal entropies H(Xᵢ) are estimated via KDE (using marginal_entropy) and the joint entropy H(X) is estimated by fitting a multivariate Gaussian to the data (joint_entropy_gaussian).

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data matrix.

Returns¶

tc : float Estimated total correlation in nats. Values close to zero indicate approximate statistical independence among the dimensions.

Notes¶

See :func:rbig._src.densities.total_correlation for identical logic. This function is kept in metrics for API convenience.

Examples¶

import numpy as np from rbig._src.metrics import total_correlation_rbig rng = np.random.default_rng(0) X = rng.standard_normal((300, 4)) # independent Gaussians tc = total_correlation_rbig(X) tc >= -0.5 # should be near 0 True

Source code in rbig/_src/metrics.py

def total_correlation_rbig(X: np.ndarray) -> float:
    """Estimate Total Correlation (multivariate mutual information) of X.

    Total Correlation is defined as:

        TC(X) = ∑ᵢ H(Xᵢ) − H(X)

    where the marginal entropies H(Xᵢ) are estimated via KDE (using
    ``marginal_entropy``) and the joint entropy H(X) is estimated by fitting
    a multivariate Gaussian to the data (``joint_entropy_gaussian``).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data matrix.

    Returns
    -------
    tc : float
        Estimated total correlation in nats.  Values close to zero indicate
        approximate statistical independence among the dimensions.

    Notes
    -----
    See :func:`rbig._src.densities.total_correlation` for identical logic.
    This function is kept in ``metrics`` for API convenience.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.metrics import total_correlation_rbig
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((300, 4))  # independent Gaussians
    >>> tc = total_correlation_rbig(X)
    >>> tc >= -0.5  # should be near 0
    True
    """
    from rbig._src.densities import joint_entropy_gaussian, marginal_entropy

    marg_h = marginal_entropy(X)  # ∑ᵢ H(Xᵢ), shape (n_features,)
    joint_h = joint_entropy_gaussian(X)  # H(X) via Gaussian approximation
    return float(np.sum(marg_h) - joint_h)

`rbig.mutual_information_rbig(model_X, model_Y, model_XY)` ¶

Estimate mutual information between X and Y via RBIG models.

Uses the identity:

MI(X; Y) = H(X) + H(Y) − H(X, Y)

where each entropy is estimated from a separately fitted RBIG model.

Parameters¶

model_X : AnnealedRBIG RBIG model fitted on samples from the marginal distribution of X. model_Y : AnnealedRBIG RBIG model fitted on samples from the marginal distribution of Y. model_XY : AnnealedRBIG RBIG model fitted on joint samples [X, Y] (i.e. columns concatenated).

Returns¶

mi : float Estimated mutual information MI(X; Y) in nats. Non-negative for well-calibrated models; small negative values may appear due to numerical imprecision.

Notes¶

Each model.entropy() call returns the differential entropy estimated from the RBIG-transformed representation.

Examples¶

Assumes pre-fitted models; see AnnealedRBIG for fitting details.¶

mi = mutual_information_rbig(model_X, model_Y, model_XY) mi >= 0 # MI is non-negative True

Source code in rbig/_src/metrics.py

def mutual_information_rbig(
    model_X: AnnealedRBIG,
    model_Y: AnnealedRBIG,
    model_XY: AnnealedRBIG,
) -> float:
    """Estimate mutual information between X and Y via RBIG models.

    Uses the identity:

        MI(X; Y) = H(X) + H(Y) − H(X, Y)

    where each entropy is estimated from a separately fitted RBIG model.

    Parameters
    ----------
    model_X : AnnealedRBIG
        RBIG model fitted on samples from the marginal distribution of X.
    model_Y : AnnealedRBIG
        RBIG model fitted on samples from the marginal distribution of Y.
    model_XY : AnnealedRBIG
        RBIG model fitted on joint samples [X, Y] (i.e. columns concatenated).

    Returns
    -------
    mi : float
        Estimated mutual information MI(X; Y) in nats.  Non-negative for
        well-calibrated models; small negative values may appear due to
        numerical imprecision.

    Notes
    -----
    Each ``model.entropy()`` call returns the differential entropy estimated
    from the RBIG-transformed representation.

    Examples
    --------
    >>> # Assumes pre-fitted models; see AnnealedRBIG for fitting details.
    >>> mi = mutual_information_rbig(model_X, model_Y, model_XY)
    >>> mi >= 0  # MI is non-negative
    True
    """
    hx = model_X.entropy()  # H(X)
    hy = model_Y.entropy()  # H(Y)
    hxy = model_XY.entropy()  # H(X, Y)
    return float(hx + hy - hxy)  # MI(X;Y) = H(X) + H(Y) - H(X,Y)

`rbig.kl_divergence_rbig(model_P, X_Q)` ¶

Estimate a divergence between distributions P and Q via a fitted RBIG model.

As implemented, this function returns:

−𝔼_Q[log p(x)] − H(P)

where 𝔼_Q is the expectation over samples X_Q from Q, log p is the log-density of model_P, and H(P) is the entropy of P estimated from the training data. Expanding H(P) = −𝔼_P[log p(x)]:

result = 𝔼_P[log p(x)] − 𝔼_Q[log p(x)]

.. note:: This quantity is not the standard KL divergence KL(P ‖ Q) = 𝔼_P[log p(x)/q(x)], because Q's log-density log q is never evaluated. The result is a measure of how differently P's log-density scores the P-samples versus the Q-samples. It equals zero when P and Q assign identical average log-probability under P's model.

Parameters¶

model_P : AnnealedRBIG RBIG model fitted on samples from distribution P. Must expose score_samples(X) and entropy(). X_Q : np.ndarray of shape (n_samples, n_features) Samples drawn from distribution Q against which P is compared.

Returns¶

divergence : float Estimated 𝔼_P[log p(x)] − 𝔼_Q[log p(x)] in nats.

Examples¶

When P == Q the divergence should be near zero.¶

kl = kl_divergence_rbig(model_P, X_from_P) kl >= -0.1 # small negative values possible due to approximation True

Source code in rbig/_src/metrics.py

def kl_divergence_rbig(
    model_P: AnnealedRBIG,
    X_Q: np.ndarray,
) -> float:
    """Estimate a divergence between distributions P and Q via a fitted RBIG model.

    As implemented, this function returns:

        −𝔼_Q[log p(x)] − H(P)

    where ``𝔼_Q`` is the expectation over samples ``X_Q`` from Q, ``log p``
    is the log-density of ``model_P``, and ``H(P)`` is the entropy of P
    estimated from the training data.  Expanding ``H(P) = −𝔼_P[log p(x)]``:

        result = 𝔼_P[log p(x)] − 𝔼_Q[log p(x)]

    .. note::
        This quantity is **not** the standard KL divergence
        ``KL(P ‖ Q) = 𝔼_P[log p(x)/q(x)]``, because Q's log-density
        ``log q`` is never evaluated.  The result is a measure of how
        differently P's log-density scores the P-samples versus the
        Q-samples.  It equals zero when P and Q assign identical average
        log-probability under P's model.

    Parameters
    ----------
    model_P : AnnealedRBIG
        RBIG model fitted on samples from distribution P.  Must expose
        ``score_samples(X)`` and ``entropy()``.
    X_Q : np.ndarray of shape (n_samples, n_features)
        Samples drawn from distribution Q against which P is compared.

    Returns
    -------
    divergence : float
        Estimated ``𝔼_P[log p(x)] − 𝔼_Q[log p(x)]`` in nats.

    Examples
    --------
    >>> # When P == Q the divergence should be near zero.
    >>> kl = kl_divergence_rbig(model_P, X_from_P)
    >>> kl >= -0.1  # small negative values possible due to approximation
    True
    """
    # Evaluate log p(x) for samples drawn from Q
    log_pq = model_P.score_samples(X_Q)  # shape (n_samples,)
    hp = model_P.entropy()  # H(P) estimated by RBIG
    return float(-np.mean(log_pq) - hp)  # -E_Q[log p] - H(P)

`rbig.entropy_rbig(model, X)` ¶

Estimate differential entropy of X using a fitted RBIG model.

Approximates the entropy via the plug-in estimator:

H(X) = −𝔼[log p(x)] ≈ −(1/N) ∑ᵢ log p(xᵢ)

where log p(xᵢ) is provided by model.score_samples.

Parameters¶

model : AnnealedRBIG RBIG model fitted on data from the same distribution as X. Must expose a score_samples(X) method returning per-sample log probabilities. X : np.ndarray of shape (n_samples, n_features) Evaluation data used to compute the empirical expectation.

Returns¶

entropy : float Estimated differential entropy in nats.

Examples¶

Assumes a pre-fitted AnnealedRBIG model.¶

h = entropy_rbig(fitted_model, X_test) h > 0 # entropy is typically positive for continuous distributions True

Source code in rbig/_src/metrics.py

def entropy_rbig(model: AnnealedRBIG, X: np.ndarray) -> float:
    """Estimate differential entropy of X using a fitted RBIG model.

    Approximates the entropy via the plug-in estimator:

        H(X) = −𝔼[log p(x)] ≈ −(1/N) ∑ᵢ log p(xᵢ)

    where log p(xᵢ) is provided by ``model.score_samples``.

    Parameters
    ----------
    model : AnnealedRBIG
        RBIG model fitted on data from the same distribution as X.  Must
        expose a ``score_samples(X)`` method returning per-sample log
        probabilities.
    X : np.ndarray of shape (n_samples, n_features)
        Evaluation data used to compute the empirical expectation.

    Returns
    -------
    entropy : float
        Estimated differential entropy in nats.

    Examples
    --------
    >>> # Assumes a pre-fitted AnnealedRBIG model.
    >>> h = entropy_rbig(fitted_model, X_test)
    >>> h > 0  # entropy is typically positive for continuous distributions
    True
    """
    log_probs = model.score_samples(X)  # log p(xᵢ) for each sample, shape (N,)
    return float(-np.mean(log_probs))  # H ~= -(1/N) sum log p(xi)

`rbig.entropy_marginal(X)` ¶

Per-dimension marginal entropy using the Vasicek spacing estimator.

Applies :func:entropy_univariate independently to each column of X.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data matrix.

Returns¶

entropies : np.ndarray of shape (n_features,) Vasicek entropy estimate (nats) for each feature dimension.

Examples¶

import numpy as np from rbig._src.metrics import entropy_marginal rng = np.random.default_rng(9) X = rng.standard_normal((800, 3)) h = entropy_marginal(X) h.shape (3,)

Source code in rbig/_src/metrics.py

def entropy_marginal(X: np.ndarray) -> np.ndarray:
    """Per-dimension marginal entropy using the Vasicek spacing estimator.

    Applies :func:`entropy_univariate` independently to each column of X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data matrix.

    Returns
    -------
    entropies : np.ndarray of shape (n_features,)
        Vasicek entropy estimate (nats) for each feature dimension.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.metrics import entropy_marginal
    >>> rng = np.random.default_rng(9)
    >>> X = rng.standard_normal((800, 3))
    >>> h = entropy_marginal(X)
    >>> h.shape
    (3,)
    """
    n_features = X.shape[1]
    # Apply 1-D Vasicek estimator to each column independently
    return np.array([entropy_univariate(X[:, i]) for i in range(n_features)])

`rbig.negentropy(X)` ¶

Compute per-dimension negentropy J(Xᵢ) = H_Gauss(Xᵢ) − H(Xᵢ).

Negentropy measures non-Gaussianity for each marginal:

J(Xᵢ) = H_Gauss(Xᵢ) − H(Xᵢ) ≥ 0

where H_Gauss(Xᵢ) = ½(1 + log 2π) + ½ log Var(Xᵢ) is the Gaussian entropy matched to the observed variance and H(Xᵢ) is estimated via KDE.

Parameters¶

X : np.ndarray of shape (n_samples, n_features) Data matrix.

Returns¶

neg_entropy : np.ndarray of shape (n_features,) Non-negative negentropy for each dimension. A value of 0 indicates that the marginal is Gaussian; larger values indicate more non-Gaussianity.

Notes¶

Negentropy is guaranteed non-negative by the maximum-entropy principle: among all distributions with a given variance, the Gaussian has the highest entropy.

Examples¶

import numpy as np from rbig._src.metrics import negentropy rng = np.random.default_rng(3) X_gauss = rng.standard_normal((500, 2)) J_gauss = negentropy(X_gauss) np.all(J_gauss >= -0.05) # nearly zero for Gaussian data True

Source code in rbig/_src/metrics.py

def negentropy(X: np.ndarray) -> np.ndarray:
    """Compute per-dimension negentropy J(Xᵢ) = H_Gauss(Xᵢ) − H(Xᵢ).

    Negentropy measures non-Gaussianity for each marginal:

        J(Xᵢ) = H_Gauss(Xᵢ) − H(Xᵢ) ≥ 0

    where H_Gauss(Xᵢ) = ½(1 + log 2π) + ½ log Var(Xᵢ) is the Gaussian
    entropy matched to the observed variance and H(Xᵢ) is estimated via KDE.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data matrix.

    Returns
    -------
    neg_entropy : np.ndarray of shape (n_features,)
        Non-negative negentropy for each dimension.  A value of 0 indicates
        that the marginal is Gaussian; larger values indicate more
        non-Gaussianity.

    Notes
    -----
    Negentropy is guaranteed non-negative by the maximum-entropy principle:
    among all distributions with a given variance, the Gaussian has the
    highest entropy.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.metrics import negentropy
    >>> rng = np.random.default_rng(3)
    >>> X_gauss = rng.standard_normal((500, 2))
    >>> J_gauss = negentropy(X_gauss)
    >>> np.all(J_gauss >= -0.05)  # nearly zero for Gaussian data
    True
    """
    _n, _d = X.shape
    # Gaussian entropy matched to empirical variance: H_Gauss = ½(1+log 2π) + ½ log σ²
    gauss_h = 0.5 * (1 + np.log(2 * np.pi)) + 0.5 * np.log(np.var(X, axis=0))
    from rbig._src.densities import marginal_entropy

    marg_h = marginal_entropy(X)  # KDE-based entropy, shape (n_features,)
    return gauss_h - marg_h  # J(Xi) = H_Gauss - H >= 0

`rbig.negative_log_likelihood(model, X)` ¶

Average negative log-likelihood of X under the RBIG model.

Computes:

NLL = −(1/N) ∑ᵢ log p(xᵢ)

This is equivalent to :func:entropy_rbig but is exposed separately to make its role as a loss / evaluation metric explicit.

Parameters¶

model : AnnealedRBIG Fitted RBIG model. Must expose score_samples(X). X : np.ndarray of shape (n_samples, n_features) Evaluation data.

Returns¶

nll : float Average negative log-likelihood in nats.

Examples¶

nll = negative_log_likelihood(fitted_model, X_test) nll > 0 # NLL is positive for well-calibrated models True

Source code in rbig/_src/metrics.py

def negative_log_likelihood(model: AnnealedRBIG, X: np.ndarray) -> float:
    """Average negative log-likelihood of X under the RBIG model.

    Computes:

        NLL = −(1/N) ∑ᵢ log p(xᵢ)

    This is equivalent to :func:`entropy_rbig` but is exposed separately to
    make its role as a loss / evaluation metric explicit.

    Parameters
    ----------
    model : AnnealedRBIG
        Fitted RBIG model.  Must expose ``score_samples(X)``.
    X : np.ndarray of shape (n_samples, n_features)
        Evaluation data.

    Returns
    -------
    nll : float
        Average negative log-likelihood in nats.

    Examples
    --------
    >>> nll = negative_log_likelihood(fitted_model, X_test)
    >>> nll > 0  # NLL is positive for well-calibrated models
    True
    """
    log_probs = model.score_samples(X)  # log p(xᵢ), shape (N,)
    return float(-np.mean(log_probs))  # NLL = -(1/N) sum log p(xi)

Parametric Distributions¶

`rbig.gaussian(n_samples=1000, loc=0.0, scale=1.0, random_state=None)` ¶

Sample from a univariate Gaussian (normal) distribution.

Parameters¶

n_samples : int, optional (default=1000) Number of samples to draw. loc : float, optional (default=0.0) Mean μ of the distribution. scale : float, optional (default=1.0) Standard deviation σ > 0 of the distribution. random_state : int or None, optional (default=None) Seed for the random number generator. Pass an integer for reproducible results.

Returns¶

samples : np.ndarray of shape (n_samples,) Independent draws from 𝒩(loc, scale²).

Examples¶

from rbig._src.parametric import gaussian x = gaussian(n_samples=500, loc=2.0, scale=0.5, random_state=0) x.shape (500,) import numpy as np np.isclose(x.mean(), 2.0, atol=0.1) True

Source code in rbig/_src/parametric.py

def gaussian(
    n_samples: int = 1000,
    loc: float = 0.0,
    scale: float = 1.0,
    random_state: int | None = None,
) -> np.ndarray:
    """Sample from a univariate Gaussian (normal) distribution.

    Parameters
    ----------
    n_samples : int, optional (default=1000)
        Number of samples to draw.
    loc : float, optional (default=0.0)
        Mean μ of the distribution.
    scale : float, optional (default=1.0)
        Standard deviation σ > 0 of the distribution.
    random_state : int or None, optional (default=None)
        Seed for the random number generator.  Pass an integer for
        reproducible results.

    Returns
    -------
    samples : np.ndarray of shape (n_samples,)
        Independent draws from 𝒩(loc, scale²).

    Examples
    --------
    >>> from rbig._src.parametric import gaussian
    >>> x = gaussian(n_samples=500, loc=2.0, scale=0.5, random_state=0)
    >>> x.shape
    (500,)
    >>> import numpy as np
    >>> np.isclose(x.mean(), 2.0, atol=0.1)
    True
    """
    rng = np.random.default_rng(random_state)
    return rng.normal(loc=loc, scale=scale, size=n_samples)

`rbig.multivariate_gaussian(n_samples=1000, mean=None, cov=None, d=2, random_state=None)` ¶

Sample from a multivariate Gaussian distribution.

Parameters¶

n_samples : int, optional (default=1000) Number of samples to draw. mean : np.ndarray of shape (d,) or None, optional Mean vector μ. Defaults to the zero vector of length d. cov : np.ndarray of shape (d, d) or None, optional Covariance matrix Σ. Must be symmetric positive semi-definite. Defaults to the identity matrix Iₐ. d : int, optional (default=2) Dimensionality used when mean is None. Ignored when mean is provided (its length determines the dimension). random_state : int or None, optional (default=None) Seed for the random number generator.

Returns¶

samples : np.ndarray of shape (n_samples, d) Independent draws from 𝒩(mean, cov).

Examples¶

import numpy as np from rbig._src.parametric import multivariate_gaussian cov = np.array([[1.0, 0.8], [0.8, 1.0]]) X = multivariate_gaussian(n_samples=200, cov=cov, d=2, random_state=7) X.shape (200, 2) np.isclose(np.corrcoef(X.T)[0, 1], 0.8, atol=0.1) True

Source code in rbig/_src/parametric.py

def multivariate_gaussian(
    n_samples: int = 1000,
    mean: np.ndarray | None = None,
    cov: np.ndarray | None = None,
    d: int = 2,
    random_state: int | None = None,
) -> np.ndarray:
    """Sample from a multivariate Gaussian distribution.

    Parameters
    ----------
    n_samples : int, optional (default=1000)
        Number of samples to draw.
    mean : np.ndarray of shape (d,) or None, optional
        Mean vector μ.  Defaults to the zero vector of length ``d``.
    cov : np.ndarray of shape (d, d) or None, optional
        Covariance matrix Σ.  Must be symmetric positive semi-definite.
        Defaults to the identity matrix Iₐ.
    d : int, optional (default=2)
        Dimensionality used when ``mean`` is ``None``.  Ignored when
        ``mean`` is provided (its length determines the dimension).
    random_state : int or None, optional (default=None)
        Seed for the random number generator.

    Returns
    -------
    samples : np.ndarray of shape (n_samples, d)
        Independent draws from 𝒩(mean, cov).

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import multivariate_gaussian
    >>> cov = np.array([[1.0, 0.8], [0.8, 1.0]])
    >>> X = multivariate_gaussian(n_samples=200, cov=cov, d=2, random_state=7)
    >>> X.shape
    (200, 2)
    >>> np.isclose(np.corrcoef(X.T)[0, 1], 0.8, atol=0.1)
    True
    """
    rng = np.random.default_rng(random_state)
    if mean is None:
        mean = np.zeros(d)  # default: zero mean
    if cov is None:
        cov = np.eye(len(mean))  # default: identity covariance
    return rng.multivariate_normal(mean, cov, size=n_samples)

`rbig.total_correlation_gaussian(cov)` ¶

Analytic Total Correlation of a multivariate Gaussian.

For a Gaussian with covariance Σ, the TC reduces to a function of the correlation matrix R = D^{-½} Σ D^{-½} (where D = diag(Σ)):

TC = ∑ᵢ H(Xᵢ) − H(X) = −½ log|R|

Equivalently, it measures how far the distribution is from being a product of its marginals.

Parameters¶

cov : np.ndarray of shape (d, d) Covariance matrix Σ. Coerced to at least 2-D.

Returns¶

tc : float Total correlation in nats. Returns +inf if Σ is singular.

Notes¶

The computation uses:

TC = (∑ᵢ ½ log(2πe σᵢ²)) − ½(d(1 + log 2π) + log|Σ|)
   = ½ ∑ᵢ log σᵢ² − ½ log|Σ|
   = −½ log|corr(Σ)|

Examples¶

import numpy as np from rbig._src.parametric import total_correlation_gaussian

Identity covariance → all marginals independent → TC = 0¶

tc = total_correlation_gaussian(np.eye(3)) np.isclose(tc, 0.0) True

Correlated covariance → TC > 0¶

cov = np.array([[1.0, 0.9], [0.9, 1.0]]) total_correlation_gaussian(cov) > 0 True

Source code in rbig/_src/parametric.py

def total_correlation_gaussian(cov: np.ndarray) -> float:
    """Analytic Total Correlation of a multivariate Gaussian.

    For a Gaussian with covariance Σ, the TC reduces to a function of the
    correlation matrix R = D^{-½} Σ D^{-½} (where D = diag(Σ)):

        TC = ∑ᵢ H(Xᵢ) − H(X) = −½ log|R|

    Equivalently, it measures how far the distribution is from being a
    product of its marginals.

    Parameters
    ----------
    cov : np.ndarray of shape (d, d)
        Covariance matrix Σ.  Coerced to at least 2-D.

    Returns
    -------
    tc : float
        Total correlation in nats.  Returns ``+inf`` if Σ is singular.

    Notes
    -----
    The computation uses:

        TC = (∑ᵢ ½ log(2πe σᵢ²)) − ½(d(1 + log 2π) + log|Σ|)
           = ½ ∑ᵢ log σᵢ² − ½ log|Σ|
           = −½ log|corr(Σ)|

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import total_correlation_gaussian
    >>> # Identity covariance → all marginals independent → TC = 0
    >>> tc = total_correlation_gaussian(np.eye(3))
    >>> np.isclose(tc, 0.0)
    True
    >>> # Correlated covariance → TC > 0
    >>> cov = np.array([[1.0, 0.9], [0.9, 1.0]])
    >>> total_correlation_gaussian(cov) > 0
    True
    """
    cov = np.atleast_2d(cov)
    d = cov.shape[0]
    marginal_vars = np.diag(cov)  # σᵢ²
    # ∑ᵢ H(Xᵢ) = ∑ᵢ ½ log(2πe σᵢ²)
    sum_marg_h = 0.5 * np.sum(np.log(2 * np.pi * np.e * marginal_vars))
    sign, log_det = np.linalg.slogdet(cov)  # log|Σ|
    if sign <= 0:
        return np.inf  # singular Σ
    # H(X) = ½ (d(1 + log 2π) + log|Σ|)
    joint_h = 0.5 * (d * (1 + np.log(2 * np.pi)) + log_det)
    return float(sum_marg_h - joint_h)  # TC = sum H(Xi) - H(X)

`rbig.mutual_information_gaussian(cov_X, cov_Y, cov_XY)` ¶

Analytic mutual information between two jointly Gaussian variables.

Uses the entropy identity:

MI(X; Y) = H(X) + H(Y) − H(X, Y)

where each entropy is computed analytically from the corresponding covariance matrix via :func:entropy_gaussian.

Parameters¶

cov_X : np.ndarray of shape (d_X, d_X) Marginal covariance of X. cov_Y : np.ndarray of shape (d_Y, d_Y) Marginal covariance of Y. cov_XY : np.ndarray of shape (d_X + d_Y, d_X + d_Y) Joint covariance matrix of the concatenated variable [X, Y].

Returns¶

mi : float Mutual information in nats. Non-negative for valid covariance matrices; small negative values indicate numerical imprecision.

Notes¶

For Gaussians the MI can also be expressed as:

MI(X; Y) = −½ log(|Σ_{XX}| · |Σ_{YY}| / |Σ_{XY}|)

Examples¶

import numpy as np from rbig._src.parametric import mutual_information_gaussian

Block-diagonal joint covariance → MI = 0¶

cov_X = np.eye(2) cov_Y = np.eye(2) cov_XY = np.block([[cov_X, np.zeros((2, 2))], [np.zeros((2, 2)), cov_Y]]) mi = mutual_information_gaussian(cov_X, cov_Y, cov_XY) np.isclose(mi, 0.0, atol=1e-10) True

Source code in rbig/_src/parametric.py

def mutual_information_gaussian(
    cov_X: np.ndarray,
    cov_Y: np.ndarray,
    cov_XY: np.ndarray,
) -> float:
    """Analytic mutual information between two jointly Gaussian variables.

    Uses the entropy identity:

        MI(X; Y) = H(X) + H(Y) − H(X, Y)

    where each entropy is computed analytically from the corresponding
    covariance matrix via :func:`entropy_gaussian`.

    Parameters
    ----------
    cov_X : np.ndarray of shape (d_X, d_X)
        Marginal covariance of X.
    cov_Y : np.ndarray of shape (d_Y, d_Y)
        Marginal covariance of Y.
    cov_XY : np.ndarray of shape (d_X + d_Y, d_X + d_Y)
        Joint covariance matrix of the concatenated variable [X, Y].

    Returns
    -------
    mi : float
        Mutual information in nats.  Non-negative for valid covariance
        matrices; small negative values indicate numerical imprecision.

    Notes
    -----
    For Gaussians the MI can also be expressed as:

        MI(X; Y) = −½ log(|Σ_{XX}| · |Σ_{YY}| / |Σ_{XY}|)

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import mutual_information_gaussian
    >>> # Block-diagonal joint covariance → MI = 0
    >>> cov_X = np.eye(2)
    >>> cov_Y = np.eye(2)
    >>> cov_XY = np.block([[cov_X, np.zeros((2, 2))], [np.zeros((2, 2)), cov_Y]])
    >>> mi = mutual_information_gaussian(cov_X, cov_Y, cov_XY)
    >>> np.isclose(mi, 0.0, atol=1e-10)
    True
    """
    hx = entropy_gaussian(cov_X)  # H(X)
    hy = entropy_gaussian(cov_Y)  # H(Y)
    hxy = entropy_gaussian(cov_XY)  # H(X, Y)
    return float(hx + hy - hxy)  # MI(X;Y) = H(X) + H(Y) - H(X,Y)

`rbig.kl_divergence_gaussian(mu0, cov0, mu1, cov1)` ¶

Analytic KL divergence KL(P₀ ‖ P₁) between two multivariate Gaussians.

Both distributions are assumed to be multivariate Gaussian:

P₀ = 𝒩(μ₀, Σ₀)   and   P₁ = 𝒩(μ₁, Σ₁)

The closed-form KL divergence is:

KL(P₀ ‖ P₁) = ½ [ tr(Σ₁⁻¹Σ₀) + (μ₁ − μ₀)ᵀ Σ₁⁻¹ (μ₁ − μ₀)
                   − d + log(|Σ₁| / |Σ₀|) ]

Parameters¶

mu0 : np.ndarray of shape (d,) Mean of the source distribution P₀. cov0 : np.ndarray of shape (d, d) Covariance Σ₀ of the source distribution P₀. mu1 : np.ndarray of shape (d,) Mean of the target distribution P₁. cov1 : np.ndarray of shape (d, d) Covariance Σ₁ of the target distribution P₁.

Returns¶

kl : float KL divergence KL(P₀ ‖ P₁) in nats. Always non-negative for valid covariance matrices.

Notes¶

The matrix inverse Σ₁⁻¹ is computed via np.linalg.inv; for large d a Cholesky-based approach would be more numerically stable.

Examples¶

import numpy as np from rbig._src.parametric import kl_divergence_gaussian

KL(P ‖ P) = 0 for identical distributions¶

mu = np.array([1.0, 2.0]) cov = np.array([[2.0, 0.5], [0.5, 1.5]]) kl = kl_divergence_gaussian(mu, cov, mu, cov) np.isclose(kl, 0.0, atol=1e-10) True

Source code in rbig/_src/parametric.py

def kl_divergence_gaussian(
    mu0: np.ndarray,
    cov0: np.ndarray,
    mu1: np.ndarray,
    cov1: np.ndarray,
) -> float:
    """Analytic KL divergence KL(P₀ ‖ P₁) between two multivariate Gaussians.

    Both distributions are assumed to be multivariate Gaussian:

        P₀ = 𝒩(μ₀, Σ₀)   and   P₁ = 𝒩(μ₁, Σ₁)

    The closed-form KL divergence is:

        KL(P₀ ‖ P₁) = ½ [ tr(Σ₁⁻¹Σ₀) + (μ₁ − μ₀)ᵀ Σ₁⁻¹ (μ₁ − μ₀)
                           − d + log(|Σ₁| / |Σ₀|) ]

    Parameters
    ----------
    mu0 : np.ndarray of shape (d,)
        Mean of the *source* distribution P₀.
    cov0 : np.ndarray of shape (d, d)
        Covariance Σ₀ of the source distribution P₀.
    mu1 : np.ndarray of shape (d,)
        Mean of the *target* distribution P₁.
    cov1 : np.ndarray of shape (d, d)
        Covariance Σ₁ of the target distribution P₁.

    Returns
    -------
    kl : float
        KL divergence KL(P₀ ‖ P₁) in nats.  Always non-negative for valid
        covariance matrices.

    Notes
    -----
    The matrix inverse Σ₁⁻¹ is computed via ``np.linalg.inv``; for large d
    a Cholesky-based approach would be more numerically stable.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import kl_divergence_gaussian
    >>> # KL(P ‖ P) = 0 for identical distributions
    >>> mu = np.array([1.0, 2.0])
    >>> cov = np.array([[2.0, 0.5], [0.5, 1.5]])
    >>> kl = kl_divergence_gaussian(mu, cov, mu, cov)
    >>> np.isclose(kl, 0.0, atol=1e-10)
    True
    """
    mu0 = np.atleast_1d(mu0)
    mu1 = np.atleast_1d(mu1)
    cov0 = np.atleast_2d(cov0)
    cov1 = np.atleast_2d(cov1)
    d = len(mu0)
    cov1_inv = np.linalg.inv(cov1)  # Σ₁⁻¹, shape (d, d)
    diff = mu1 - mu0  # mu1 - mu0, shape (d,)
    _sign0, log_det0 = np.linalg.slogdet(cov0)  # log|Σ₀|
    _sign1, log_det1 = np.linalg.slogdet(cov1)  # log|Σ₁|
    trace_term = np.trace(cov1_inv @ cov0)  # tr(Σ₁⁻¹Σ₀)
    quad_term = diff @ cov1_inv @ diff  # (mu1-mu0)^T Sigma1^-1 (mu1-mu0)
    log_det_term = log_det1 - log_det0  # log(|Sigma1|/|Sigma0|)
    # KL = 0.5 [tr(Sigma1^-1 Sigma0) + quad - d + log(|Sigma1|/|Sigma0|)]
    return float(0.5 * (trace_term + quad_term - d + log_det_term))

`rbig.entropy_gaussian(cov)` ¶

Analytic differential entropy of a multivariate Gaussian.

Computes the closed-form entropy of 𝒩(μ, Σ):

H(X) = ½ log|2πeΣ| = ½ (d(1 + log 2π) + log|Σ|)

where d is the dimensionality and |·| denotes the matrix determinant. The mean μ does not affect the entropy.

Parameters¶

cov : np.ndarray of shape (d, d) or (1,) for scalar Covariance matrix Σ (or variance for d=1). Coerced to at least 2-D via np.atleast_2d.

Returns¶

entropy : float Differential entropy in nats. Returns -inf if Σ is singular or not positive definite.

Notes¶

np.linalg.slogdet is used for numerically stable log-determinant computation.

Examples¶

import numpy as np from rbig._src.parametric import entropy_gaussian

2-D standard Gaussian: H = 0.5 * 2 * (1 + log 2π) ≈ 2.838 nats¶

h = entropy_gaussian(np.eye(2)) np.isclose(h, 0.5 * 2 * (1 + np.log(2 * np.pi))) True

Source code in rbig/_src/parametric.py

def entropy_gaussian(cov: np.ndarray) -> float:
    """Analytic differential entropy of a multivariate Gaussian.

    Computes the closed-form entropy of 𝒩(μ, Σ):

        H(X) = ½ log|2πeΣ| = ½ (d(1 + log 2π) + log|Σ|)

    where d is the dimensionality and |·| denotes the matrix determinant.
    The mean μ does not affect the entropy.

    Parameters
    ----------
    cov : np.ndarray of shape (d, d) or (1,) for scalar
        Covariance matrix Σ (or variance for d=1).  Coerced to at least 2-D
        via ``np.atleast_2d``.

    Returns
    -------
    entropy : float
        Differential entropy in nats.  Returns ``-inf`` if Σ is singular or
        not positive definite.

    Notes
    -----
    ``np.linalg.slogdet`` is used for numerically stable log-determinant
    computation.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import entropy_gaussian
    >>> # 2-D standard Gaussian: H = 0.5 * 2 * (1 + log 2π) ≈ 2.838 nats
    >>> h = entropy_gaussian(np.eye(2))
    >>> np.isclose(h, 0.5 * 2 * (1 + np.log(2 * np.pi)))
    True
    """
    cov = np.atleast_2d(cov)
    d = cov.shape[0]
    sign, log_det = np.linalg.slogdet(cov)  # stable log|Σ|
    if sign <= 0:
        return -np.inf  # singular covariance
    # H = ½ (d(1 + log 2π) + log|Σ|)
    return 0.5 * (d * (1 + np.log(2 * np.pi)) + log_det)

Image Processing¶

`rbig.ImageRBIG` ¶

RBIG orchestrator for image data.

Alternates between marginal Gaussianisation and an orthonormal spatial rotation for n_layers steps, progressively pushing the joint distribution of image pixels towards a multivariate Gaussian.

Each layer applies:

:class:~rbig._src.marginal.MarginalGaussianize — maps every feature dimension to a standard normal marginal distribution.
An orthonormal rotation selected by strategy — decorrelates features without changing the differential entropy.

Parameters¶

n_layers : int, default 10 Number of (marginal + rotation) layer pairs to apply. C : int, default 1 Number of image channels passed to the rotation layers. H : int, default 8 Image height in pixels passed to the rotation layers. W : int, default 8 Image width in pixels passed to the rotation layers. strategy : str, default "dct" Rotation strategy. One of:

* ``"dct"`` — Type-II orthonormal DCT (:class:`DCTRotation`).
* ``"hartley"`` — Discrete Hartley Transform
  (:class:`HartleyRotation`).
* ``"random_channel"`` — Random orthogonal channel mixing
  (:class:`RandomChannelRotation`).

Any unknown string falls back to ``"dct"``.

random_state : int or None, default None Base seed for rotation layers that use randomness (random_channel). Layer i uses seed random_state + i. verbose : bool or int, default=False Controls progress bar display. False (or 0) disables all progress bars. True (or 1) shows a progress bar for the fit loop. 2 additionally shows progress bars for transform and inverse_transform.

Attributes¶

layers_ : list of tuple (MarginalGaussianize, ImageBijector) Fitted (marginal, rotation) pairs in application order. X_transformed_ : np.ndarray, shape (N, C*H*W) Final transformed representation after the last layer.

Notes¶

The composed forward transform for a single sample :math:\mathbf{x} is

.. math::

\mathbf{z} = (R_L \circ G_L \circ \cdots \circ R_1 \circ G_1)(\mathbf{x})

where :math:G_\ell is marginal Gaussianisation and :math:R_\ell is an orthonormal rotation at layer :math:\ell. Because each rotation is orthonormal, the total log-determinant is determined entirely by the marginal transforms.

Examples¶

import numpy as np rng = np.random.default_rng(0) X = rng.standard_normal((50, 64)) # 50 images, C=1, H=8, W=8 model = ImageRBIG(n_layers=3, C=1, H=8, W=8, strategy="dct", random_state=0) model.fit(X) # doctest: +ELLIPSIS ImageRBIG(...) Xt = model.transform(X) Xt.shape (50, 64) Xr = model.inverse_transform(Xt) Xr.shape (50, 64)

Source code in rbig/_src/image.py

class ImageRBIG:
    """RBIG orchestrator for image data.

    Alternates between marginal Gaussianisation and an orthonormal spatial
    rotation for ``n_layers`` steps, progressively pushing the joint
    distribution of image pixels towards a multivariate Gaussian.

    Each layer applies:

    1. :class:`~rbig._src.marginal.MarginalGaussianize` — maps every feature
       dimension to a standard normal marginal distribution.
    2. An orthonormal rotation selected by ``strategy`` — decorrelates
       features without changing the differential entropy.

    Parameters
    ----------
    n_layers : int, default 10
        Number of (marginal + rotation) layer pairs to apply.
    C : int, default 1
        Number of image channels passed to the rotation layers.
    H : int, default 8
        Image height in pixels passed to the rotation layers.
    W : int, default 8
        Image width in pixels passed to the rotation layers.
    strategy : str, default ``"dct"``
        Rotation strategy.  One of:

        * ``"dct"`` — Type-II orthonormal DCT (:class:`DCTRotation`).
        * ``"hartley"`` — Discrete Hartley Transform
          (:class:`HartleyRotation`).
        * ``"random_channel"`` — Random orthogonal channel mixing
          (:class:`RandomChannelRotation`).

        Any unknown string falls back to ``"dct"``.
    random_state : int or None, default None
        Base seed for rotation layers that use randomness (``random_channel``).
        Layer ``i`` uses seed ``random_state + i``.
    verbose : bool or int, default=False
        Controls progress bar display.  ``False`` (or ``0``) disables all
        progress bars.  ``True`` (or ``1``) shows a progress bar for the
        ``fit`` loop.  ``2`` additionally shows progress bars for
        ``transform`` and ``inverse_transform``.

    Attributes
    ----------
    layers_ : list of tuple (MarginalGaussianize, ImageBijector)
        Fitted (marginal, rotation) pairs in application order.
    X_transformed_ : np.ndarray, shape ``(N, C*H*W)``
        Final transformed representation after the last layer.

    Notes
    -----
    The composed forward transform for a single sample :math:`\\mathbf{x}` is

    .. math::

        \\mathbf{z} = (R_L \\circ G_L \\circ \\cdots \\circ R_1 \\circ G_1)(\\mathbf{x})

    where :math:`G_\\ell` is marginal Gaussianisation and :math:`R_\\ell` is an
    orthonormal rotation at layer :math:`\\ell`.  Because each rotation is
    orthonormal, the total log-determinant is determined entirely by the
    marginal transforms.

    Examples
    --------
    >>> import numpy as np
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((50, 64))  # 50 images, C=1, H=8, W=8
    >>> model = ImageRBIG(n_layers=3, C=1, H=8, W=8, strategy="dct", random_state=0)
    >>> model.fit(X)  # doctest: +ELLIPSIS
    ImageRBIG(...)
    >>> Xt = model.transform(X)
    >>> Xt.shape
    (50, 64)
    >>> Xr = model.inverse_transform(Xt)
    >>> Xr.shape
    (50, 64)
    """

    def __init__(
        self,
        n_layers: int = 10,
        C: int = 1,
        H: int = 8,
        W: int = 8,
        strategy: str = "dct",
        random_state: int | None = None,
        verbose: bool | int = False,
    ):
        self.n_layers = n_layers
        self.C = C
        self.H = H
        self.W = W
        self.strategy = strategy
        self.random_state = random_state
        self.verbose = verbose

    def fit(self, X: np.ndarray, y=None) -> ImageRBIG:
        """Fit all (marginal, rotation) layer pairs sequentially.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Training image batch in flattened format.

        Returns
        -------
        self : ImageRBIG
        """
        from rbig._src._progress import maybe_tqdm
        from rbig._src.marginal import MarginalGaussianize

        self.layers_ = []
        Xt = X.copy()  # working copy updated layer by layer
        pbar = maybe_tqdm(
            range(self.n_layers),
            verbose=self.verbose,
            level=1,
            desc="Fitting ImageRBIG",
            total=self.n_layers,
        )
        for i in pbar:
            # Step 1: marginal Gaussianisation
            marginal = MarginalGaussianize()
            Xt_m = marginal.fit_transform(Xt)
            # Step 2: orthonormal spatial rotation
            rotation = self._make_rotation(seed=i)
            rotation.fit(Xt_m)
            self.layers_.append((marginal, rotation))
            Xt = rotation.transform(Xt_m)  # update for the next iteration
        self.X_transformed_ = Xt  # final representation after all layers
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply all fitted layers in forward order.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Image batch to transform.

        Returns
        -------
        Xt : np.ndarray, shape ``(N, C*H*W)``
            Gaussianised representation.
        """
        from rbig._src._progress import maybe_tqdm

        Xt = X.copy()
        layers_iter = maybe_tqdm(
            self.layers_,
            verbose=self.verbose,
            level=2,
            desc="Transforming",
            total=len(self.layers_),
        )
        for marginal, rotation in layers_iter:
            Xt = rotation.transform(marginal.transform(Xt))
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply all fitted layers in reverse order.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Gaussianised representation to invert.

        Returns
        -------
        Xr : np.ndarray, shape ``(N, C*H*W)``
            Reconstructed image batch in the original domain.
        """
        from rbig._src._progress import maybe_tqdm

        Xt = X.copy()
        # Reverse the layer list; apply inverse of each (rotation first, then marginal)
        layers_iter = maybe_tqdm(
            reversed(self.layers_),
            verbose=self.verbose,
            level=2,
            desc="Inverse transforming",
            total=len(self.layers_),
        )
        for marginal, rotation in layers_iter:
            Xt = marginal.inverse_transform(rotation.inverse_transform(Xt))
        return Xt

    def _make_rotation(self, seed: int = 0):
        """Instantiate the rotation layer for the given layer index.

        Parameters
        ----------
        seed : int, default 0
            Layer index; combined with ``random_state`` for reproducibility.

        Returns
        -------
        rotation : ImageBijector
            An unfitted rotation bijector of the type specified by
            ``self.strategy``.
        """
        # Combine base seed with layer index so each layer gets a unique seed
        rng_seed = (self.random_state or 0) + seed
        if self.strategy == "dct":
            return DCTRotation(C=self.C, H=self.H, W=self.W)
        elif self.strategy == "hartley":
            return HartleyRotation(C=self.C, H=self.H, W=self.W)
        elif self.strategy == "random_channel":
            return RandomChannelRotation(
                C=self.C, H=self.H, W=self.W, random_state=rng_seed
            )
        else:
            # Unknown strategy: fall back to DCT
            return DCTRotation(C=self.C, H=self.H, W=self.W)

`fit(X, y=None)` ¶

Fit all (marginal, rotation) layer pairs sequentially.

Parameters¶

X : np.ndarray, shape (N, C*H*W) Training image batch in flattened format.

Returns¶

self : ImageRBIG

Source code in rbig/_src/image.py

def fit(self, X: np.ndarray, y=None) -> ImageRBIG:
    """Fit all (marginal, rotation) layer pairs sequentially.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Training image batch in flattened format.

    Returns
    -------
    self : ImageRBIG
    """
    from rbig._src._progress import maybe_tqdm
    from rbig._src.marginal import MarginalGaussianize

    self.layers_ = []
    Xt = X.copy()  # working copy updated layer by layer
    pbar = maybe_tqdm(
        range(self.n_layers),
        verbose=self.verbose,
        level=1,
        desc="Fitting ImageRBIG",
        total=self.n_layers,
    )
    for i in pbar:
        # Step 1: marginal Gaussianisation
        marginal = MarginalGaussianize()
        Xt_m = marginal.fit_transform(Xt)
        # Step 2: orthonormal spatial rotation
        rotation = self._make_rotation(seed=i)
        rotation.fit(Xt_m)
        self.layers_.append((marginal, rotation))
        Xt = rotation.transform(Xt_m)  # update for the next iteration
    self.X_transformed_ = Xt  # final representation after all layers
    return self

`transform(X)` ¶

Apply all fitted layers in forward order.

Parameters¶

X : np.ndarray, shape (N, C*H*W) Image batch to transform.

Returns¶

Xt : np.ndarray, shape (N, C*H*W) Gaussianised representation.

Source code in rbig/_src/image.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply all fitted layers in forward order.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Image batch to transform.

    Returns
    -------
    Xt : np.ndarray, shape ``(N, C*H*W)``
        Gaussianised representation.
    """
    from rbig._src._progress import maybe_tqdm

    Xt = X.copy()
    layers_iter = maybe_tqdm(
        self.layers_,
        verbose=self.verbose,
        level=2,
        desc="Transforming",
        total=len(self.layers_),
    )
    for marginal, rotation in layers_iter:
        Xt = rotation.transform(marginal.transform(Xt))
    return Xt

`inverse_transform(X)` ¶

Apply all fitted layers in reverse order.

Parameters¶

X : np.ndarray, shape (N, C*H*W) Gaussianised representation to invert.

Returns¶

Xr : np.ndarray, shape (N, C*H*W) Reconstructed image batch in the original domain.

Source code in rbig/_src/image.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply all fitted layers in reverse order.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Gaussianised representation to invert.

    Returns
    -------
    Xr : np.ndarray, shape ``(N, C*H*W)``
        Reconstructed image batch in the original domain.
    """
    from rbig._src._progress import maybe_tqdm

    Xt = X.copy()
    # Reverse the layer list; apply inverse of each (rotation first, then marginal)
    layers_iter = maybe_tqdm(
        reversed(self.layers_),
        verbose=self.verbose,
        level=2,
        desc="Inverse transforming",
        total=len(self.layers_),
    )
    for marginal, rotation in layers_iter:
        Xt = marginal.inverse_transform(rotation.inverse_transform(Xt))
    return Xt

`rbig.ImageBijector` ¶

Bases: Bijector

Abstract base class for bijective image transforms.

Manages the conversion between the flattened representation (N, C·H·W) expected by RBIG and the 4-D tensor (N, C, H, W) used internally by spatial transforms.

Subclasses must implement :meth:fit, :meth:transform, and :meth:inverse_transform. The default :meth:get_log_det_jacobian returns zeros, which is correct for all orthonormal transforms defined in this module (|det J| = 1).

Attributes¶

C_ : int Number of channels (set during :meth:fit). H_ : int Image height in pixels (set during :meth:fit). W_ : int Image width in pixels (set during :meth:fit).

Notes¶

The two helper methods implement:

.. math::

\text{_to_tensor}: (N, C \cdot H \cdot W)
    \longrightarrow (N, C, H, W)

\text{_to_flat}: (N, C, H, W)
    \longrightarrow (N, C \cdot H \cdot W)

Source code in rbig/_src/image.py

class ImageBijector(Bijector):
    """Abstract base class for bijective image transforms.

    Manages the conversion between the flattened representation
    ``(N, C·H·W)`` expected by RBIG and the 4-D tensor ``(N, C, H, W)``
    used internally by spatial transforms.

    Subclasses must implement :meth:`fit`, :meth:`transform`, and
    :meth:`inverse_transform`.  The default :meth:`get_log_det_jacobian`
    returns zeros, which is correct for all orthonormal transforms defined
    in this module (``|det J| = 1``).

    Attributes
    ----------
    C_ : int
        Number of channels (set during :meth:`fit`).
    H_ : int
        Image height in pixels (set during :meth:`fit`).
    W_ : int
        Image width in pixels (set during :meth:`fit`).

    Notes
    -----
    The two helper methods implement:

    .. math::

        \\text{_to_tensor}: (N, C \\cdot H \\cdot W)
            \\longrightarrow (N, C, H, W)

        \\text{_to_flat}: (N, C, H, W)
            \\longrightarrow (N, C \\cdot H \\cdot W)
    """

    def _to_tensor(self, X: np.ndarray) -> np.ndarray:
        """Reshape flat ``(N, C*H*W)`` array to tensor ``(N, C, H, W)``.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Flattened image batch.

        Returns
        -------
        tensor : np.ndarray, shape ``(N, C, H, W)``
        """
        N = X.shape[0]
        C, H, W = self.C_, self.H_, self.W_
        return X.reshape(N, C, H, W)

    def _to_flat(self, X: np.ndarray) -> np.ndarray:
        """Reshape tensor ``(N, C, H, W)`` to flat ``(N, C*H*W)``.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C, H, W)``
            Image tensor batch.

        Returns
        -------
        flat : np.ndarray, shape ``(N, C*H*W)``
        """
        N = X.shape[0]
        return X.reshape(N, -1)

    def fit(self, X: np.ndarray, y=None) -> ImageBijector:
        raise NotImplementedError

    def transform(self, X: np.ndarray) -> np.ndarray:
        raise NotImplementedError

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        raise NotImplementedError

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return per-sample log |det J| = 0 (orthonormal transform).

        Parameters
        ----------
        X : np.ndarray, shape ``(N, D)``

        Returns
        -------
        log_det : np.ndarray, shape ``(N,)``
            All-zero array because the Jacobian determinant is ±1 for every
            orthonormal linear map.
        """
        return np.zeros(X.shape[0])

`get_log_det_jacobian(X)` ¶

Return per-sample log |det J| = 0 (orthonormal transform).

Parameters¶

X : np.ndarray, shape (N, D)

Returns¶

log_det : np.ndarray, shape (N,) All-zero array because the Jacobian determinant is ±1 for every orthonormal linear map.

Source code in rbig/_src/image.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return per-sample log |det J| = 0 (orthonormal transform).

    Parameters
    ----------
    X : np.ndarray, shape ``(N, D)``

    Returns
    -------
    log_det : np.ndarray, shape ``(N,)``
        All-zero array because the Jacobian determinant is ±1 for every
        orthonormal linear map.
    """
    return np.zeros(X.shape[0])

`rbig.DCTRotation` ¶

Bases: ImageBijector

Type-II orthonormal 2-D Discrete Cosine Transform rotation.

Applies the 2-D DCT-II with orthonormal normalisation (norm="ortho") to each spatial channel. Because the ortho-normalised DCT is an orthogonal matrix, log|det J| = 0 for all inputs.

Parameters¶

C : int, default 1 Number of image channels. H : int, default 8 Image height in pixels. W : int, default 8 Image width in pixels.

Attributes¶

C_ : int Fitted number of channels. H_ : int Fitted image height. W_ : int Fitted image width.

Notes¶

For an orthonormal DCT matrix :math:\mathbf{D} acting on the vectorised image :math:\mathbf{x}:

.. math::

\mathbf{y} = \mathbf{D}\,\mathbf{x},
\quad
\log |\det J| = \log |\det \mathbf{D}| = 0

because :math:\mathbf{D} is orthogonal (det = ±1).

Examples¶

import numpy as np rng = np.random.default_rng(0) X = rng.standard_normal((5, 64)) # N=5, C=1, H=8, W=8 layer = DCTRotation(C=1, H=8, W=8) layer.fit(X) # doctest: +ELLIPSIS DCTRotation(...) Xt = layer.transform(X) Xr = layer.inverse_transform(Xt) np.allclose(X, Xr, atol=1e-10) True

Source code in rbig/_src/image.py

class DCTRotation(ImageBijector):
    """Type-II orthonormal 2-D Discrete Cosine Transform rotation.

    Applies the 2-D DCT-II with orthonormal normalisation (``norm="ortho"``)
    to each spatial channel.  Because the ortho-normalised DCT is an
    orthogonal matrix, ``log|det J| = 0`` for all inputs.

    Parameters
    ----------
    C : int, default 1
        Number of image channels.
    H : int, default 8
        Image height in pixels.
    W : int, default 8
        Image width in pixels.

    Attributes
    ----------
    C_ : int
        Fitted number of channels.
    H_ : int
        Fitted image height.
    W_ : int
        Fitted image width.

    Notes
    -----
    For an orthonormal DCT matrix :math:`\\mathbf{D}` acting on the
    vectorised image :math:`\\mathbf{x}`:

    .. math::

        \\mathbf{y} = \\mathbf{D}\\,\\mathbf{x},
        \\quad
        \\log |\\det J| = \\log |\\det \\mathbf{D}| = 0

    because :math:`\\mathbf{D}` is orthogonal (``det = ±1``).

    Examples
    --------
    >>> import numpy as np
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((5, 64))  # N=5, C=1, H=8, W=8
    >>> layer = DCTRotation(C=1, H=8, W=8)
    >>> layer.fit(X)  # doctest: +ELLIPSIS
    DCTRotation(...)
    >>> Xt = layer.transform(X)
    >>> Xr = layer.inverse_transform(Xt)
    >>> np.allclose(X, Xr, atol=1e-10)
    True
    """

    def __init__(self, C: int = 1, H: int = 8, W: int = 8):
        self.C = C
        self.H = H
        self.W = W

    def fit(self, X: np.ndarray, y=None) -> DCTRotation:
        """Store spatial dimensions; no data-dependent fitting required.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``

        Returns
        -------
        self : DCTRotation
        """
        self.C_ = self.C
        self.H_ = self.H
        self.W_ = self.W
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply orthonormal 2-D DCT-II to every image channel.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Flattened image batch.

        Returns
        -------
        Xt : np.ndarray, shape ``(N, C*H*W)``
            Orthonormal DCT-II coefficients.
        """
        from scipy.fft import dctn

        N = X.shape[0]
        imgs = self._to_tensor(X)  # (N, C, H, W)
        result = np.zeros_like(imgs)
        for n in range(N):
            for c in range(self.C_):
                # norm="ortho" yields the orthonormal variant of the DCT-II
                result[n, c] = dctn(imgs[n, c], norm="ortho")
        return self._to_flat(result)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply orthonormal 2-D inverse DCT (DCT-III scaled).

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            DCT coefficient batch from :meth:`transform`.

        Returns
        -------
        Xr : np.ndarray, shape ``(N, C*H*W)``
            Reconstructed image batch.
        """
        from scipy.fft import idctn

        N = X.shape[0]
        imgs = X.reshape(N, self.C_, self.H_, self.W_)  # (N, C, H, W)
        result = np.zeros_like(imgs)
        for n in range(N):
            for c in range(self.C_):
                # idctn with norm="ortho" is the exact inverse of dctn with norm="ortho"
                result[n, c] = idctn(imgs[n, c], norm="ortho")
        return self._to_flat(result)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros: orthonormal DCT has ``log|det J| = 0``.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, D)``

        Returns
        -------
        log_det : np.ndarray, shape ``(N,)``
        """
        return np.zeros(X.shape[0])

`fit(X, y=None)` ¶

Store spatial dimensions; no data-dependent fitting required.

Parameters¶

X : np.ndarray, shape (N, C*H*W)

Returns¶

self : DCTRotation

Source code in rbig/_src/image.py

def fit(self, X: np.ndarray, y=None) -> DCTRotation:
    """Store spatial dimensions; no data-dependent fitting required.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``

    Returns
    -------
    self : DCTRotation
    """
    self.C_ = self.C
    self.H_ = self.H
    self.W_ = self.W
    return self

`transform(X)` ¶

Apply orthonormal 2-D DCT-II to every image channel.

Parameters¶

X : np.ndarray, shape (N, C*H*W) Flattened image batch.

Returns¶

Xt : np.ndarray, shape (N, C*H*W) Orthonormal DCT-II coefficients.

Source code in rbig/_src/image.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply orthonormal 2-D DCT-II to every image channel.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Flattened image batch.

    Returns
    -------
    Xt : np.ndarray, shape ``(N, C*H*W)``
        Orthonormal DCT-II coefficients.
    """
    from scipy.fft import dctn

    N = X.shape[0]
    imgs = self._to_tensor(X)  # (N, C, H, W)
    result = np.zeros_like(imgs)
    for n in range(N):
        for c in range(self.C_):
            # norm="ortho" yields the orthonormal variant of the DCT-II
            result[n, c] = dctn(imgs[n, c], norm="ortho")
    return self._to_flat(result)

`inverse_transform(X)` ¶

Apply orthonormal 2-D inverse DCT (DCT-III scaled).

Parameters¶

X : np.ndarray, shape (N, C*H*W) DCT coefficient batch from :meth:transform.

Returns¶

Xr : np.ndarray, shape (N, C*H*W) Reconstructed image batch.

Source code in rbig/_src/image.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply orthonormal 2-D inverse DCT (DCT-III scaled).

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        DCT coefficient batch from :meth:`transform`.

    Returns
    -------
    Xr : np.ndarray, shape ``(N, C*H*W)``
        Reconstructed image batch.
    """
    from scipy.fft import idctn

    N = X.shape[0]
    imgs = X.reshape(N, self.C_, self.H_, self.W_)  # (N, C, H, W)
    result = np.zeros_like(imgs)
    for n in range(N):
        for c in range(self.C_):
            # idctn with norm="ortho" is the exact inverse of dctn with norm="ortho"
            result[n, c] = idctn(imgs[n, c], norm="ortho")
    return self._to_flat(result)

`get_log_det_jacobian(X)` ¶

Return zeros: orthonormal DCT has log|det J| = 0.

Parameters¶

X : np.ndarray, shape (N, D)

Returns¶

log_det : np.ndarray, shape (N,)

Source code in rbig/_src/image.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros: orthonormal DCT has ``log|det J| = 0``.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, D)``

    Returns
    -------
    log_det : np.ndarray, shape ``(N,)``
    """
    return np.zeros(X.shape[0])

`rbig.HartleyRotation` ¶

Bases: ImageBijector

Discrete Hartley Transform — real-to-real orthonormal rotation.

The 2-D Discrete Hartley Transform (DHT) is defined as

.. math::

H(\mathbf{x}) = \operatorname{Re}\bigl(\text{FFT}(\mathbf{x})\bigr)
                - \operatorname{Im}\bigl(\text{FFT}(\mathbf{x})\bigr)

and is normalised by 1/√(H·W) to make it orthonormal (unitary). Because the DHT is its own inverse (self-inverse), the same operation is applied in both :meth:transform and :meth:inverse_transform.

Since the transform is orthonormal log|det J| = 0 for all inputs.

Parameters¶

C : int, default 1 Number of image channels. H : int, default 8 Image height in pixels. W : int, default 8 Image width in pixels.

Attributes¶

C_ : int Fitted number of channels. H_ : int Fitted image height. W_ : int Fitted image width.

Notes¶

The normalised DHT satisfies

.. math::

H(H(\mathbf{x})) = \mathbf{x}

making it a self-inverse bijection. The scaling factor is :math:1 / \sqrt{H \cdot W}.

Examples¶

import numpy as np rng = np.random.default_rng(0) X = rng.standard_normal((5, 64)) # N=5, C=1, H=8, W=8 layer = HartleyRotation(C=1, H=8, W=8) layer.fit(X) # doctest: +ELLIPSIS HartleyRotation(...) Xt = layer.transform(X) Xr = layer.inverse_transform(Xt) np.allclose(X, Xr, atol=1e-10) True

Source code in rbig/_src/image.py

class HartleyRotation(ImageBijector):
    """Discrete Hartley Transform — real-to-real orthonormal rotation.

    The 2-D Discrete Hartley Transform (DHT) is defined as

    .. math::

        H(\\mathbf{x}) = \\operatorname{Re}\\bigl(\\text{FFT}(\\mathbf{x})\\bigr)
                        - \\operatorname{Im}\\bigl(\\text{FFT}(\\mathbf{x})\\bigr)

    and is normalised by ``1/√(H·W)`` to make it orthonormal (unitary).
    Because the DHT is its own inverse (self-inverse), the same operation is
    applied in both :meth:`transform` and :meth:`inverse_transform`.

    Since the transform is orthonormal ``log|det J| = 0`` for all inputs.

    Parameters
    ----------
    C : int, default 1
        Number of image channels.
    H : int, default 8
        Image height in pixels.
    W : int, default 8
        Image width in pixels.

    Attributes
    ----------
    C_ : int
        Fitted number of channels.
    H_ : int
        Fitted image height.
    W_ : int
        Fitted image width.

    Notes
    -----
    The normalised DHT satisfies

    .. math::

        H(H(\\mathbf{x})) = \\mathbf{x}

    making it a self-inverse bijection.  The scaling factor is
    :math:`1 / \\sqrt{H \\cdot W}`.

    Examples
    --------
    >>> import numpy as np
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((5, 64))  # N=5, C=1, H=8, W=8
    >>> layer = HartleyRotation(C=1, H=8, W=8)
    >>> layer.fit(X)  # doctest: +ELLIPSIS
    HartleyRotation(...)
    >>> Xt = layer.transform(X)
    >>> Xr = layer.inverse_transform(Xt)
    >>> np.allclose(X, Xr, atol=1e-10)
    True
    """

    def __init__(self, C: int = 1, H: int = 8, W: int = 8):
        self.C = C
        self.H = H
        self.W = W

    def fit(self, X: np.ndarray, y=None) -> HartleyRotation:
        """Store spatial dimensions; no data-dependent fitting required.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``

        Returns
        -------
        self : HartleyRotation
        """
        self.C_ = self.C
        self.H_ = self.H
        self.W_ = self.W
        return self

    def _dht2(self, x: np.ndarray) -> np.ndarray:
        """Compute un-normalised 2-D Discrete Hartley Transform.

        .. math::

            H_{m,n} = \\operatorname{Re}(F_{m,n}) - \\operatorname{Im}(F_{m,n})

        where :math:`F = \\text{FFT2}(x)`.

        Parameters
        ----------
        x : np.ndarray, shape ``(H, W)``
            Single real-valued image channel.

        Returns
        -------
        h : np.ndarray, shape ``(H, W)``
            Un-normalised DHT coefficients.
        """
        from scipy.fft import fft2

        X_fft = fft2(x)  # complex FFT2, shape (H, W)
        # DHT = Re(FFT) - Im(FFT)
        return X_fft.real - X_fft.imag

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply normalised 2-D DHT to every image channel.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Flattened image batch.

        Returns
        -------
        Xt : np.ndarray, shape ``(N, C*H*W)``
            DHT coefficients scaled by ``1/√(H*W)``.
        """
        N = X.shape[0]
        imgs = self._to_tensor(X)  # (N, C, H, W)
        result = np.zeros_like(imgs)
        for n in range(N):
            for c in range(self.C_):
                # Normalise by 1/√(H·W) to make the transform orthonormal
                result[n, c] = self._dht2(imgs[n, c]) / np.sqrt(self.H_ * self.W_)
        return self._to_flat(result)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the inverse DHT (identical to the forward transform).

        The normalised DHT satisfies ``H(H(x)) = x``, so the forward and
        inverse transforms are the same function.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``

        Returns
        -------
        Xr : np.ndarray, shape ``(N, C*H*W)``
        """
        # DHT is self-inverse (up to scale factor already applied in transform)
        return self.transform(X)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros: normalised DHT has ``|det J| = 1``.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, D)``

        Returns
        -------
        log_det : np.ndarray, shape ``(N,)``
        """
        return np.zeros(X.shape[0])

`fit(X, y=None)` ¶

Store spatial dimensions; no data-dependent fitting required.

Parameters¶

X : np.ndarray, shape (N, C*H*W)

Returns¶

self : HartleyRotation

Source code in rbig/_src/image.py

def fit(self, X: np.ndarray, y=None) -> HartleyRotation:
    """Store spatial dimensions; no data-dependent fitting required.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``

    Returns
    -------
    self : HartleyRotation
    """
    self.C_ = self.C
    self.H_ = self.H
    self.W_ = self.W
    return self

`transform(X)` ¶

Apply normalised 2-D DHT to every image channel.

Parameters¶

X : np.ndarray, shape (N, C*H*W) Flattened image batch.

Returns¶

Xt : np.ndarray, shape (N, C*H*W) DHT coefficients scaled by 1/√(H*W).

Source code in rbig/_src/image.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply normalised 2-D DHT to every image channel.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Flattened image batch.

    Returns
    -------
    Xt : np.ndarray, shape ``(N, C*H*W)``
        DHT coefficients scaled by ``1/√(H*W)``.
    """
    N = X.shape[0]
    imgs = self._to_tensor(X)  # (N, C, H, W)
    result = np.zeros_like(imgs)
    for n in range(N):
        for c in range(self.C_):
            # Normalise by 1/√(H·W) to make the transform orthonormal
            result[n, c] = self._dht2(imgs[n, c]) / np.sqrt(self.H_ * self.W_)
    return self._to_flat(result)

`inverse_transform(X)` ¶

Apply the inverse DHT (identical to the forward transform).

The normalised DHT satisfies H(H(x)) = x, so the forward and inverse transforms are the same function.

Parameters¶

X : np.ndarray, shape (N, C*H*W)

Returns¶

Xr : np.ndarray, shape (N, C*H*W)

Source code in rbig/_src/image.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the inverse DHT (identical to the forward transform).

    The normalised DHT satisfies ``H(H(x)) = x``, so the forward and
    inverse transforms are the same function.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``

    Returns
    -------
    Xr : np.ndarray, shape ``(N, C*H*W)``
    """
    # DHT is self-inverse (up to scale factor already applied in transform)
    return self.transform(X)

`get_log_det_jacobian(X)` ¶

Return zeros: normalised DHT has |det J| = 1.

Parameters¶

X : np.ndarray, shape (N, D)

Returns¶

log_det : np.ndarray, shape (N,)

Source code in rbig/_src/image.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros: normalised DHT has ``|det J| = 1``.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, D)``

    Returns
    -------
    log_det : np.ndarray, shape ``(N,)``
    """
    return np.zeros(X.shape[0])

`rbig.WaveletTransform` ¶

Bases: BaseTransform

Multi-level 2-D wavelet decomposition for image data.

Wraps PyWavelets wavedec2 / waverec2 to provide the standard fit / transform / inverse_transform interface expected by RBIG pipeline components.

The forward transform maps each (H, W) image to a flat coefficient vector of length H * W (for periodization boundary mode the coefficient array has the same number of elements as the input image).

Requires PyWavelets (pip install PyWavelets).

Parameters¶

wavelet : str, default "haar" Wavelet name accepted by :func:pywt.Wavelet, e.g. "haar", "db2", "sym4". level : int, default 1 Decomposition depth. Higher levels produce coarser approximation sub-bands. mode : str, default "periodization" Signal extension mode passed to PyWavelets. "periodization" ensures the output coefficient array has the same total size as the input.

Attributes¶

original_shape_ : tuple of int Shape (N, H, W) of the training data passed to :meth:fit. coeff_slices_ : list PyWavelets slicing metadata needed to pack/unpack the coefficient array. Set during :meth:fit. coeff_shape_ : tuple of int Shape of the 2-D coefficient array produced by :func:pywt.coeffs_to_array.

Notes¶

The mapping from image to coefficients is

.. math::

(N,\, H \cdot W) \xrightarrow{\text{wavedec2}}
(N,\, H \cdot W)

For level=1 and wavelet="haar" the four sub-bands are: approximation (LL), horizontal detail (LH), vertical detail (HL), and diagonal detail (HH).

Examples¶

import numpy as np rng = np.random.default_rng(0) X = rng.standard_normal((50, 8, 8)) # 50 grayscale 8×8 images wt = WaveletTransform(wavelet="haar", level=1) wt.fit(X) # doctest: +ELLIPSIS WaveletTransform(...) Xt = wt.transform(X) Xt.shape (50, 64) Xr = wt.inverse_transform(Xt) Xr.shape (50, 8, 8)

Source code in rbig/_src/image.py

class WaveletTransform(BaseTransform):
    """Multi-level 2-D wavelet decomposition for image data.

    Wraps PyWavelets ``wavedec2`` / ``waverec2`` to provide the standard
    ``fit`` / ``transform`` / ``inverse_transform`` interface expected by
    RBIG pipeline components.

    The forward transform maps each ``(H, W)`` image to a flat coefficient
    vector of length ``H * W`` (for periodization boundary mode the coefficient
    array has the same number of elements as the input image).

    Requires PyWavelets (``pip install PyWavelets``).

    Parameters
    ----------
    wavelet : str, default ``"haar"``
        Wavelet name accepted by :func:`pywt.Wavelet`, e.g. ``"haar"``,
        ``"db2"``, ``"sym4"``.
    level : int, default 1
        Decomposition depth.  Higher levels produce coarser approximation
        sub-bands.
    mode : str, default ``"periodization"``
        Signal extension mode passed to PyWavelets.  ``"periodization"``
        ensures the output coefficient array has the same total size as the
        input.

    Attributes
    ----------
    original_shape_ : tuple of int
        Shape ``(N, H, W)`` of the training data passed to :meth:`fit`.
    coeff_slices_ : list
        PyWavelets slicing metadata needed to pack/unpack the coefficient
        array.  Set during :meth:`fit`.
    coeff_shape_ : tuple of int
        Shape of the 2-D coefficient array produced by
        :func:`pywt.coeffs_to_array`.

    Notes
    -----
    The mapping from image to coefficients is

    .. math::

        (N,\\, H \\cdot W) \\xrightarrow{\\text{wavedec2}}
        (N,\\, H \\cdot W)

    For ``level=1`` and ``wavelet="haar"`` the four sub-bands are:
    approximation (LL), horizontal detail (LH), vertical detail (HL), and
    diagonal detail (HH).

    Examples
    --------
    >>> import numpy as np
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((50, 8, 8))  # 50 grayscale 8×8 images
    >>> wt = WaveletTransform(wavelet="haar", level=1)
    >>> wt.fit(X)  # doctest: +ELLIPSIS
    WaveletTransform(...)
    >>> Xt = wt.transform(X)
    >>> Xt.shape
    (50, 64)
    >>> Xr = wt.inverse_transform(Xt)
    >>> Xr.shape
    (50, 8, 8)
    """

    def __init__(
        self, wavelet: str = "haar", level: int = 1, mode: str = "periodization"
    ):
        self.wavelet = wavelet
        self.level = level
        self.mode = mode

    def fit(self, X: np.ndarray, y=None) -> WaveletTransform:
        """Compute and store coefficient layout from the first sample.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, H, W)``
            Training images.  Only the first sample is used to determine the
            coefficient array shape; the data values are not retained.

        Returns
        -------
        self : WaveletTransform
        """
        import pywt

        self.pywt_ = pywt
        self.original_shape_ = X.shape  # store (N, H, W) for reference
        test = X[0]  # single image of shape (H, W)
        coeffs = pywt.wavedec2(test, self.wavelet, level=self.level, mode=self.mode)
        self.coeff_slices_ = None
        # coeffs_to_array packs all sub-bands into one 2-D array
        arr, self.coeff_slices_ = pywt.coeffs_to_array(coeffs)
        self.coeff_shape_ = arr.shape  # e.g. (H, W) for periodization
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Decompose images into flattened wavelet coefficients.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, H, W)``
            Images to transform.

        Returns
        -------
        Xt : np.ndarray, shape ``(N, H*W)``
            Flattened coefficient vectors, one row per image.
        """
        import pywt

        result = []
        for xi in X:  # xi shape: (H, W)
            coeffs = pywt.wavedec2(xi, self.wavelet, level=self.level, mode=self.mode)
            arr, _ = pywt.coeffs_to_array(coeffs)  # arr shape: coeff_shape_
            result.append(arr.ravel())  # flatten to 1-D coefficient vector
        return np.array(result)  # (N, H*W)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Reconstruct images from flattened wavelet coefficients.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, H*W)``
            Flattened coefficient vectors produced by :meth:`transform`.

        Returns
        -------
        Xr : np.ndarray, shape ``(N, H, W)``
            Reconstructed images.
        """
        import pywt

        result = []
        for xi in X:  # xi shape: (H*W,)
            arr = xi.reshape(self.coeff_shape_)  # restore 2-D coefficient array
            coeffs = pywt.array_to_coeffs(
                arr, self.coeff_slices_, output_format="wavedec2"
            )
            img = pywt.waverec2(coeffs, self.wavelet, mode=self.mode)  # (H, W)
            result.append(img)
        return np.array(result)  # (N, H, W)

`fit(X, y=None)` ¶

Compute and store coefficient layout from the first sample.

Parameters¶

X : np.ndarray, shape (N, H, W) Training images. Only the first sample is used to determine the coefficient array shape; the data values are not retained.

Returns¶

self : WaveletTransform

Source code in rbig/_src/image.py

def fit(self, X: np.ndarray, y=None) -> WaveletTransform:
    """Compute and store coefficient layout from the first sample.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, H, W)``
        Training images.  Only the first sample is used to determine the
        coefficient array shape; the data values are not retained.

    Returns
    -------
    self : WaveletTransform
    """
    import pywt

    self.pywt_ = pywt
    self.original_shape_ = X.shape  # store (N, H, W) for reference
    test = X[0]  # single image of shape (H, W)
    coeffs = pywt.wavedec2(test, self.wavelet, level=self.level, mode=self.mode)
    self.coeff_slices_ = None
    # coeffs_to_array packs all sub-bands into one 2-D array
    arr, self.coeff_slices_ = pywt.coeffs_to_array(coeffs)
    self.coeff_shape_ = arr.shape  # e.g. (H, W) for periodization
    return self

`transform(X)` ¶

Decompose images into flattened wavelet coefficients.

Parameters¶

X : np.ndarray, shape (N, H, W) Images to transform.

Returns¶

Xt : np.ndarray, shape (N, H*W) Flattened coefficient vectors, one row per image.

Source code in rbig/_src/image.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Decompose images into flattened wavelet coefficients.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, H, W)``
        Images to transform.

    Returns
    -------
    Xt : np.ndarray, shape ``(N, H*W)``
        Flattened coefficient vectors, one row per image.
    """
    import pywt

    result = []
    for xi in X:  # xi shape: (H, W)
        coeffs = pywt.wavedec2(xi, self.wavelet, level=self.level, mode=self.mode)
        arr, _ = pywt.coeffs_to_array(coeffs)  # arr shape: coeff_shape_
        result.append(arr.ravel())  # flatten to 1-D coefficient vector
    return np.array(result)  # (N, H*W)

`inverse_transform(X)` ¶

Reconstruct images from flattened wavelet coefficients.

Parameters¶

X : np.ndarray, shape (N, H*W) Flattened coefficient vectors produced by :meth:transform.

Returns¶

Xr : np.ndarray, shape (N, H, W) Reconstructed images.

Source code in rbig/_src/image.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Reconstruct images from flattened wavelet coefficients.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, H*W)``
        Flattened coefficient vectors produced by :meth:`transform`.

    Returns
    -------
    Xr : np.ndarray, shape ``(N, H, W)``
        Reconstructed images.
    """
    import pywt

    result = []
    for xi in X:  # xi shape: (H*W,)
        arr = xi.reshape(self.coeff_shape_)  # restore 2-D coefficient array
        coeffs = pywt.array_to_coeffs(
            arr, self.coeff_slices_, output_format="wavedec2"
        )
        img = pywt.waverec2(coeffs, self.wavelet, mode=self.mode)  # (H, W)
        result.append(img)
    return np.array(result)  # (N, H, W)

Xarray Integration¶

`rbig.XarrayRBIG` ¶

RBIG model with an xarray-aware interface.

Wraps an :class:~rbig._src.model.AnnealedRBIG (or compatible class) so that it can be fitted and applied directly to :class:xarray.DataArray / :class:xarray.Dataset objects with spatiotemporal dimensions. The underlying model operates on a 2-D (samples, features) matrix obtained via :func:xr_st_to_matrix.

Parameters¶

n_layers : int, default 100 Maximum number of RBIG layers. strategy : list or None, default None Rotation strategy list passed to the underlying RBIG model. If None, the default rotation of the model class is used. tol : float, default 1e-5 Convergence tolerance for early stopping. random_state : int or None, default None Random seed for reproducibility. rbig_class : class or None, default None RBIG model class to instantiate. Defaults to :class:~rbig._src.model.AnnealedRBIG when None. rbig_kwargs : dict or None, default None Additional keyword arguments forwarded to rbig_class. verbose : bool or int, default=False Controls progress bar display. Passed through to the underlying RBIG model.

Attributes¶

model_ : AnnealedRBIG The fitted underlying RBIG model. meta_ : dict xarray metadata captured during :meth:fit, used to reconstruct output arrays.

Examples¶

import numpy as np import xarray as xr rng = np.random.default_rng(0) da = xr.DataArray( ... rng.standard_normal((30, 4, 5)), ... dims=["time", "lat", "lon"], ... ) xrbig = XarrayRBIG(n_layers=10, random_state=0)

info = xrbig.fit(da)¶

da_t = xrbig.transform(da)¶

Source code in rbig/_src/xarray_st.py

class XarrayRBIG:
    """RBIG model with an xarray-aware interface.

    Wraps an :class:`~rbig._src.model.AnnealedRBIG` (or compatible class) so
    that it can be fitted and applied directly to :class:`xarray.DataArray` /
    :class:`xarray.Dataset` objects with spatiotemporal dimensions.  The
    underlying model operates on a 2-D ``(samples, features)`` matrix obtained
    via :func:`xr_st_to_matrix`.

    Parameters
    ----------
    n_layers : int, default 100
        Maximum number of RBIG layers.
    strategy : list or None, default None
        Rotation strategy list passed to the underlying RBIG model.  If
        ``None``, the default rotation of the model class is used.
    tol : float, default 1e-5
        Convergence tolerance for early stopping.
    random_state : int or None, default None
        Random seed for reproducibility.
    rbig_class : class or None, default None
        RBIG model class to instantiate.  Defaults to
        :class:`~rbig._src.model.AnnealedRBIG` when ``None``.
    rbig_kwargs : dict or None, default None
        Additional keyword arguments forwarded to ``rbig_class``.
    verbose : bool or int, default=False
        Controls progress bar display.  Passed through to the underlying
        RBIG model.

    Attributes
    ----------
    model_ : AnnealedRBIG
        The fitted underlying RBIG model.
    meta_ : dict
        xarray metadata captured during :meth:`fit`, used to reconstruct
        output arrays.

    Examples
    --------
    >>> import numpy as np
    >>> import xarray as xr
    >>> rng = np.random.default_rng(0)
    >>> da = xr.DataArray(
    ...     rng.standard_normal((30, 4, 5)),
    ...     dims=["time", "lat", "lon"],
    ... )
    >>> xrbig = XarrayRBIG(n_layers=10, random_state=0)
    >>> # info = xrbig.fit(da)
    >>> # da_t = xrbig.transform(da)
    """

    def __init__(
        self,
        n_layers: int = 100,
        strategy: list | None = None,
        tol: float = 1e-5,
        random_state: int | None = None,
        rbig_class=None,
        rbig_kwargs: dict | None = None,
        verbose: bool | int = False,
    ):
        self.n_layers = n_layers
        self.strategy = strategy
        self.tol = tol
        self.random_state = random_state
        self.rbig_class = rbig_class
        self.rbig_kwargs = rbig_kwargs or {}
        self.verbose = verbose

    def fit(self, X) -> dict:
        """Fit the RBIG model to xarray data and return an information summary.

        Parameters
        ----------
        X : xr.DataArray or xr.Dataset
            Input spatiotemporal data.  Internally converted to a 2-D matrix
            via :func:`xr_st_to_matrix`.

        Returns
        -------
        info : dict
            Dictionary of RBIG information metrics (e.g. total correlation,
            entropy estimates) as returned by
            :func:`~rbig._src.metrics.information_summary`.
        """
        from rbig._src.metrics import information_summary
        from rbig._src.model import AnnealedRBIG

        rbig_cls = self.rbig_class if self.rbig_class is not None else AnnealedRBIG
        kwargs = {
            "n_layers": self.n_layers,
            "tol": self.tol,
            "random_state": self.random_state,
        }
        if self.strategy is not None:
            kwargs["strategy"] = self.strategy
        kwargs.update(self.rbig_kwargs)
        kwargs["verbose"] = self.verbose

        # Convert xarray → (n_samples, n_features) matrix and store metadata
        matrix, self.meta_ = xr_st_to_matrix(X)
        self.model_ = rbig_cls(**kwargs)
        self.model_.fit(matrix)
        return information_summary(self.model_, matrix)

    def transform(self, X):
        """Gaussianise samples and return an xarray object.

        Applies the fitted RBIG transform to ``X``, then reconstructs the
        original xarray structure.  Original coordinates and DataArray name
        are re-attached when possible.

        Parameters
        ----------
        X : xr.DataArray or xr.Dataset
            Data to transform.  Must have the same structure as the data
            passed to :meth:`fit`.

        Returns
        -------
        out : xr.DataArray or xr.Dataset
            Gaussianised data with the same shape and dimension names as ``X``.
        """
        matrix, _ = xr_st_to_matrix(X)
        Xt = self.model_.transform(matrix)
        out = matrix_to_xr_st(Xt, self.meta_)
        # Re-attach original xarray coordinates and name when available
        if hasattr(X, "assign_coords") and hasattr(X, "coords"):
            try:
                out = out.assign_coords(X.coords)
            except Exception:
                pass
        if hasattr(X, "name") and hasattr(out, "name") and X.name is not None:
            try:
                out.name = X.name
            except Exception:
                pass
        return out

    def score_samples(self, X):
        """Compute per-sample log-probability log p(x).

        Parameters
        ----------
        X : xr.DataArray or xr.Dataset
            Input data.

        Returns
        -------
        log_prob : np.ndarray, shape ``(n_samples,)``
            Log-probability of each sample under the fitted RBIG model.
        """
        matrix, _ = xr_st_to_matrix(X)
        return self.model_.score_samples(matrix)

    def mutual_information(self, X, Y) -> float:
        """Estimate mutual information between two xarray variables via RBIG.

        Fits independent RBIG models to ``X``, ``Y``, and their concatenation
        ``[X, Y]``, then computes:

        .. math::

            \\mathrm{MI}(X;\\,Y)
            = H(X) + H(Y) - H(X,\\,Y)

        where each differential entropy :math:`H` is estimated from the RBIG
        log-determinant accumulation.

        Parameters
        ----------
        X : xr.DataArray or xr.Dataset
            First variable.
        Y : xr.DataArray or xr.Dataset
            Second variable.  Must have the same number of samples as ``X``
            after flattening.

        Returns
        -------
        mi : float
            Estimated mutual information in nats.

        Notes
        -----
        All three RBIG models share the same ``n_layers``, ``tol``, and
        ``random_state`` settings as the parent :class:`XarrayRBIG` instance.

        Examples
        --------
        >>> import numpy as np
        >>> import xarray as xr
        >>> rng = np.random.default_rng(0)
        >>> x = xr.DataArray(rng.standard_normal((50, 1)), dims=["time", "f"])
        >>> y = xr.DataArray(rng.standard_normal((50, 1)), dims=["time", "f"])
        >>> xrbig = XarrayRBIG(n_layers=5, random_state=0)
        >>> # mi = xrbig.mutual_information(x, y)
        """
        from rbig._src.metrics import entropy_rbig
        from rbig._src.model import AnnealedRBIG

        rbig_cls = self.rbig_class if self.rbig_class is not None else AnnealedRBIG
        kwargs = {
            "n_layers": self.n_layers,
            "tol": self.tol,
            "random_state": self.random_state,
        }
        kwargs.update(self.rbig_kwargs)

        # Flatten both variables to 2-D matrices
        X_mat, _ = xr_st_to_matrix(X)
        Y_mat, _ = xr_st_to_matrix(Y)
        XY_mat = np.hstack([X_mat, Y_mat])  # joint representation (n, dx + dy)

        # Fit three separate RBIG models for H(X), H(Y), H(X,Y)
        mx = rbig_cls(**kwargs).fit(X_mat)
        my = rbig_cls(**kwargs).fit(Y_mat)
        mxy = rbig_cls(**kwargs).fit(XY_mat)

        hx = entropy_rbig(mx, X_mat)  # H(X)
        hy = entropy_rbig(my, Y_mat)  # H(Y)
        hxy = entropy_rbig(mxy, XY_mat)  # H(X, Y)
        # MI(X;Y) = H(X) + H(Y) - H(X,Y)
        return float(hx + hy - hxy)

`fit(X)` ¶

Fit the RBIG model to xarray data and return an information summary.

Parameters¶

X : xr.DataArray or xr.Dataset Input spatiotemporal data. Internally converted to a 2-D matrix via :func:xr_st_to_matrix.

Returns¶

info : dict Dictionary of RBIG information metrics (e.g. total correlation, entropy estimates) as returned by :func:~rbig._src.metrics.information_summary.

Source code in rbig/_src/xarray_st.py

def fit(self, X) -> dict:
    """Fit the RBIG model to xarray data and return an information summary.

    Parameters
    ----------
    X : xr.DataArray or xr.Dataset
        Input spatiotemporal data.  Internally converted to a 2-D matrix
        via :func:`xr_st_to_matrix`.

    Returns
    -------
    info : dict
        Dictionary of RBIG information metrics (e.g. total correlation,
        entropy estimates) as returned by
        :func:`~rbig._src.metrics.information_summary`.
    """
    from rbig._src.metrics import information_summary
    from rbig._src.model import AnnealedRBIG

    rbig_cls = self.rbig_class if self.rbig_class is not None else AnnealedRBIG
    kwargs = {
        "n_layers": self.n_layers,
        "tol": self.tol,
        "random_state": self.random_state,
    }
    if self.strategy is not None:
        kwargs["strategy"] = self.strategy
    kwargs.update(self.rbig_kwargs)
    kwargs["verbose"] = self.verbose

    # Convert xarray → (n_samples, n_features) matrix and store metadata
    matrix, self.meta_ = xr_st_to_matrix(X)
    self.model_ = rbig_cls(**kwargs)
    self.model_.fit(matrix)
    return information_summary(self.model_, matrix)

`transform(X)` ¶

Gaussianise samples and return an xarray object.

Applies the fitted RBIG transform to X, then reconstructs the original xarray structure. Original coordinates and DataArray name are re-attached when possible.

Parameters¶

X : xr.DataArray or xr.Dataset Data to transform. Must have the same structure as the data passed to :meth:fit.

Returns¶

out : xr.DataArray or xr.Dataset Gaussianised data with the same shape and dimension names as X.

Source code in rbig/_src/xarray_st.py

def transform(self, X):
    """Gaussianise samples and return an xarray object.

    Applies the fitted RBIG transform to ``X``, then reconstructs the
    original xarray structure.  Original coordinates and DataArray name
    are re-attached when possible.

    Parameters
    ----------
    X : xr.DataArray or xr.Dataset
        Data to transform.  Must have the same structure as the data
        passed to :meth:`fit`.

    Returns
    -------
    out : xr.DataArray or xr.Dataset
        Gaussianised data with the same shape and dimension names as ``X``.
    """
    matrix, _ = xr_st_to_matrix(X)
    Xt = self.model_.transform(matrix)
    out = matrix_to_xr_st(Xt, self.meta_)
    # Re-attach original xarray coordinates and name when available
    if hasattr(X, "assign_coords") and hasattr(X, "coords"):
        try:
            out = out.assign_coords(X.coords)
        except Exception:
            pass
    if hasattr(X, "name") and hasattr(out, "name") and X.name is not None:
        try:
            out.name = X.name
        except Exception:
            pass
    return out

`score_samples(X)` ¶

Compute per-sample log-probability log p(x).

Parameters¶

X : xr.DataArray or xr.Dataset Input data.

Returns¶

log_prob : np.ndarray, shape (n_samples,) Log-probability of each sample under the fitted RBIG model.

Source code in rbig/_src/xarray_st.py

def score_samples(self, X):
    """Compute per-sample log-probability log p(x).

    Parameters
    ----------
    X : xr.DataArray or xr.Dataset
        Input data.

    Returns
    -------
    log_prob : np.ndarray, shape ``(n_samples,)``
        Log-probability of each sample under the fitted RBIG model.
    """
    matrix, _ = xr_st_to_matrix(X)
    return self.model_.score_samples(matrix)

`mutual_information(X, Y)` ¶

Estimate mutual information between two xarray variables via RBIG.

Fits independent RBIG models to X, Y, and their concatenation [X, Y], then computes:

.. math::

\mathrm{MI}(X;\,Y)
= H(X) + H(Y) - H(X,\,Y)

where each differential entropy :math:H is estimated from the RBIG log-determinant accumulation.

Parameters¶

X : xr.DataArray or xr.Dataset First variable. Y : xr.DataArray or xr.Dataset Second variable. Must have the same number of samples as X after flattening.

Returns¶

mi : float Estimated mutual information in nats.

Notes¶

All three RBIG models share the same n_layers, tol, and random_state settings as the parent :class:XarrayRBIG instance.

Examples¶

import numpy as np import xarray as xr rng = np.random.default_rng(0) x = xr.DataArray(rng.standard_normal((50, 1)), dims=["time", "f"]) y = xr.DataArray(rng.standard_normal((50, 1)), dims=["time", "f"]) xrbig = XarrayRBIG(n_layers=5, random_state=0)

mi = xrbig.mutual_information(x, y)¶

Source code in rbig/_src/xarray_st.py

def mutual_information(self, X, Y) -> float:
    """Estimate mutual information between two xarray variables via RBIG.

    Fits independent RBIG models to ``X``, ``Y``, and their concatenation
    ``[X, Y]``, then computes:

    .. math::

        \\mathrm{MI}(X;\\,Y)
        = H(X) + H(Y) - H(X,\\,Y)

    where each differential entropy :math:`H` is estimated from the RBIG
    log-determinant accumulation.

    Parameters
    ----------
    X : xr.DataArray or xr.Dataset
        First variable.
    Y : xr.DataArray or xr.Dataset
        Second variable.  Must have the same number of samples as ``X``
        after flattening.

    Returns
    -------
    mi : float
        Estimated mutual information in nats.

    Notes
    -----
    All three RBIG models share the same ``n_layers``, ``tol``, and
    ``random_state`` settings as the parent :class:`XarrayRBIG` instance.

    Examples
    --------
    >>> import numpy as np
    >>> import xarray as xr
    >>> rng = np.random.default_rng(0)
    >>> x = xr.DataArray(rng.standard_normal((50, 1)), dims=["time", "f"])
    >>> y = xr.DataArray(rng.standard_normal((50, 1)), dims=["time", "f"])
    >>> xrbig = XarrayRBIG(n_layers=5, random_state=0)
    >>> # mi = xrbig.mutual_information(x, y)
    """
    from rbig._src.metrics import entropy_rbig
    from rbig._src.model import AnnealedRBIG

    rbig_cls = self.rbig_class if self.rbig_class is not None else AnnealedRBIG
    kwargs = {
        "n_layers": self.n_layers,
        "tol": self.tol,
        "random_state": self.random_state,
    }
    kwargs.update(self.rbig_kwargs)

    # Flatten both variables to 2-D matrices
    X_mat, _ = xr_st_to_matrix(X)
    Y_mat, _ = xr_st_to_matrix(Y)
    XY_mat = np.hstack([X_mat, Y_mat])  # joint representation (n, dx + dy)

    # Fit three separate RBIG models for H(X), H(Y), H(X,Y)
    mx = rbig_cls(**kwargs).fit(X_mat)
    my = rbig_cls(**kwargs).fit(Y_mat)
    mxy = rbig_cls(**kwargs).fit(XY_mat)

    hx = entropy_rbig(mx, X_mat)  # H(X)
    hy = entropy_rbig(my, Y_mat)  # H(Y)
    hxy = entropy_rbig(mxy, XY_mat)  # H(X, Y)
    # MI(X;Y) = H(X) + H(Y) - H(X,Y)
    return float(hx + hy - hxy)

API Reference¶

Models¶

rbig.AnnealedRBIG ¶

Parameters¶

Attributes¶

Notes¶

References¶

Examples¶

zero_tolerance property writable ¶

fit(X, y=None) ¶

Parameters¶

Returns¶

transform(X) ¶

Parameters¶

Returns¶

inverse_transform(X) ¶

Parameters¶

Returns¶

fit_transform(X, y=None) ¶

Parameters¶

Returns¶

score_samples(X) ¶

Parameters¶

Returns¶

Notes¶

score(X, y=None) ¶

Parameters¶

Returns¶

entropy() ¶

Returns¶

Notes¶

total_correlation_reduction() ¶

Returns¶

entropy_reduction(X) ¶

Parameters¶

Returns¶

score_samples_raw_() ¶

Returns¶

sample(n_samples, random_state=None) ¶

Parameters¶

Returns¶

predict_proba(X, domain='input') ¶

Parameters¶

Returns¶

jacobian(X, return_X_transform=False) ¶

Parameters¶

Returns¶

rbig.RBIGLayer dataclass ¶

Parameters¶

Attributes¶

Notes¶

References¶

Examples¶

fit(X, y=None) ¶

Parameters¶

Returns¶

transform(X) ¶

Parameters¶

Returns¶

log_det_jacobian(X) ¶

Parameters¶

Returns¶

inverse_transform(X) ¶

Parameters¶

Returns¶

Marginal Transforms¶

rbig.MarginalUniformize ¶

Parameters¶

Attributes¶

Notes¶

References¶

Examples¶

fit(X, y=None) ¶

Parameters¶

Returns¶

transform(X) ¶

Parameters¶

Returns¶

inverse_transform(X) ¶

Parameters¶

`rbig.AnnealedRBIG` ¶

`zero_tolerance` `property` `writable` ¶

`fit(X, y=None)` ¶

`transform(X)` ¶

`inverse_transform(X)` ¶

`fit_transform(X, y=None)` ¶

`score_samples(X)` ¶

`score(X, y=None)` ¶

`entropy()` ¶

`total_correlation_reduction()` ¶

`entropy_reduction(X)` ¶

`score_samples_raw_()` ¶

`sample(n_samples, random_state=None)` ¶

`predict_proba(X, domain='input')` ¶

`jacobian(X, return_X_transform=False)` ¶

`rbig.RBIGLayer` `dataclass` ¶

`fit(X, y=None)` ¶

`transform(X)` ¶

`log_det_jacobian(X)` ¶

`inverse_transform(X)` ¶

`rbig.MarginalUniformize` ¶

`fit(X, y=None)` ¶

`transform(X)` ¶

`inverse_transform(X)` ¶

`rbig.MarginalGaussianize` ¶

`fit(X, y=None)` ¶

`transform(X)` ¶

`inverse_transform(X)` ¶

`log_det_jacobian(X)` ¶

`rbig.MarginalKDEGaussianize` ¶

`fit(X, y=None)` ¶

`transform(X)` ¶

`inverse_transform(X)` ¶

`rbig.SplineGaussianizer` ¶

`fit(X, y=None)` ¶

`transform(X)` ¶

`inverse_transform(X)` ¶

`get_log_det_jacobian(X)` ¶

`rbig.KDEGaussianizer` ¶

`fit(X, y=None)` ¶

`transform(X)` ¶

`inverse_transform(X)` ¶

`get_log_det_jacobian(X)` ¶