API Reference

RBIG: Rotation-Based Iterative Gaussianization.

`AnnealedRBIG`

Rotation-Based Iterative Gaussianization (RBIG).

RBIG is a density estimation and data transformation method that iteratively Gaussianizes multivariate data by alternating between:

Marginal Gaussianization: mapping each feature to a Gaussian using its empirical CDF and the probit transform.
Rotation: applying an orthogonal matrix (PCA or ICA) to de-correlate the Gaussianized features.

The process repeats until the total correlation (TC) of the transformed data converges. After fitting, the model represents a normalizing flow whose density is given by the change-of-variables formula:

log p(x) = log p_Z(f(x)) + log|det J_f(x)|

where f is the composition of all fitted layers and p_Z is a standard multivariate Gaussian.

Parameters

n_layers : int, default=100 Maximum number of RBIG layers to apply. Early stopping via zero_tolerance may halt training before this limit. rotation : str, default="pca" Rotation method: "pca" (PCA with whitening), "ica" (Independent Component Analysis), or "random" (Haar-distributed orthogonal rotation). zero_tolerance : int, default=60 Number of consecutive layers showing a TC change smaller than tol before training stops early. tol : float or "auto", default=1e-5 Convergence threshold for the per-layer change in total correlation: |TC(k) − TC(k−1)| < tol. When set to "auto", the tolerance is chosen adaptively based on the number of training samples using an empirically calibrated lookup table. random_state : int or None, default=None Seed for the random number generator used by stochastic components such as ICA or random rotations. strategy : list or None, default=None Optional per-layer override list. Each entry may be a string (rotation name) or a (rotation_name, marginal_name) pair. Entries cycle if the list is shorter than n_layers.

Attributes

n_features_in_ : int Number of features seen during fit. layers_ : list of RBIGLayer Fitted RBIG layers in application order. tc_per_layer_ : list of float Total correlation of the data after each fitted layer. log_det_train_ : np.ndarray of shape (n_samples,) Accumulated per-sample log-det-Jacobian over all layers, computed on the training data during fit. X_transformed_ : np.ndarray of shape (n_samples, n_features) Training data after passing through all fitted layers.

Notes

Total correlation is defined as:

TC(X) = ∑ᵢ H(Xᵢ) − H(X)

where H(Xᵢ) is the marginal entropy of the i-th feature and H(X) is the joint entropy. For a fully Gaussianized, independent dataset, TC = 0.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: From ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537–549. https://doi.org/10.1109/TNN.2011.2106511

Examples

import numpy as np from rbig._src.model import AnnealedRBIG rng = np.random.default_rng(42) X = rng.standard_normal((300, 4)) model = AnnealedRBIG(n_layers=20, rotation="pca") model.fit(X) Z = model.transform(X) Z.shape (300, 4) model.score(X) # mean log-likelihood in nats -5.65...

Source code in rbig/_src/model.py

class AnnealedRBIG:
    """Rotation-Based Iterative Gaussianization (RBIG).

    RBIG is a density estimation and data transformation method that
    iteratively Gaussianizes multivariate data by alternating between:

    1. **Marginal Gaussianization**: mapping each feature to a Gaussian
       using its empirical CDF and the probit transform.
    2. **Rotation**: applying an orthogonal matrix (PCA or ICA) to
       de-correlate the Gaussianized features.

    The process repeats until the total correlation (TC) of the
    transformed data converges.  After fitting, the model represents a
    normalizing flow whose density is given by the change-of-variables
    formula:

        log p(x) = log p_Z(f(x)) + log|det J_f(x)|

    where ``f`` is the composition of all fitted layers and ``p_Z`` is a
    standard multivariate Gaussian.

    Parameters
    ----------
    n_layers : int, default=100
        Maximum number of RBIG layers to apply.  Early stopping via
        ``zero_tolerance`` may halt training before this limit.
    rotation : str, default="pca"
        Rotation method: ``"pca"`` (PCA with whitening), ``"ica"``
        (Independent Component Analysis), or ``"random"`` (Haar-distributed
        orthogonal rotation).
    zero_tolerance : int, default=60
        Number of consecutive layers showing a TC change smaller than
        ``tol`` before training stops early.
    tol : float or "auto", default=1e-5
        Convergence threshold for the per-layer change in total correlation:
        ``|TC(k) − TC(k−1)| < tol``.  When set to ``"auto"``, the tolerance
        is chosen adaptively based on the number of training samples using
        an empirically calibrated lookup table.
    random_state : int or None, default=None
        Seed for the random number generator used by stochastic components
        such as ICA or random rotations.
    strategy : list or None, default=None
        Optional per-layer override list.  Each entry may be a string
        (rotation name) or a ``(rotation_name, marginal_name)`` pair.
        Entries cycle if the list is shorter than ``n_layers``.

    Attributes
    ----------
    n_features_in_ : int
        Number of features seen during ``fit``.
    layers_ : list of RBIGLayer
        Fitted RBIG layers in application order.
    tc_per_layer_ : list of float
        Total correlation of the data after each fitted layer.
    log_det_train_ : np.ndarray of shape (n_samples,)
        Accumulated per-sample log-det-Jacobian over all layers,
        computed on the training data during ``fit``.
    X_transformed_ : np.ndarray of shape (n_samples, n_features)
        Training data after passing through all fitted layers.

    Notes
    -----
    Total correlation is defined as:

        TC(X) = ∑ᵢ H(Xᵢ) − H(X)

    where H(Xᵢ) is the marginal entropy of the i-th feature and H(X) is
    the joint entropy.  For a fully Gaussianized, independent dataset,
    TC = 0.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    From ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537–549. https://doi.org/10.1109/TNN.2011.2106511

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.model import AnnealedRBIG
    >>> rng = np.random.default_rng(42)
    >>> X = rng.standard_normal((300, 4))
    >>> model = AnnealedRBIG(n_layers=20, rotation="pca")
    >>> model.fit(X)
    <rbig._src.model.AnnealedRBIG object at ...>
    >>> Z = model.transform(X)
    >>> Z.shape
    (300, 4)
    >>> model.score(X)  # mean log-likelihood in nats
    -5.65...
    """

    def __init__(
        self,
        n_layers: int = 100,
        rotation: str = "pca",
        zero_tolerance: int = 60,
        tol: float | str = 1e-5,
        random_state: int | None = None,
        strategy: list | None = None,
    ):
        self.n_layers = n_layers
        self.rotation = rotation
        self.zero_tolerance = zero_tolerance
        self.tol = tol
        self.random_state = random_state
        self.strategy = strategy

    def fit(self, X: np.ndarray) -> AnnealedRBIG:
        """Fit the RBIG model by iteratively Gaussianizing X.

        At each layer k the algorithm:

        1. Builds a new :class:`RBIGLayer` with the configured marginal and
           rotation transforms.
        2. Fits the layer on the current working copy ``Xt``.
        3. Accumulates the per-sample log-det-Jacobian:
           ``log_det_train_ += log|det J_k(Xt)|``.
        4. Advances ``Xt`` through the layer: ``Xt = f_k(Xt)``.
        5. Measures residual total correlation: ``TC(Xt) = ∑ᵢ H(Xᵢ) − H(X)``.
        6. Stops early when TC has not changed by more than ``tol`` for
           ``zero_tolerance`` consecutive layers.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : AnnealedRBIG
            The fitted model.
        """
        n_samples, n_features = X.shape
        self.n_features_in_ = n_features  # remember input dimensionality
        self.layers_: list[RBIGLayer] = []
        self.tc_per_layer_: list[float] = []

        # Validate and resolve tolerance
        if self.tol == "auto":
            tol = self._get_information_tolerance(n_samples)
        elif isinstance(self.tol, int | float):
            tol = float(self.tol)
        else:
            raise ValueError(f"tol must be a float or 'auto', got {self.tol!r}")
        self.tol_: float = tol  # store resolved tolerance for inspection

        Xt = X.copy()  # working copy; shape (n_samples, n_features)
        self.log_det_train_ = np.zeros(
            n_samples
        )  # accumulated log|det J|; shape (n_samples,)
        zero_count = 0  # consecutive non-improving layer counter

        for i in range(self.n_layers):
            # Build layer i with the appropriate marginal and rotation components
            layer = RBIGLayer(
                marginal=self._make_marginal(layer_index=i),
                rotation=self._make_rotation(layer_index=i),
            )
            layer.fit(Xt)
            # Accumulate log|det J_i(Xt)| before advancing Xt
            self.log_det_train_ += layer.log_det_jacobian(Xt)
            Xt = layer.transform(Xt)  # advance to next representation
            self.layers_.append(layer)

            # Measure residual total correlation: TC = sum_i H(Xi) - H(X)
            tc = self._total_correlation(Xt)
            self.tc_per_layer_.append(tc)

            if i > 0:
                # Check convergence: how much did TC improve this layer?
                delta = abs(self.tc_per_layer_[-2] - tc)
                if delta < tol:
                    zero_count += 1
                else:
                    zero_count = 0  # reset on any significant improvement

            # Stop early if TC has been flat for zero_tolerance consecutive layers
            if zero_count >= self.zero_tolerance:
                break

        # Store the fully transformed training data for efficient entropy estimation
        self.X_transformed_ = Xt  # shape (n_samples, n_features)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Map X to the Gaussian latent space through all fitted layers.

        Applies each fitted :class:`RBIGLayer` in order:
        ``Z = fₖ(… f₂(f₁(x)) …)``.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Data in the approximately Gaussian latent space.
        """
        Xt = X.copy()  # shape (n_samples, n_features)
        for layer in self.layers_:
            Xt = layer.transform(Xt)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Map latent-space data back to the original input space.

        Applies layers in reverse order:
        ``x = f₁⁻¹(… fₖ₋₁⁻¹(fₖ⁻¹(z)) …)``.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the latent (approximately Gaussian) space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original input space.
        """
        Xt = X.copy()  # shape (n_samples, n_features)
        for layer in reversed(self.layers_):
            Xt = layer.inverse_transform(Xt)
        return Xt

    def fit_transform(self, X: np.ndarray) -> np.ndarray:
        """Fit the model to X and return the latent-space representation.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Transformed data in the latent space.
        """
        return self.fit(X).transform(X)

    def score_samples(self, X: np.ndarray) -> np.ndarray:
        """Per-sample log-likelihood under the fitted density model.

        Uses the change-of-variables formula for normalizing flows:

            log p(x) = log p_Z(f(x)) + log|det J_f(x)|

        where ``p_Z = 𝒩(0, I)`` is the standard Gaussian base density,
        ``f`` is the composition of all fitted layers, and ``J_f(x)`` is
        the Jacobian of ``f`` at ``x``.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data points at which to evaluate the log-likelihood.

        Returns
        -------
        log_prob : np.ndarray of shape (n_samples,)
            Per-sample log-likelihood in nats.

        Notes
        -----
        The log-det-Jacobian is accumulated layer by layer to avoid
        recomputing intermediate representations:

            log|det J_f(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|
        """
        Xt = X.copy()  # shape (n_samples, n_features)
        log_det_jac = np.zeros(X.shape[0])  # accumulator; shape (n_samples,)
        for layer in self.layers_:
            # Accumulate log|det Jₖ| before advancing through layer k
            log_det_jac += layer.log_det_jacobian(Xt)
            Xt = layer.transform(Xt)  # xₖ = fₖ(xₖ₋₁)
        # log p_Z(z) = sum_i log N(z_i; 0, 1); shape (n_samples,)
        log_pz = np.sum(stats.norm.logpdf(Xt), axis=1)
        # change-of-variables: log p(x) = log p_Z(f(x)) + log|det J_f(x)|
        return log_pz + log_det_jac

    def score(self, X: np.ndarray) -> float:
        """Mean log-likelihood of samples X under the fitted density.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data points to evaluate.

        Returns
        -------
        mean_log_prob : float
            Average per-sample log-likelihood in nats.
        """
        return float(np.mean(self.score_samples(X)))

    def entropy(self) -> float:
        """Differential entropy of the fitted distribution in nats.

        Estimated from the training data using:

            H(X) = −𝔼_X[log p(x)]

        The expectation is approximated by the sample mean over the training
        set.  The log-likelihoods are obtained via the efficient cached path
        :meth:`score_samples_raw_` which reuses pre-computed quantities from
        ``fit``.

        Returns
        -------
        h : float
            Estimated entropy in nats.  Always ≥ 0 for continuous
            distributions.

        Notes
        -----
        This is equivalent to ``-self.score(X_train)`` but avoids the cost
        of re-passing training data through all layers.
        """
        return float(-np.mean(self.score_samples_raw_()))

    def score_samples_raw_(self) -> np.ndarray:
        """Log-likelihood for the stored training data without recomputing layers.

        Reuses ``X_transformed_`` and ``log_det_train_`` cached during
        :meth:`fit`, so the cost is a single Gaussian log-pdf evaluation
        rather than a full forward pass through all layers.

        Returns
        -------
        log_prob : np.ndarray of shape (n_samples,)
            Per-sample log-likelihood of the training data in nats.
        """
        # log p_Z evaluated at the pre-computed transformed training data
        log_pz = np.sum(
            stats.norm.logpdf(self.X_transformed_), axis=1
        )  # shape (n_samples,)
        # add the accumulated log-det-Jacobian stored during fit
        return log_pz + self.log_det_train_

    def sample(self, n_samples: int, random_state: int | None = None) -> np.ndarray:
        """Generate samples from the learned distribution.

        Draws i.i.d. standard Gaussian samples in the latent space and maps
        them back to the data space via the inverse normalizing flow.

        Parameters
        ----------
        n_samples : int
            Number of samples to generate.
        random_state : int or None, optional
            Seed for the random number generator.  If ``None``, a random
            seed is used.

        Returns
        -------
        X_new : np.ndarray of shape (n_samples, n_features_in_)
            Samples in the original data space.
        """
        rng = np.random.default_rng(random_state)
        Z = rng.standard_normal((n_samples, self.n_features_in_))  # latent samples
        return self.inverse_transform(Z)

    def predict_proba(
        self,
        X: np.ndarray,
        domain: str = "input",
    ) -> np.ndarray | tuple[np.ndarray, np.ndarray]:
        """Return probability density estimates for X.

        Uses the change-of-variables formula via the full Jacobian matrix
        to compute the density in the requested domain.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data points to evaluate.
        domain : str, default="input"
            Which domain to return densities in:

            - ``"input"`` — density in the original data space:
              ``p(x) = p_Z(f(x)) · |det J_f(x)|``
            - ``"transform"`` — density in the Gaussian latent space:
              ``p_Z(f(x)) = ∏ᵢ φ(fᵢ(x))``
            - ``"both"`` — returns a tuple ``(p_input, p_transform)``

        Returns
        -------
        proba : np.ndarray of shape (n_samples,) or tuple
            Probability density estimates.  When ``domain="both"``, returns
            ``(p_input, p_transform)``.
        """
        jac, Xt = self.jacobian(X, return_X_transform=True)

        # Work in log-space for numerical stability
        log_p_transform = np.sum(stats.norm.logpdf(Xt), axis=1)

        if domain == "transform":
            p_transform = np.exp(log_p_transform)
            p_transform = np.where(np.isfinite(p_transform), p_transform, 0.0)
            return p_transform

        # Input-domain density via change of variables (log-space)
        _sign, log_abs_det = np.linalg.slogdet(jac)
        log_p_input = log_p_transform + log_abs_det
        p_input = np.exp(log_p_input)
        p_input = np.where(np.isfinite(p_input), p_input, 0.0)

        if domain == "input":
            return p_input
        if domain == "both":
            p_transform = np.exp(log_p_transform)
            p_transform = np.where(np.isfinite(p_transform), p_transform, 0.0)
            return p_input, p_transform
        raise ValueError(
            f"Unknown domain: {domain!r}. Use 'input', 'transform', or 'both'."
        )

    def jacobian(
        self,
        X: np.ndarray,
        return_X_transform: bool = False,
    ) -> np.ndarray | tuple[np.ndarray, np.ndarray]:
        """Compute the full Jacobian matrix of the RBIG transform.

        For each sample, returns the ``(n_features, n_features)`` Jacobian
        matrix ``df/dx`` of the composition of all fitted layers.  Uses the
        seed-dimension approach from the legacy implementation: for each input
        dimension ``idim``, a unit vector is propagated through the chain of
        per-feature marginal derivatives and rotation matrices.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data at which to evaluate the Jacobian.
        return_X_transform : bool, default False
            If True, also return the fully transformed data ``f(X)`` (computed
            as a side-effect of the Jacobian calculation).

        Returns
        -------
        jac : np.ndarray of shape (n_samples, n_features, n_features)
            Full Jacobian matrix per sample.  ``jac[n, i, j]`` is the partial
            derivative ``df_i/dx_j`` for the n-th sample.
        X_transformed : np.ndarray of shape (n_samples, n_features)
            Only returned when ``return_X_transform=True``.  The data after
            passing through all layers.
        """
        n_samples, n_features = X.shape

        # ── Forward pass: collect per-layer derivatives and rotation matrices ──
        derivs_per_layer = []  # each: (n_samples, n_features)
        rotmats_per_layer = []  # each: (n_features, n_features)

        Xt = X.copy()
        for layer in self.layers_:
            if not hasattr(layer.marginal, "_per_feature_log_deriv"):
                raise NotImplementedError(
                    f"Jacobian computation requires a marginal with "
                    f"_per_feature_log_deriv(); "
                    f"{type(layer.marginal).__name__} does not support this."
                )
            # Per-feature marginal derivatives and transformed data in one pass
            log_d, Xt_marginal = layer.marginal._per_feature_log_deriv(
                Xt, return_transform=True
            )
            derivs_per_layer.append(np.exp(log_d))

            # Rotation matrix in row-vector convention: y = z @ R
            rot = self._extract_rotation_matrix(layer.rotation)
            rotmats_per_layer.append(rot)

            # Advance through rotation only
            Xt = layer.rotation.transform(Xt_marginal)

        # ── Seed-dimension loop: propagate unit vectors through the chain ──
        jac = np.zeros((n_samples, n_features, n_features))

        for idim in range(n_features):
            # Initialize seed: unit vector in dimension idim
            XX = np.zeros((n_samples, n_features))
            XX[:, idim] = 1.0

            for derivs, R in zip(derivs_per_layer, rotmats_per_layer, strict=True):
                # Chain rule: XX_new = diag(derivs) @ XX @ R
                XX = (derivs * XX) @ R

            jac[:, :, idim] = XX

        if return_X_transform:
            return jac, Xt
        return jac

    @staticmethod
    def _extract_rotation_matrix(rotation) -> np.ndarray:
        """Extract the effective rotation matrix in row-vector convention.

        For PCA with whitening the effective matrix is
        ``components_.T / sqrt(explained_variance_)`` so that
        ``y = (x - mu) @ R``.

        Parameters
        ----------
        rotation : BaseTransform
            A fitted rotation object (PCARotation, ICARotation, etc.).

        Returns
        -------
        R : np.ndarray of shape (n_features, n_features)
            Rotation matrix such that ``y = x @ R`` (ignoring mean shift).
        """
        from rbig._src.rotation import ICARotation, PCARotation

        if isinstance(rotation, PCARotation):
            R = rotation.pca_.components_.T.copy()
            if rotation.whiten:
                R /= np.sqrt(rotation.pca_.explained_variance_)[np.newaxis, :]
            return R

        if isinstance(rotation, ICARotation):
            # ICA unmixing: W = components_, transform is x @ W.T
            if hasattr(rotation, "K_") and rotation.K_ is not None:
                # Picard path: y = (x @ K.T) @ W.T
                return rotation.K_.T @ rotation.W_.T
            return rotation.ica_.components_.T.copy()

        # Generic fallback: try to get rotation_matrix_ attribute.
        # These rotations apply X @ rotation_matrix_.T, so transpose
        # to match the y = x @ R convention used by PCA/ICA above.
        if hasattr(rotation, "rotation_matrix_"):
            return rotation.rotation_matrix_.T.copy()

        raise TypeError(
            f"Cannot extract rotation matrix from {type(rotation).__name__}. "
            f"Jacobian computation requires PCARotation, ICARotation, or an "
            f"object with a rotation_matrix_ attribute."
        )

    def _make_rotation(self, layer_index: int = 0):
        """Instantiate the rotation component for a given layer.

        Parameters
        ----------
        layer_index : int, default=0
            Index of the layer being constructed.  Used when cycling through
            a ``strategy`` list.

        Returns
        -------
        rotation : RotationBijector
            An unfitted rotation bijector instance.
        """
        if self.strategy is not None:
            # cycle through the strategy list to select rotation for this layer
            idx = layer_index % len(self.strategy)
            entry = self.strategy[idx]
            rotation_name = entry[0] if isinstance(entry, list | tuple) else entry
            return self._get_component(rotation_name, "rotation", layer_index)
        if self.rotation == "pca":
            return PCARotation(whiten=True)
        elif self.rotation == "ica":
            from rbig._src.rotation import ICARotation

            return ICARotation(random_state=self.random_state)
        elif self.rotation == "random":
            from rbig._src.rotation import RandomRotation

            seed = (self.random_state or 0) + layer_index
            return RandomRotation(random_state=seed)
        else:
            raise ValueError(
                f"Unknown rotation: {self.rotation}. Use 'pca', 'ica', or 'random'."
            )

    def _make_marginal(self, layer_index: int = 0):
        """Instantiate the marginal Gaussianization component for a given layer.

        Parameters
        ----------
        layer_index : int, default=0
            Index of the layer being constructed.  Used when cycling through
            a ``strategy`` list.

        Returns
        -------
        marginal : MarginalBijector
            An unfitted marginal Gaussianizer instance.
        """
        if self.strategy is not None:
            # cycle through the strategy list to select marginal for this layer
            idx = layer_index % len(self.strategy)
            entry = self.strategy[idx]
            marginal_name = (
                entry[1] if isinstance(entry, list | tuple) else "gaussianize"
            )
            return self._get_component(marginal_name, "marginal", layer_index)
        return MarginalGaussianize()

    def _get_component(self, name: str, kind: str, seed: int = 0):
        """Instantiate a rotation or marginal component by name.

        Parameters
        ----------
        name : str
            Component name, e.g. ``"pca"``, ``"ica"``, ``"gaussianize"``.
        kind : str
            Either ``"rotation"`` or ``"marginal"``.
        seed : int, default=0
            Layer index added to ``random_state`` to vary seeds per layer.

        Returns
        -------
        component : Bijector
            An unfitted bijector of the requested kind.
        """
        rng_seed = (self.random_state or 0) + seed
        if kind == "rotation":
            return self._make_rotation_by_name(name, rng_seed)
        return self._make_marginal_by_name(name, rng_seed)

    def _make_rotation_by_name(self, name: str, seed: int):
        """Instantiate a rotation bijector from its string name.

        Parameters
        ----------
        name : str
            One of ``"pca"``, ``"ica"``, or ``"random"``.
        seed : int
            Random seed for stochastic rotations.

        Returns
        -------
        rotation : RotationBijector
            The corresponding unfitted rotation instance.

        Raises
        ------
        ValueError
            If ``name`` is not a recognised rotation type.
        """
        if name == "pca":
            return PCARotation(whiten=True)
        if name == "ica":
            from rbig._src.rotation import ICARotation

            return ICARotation(random_state=seed)
        if name == "random":
            from rbig._src.rotation import RandomRotation

            return RandomRotation(random_state=seed)
        raise ValueError(f"Unknown rotation: {name!r}. Use 'pca', 'ica', or 'random'.")

    def _make_marginal_by_name(self, name: str, seed: int):
        """Instantiate a marginal Gaussianizer from its string name.

        Parameters
        ----------
        name : str
            One of ``"gaussianize"`` / ``"empirical"``, ``"quantile"``,
            ``"kde"``, ``"gmm"``, or ``"spline"``.
        seed : int
            Random seed for stochastic marginal estimators.

        Returns
        -------
        marginal : MarginalBijector
            The corresponding unfitted marginal Gaussianizer instance.

        Raises
        ------
        ValueError
            If ``name`` is not a recognised marginal type.
        """
        if name in ("gaussianize", "empirical", None):
            return MarginalGaussianize()
        if name == "quantile":
            from rbig._src.marginal import QuantileGaussianizer

            return QuantileGaussianizer(random_state=seed)
        if name == "kde":
            from rbig._src.marginal import KDEGaussianizer

            return KDEGaussianizer()
        if name == "gmm":
            from rbig._src.marginal import GMMGaussianizer

            return GMMGaussianizer(random_state=seed)
        if name == "spline":
            from rbig._src.marginal import SplineGaussianizer

            return SplineGaussianizer()
        raise ValueError(
            f"Unknown marginal: {name!r}. Use 'gaussianize', 'quantile', 'kde', 'gmm', or 'spline'."
        )

    @staticmethod
    def _get_information_tolerance(n_samples: int) -> float:
        """Compute a sample-size-adaptive convergence tolerance.

        Interpolates from an empirically calibrated lookup table mapping
        dataset size to an appropriate TC-change threshold.  Larger datasets
        can resolve finer changes in total correlation, so the tolerance
        decreases with sample count.

        Parameters
        ----------
        n_samples : int
            Number of training samples.

        Returns
        -------
        tol : float
            Adaptive tolerance value.
        """
        from scipy.interpolate import interp1d

        xxx = np.logspace(2, 8, 7)
        yyy = [0.1571, 0.0468, 0.0145, 0.0046, 0.0014, 0.0001, 0.00001]
        return float(interp1d(xxx, yyy, fill_value="extrapolate")(n_samples))

    @staticmethod
    def _calculate_negentropy(X: np.ndarray) -> np.ndarray:
        """Negentropy of each marginal: J(xᵢ) = H(Gauss) − H(xᵢ) ≥ 0.

        Negentropy measures how far a distribution is from Gaussian.  It is
        zero if and only if the distribution is Gaussian.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data whose per-feature negentropy is computed.

        Returns
        -------
        neg_entropy : np.ndarray of shape (n_features,)
            Non-negative negentropy for each feature dimension.

        Notes
        -----
        The negentropy is computed as:

            J(xᵢ) = H(𝒩(μᵢ, σᵢ²)) − H(xᵢ)

        where H(𝒩(μ, σ²)) = ½(1 + log(2πσ²)) is the Gaussian entropy with
        the same variance.
        """
        from rbig._src.densities import marginal_entropy

        # Gaussian entropy for a Gaussian with the same variance: 0.5*(1 + log(2*pi*var))
        gauss_h = 0.5 * (1 + np.log(2 * np.pi)) + 0.5 * np.log(np.var(X, axis=0))
        marg_h = marginal_entropy(X)  # empirical marginal entropy per feature
        return gauss_h - marg_h  # shape (n_features,); always >= 0

    @staticmethod
    def _total_correlation(X: np.ndarray) -> float:
        """Total correlation of X: TC(X) = ∑ᵢ H(Xᵢ) − H(X).

        Total correlation (also called multi-information) quantifies the
        statistical dependence among all features jointly.  It equals zero
        when all features are mutually independent.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data whose total correlation is measured.

        Returns
        -------
        tc : float
            Total correlation in nats.  Non-negative by the subadditivity of
            entropy.

        Notes
        -----
        The joint entropy ``H(X)`` is estimated under a Gaussian assumption
        (using the log-determinant of the covariance matrix), while the
        marginal entropies ``H(Xᵢ)`` are estimated empirically.
        """
        from rbig._src.densities import joint_entropy_gaussian, marginal_entropy

        marg_h = marginal_entropy(X)  # per-feature entropy; shape (n_features,)
        joint_h = joint_entropy_gaussian(X)  # Gaussian approximation to joint entropy
        return float(np.sum(marg_h) - joint_h)

`entropy()`

Differential entropy of the fitted distribution in nats.

Estimated from the training data using:

H(X) = −𝔼_X[log p(x)]

The expectation is approximated by the sample mean over the training set. The log-likelihoods are obtained via the efficient cached path :meth:score_samples_raw_ which reuses pre-computed quantities from fit.

Returns

h : float Estimated entropy in nats. Always ≥ 0 for continuous distributions.

Notes

This is equivalent to -self.score(X_train) but avoids the cost of re-passing training data through all layers.

Source code in rbig/_src/model.py

def entropy(self) -> float:
    """Differential entropy of the fitted distribution in nats.

    Estimated from the training data using:

        H(X) = −𝔼_X[log p(x)]

    The expectation is approximated by the sample mean over the training
    set.  The log-likelihoods are obtained via the efficient cached path
    :meth:`score_samples_raw_` which reuses pre-computed quantities from
    ``fit``.

    Returns
    -------
    h : float
        Estimated entropy in nats.  Always ≥ 0 for continuous
        distributions.

    Notes
    -----
    This is equivalent to ``-self.score(X_train)`` but avoids the cost
    of re-passing training data through all layers.
    """
    return float(-np.mean(self.score_samples_raw_()))

`fit(X)`

Fit the RBIG model by iteratively Gaussianizing X.

At each layer k the algorithm:

Builds a new :class:RBIGLayer with the configured marginal and rotation transforms.
Fits the layer on the current working copy Xt.
Accumulates the per-sample log-det-Jacobian: log_det_train_ += log|det J_k(Xt)|.
Advances Xt through the layer: Xt = f_k(Xt).
Measures residual total correlation: TC(Xt) = ∑ᵢ H(Xᵢ) − H(X).
Stops early when TC has not changed by more than tol for zero_tolerance consecutive layers.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : AnnealedRBIG The fitted model.

Source code in rbig/_src/model.py

def fit(self, X: np.ndarray) -> AnnealedRBIG:
    """Fit the RBIG model by iteratively Gaussianizing X.

    At each layer k the algorithm:

    1. Builds a new :class:`RBIGLayer` with the configured marginal and
       rotation transforms.
    2. Fits the layer on the current working copy ``Xt``.
    3. Accumulates the per-sample log-det-Jacobian:
       ``log_det_train_ += log|det J_k(Xt)|``.
    4. Advances ``Xt`` through the layer: ``Xt = f_k(Xt)``.
    5. Measures residual total correlation: ``TC(Xt) = ∑ᵢ H(Xᵢ) − H(X)``.
    6. Stops early when TC has not changed by more than ``tol`` for
       ``zero_tolerance`` consecutive layers.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : AnnealedRBIG
        The fitted model.
    """
    n_samples, n_features = X.shape
    self.n_features_in_ = n_features  # remember input dimensionality
    self.layers_: list[RBIGLayer] = []
    self.tc_per_layer_: list[float] = []

    # Validate and resolve tolerance
    if self.tol == "auto":
        tol = self._get_information_tolerance(n_samples)
    elif isinstance(self.tol, int | float):
        tol = float(self.tol)
    else:
        raise ValueError(f"tol must be a float or 'auto', got {self.tol!r}")
    self.tol_: float = tol  # store resolved tolerance for inspection

    Xt = X.copy()  # working copy; shape (n_samples, n_features)
    self.log_det_train_ = np.zeros(
        n_samples
    )  # accumulated log|det J|; shape (n_samples,)
    zero_count = 0  # consecutive non-improving layer counter

    for i in range(self.n_layers):
        # Build layer i with the appropriate marginal and rotation components
        layer = RBIGLayer(
            marginal=self._make_marginal(layer_index=i),
            rotation=self._make_rotation(layer_index=i),
        )
        layer.fit(Xt)
        # Accumulate log|det J_i(Xt)| before advancing Xt
        self.log_det_train_ += layer.log_det_jacobian(Xt)
        Xt = layer.transform(Xt)  # advance to next representation
        self.layers_.append(layer)

        # Measure residual total correlation: TC = sum_i H(Xi) - H(X)
        tc = self._total_correlation(Xt)
        self.tc_per_layer_.append(tc)

        if i > 0:
            # Check convergence: how much did TC improve this layer?
            delta = abs(self.tc_per_layer_[-2] - tc)
            if delta < tol:
                zero_count += 1
            else:
                zero_count = 0  # reset on any significant improvement

        # Stop early if TC has been flat for zero_tolerance consecutive layers
        if zero_count >= self.zero_tolerance:
            break

    # Store the fully transformed training data for efficient entropy estimation
    self.X_transformed_ = Xt  # shape (n_samples, n_features)
    return self

`fit_transform(X)`

Fit the model to X and return the latent-space representation.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

Z : np.ndarray of shape (n_samples, n_features) Transformed data in the latent space.

Source code in rbig/_src/model.py

def fit_transform(self, X: np.ndarray) -> np.ndarray:
    """Fit the model to X and return the latent-space representation.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Transformed data in the latent space.
    """
    return self.fit(X).transform(X)

`inverse_transform(X)`

Map latent-space data back to the original input space.

Applies layers in reverse order: x = f₁⁻¹(… fₖ₋₁⁻¹(fₖ⁻¹(z)) …).

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the latent (approximately Gaussian) space.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original input space.

Source code in rbig/_src/model.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Map latent-space data back to the original input space.

    Applies layers in reverse order:
    ``x = f₁⁻¹(… fₖ₋₁⁻¹(fₖ⁻¹(z)) …)``.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the latent (approximately Gaussian) space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original input space.
    """
    Xt = X.copy()  # shape (n_samples, n_features)
    for layer in reversed(self.layers_):
        Xt = layer.inverse_transform(Xt)
    return Xt

`jacobian(X, return_X_transform=False)`

Compute the full Jacobian matrix of the RBIG transform.

For each sample, returns the (n_features, n_features) Jacobian matrix df/dx of the composition of all fitted layers. Uses the seed-dimension approach from the legacy implementation: for each input dimension idim, a unit vector is propagated through the chain of per-feature marginal derivatives and rotation matrices.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data at which to evaluate the Jacobian. return_X_transform : bool, default False If True, also return the fully transformed data f(X) (computed as a side-effect of the Jacobian calculation).

Returns

jac : np.ndarray of shape (n_samples, n_features, n_features) Full Jacobian matrix per sample. jac[n, i, j] is the partial derivative df_i/dx_j for the n-th sample. X_transformed : np.ndarray of shape (n_samples, n_features) Only returned when return_X_transform=True. The data after passing through all layers.

Source code in rbig/_src/model.py

def jacobian(
    self,
    X: np.ndarray,
    return_X_transform: bool = False,
) -> np.ndarray | tuple[np.ndarray, np.ndarray]:
    """Compute the full Jacobian matrix of the RBIG transform.

    For each sample, returns the ``(n_features, n_features)`` Jacobian
    matrix ``df/dx`` of the composition of all fitted layers.  Uses the
    seed-dimension approach from the legacy implementation: for each input
    dimension ``idim``, a unit vector is propagated through the chain of
    per-feature marginal derivatives and rotation matrices.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data at which to evaluate the Jacobian.
    return_X_transform : bool, default False
        If True, also return the fully transformed data ``f(X)`` (computed
        as a side-effect of the Jacobian calculation).

    Returns
    -------
    jac : np.ndarray of shape (n_samples, n_features, n_features)
        Full Jacobian matrix per sample.  ``jac[n, i, j]`` is the partial
        derivative ``df_i/dx_j`` for the n-th sample.
    X_transformed : np.ndarray of shape (n_samples, n_features)
        Only returned when ``return_X_transform=True``.  The data after
        passing through all layers.
    """
    n_samples, n_features = X.shape

    # ── Forward pass: collect per-layer derivatives and rotation matrices ──
    derivs_per_layer = []  # each: (n_samples, n_features)
    rotmats_per_layer = []  # each: (n_features, n_features)

    Xt = X.copy()
    for layer in self.layers_:
        if not hasattr(layer.marginal, "_per_feature_log_deriv"):
            raise NotImplementedError(
                f"Jacobian computation requires a marginal with "
                f"_per_feature_log_deriv(); "
                f"{type(layer.marginal).__name__} does not support this."
            )
        # Per-feature marginal derivatives and transformed data in one pass
        log_d, Xt_marginal = layer.marginal._per_feature_log_deriv(
            Xt, return_transform=True
        )
        derivs_per_layer.append(np.exp(log_d))

        # Rotation matrix in row-vector convention: y = z @ R
        rot = self._extract_rotation_matrix(layer.rotation)
        rotmats_per_layer.append(rot)

        # Advance through rotation only
        Xt = layer.rotation.transform(Xt_marginal)

    # ── Seed-dimension loop: propagate unit vectors through the chain ──
    jac = np.zeros((n_samples, n_features, n_features))

    for idim in range(n_features):
        # Initialize seed: unit vector in dimension idim
        XX = np.zeros((n_samples, n_features))
        XX[:, idim] = 1.0

        for derivs, R in zip(derivs_per_layer, rotmats_per_layer, strict=True):
            # Chain rule: XX_new = diag(derivs) @ XX @ R
            XX = (derivs * XX) @ R

        jac[:, :, idim] = XX

    if return_X_transform:
        return jac, Xt
    return jac

`predict_proba(X, domain='input')`

Return probability density estimates for X.

Uses the change-of-variables formula via the full Jacobian matrix to compute the density in the requested domain.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data points to evaluate. domain : str, default="input" Which domain to return densities in:

- ``"input"`` — density in the original data space:
  ``p(x) = p_Z(f(x)) · |det J_f(x)|``
- ``"transform"`` — density in the Gaussian latent space:
  ``p_Z(f(x)) = ∏ᵢ φ(fᵢ(x))``
- ``"both"`` — returns a tuple ``(p_input, p_transform)``

Returns

proba : np.ndarray of shape (n_samples,) or tuple Probability density estimates. When domain="both", returns (p_input, p_transform).

Source code in rbig/_src/model.py

def predict_proba(
    self,
    X: np.ndarray,
    domain: str = "input",
) -> np.ndarray | tuple[np.ndarray, np.ndarray]:
    """Return probability density estimates for X.

    Uses the change-of-variables formula via the full Jacobian matrix
    to compute the density in the requested domain.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data points to evaluate.
    domain : str, default="input"
        Which domain to return densities in:

        - ``"input"`` — density in the original data space:
          ``p(x) = p_Z(f(x)) · |det J_f(x)|``
        - ``"transform"`` — density in the Gaussian latent space:
          ``p_Z(f(x)) = ∏ᵢ φ(fᵢ(x))``
        - ``"both"`` — returns a tuple ``(p_input, p_transform)``

    Returns
    -------
    proba : np.ndarray of shape (n_samples,) or tuple
        Probability density estimates.  When ``domain="both"``, returns
        ``(p_input, p_transform)``.
    """
    jac, Xt = self.jacobian(X, return_X_transform=True)

    # Work in log-space for numerical stability
    log_p_transform = np.sum(stats.norm.logpdf(Xt), axis=1)

    if domain == "transform":
        p_transform = np.exp(log_p_transform)
        p_transform = np.where(np.isfinite(p_transform), p_transform, 0.0)
        return p_transform

    # Input-domain density via change of variables (log-space)
    _sign, log_abs_det = np.linalg.slogdet(jac)
    log_p_input = log_p_transform + log_abs_det
    p_input = np.exp(log_p_input)
    p_input = np.where(np.isfinite(p_input), p_input, 0.0)

    if domain == "input":
        return p_input
    if domain == "both":
        p_transform = np.exp(log_p_transform)
        p_transform = np.where(np.isfinite(p_transform), p_transform, 0.0)
        return p_input, p_transform
    raise ValueError(
        f"Unknown domain: {domain!r}. Use 'input', 'transform', or 'both'."
    )

`sample(n_samples, random_state=None)`

Generate samples from the learned distribution.

Draws i.i.d. standard Gaussian samples in the latent space and maps them back to the data space via the inverse normalizing flow.

Parameters

n_samples : int Number of samples to generate. random_state : int or None, optional Seed for the random number generator. If None, a random seed is used.

Returns

X_new : np.ndarray of shape (n_samples, n_features_in_) Samples in the original data space.

Source code in rbig/_src/model.py

def sample(self, n_samples: int, random_state: int | None = None) -> np.ndarray:
    """Generate samples from the learned distribution.

    Draws i.i.d. standard Gaussian samples in the latent space and maps
    them back to the data space via the inverse normalizing flow.

    Parameters
    ----------
    n_samples : int
        Number of samples to generate.
    random_state : int or None, optional
        Seed for the random number generator.  If ``None``, a random
        seed is used.

    Returns
    -------
    X_new : np.ndarray of shape (n_samples, n_features_in_)
        Samples in the original data space.
    """
    rng = np.random.default_rng(random_state)
    Z = rng.standard_normal((n_samples, self.n_features_in_))  # latent samples
    return self.inverse_transform(Z)

`score(X)`

Mean log-likelihood of samples X under the fitted density.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data points to evaluate.

Returns

mean_log_prob : float Average per-sample log-likelihood in nats.

Source code in rbig/_src/model.py

def score(self, X: np.ndarray) -> float:
    """Mean log-likelihood of samples X under the fitted density.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data points to evaluate.

    Returns
    -------
    mean_log_prob : float
        Average per-sample log-likelihood in nats.
    """
    return float(np.mean(self.score_samples(X)))

`score_samples(X)`

Per-sample log-likelihood under the fitted density model.

Uses the change-of-variables formula for normalizing flows:

log p(x) = log p_Z(f(x)) + log|det J_f(x)|

where p_Z = 𝒩(0, I) is the standard Gaussian base density, f is the composition of all fitted layers, and J_f(x) is the Jacobian of f at x.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data points at which to evaluate the log-likelihood.

Returns

log_prob : np.ndarray of shape (n_samples,) Per-sample log-likelihood in nats.

Notes

The log-det-Jacobian is accumulated layer by layer to avoid recomputing intermediate representations:

log|det J_f(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|

Source code in rbig/_src/model.py

def score_samples(self, X: np.ndarray) -> np.ndarray:
    """Per-sample log-likelihood under the fitted density model.

    Uses the change-of-variables formula for normalizing flows:

        log p(x) = log p_Z(f(x)) + log|det J_f(x)|

    where ``p_Z = 𝒩(0, I)`` is the standard Gaussian base density,
    ``f`` is the composition of all fitted layers, and ``J_f(x)`` is
    the Jacobian of ``f`` at ``x``.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data points at which to evaluate the log-likelihood.

    Returns
    -------
    log_prob : np.ndarray of shape (n_samples,)
        Per-sample log-likelihood in nats.

    Notes
    -----
    The log-det-Jacobian is accumulated layer by layer to avoid
    recomputing intermediate representations:

        log|det J_f(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|
    """
    Xt = X.copy()  # shape (n_samples, n_features)
    log_det_jac = np.zeros(X.shape[0])  # accumulator; shape (n_samples,)
    for layer in self.layers_:
        # Accumulate log|det Jₖ| before advancing through layer k
        log_det_jac += layer.log_det_jacobian(Xt)
        Xt = layer.transform(Xt)  # xₖ = fₖ(xₖ₋₁)
    # log p_Z(z) = sum_i log N(z_i; 0, 1); shape (n_samples,)
    log_pz = np.sum(stats.norm.logpdf(Xt), axis=1)
    # change-of-variables: log p(x) = log p_Z(f(x)) + log|det J_f(x)|
    return log_pz + log_det_jac

`score_samples_raw_()`

Log-likelihood for the stored training data without recomputing layers.

Reuses X_transformed_ and log_det_train_ cached during :meth:fit, so the cost is a single Gaussian log-pdf evaluation rather than a full forward pass through all layers.

Returns

log_prob : np.ndarray of shape (n_samples,) Per-sample log-likelihood of the training data in nats.

Source code in rbig/_src/model.py

def score_samples_raw_(self) -> np.ndarray:
    """Log-likelihood for the stored training data without recomputing layers.

    Reuses ``X_transformed_`` and ``log_det_train_`` cached during
    :meth:`fit`, so the cost is a single Gaussian log-pdf evaluation
    rather than a full forward pass through all layers.

    Returns
    -------
    log_prob : np.ndarray of shape (n_samples,)
        Per-sample log-likelihood of the training data in nats.
    """
    # log p_Z evaluated at the pre-computed transformed training data
    log_pz = np.sum(
        stats.norm.logpdf(self.X_transformed_), axis=1
    )  # shape (n_samples,)
    # add the accumulated log-det-Jacobian stored during fit
    return log_pz + self.log_det_train_

`transform(X)`

Map X to the Gaussian latent space through all fitted layers.

Applies each fitted :class:RBIGLayer in order: Z = fₖ(… f₂(f₁(x)) …).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Z : np.ndarray of shape (n_samples, n_features) Data in the approximately Gaussian latent space.

Source code in rbig/_src/model.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Map X to the Gaussian latent space through all fitted layers.

    Applies each fitted :class:`RBIGLayer` in order:
    ``Z = fₖ(… f₂(f₁(x)) …)``.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Data in the approximately Gaussian latent space.
    """
    Xt = X.copy()  # shape (n_samples, n_features)
    for layer in self.layers_:
        Xt = layer.transform(Xt)
    return Xt

`BaseITMeasure`

Bases: ABC

Abstract base class for information-theoretic (IT) measures.

Subclasses implement specific measures such as mutual information, total correlation, or entropy from data, potentially conditioning on an optional second variable Y.

Notes

Common IT quantities include:

Entropy: H(X) = −𝔼_X[log p(x)]
Mutual information: I(X; Y) = H(X) + H(Y) − H(X, Y)
Total correlation: TC(X) = ∑ᵢ H(Xᵢ) − H(X)

Source code in rbig/_src/base.py

class BaseITMeasure(ABC):
    """Abstract base class for information-theoretic (IT) measures.

    Subclasses implement specific measures such as mutual information,
    total correlation, or entropy from data, potentially conditioning on
    an optional second variable ``Y``.

    Notes
    -----
    Common IT quantities include:

    * **Entropy**: H(X) = −𝔼_X[log p(x)]
    * **Mutual information**: I(X; Y) = H(X) + H(Y) − H(X, Y)
    * **Total correlation**: TC(X) = ∑ᵢ H(Xᵢ) − H(X)
    """

    @abstractmethod
    def fit(self, X: np.ndarray, Y: np.ndarray | None = None) -> "BaseITMeasure":
        """Fit the IT measure to data.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Primary data array.
        Y : np.ndarray of shape (n_samples, n_targets) or None, optional
            Secondary data array, used for joint or conditional measures.

        Returns
        -------
        self : BaseITMeasure
            The fitted measure instance.
        """
        ...

    @abstractmethod
    def score(self) -> float:
        """Return the scalar value of the fitted IT measure.

        Returns
        -------
        value : float
            The computed information-theoretic quantity (e.g., entropy in
            nats, mutual information in bits, total correlation in nats).
        """
        ...

`fit(X, Y=None)` `abstractmethod`

Fit the IT measure to data.

Parameters

X : np.ndarray of shape (n_samples, n_features) Primary data array. Y : np.ndarray of shape (n_samples, n_targets) or None, optional Secondary data array, used for joint or conditional measures.

Returns

self : BaseITMeasure The fitted measure instance.

Source code in rbig/_src/base.py

@abstractmethod
def fit(self, X: np.ndarray, Y: np.ndarray | None = None) -> "BaseITMeasure":
    """Fit the IT measure to data.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Primary data array.
    Y : np.ndarray of shape (n_samples, n_targets) or None, optional
        Secondary data array, used for joint or conditional measures.

    Returns
    -------
    self : BaseITMeasure
        The fitted measure instance.
    """
    ...

`score()` `abstractmethod`

Return the scalar value of the fitted IT measure.

Returns

value : float The computed information-theoretic quantity (e.g., entropy in nats, mutual information in bits, total correlation in nats).

Source code in rbig/_src/base.py

@abstractmethod
def score(self) -> float:
    """Return the scalar value of the fitted IT measure.

    Returns
    -------
    value : float
        The computed information-theoretic quantity (e.g., entropy in
        nats, mutual information in bits, total correlation in nats).
    """
    ...

`BaseTransform`

Bases: ABC

Abstract base class for all RBIG transforms.

Defines the common interface shared by every learnable data transformation in this library: fitting to data, forward mapping, and its inverse. Subclasses that support density estimation should also implement log_det_jacobian.

Notes

The change-of-variables formula for a normalizing flow relates the density of the input x to a base density p_Z via a bijection f:

log p(x) = log p_Z(f(x)) + log|det J_f(x)|

where J_f(x) is the Jacobian of f evaluated at x.

Source code in rbig/_src/base.py

class BaseTransform(ABC):
    """Abstract base class for all RBIG transforms.

    Defines the common interface shared by every learnable data transformation
    in this library: fitting to data, forward mapping, and its inverse.
    Subclasses that support density estimation should also implement
    ``log_det_jacobian``.

    Notes
    -----
    The change-of-variables formula for a normalizing flow relates the density
    of the input ``x`` to a base density ``p_Z`` via a bijection ``f``:

        log p(x) = log p_Z(f(x)) + log|det J_f(x)|

    where ``J_f(x)`` is the Jacobian of ``f`` evaluated at ``x``.
    """

    @abstractmethod
    def fit(self, X: np.ndarray) -> "BaseTransform":
        """Fit the transform to data X.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data used to estimate any internal parameters.

        Returns
        -------
        self : BaseTransform
            The fitted transform instance.
        """
        ...

    @abstractmethod
    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the fitted forward transform to data X.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data to transform.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Transformed data.
        """
        ...

    @abstractmethod
    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the fitted inverse transform to data X.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the transformed (latent) space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original input space.
        """
        ...

    def fit_transform(self, X: np.ndarray) -> np.ndarray:
        """Fit to data X, then return the transformed result.

        Equivalent to calling ``self.fit(X).transform(X)`` but may be
        overridden for efficiency when a single pass suffices.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data to fit and transform.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Transformed data.
        """
        return self.fit(X).transform(X)

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Log absolute determinant of the Jacobian evaluated at X.

        For a transform f, this returns ``log|det J_f(x)|`` per sample,
        which is the volume-correction term required in the change-of-variables
        formula for density estimation.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points at which to evaluate the log-det-Jacobian.

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant of the Jacobian.

        Raises
        ------
        NotImplementedError
            If the subclass does not implement this method.
        """
        raise NotImplementedError

`fit(X)` `abstractmethod`

Fit the transform to data X.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data used to estimate any internal parameters.

Returns

self : BaseTransform The fitted transform instance.

Source code in rbig/_src/base.py

@abstractmethod
def fit(self, X: np.ndarray) -> "BaseTransform":
    """Fit the transform to data X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data used to estimate any internal parameters.

    Returns
    -------
    self : BaseTransform
        The fitted transform instance.
    """
    ...

`fit_transform(X)`

Fit to data X, then return the transformed result.

Equivalent to calling self.fit(X).transform(X) but may be overridden for efficiency when a single pass suffices.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data to fit and transform.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Transformed data.

Source code in rbig/_src/base.py

def fit_transform(self, X: np.ndarray) -> np.ndarray:
    """Fit to data X, then return the transformed result.

    Equivalent to calling ``self.fit(X).transform(X)`` but may be
    overridden for efficiency when a single pass suffices.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data to fit and transform.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Transformed data.
    """
    return self.fit(X).transform(X)

`inverse_transform(X)` `abstractmethod`

Apply the fitted inverse transform to data X.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the transformed (latent) space.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original input space.

Source code in rbig/_src/base.py

@abstractmethod
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the fitted inverse transform to data X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the transformed (latent) space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original input space.
    """
    ...

`log_det_jacobian(X)`

Log absolute determinant of the Jacobian evaluated at X.

For a transform f, this returns log|det J_f(x)| per sample, which is the volume-correction term required in the change-of-variables formula for density estimation.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points at which to evaluate the log-det-Jacobian.

Returns

ldj : np.ndarray of shape (n_samples,) Per-sample log absolute determinant of the Jacobian.

Raises

NotImplementedError If the subclass does not implement this method.

Source code in rbig/_src/base.py

def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Log absolute determinant of the Jacobian evaluated at X.

    For a transform f, this returns ``log|det J_f(x)|`` per sample,
    which is the volume-correction term required in the change-of-variables
    formula for density estimation.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points at which to evaluate the log-det-Jacobian.

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant of the Jacobian.

    Raises
    ------
    NotImplementedError
        If the subclass does not implement this method.
    """
    raise NotImplementedError

`transform(X)` `abstractmethod`

Apply the fitted forward transform to data X.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data to transform.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Transformed data.

Source code in rbig/_src/base.py

@abstractmethod
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the fitted forward transform to data X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data to transform.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Transformed data.
    """
    ...

`Bijector`

Bases: ABC

Abstract base class for invertible transformations (bijectors).

A bijector implements a differentiable, invertible map f : ℝᵈ → ℝᵈ and provides the log absolute determinant of its Jacobian. These are the building blocks of normalizing flows.

The density of a random variable X = f⁻¹(Z) where Z ~ p_Z is:

log p(x) = log p_Z(f(x)) + log|det J_f(x)|

Notes

Concrete subclasses must implement :meth:fit, :meth:transform, :meth:inverse_transform, and :meth:get_log_det_jacobian. log_det_jacobian is provided as a convenience alias for the last method, for compatibility with RBIGLayer.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: From ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537–549. https://doi.org/10.1109/TNN.2011.2106511

Source code in rbig/_src/base.py

class Bijector(ABC):
    """Abstract base class for invertible transformations (bijectors).

    A bijector implements a differentiable, invertible map ``f : ℝᵈ → ℝᵈ``
    and provides the log absolute determinant of its Jacobian.  These are
    the building blocks of normalizing flows.

    The density of a random variable ``X = f⁻¹(Z)`` where ``Z ~ p_Z`` is:

        log p(x) = log p_Z(f(x)) + log|det J_f(x)|

    Notes
    -----
    Concrete subclasses must implement :meth:`fit`, :meth:`transform`,
    :meth:`inverse_transform`, and :meth:`get_log_det_jacobian`.
    ``log_det_jacobian`` is provided as a convenience alias for the last
    method, for compatibility with ``RBIGLayer``.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    From ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537–549. https://doi.org/10.1109/TNN.2011.2106511
    """

    @abstractmethod
    def fit(self, X: np.ndarray) -> "Bijector":
        """Fit the bijector to data X.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : Bijector
            The fitted bijector.
        """
        ...

    @abstractmethod
    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the forward bijection f(x).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Data mapped to the latent space.
        """
        ...

    @abstractmethod
    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the inverse bijection f⁻¹(z).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the latent space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original space.
        """
        ...

    @abstractmethod
    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute log|det J_f(x)| per sample.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points at which to evaluate the log-det-Jacobian.

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant of the forward Jacobian J_f.
        """
        ...

    def fit_transform(self, X: np.ndarray) -> np.ndarray:
        """Fit the bijector to X, then return f(X).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data to fit and transform.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Transformed data.
        """
        return self.fit(X).transform(X)

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Alias for get_log_det_jacobian for compatibility with RBIGLayer.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points.

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Per-sample log|det J_f(x)|.
        """
        return self.get_log_det_jacobian(X)

`fit(X)` `abstractmethod`

Fit the bijector to data X.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : Bijector The fitted bijector.

Source code in rbig/_src/base.py

@abstractmethod
def fit(self, X: np.ndarray) -> "Bijector":
    """Fit the bijector to data X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : Bijector
        The fitted bijector.
    """
    ...

`fit_transform(X)`

Fit the bijector to X, then return f(X).

Parameters

X : np.ndarray of shape (n_samples, n_features) Data to fit and transform.

Returns

Z : np.ndarray of shape (n_samples, n_features) Transformed data.

Source code in rbig/_src/base.py

def fit_transform(self, X: np.ndarray) -> np.ndarray:
    """Fit the bijector to X, then return f(X).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data to fit and transform.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Transformed data.
    """
    return self.fit(X).transform(X)

`get_log_det_jacobian(X)` `abstractmethod`

Compute log|det J_f(x)| per sample.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points at which to evaluate the log-det-Jacobian.

Returns

ldj : np.ndarray of shape (n_samples,) Per-sample log absolute determinant of the forward Jacobian J_f.

Source code in rbig/_src/base.py

@abstractmethod
def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute log|det J_f(x)| per sample.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points at which to evaluate the log-det-Jacobian.

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant of the forward Jacobian J_f.
    """
    ...

`inverse_transform(X)` `abstractmethod`

Apply the inverse bijection f⁻¹(z).

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the latent space.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original space.

Source code in rbig/_src/base.py

@abstractmethod
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the inverse bijection f⁻¹(z).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the latent space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original space.
    """
    ...

`log_det_jacobian(X)`

Alias for get_log_det_jacobian for compatibility with RBIGLayer.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points.

Returns

ldj : np.ndarray of shape (n_samples,) Per-sample log|det J_f(x)|.

Source code in rbig/_src/base.py

def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Alias for get_log_det_jacobian for compatibility with RBIGLayer.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points.

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Per-sample log|det J_f(x)|.
    """
    return self.get_log_det_jacobian(X)

`transform(X)` `abstractmethod`

Apply the forward bijection f(x).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns

Z : np.ndarray of shape (n_samples, n_features) Data mapped to the latent space.

Source code in rbig/_src/base.py

@abstractmethod
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the forward bijection f(x).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Data mapped to the latent space.
    """
    ...

`BoxCoxTransform`

Bases: BaseTransform

Box-Cox power transform fitted independently to each feature.

The Box-Cox family of transforms is parameterised by λ (one per feature):

λ ≠ 0 :  y = (xᵏ − 1) / λ
λ → 0 :  y = log(x)             (continuity limit)

λ values are estimated via maximum likelihood (scipy's boxcox). Features with non-positive values are left unchanged (λ = 0 applied as identity rather than log, since log requires positive inputs).

The inverse transform is:

λ ≠ 0 :  x = (λy + 1)^{1/λ}
λ = 0 :  x = exp(y)

The log-det of the Jacobian is:

λ ≠ 0 :  ∑ᵢ (λ − 1) log xᵢ
λ = 0 :  ∑ᵢ (−log xᵢ)          (from d(log x)/dx = 1/x ⟹ log|dy/dx| = −log xᵢ)

.. note:: The current implementation uses −xᵢ (not −log xᵢ) for the λ = 0 branch, matching the original code behaviour. This is an approximation that differs from the exact analytical log-det.

Parameters

method : str, optional (default='mle') Fitting method passed conceptually; currently scipy MLE is always used regardless of this value.

Attributes

lambdas_ : np.ndarray of shape (n_features,) Fitted λ values after calling fit.

Examples

import numpy as np from rbig._src.parametric import BoxCoxTransform rng = np.random.default_rng(1) X = rng.exponential(scale=2.0, size=(200, 3)) # strictly positive tr = BoxCoxTransform().fit(X) Y = tr.transform(X) X_rec = tr.inverse_transform(Y) np.allclose(X, X_rec, atol=1e-6) True

Source code in rbig/_src/parametric.py

class BoxCoxTransform(BaseTransform):
    """Box-Cox power transform fitted independently to each feature.

    The Box-Cox family of transforms is parameterised by λ (one per feature):

        λ ≠ 0 :  y = (xᵏ − 1) / λ
        λ → 0 :  y = log(x)             (continuity limit)

    λ values are estimated via maximum likelihood (scipy's ``boxcox``).
    Features with non-positive values are left unchanged (λ = 0 applied as
    identity rather than log, since log requires positive inputs).

    The inverse transform is:

        λ ≠ 0 :  x = (λy + 1)^{1/λ}
        λ = 0 :  x = exp(y)

    The log-det of the Jacobian is:

        λ ≠ 0 :  ∑ᵢ (λ − 1) log xᵢ
        λ = 0 :  ∑ᵢ (−log xᵢ)          (from d(log x)/dx = 1/x ⟹ log|dy/dx| = −log xᵢ)

    .. note::
        The current implementation uses ``−xᵢ`` (not ``−log xᵢ``) for the
        λ = 0 branch, matching the original code behaviour.  This is an
        approximation that differs from the exact analytical log-det.

    Parameters
    ----------
    method : str, optional (default='mle')
        Fitting method passed conceptually; currently scipy MLE is always
        used regardless of this value.

    Attributes
    ----------
    lambdas_ : np.ndarray of shape (n_features,)
        Fitted λ values after calling ``fit``.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import BoxCoxTransform
    >>> rng = np.random.default_rng(1)
    >>> X = rng.exponential(scale=2.0, size=(200, 3))  # strictly positive
    >>> tr = BoxCoxTransform().fit(X)
    >>> Y = tr.transform(X)
    >>> X_rec = tr.inverse_transform(Y)
    >>> np.allclose(X, X_rec, atol=1e-6)
    True
    """

    def __init__(self, method: str = "mle"):
        self.method = method

    def fit(self, X: np.ndarray) -> BoxCoxTransform:
        """Estimate one Box-Cox λ per feature via MLE.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.  Features that contain non-positive values are
            assigned λ = 0 (no transform applied during ``transform``).

        Returns
        -------
        self : BoxCoxTransform
            Fitted instance with ``lambdas_`` attribute set.
        """
        self.lambdas_ = np.zeros(X.shape[1])  # λ per feature, default 0
        for i in range(X.shape[1]):
            xi = X[:, i]
            if np.all(xi > 0):
                _, lam = stats.boxcox(xi)  # MLE for λ
            else:
                lam = 0.0  # non-positive data: no power transform
            self.lambdas_[i] = lam
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the fitted Box-Cox transform to X.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.  Features with λ = 0 and non-positive values are
            passed through unchanged.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Box-Cox transformed data.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(X.shape[1]):
            xi = X[:, i]
            lam = self.lambdas_[i]
            if np.all(xi > 0):
                Xt[:, i] = stats.boxcox(xi, lmbda=lam)  # y = (x^lam - 1)/lam or log(x)
            else:
                Xt[:, i] = xi  # pass-through for non-positive
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the Box-Cox transform.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Box-Cox transformed data.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Recovered original-scale data.  Uses:

            * λ = 0 : x = exp(y)
            * λ ≠ 0 : x = (λy + 1)^{1/λ}   (clamped to 0 for stability)
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(X.shape[1]):
            lam = self.lambdas_[i]
            if np.abs(lam) < 1e-10:
                Xt[:, i] = np.exp(X[:, i])  # x = exp(y)
            else:
                # x = (λy + 1)^{1/λ}, clamp argument to ≥ 0
                Xt[:, i] = np.power(np.maximum(lam * X[:, i] + 1, 0), 1 / lam)
        return Xt

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute per-sample log |det J| of the forward Box-Cox transform.

        The Jacobian is diagonal; for each feature:

            λ ≠ 0 :  d/dx[(xᵏ−1)/λ] = x^{λ−1}  ⟹  log = (λ−1) log x
            λ = 0 :  d/dx[log x] = 1/x           ⟹  exact log = −log x

        .. note::
            The λ = 0 branch accumulates ``−xᵢ`` rather than the exact
            ``−log xᵢ``.  This preserves the original implementation
            behaviour.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data (pre-transform, original scale).

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Sum of per-feature log Jacobian contributions.
        """
        log_jac = np.zeros(X.shape[0])
        for i in range(X.shape[1]):
            xi = X[:, i]
            lam = self.lambdas_[i]
            if np.abs(lam) < 1e-10:
                log_jac += -xi  # log(1/x) ~= -x (lam->0 limit)
            else:
                # (lam-1) log xi from x^{lam-1} Jacobian
                log_jac += (lam - 1) * np.log(np.maximum(xi, 1e-300))
        return log_jac

`fit(X)`

Estimate one Box-Cox λ per feature via MLE.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data. Features that contain non-positive values are assigned λ = 0 (no transform applied during transform).

Returns

self : BoxCoxTransform Fitted instance with lambdas_ attribute set.

Source code in rbig/_src/parametric.py

def fit(self, X: np.ndarray) -> BoxCoxTransform:
    """Estimate one Box-Cox λ per feature via MLE.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.  Features that contain non-positive values are
        assigned λ = 0 (no transform applied during ``transform``).

    Returns
    -------
    self : BoxCoxTransform
        Fitted instance with ``lambdas_`` attribute set.
    """
    self.lambdas_ = np.zeros(X.shape[1])  # λ per feature, default 0
    for i in range(X.shape[1]):
        xi = X[:, i]
        if np.all(xi > 0):
            _, lam = stats.boxcox(xi)  # MLE for λ
        else:
            lam = 0.0  # non-positive data: no power transform
        self.lambdas_[i] = lam
    return self

`inverse_transform(X)`

Invert the Box-Cox transform.

Parameters

X : np.ndarray of shape (n_samples, n_features) Box-Cox transformed data.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Recovered original-scale data. Uses:

* λ = 0 : x = exp(y)
* λ ≠ 0 : x = (λy + 1)^{1/λ}   (clamped to 0 for stability)

Source code in rbig/_src/parametric.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the Box-Cox transform.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Box-Cox transformed data.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Recovered original-scale data.  Uses:

        * λ = 0 : x = exp(y)
        * λ ≠ 0 : x = (λy + 1)^{1/λ}   (clamped to 0 for stability)
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(X.shape[1]):
        lam = self.lambdas_[i]
        if np.abs(lam) < 1e-10:
            Xt[:, i] = np.exp(X[:, i])  # x = exp(y)
        else:
            # x = (λy + 1)^{1/λ}, clamp argument to ≥ 0
            Xt[:, i] = np.power(np.maximum(lam * X[:, i] + 1, 0), 1 / lam)
    return Xt

`log_det_jacobian(X)`

Compute per-sample log |det J| of the forward Box-Cox transform.

The Jacobian is diagonal; for each feature:

λ ≠ 0 :  d/dx[(xᵏ−1)/λ] = x^{λ−1}  ⟹  log = (λ−1) log x
λ = 0 :  d/dx[log x] = 1/x           ⟹  exact log = −log x

.. note:: The λ = 0 branch accumulates −xᵢ rather than the exact −log xᵢ. This preserves the original implementation behaviour.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data (pre-transform, original scale).

Returns

log_det : np.ndarray of shape (n_samples,) Sum of per-feature log Jacobian contributions.

Source code in rbig/_src/parametric.py

def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute per-sample log |det J| of the forward Box-Cox transform.

    The Jacobian is diagonal; for each feature:

        λ ≠ 0 :  d/dx[(xᵏ−1)/λ] = x^{λ−1}  ⟹  log = (λ−1) log x
        λ = 0 :  d/dx[log x] = 1/x           ⟹  exact log = −log x

    .. note::
        The λ = 0 branch accumulates ``−xᵢ`` rather than the exact
        ``−log xᵢ``.  This preserves the original implementation
        behaviour.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data (pre-transform, original scale).

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Sum of per-feature log Jacobian contributions.
    """
    log_jac = np.zeros(X.shape[0])
    for i in range(X.shape[1]):
        xi = X[:, i]
        lam = self.lambdas_[i]
        if np.abs(lam) < 1e-10:
            log_jac += -xi  # log(1/x) ~= -x (lam->0 limit)
        else:
            # (lam-1) log xi from x^{lam-1} Jacobian
            log_jac += (lam - 1) * np.log(np.maximum(xi, 1e-300))
    return log_jac

`transform(X)`

Apply the fitted Box-Cox transform to X.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data. Features with λ = 0 and non-positive values are passed through unchanged.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Box-Cox transformed data.

Source code in rbig/_src/parametric.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the fitted Box-Cox transform to X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.  Features with λ = 0 and non-positive values are
        passed through unchanged.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Box-Cox transformed data.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(X.shape[1]):
        xi = X[:, i]
        lam = self.lambdas_[i]
        if np.all(xi > 0):
            Xt[:, i] = stats.boxcox(xi, lmbda=lam)  # y = (x^lam - 1)/lam or log(x)
        else:
            Xt[:, i] = xi  # pass-through for non-positive
    return Xt

`CompositeBijector`

Bases: Bijector

A bijector that chains a sequence of bijectors in order.

Applies bijectors f₁, f₂, …, fₖ in sequence so that the composite map is g = fₖ ∘ … ∘ f₂ ∘ f₁. The log-det-Jacobian of the composition follows the chain rule:

log|det J_g(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|

where xₖ₋₁ = fₖ₋₁ ∘ … ∘ f₁(x) is the input to the k-th bijector.

Parameters

bijectors : list of Bijector Ordered list of bijectors to chain. They are applied left-to-right during transform and right-to-left during inverse_transform.

Attributes

bijectors : list of Bijector The constituent bijectors in application order.

Notes

Fitting is done sequentially: each bijector is fitted to the output of the previous one, so that the full model is trained in a single fit call.

Examples

import numpy as np from rbig._src.base import CompositeBijector from rbig._src.marginal import MarginalGaussianize from rbig._src.rotation import PCARotation rng = np.random.default_rng(0) X = rng.standard_normal((200, 4)) cb = CompositeBijector([MarginalGaussianize(), PCARotation()]) cb.fit(X) # doctest: +ELLIPSIS Z = cb.transform(X) Z.shape (200, 4)

Source code in rbig/_src/base.py

class CompositeBijector(Bijector):
    """A bijector that chains a sequence of bijectors in order.

    Applies bijectors ``f₁, f₂, …, fₖ`` in sequence so that the composite
    map is ``g = fₖ ∘ … ∘ f₂ ∘ f₁``.  The log-det-Jacobian of the
    composition follows the chain rule:

        log|det J_g(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|

    where ``xₖ₋₁ = fₖ₋₁ ∘ … ∘ f₁(x)`` is the input to the k-th bijector.

    Parameters
    ----------
    bijectors : list of Bijector
        Ordered list of bijectors to chain.  They are applied left-to-right
        during ``transform`` and right-to-left during ``inverse_transform``.

    Attributes
    ----------
    bijectors : list of Bijector
        The constituent bijectors in application order.

    Notes
    -----
    Fitting is done sequentially: each bijector is fitted to the output of
    the previous one, so that the full model is trained in a single
    ``fit`` call.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.base import CompositeBijector
    >>> from rbig._src.marginal import MarginalGaussianize
    >>> from rbig._src.rotation import PCARotation
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 4))
    >>> cb = CompositeBijector([MarginalGaussianize(), PCARotation()])
    >>> cb.fit(X)  # doctest: +ELLIPSIS
    <rbig._src.base.CompositeBijector ...>
    >>> Z = cb.transform(X)
    >>> Z.shape
    (200, 4)
    """

    def __init__(self, bijectors: list):
        self.bijectors = bijectors

    def fit(self, X: np.ndarray) -> "CompositeBijector":
        """Fit each bijector sequentially on the output of the previous one.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : CompositeBijector
            The fitted composite bijector.
        """
        Xt = X.copy()  # working copy; shape (n_samples, n_features)
        for b in self.bijectors:
            # fit bijector b on current Xt, then advance Xt to b's output
            Xt = b.fit_transform(Xt)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply all bijectors left-to-right: g(x) = fₖ(… f₁(x) …).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Data after passing through every bijector in sequence.
        """
        Xt = X.copy()  # shape (n_samples, n_features)
        for b in self.bijectors:
            Xt = b.transform(Xt)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the composite map: g⁻¹(z) = f₁⁻¹(… fₖ⁻¹(z) …).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the latent space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original input space.
        """
        Xt = X.copy()  # shape (n_samples, n_features)
        # reverse order to undo the forward composition
        for b in reversed(self.bijectors):
            Xt = b.inverse_transform(Xt)
        return Xt

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Sum log|det Jₖ| over all bijectors (chain rule).

        Uses the chain rule for Jacobian determinants:

            log|det J_g(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points.

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Per-sample sum of log-det-Jacobians across all bijectors.
        """
        Xt = X.copy()  # shape (n_samples, n_features)
        log_det = np.zeros(X.shape[0])  # accumulator, shape (n_samples,)
        for b in self.bijectors:
            # add log|det Jₖ| at the *current* intermediate input Xt
            log_det += b.get_log_det_jacobian(Xt)
            # advance Xt to the output of bijector b for the next iteration
            Xt = b.transform(Xt)
        return log_det

`fit(X)`

Fit each bijector sequentially on the output of the previous one.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : CompositeBijector The fitted composite bijector.

Source code in rbig/_src/base.py

def fit(self, X: np.ndarray) -> "CompositeBijector":
    """Fit each bijector sequentially on the output of the previous one.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : CompositeBijector
        The fitted composite bijector.
    """
    Xt = X.copy()  # working copy; shape (n_samples, n_features)
    for b in self.bijectors:
        # fit bijector b on current Xt, then advance Xt to b's output
        Xt = b.fit_transform(Xt)
    return self

`get_log_det_jacobian(X)`

Sum log|det Jₖ| over all bijectors (chain rule).

Uses the chain rule for Jacobian determinants:

log|det J_g(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points.

Returns

log_det : np.ndarray of shape (n_samples,) Per-sample sum of log-det-Jacobians across all bijectors.

Source code in rbig/_src/base.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Sum log|det Jₖ| over all bijectors (chain rule).

    Uses the chain rule for Jacobian determinants:

        log|det J_g(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points.

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Per-sample sum of log-det-Jacobians across all bijectors.
    """
    Xt = X.copy()  # shape (n_samples, n_features)
    log_det = np.zeros(X.shape[0])  # accumulator, shape (n_samples,)
    for b in self.bijectors:
        # add log|det Jₖ| at the *current* intermediate input Xt
        log_det += b.get_log_det_jacobian(Xt)
        # advance Xt to the output of bijector b for the next iteration
        Xt = b.transform(Xt)
    return log_det

`inverse_transform(X)`

Invert the composite map: g⁻¹(z) = f₁⁻¹(… fₖ⁻¹(z) …).

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the latent space.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original input space.

Source code in rbig/_src/base.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the composite map: g⁻¹(z) = f₁⁻¹(… fₖ⁻¹(z) …).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the latent space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original input space.
    """
    Xt = X.copy()  # shape (n_samples, n_features)
    # reverse order to undo the forward composition
    for b in reversed(self.bijectors):
        Xt = b.inverse_transform(Xt)
    return Xt

`transform(X)`

Apply all bijectors left-to-right: g(x) = fₖ(… f₁(x) …).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Z : np.ndarray of shape (n_samples, n_features) Data after passing through every bijector in sequence.

Source code in rbig/_src/base.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply all bijectors left-to-right: g(x) = fₖ(… f₁(x) …).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Data after passing through every bijector in sequence.
    """
    Xt = X.copy()  # shape (n_samples, n_features)
    for b in self.bijectors:
        Xt = b.transform(Xt)
    return Xt

`Cube`

Bases: Bijector

Elementwise cube bijector (y = x³) with analytic log-det Jacobian.

Implements the strictly monotone map:

Forward  : y = x³
Inverse  : x = y^{1/3}  (real cube root, handles negative values)
Log-det  : ∑ᵢ log(3xᵢ²)

Because the derivative vanishes at x = 0, a small constant (1e-300) is added inside the log to avoid −∞.

Examples

import numpy as np from rbig._src.densities import Cube X = np.array([[-2.0, 1.0], [0.5, -0.5]]) bij = Cube().fit(X) Y = bij.transform(X) # y = x³ X_rec = bij.inverse_transform(Y) np.allclose(X, X_rec, atol=1e-10) True

Source code in rbig/_src/densities.py

class Cube(Bijector):
    """Elementwise cube bijector (y = x³) with analytic log-det Jacobian.

    Implements the strictly monotone map:

        Forward  : y = x³
        Inverse  : x = y^{1/3}  (real cube root, handles negative values)
        Log-det  : ∑ᵢ log(3xᵢ²)

    Because the derivative vanishes at x = 0, a small constant (1e-300) is
    added inside the log to avoid ``−∞``.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.densities import Cube
    >>> X = np.array([[-2.0, 1.0], [0.5, -0.5]])
    >>> bij = Cube().fit(X)
    >>> Y = bij.transform(X)  # y = x³
    >>> X_rec = bij.inverse_transform(Y)
    >>> np.allclose(X, X_rec, atol=1e-10)
    True
    """

    def fit(self, X: np.ndarray) -> Cube:
        """No-op fit (stateless bijector).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Ignored.

        Returns
        -------
        self : Cube
        """
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the forward cube map y = x³.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in ℝ.

        Returns
        -------
        Y : np.ndarray of shape (n_samples, n_features)
            Transformed data, element-wise cubed.
        """
        return X**3  # y = x³, element-wise

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the inverse cube-root map x = y^{1/3}.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Cubed data in ℝ (may include negative values).

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Real cube root of each element.
        """
        return np.cbrt(X)  # real cube root, handles negative values

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute per-sample log |det J| of the forward transform.

        The Jacobian of x³ is diagonal with entries 3xᵢ², so:

            log |det J| = ∑ᵢ log(3xᵢ²)

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data (pre-transform).

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Log absolute determinant of the Jacobian for each sample.
        """
        # 1e-300 guards against log(0) when x = 0
        return np.sum(np.log(3 * X**2 + 1e-300), axis=1)

`fit(X)`

No-op fit (stateless bijector).

Parameters

X : np.ndarray of shape (n_samples, n_features) Ignored.

Returns

self : Cube

Source code in rbig/_src/densities.py

def fit(self, X: np.ndarray) -> Cube:
    """No-op fit (stateless bijector).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Ignored.

    Returns
    -------
    self : Cube
    """
    return self

`get_log_det_jacobian(X)`

Compute per-sample log |det J| of the forward transform.

The Jacobian of x³ is diagonal with entries 3xᵢ², so:

log |det J| = ∑ᵢ log(3xᵢ²)

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data (pre-transform).

Returns

log_det : np.ndarray of shape (n_samples,) Log absolute determinant of the Jacobian for each sample.

Source code in rbig/_src/densities.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute per-sample log |det J| of the forward transform.

    The Jacobian of x³ is diagonal with entries 3xᵢ², so:

        log |det J| = ∑ᵢ log(3xᵢ²)

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data (pre-transform).

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Log absolute determinant of the Jacobian for each sample.
    """
    # 1e-300 guards against log(0) when x = 0
    return np.sum(np.log(3 * X**2 + 1e-300), axis=1)

`inverse_transform(X)`

Apply the inverse cube-root map x = y^{1/3}.

Parameters

X : np.ndarray of shape (n_samples, n_features) Cubed data in ℝ (may include negative values).

Returns

Z : np.ndarray of shape (n_samples, n_features) Real cube root of each element.

Source code in rbig/_src/densities.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the inverse cube-root map x = y^{1/3}.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Cubed data in ℝ (may include negative values).

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Real cube root of each element.
    """
    return np.cbrt(X)  # real cube root, handles negative values

`transform(X)`

Apply the forward cube map y = x³.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in ℝ.

Returns

Y : np.ndarray of shape (n_samples, n_features) Transformed data, element-wise cubed.

Source code in rbig/_src/densities.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the forward cube map y = x³.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in ℝ.

    Returns
    -------
    Y : np.ndarray of shape (n_samples, n_features)
        Transformed data, element-wise cubed.
    """
    return X**3  # y = x³, element-wise

`DCTRotation`

Bases: ImageBijector

Type-II orthonormal 2-D Discrete Cosine Transform rotation.

Applies the 2-D DCT-II with orthonormal normalisation (norm="ortho") to each spatial channel. Because the ortho-normalised DCT is an orthogonal matrix, log|det J| = 0 for all inputs.

Parameters

C : int, default 1 Number of image channels. H : int, default 8 Image height in pixels. W : int, default 8 Image width in pixels.

Attributes

C_ : int Fitted number of channels. H_ : int Fitted image height. W_ : int Fitted image width.

Notes

For an orthonormal DCT matrix :math:\mathbf{D} acting on the vectorised image :math:\mathbf{x}:

.. math::

\mathbf{y} = \mathbf{D}\,\mathbf{x},
\quad
\log |\det J| = \log |\det \mathbf{D}| = 0

because :math:\mathbf{D} is orthogonal (det = ±1).

Examples

import numpy as np rng = np.random.default_rng(0) X = rng.standard_normal((5, 64)) # N=5, C=1, H=8, W=8 layer = DCTRotation(C=1, H=8, W=8) layer.fit(X) # doctest: +ELLIPSIS DCTRotation(...) Xt = layer.transform(X) Xr = layer.inverse_transform(Xt) np.allclose(X, Xr, atol=1e-10) True

Source code in rbig/_src/image.py

class DCTRotation(ImageBijector):
    """Type-II orthonormal 2-D Discrete Cosine Transform rotation.

    Applies the 2-D DCT-II with orthonormal normalisation (``norm="ortho"``)
    to each spatial channel.  Because the ortho-normalised DCT is an
    orthogonal matrix, ``log|det J| = 0`` for all inputs.

    Parameters
    ----------
    C : int, default 1
        Number of image channels.
    H : int, default 8
        Image height in pixels.
    W : int, default 8
        Image width in pixels.

    Attributes
    ----------
    C_ : int
        Fitted number of channels.
    H_ : int
        Fitted image height.
    W_ : int
        Fitted image width.

    Notes
    -----
    For an orthonormal DCT matrix :math:`\\mathbf{D}` acting on the
    vectorised image :math:`\\mathbf{x}`:

    .. math::

        \\mathbf{y} = \\mathbf{D}\\,\\mathbf{x},
        \\quad
        \\log |\\det J| = \\log |\\det \\mathbf{D}| = 0

    because :math:`\\mathbf{D}` is orthogonal (``det = ±1``).

    Examples
    --------
    >>> import numpy as np
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((5, 64))  # N=5, C=1, H=8, W=8
    >>> layer = DCTRotation(C=1, H=8, W=8)
    >>> layer.fit(X)  # doctest: +ELLIPSIS
    DCTRotation(...)
    >>> Xt = layer.transform(X)
    >>> Xr = layer.inverse_transform(Xt)
    >>> np.allclose(X, Xr, atol=1e-10)
    True
    """

    def __init__(self, C: int = 1, H: int = 8, W: int = 8):
        self.C = C
        self.H = H
        self.W = W

    def fit(self, X: np.ndarray) -> DCTRotation:
        """Store spatial dimensions; no data-dependent fitting required.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``

        Returns
        -------
        self : DCTRotation
        """
        self.C_ = self.C
        self.H_ = self.H
        self.W_ = self.W
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply orthonormal 2-D DCT-II to every image channel.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Flattened image batch.

        Returns
        -------
        Xt : np.ndarray, shape ``(N, C*H*W)``
            Orthonormal DCT-II coefficients.
        """
        from scipy.fft import dctn

        N = X.shape[0]
        imgs = self._to_tensor(X)  # (N, C, H, W)
        result = np.zeros_like(imgs)
        for n in range(N):
            for c in range(self.C_):
                # norm="ortho" yields the orthonormal variant of the DCT-II
                result[n, c] = dctn(imgs[n, c], norm="ortho")
        return self._to_flat(result)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply orthonormal 2-D inverse DCT (DCT-III scaled).

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            DCT coefficient batch from :meth:`transform`.

        Returns
        -------
        Xr : np.ndarray, shape ``(N, C*H*W)``
            Reconstructed image batch.
        """
        from scipy.fft import idctn

        N = X.shape[0]
        imgs = X.reshape(N, self.C_, self.H_, self.W_)  # (N, C, H, W)
        result = np.zeros_like(imgs)
        for n in range(N):
            for c in range(self.C_):
                # idctn with norm="ortho" is the exact inverse of dctn with norm="ortho"
                result[n, c] = idctn(imgs[n, c], norm="ortho")
        return self._to_flat(result)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros: orthonormal DCT has ``log|det J| = 0``.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, D)``

        Returns
        -------
        log_det : np.ndarray, shape ``(N,)``
        """
        return np.zeros(X.shape[0])

`fit(X)`

Store spatial dimensions; no data-dependent fitting required.

Parameters

X : np.ndarray, shape (N, C*H*W)

Returns

self : DCTRotation

Source code in rbig/_src/image.py

def fit(self, X: np.ndarray) -> DCTRotation:
    """Store spatial dimensions; no data-dependent fitting required.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``

    Returns
    -------
    self : DCTRotation
    """
    self.C_ = self.C
    self.H_ = self.H
    self.W_ = self.W
    return self

`get_log_det_jacobian(X)`

Return zeros: orthonormal DCT has log|det J| = 0.

Parameters

X : np.ndarray, shape (N, D)

Returns

log_det : np.ndarray, shape (N,)

Source code in rbig/_src/image.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros: orthonormal DCT has ``log|det J| = 0``.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, D)``

    Returns
    -------
    log_det : np.ndarray, shape ``(N,)``
    """
    return np.zeros(X.shape[0])

`inverse_transform(X)`

Apply orthonormal 2-D inverse DCT (DCT-III scaled).

Parameters

X : np.ndarray, shape (N, C*H*W) DCT coefficient batch from :meth:transform.

Returns

Xr : np.ndarray, shape (N, C*H*W) Reconstructed image batch.

Source code in rbig/_src/image.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply orthonormal 2-D inverse DCT (DCT-III scaled).

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        DCT coefficient batch from :meth:`transform`.

    Returns
    -------
    Xr : np.ndarray, shape ``(N, C*H*W)``
        Reconstructed image batch.
    """
    from scipy.fft import idctn

    N = X.shape[0]
    imgs = X.reshape(N, self.C_, self.H_, self.W_)  # (N, C, H, W)
    result = np.zeros_like(imgs)
    for n in range(N):
        for c in range(self.C_):
            # idctn with norm="ortho" is the exact inverse of dctn with norm="ortho"
            result[n, c] = idctn(imgs[n, c], norm="ortho")
    return self._to_flat(result)

`transform(X)`

Apply orthonormal 2-D DCT-II to every image channel.

Parameters

X : np.ndarray, shape (N, C*H*W) Flattened image batch.

Returns

Xt : np.ndarray, shape (N, C*H*W) Orthonormal DCT-II coefficients.

Source code in rbig/_src/image.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply orthonormal 2-D DCT-II to every image channel.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Flattened image batch.

    Returns
    -------
    Xt : np.ndarray, shape ``(N, C*H*W)``
        Orthonormal DCT-II coefficients.
    """
    from scipy.fft import dctn

    N = X.shape[0]
    imgs = self._to_tensor(X)  # (N, C, H, W)
    result = np.zeros_like(imgs)
    for n in range(N):
        for c in range(self.C_):
            # norm="ortho" yields the orthonormal variant of the DCT-II
            result[n, c] = dctn(imgs[n, c], norm="ortho")
    return self._to_flat(result)

`Exp`

Bases: Bijector

Elementwise exponential bijector with analytic log-det Jacobian.

Implements the map from ℝ to ℝ₊:

Forward  : y = exp(x)
Inverse  : x = log(y)   (y clipped to (ε, ∞) for stability)
Log-det  : ∑ᵢ xᵢ          (since d(exp xᵢ)/dxᵢ = exp xᵢ, log = xᵢ)

Examples

import numpy as np from rbig._src.densities import Exp X = np.array([[-1.0, 0.0], [1.0, 2.0]]) bij = Exp().fit(X) Y = bij.transform(X) np.allclose(bij.inverse_transform(Y), X, atol=1e-10) True

Source code in rbig/_src/densities.py

class Exp(Bijector):
    """Elementwise exponential bijector with analytic log-det Jacobian.

    Implements the map from ℝ to ℝ₊:

        Forward  : y = exp(x)
        Inverse  : x = log(y)   (y clipped to (ε, ∞) for stability)
        Log-det  : ∑ᵢ xᵢ          (since d(exp xᵢ)/dxᵢ = exp xᵢ, log = xᵢ)

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.densities import Exp
    >>> X = np.array([[-1.0, 0.0], [1.0, 2.0]])
    >>> bij = Exp().fit(X)
    >>> Y = bij.transform(X)
    >>> np.allclose(bij.inverse_transform(Y), X, atol=1e-10)
    True
    """

    def fit(self, X: np.ndarray) -> Exp:
        """No-op fit (stateless bijector).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Ignored.

        Returns
        -------
        self : Exp
        """
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the forward exponential map y = exp(x).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in ℝ.

        Returns
        -------
        Y : np.ndarray of shape (n_samples, n_features)
            Transformed data in ℝ₊.
        """
        return np.exp(X)  # y = exp(x), element-wise

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the inverse log map x = log(y).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in ℝ₊.  Values below 1e-300 are clipped for stability.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Reconstructed data in ℝ.
        """
        return np.log(np.maximum(X, 1e-300))  # clip to prevent log(0)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute per-sample log |det J| of the forward transform.

        The Jacobian of exp is diagonal with entries exp(xᵢ), so:

            log |det J| = ∑ᵢ xᵢ

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data (pre-transform).

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Log absolute determinant of the Jacobian for each sample.
        """
        return np.sum(X, axis=1)  # log |det J| = ∑ᵢ xᵢ

`fit(X)`

No-op fit (stateless bijector).

Parameters

X : np.ndarray of shape (n_samples, n_features) Ignored.

Returns

self : Exp

Source code in rbig/_src/densities.py

def fit(self, X: np.ndarray) -> Exp:
    """No-op fit (stateless bijector).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Ignored.

    Returns
    -------
    self : Exp
    """
    return self

`get_log_det_jacobian(X)`

Compute per-sample log |det J| of the forward transform.

The Jacobian of exp is diagonal with entries exp(xᵢ), so:

log |det J| = ∑ᵢ xᵢ

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data (pre-transform).

Returns

log_det : np.ndarray of shape (n_samples,) Log absolute determinant of the Jacobian for each sample.

Source code in rbig/_src/densities.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute per-sample log |det J| of the forward transform.

    The Jacobian of exp is diagonal with entries exp(xᵢ), so:

        log |det J| = ∑ᵢ xᵢ

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data (pre-transform).

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Log absolute determinant of the Jacobian for each sample.
    """
    return np.sum(X, axis=1)  # log |det J| = ∑ᵢ xᵢ

`inverse_transform(X)`

Apply the inverse log map x = log(y).

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in ℝ₊. Values below 1e-300 are clipped for stability.

Returns

Z : np.ndarray of shape (n_samples, n_features) Reconstructed data in ℝ.

Source code in rbig/_src/densities.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the inverse log map x = log(y).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in ℝ₊.  Values below 1e-300 are clipped for stability.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Reconstructed data in ℝ.
    """
    return np.log(np.maximum(X, 1e-300))  # clip to prevent log(0)

`transform(X)`

Apply the forward exponential map y = exp(x).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in ℝ.

Returns

Y : np.ndarray of shape (n_samples, n_features) Transformed data in ℝ₊.

Source code in rbig/_src/densities.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the forward exponential map y = exp(x).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in ℝ.

    Returns
    -------
    Y : np.ndarray of shape (n_samples, n_features)
        Transformed data in ℝ₊.
    """
    return np.exp(X)  # y = exp(x), element-wise

`GMMGaussianizer`

Bases: Bijector

Gaussianize each marginal using a Gaussian Mixture Model (GMM) CDF.

Fits a univariate GMM with n_components Gaussian components to each feature dimension, then maps samples to standard-normal values via the analytic GMM CDF::

F_GMM(x) = sum_k w_k * Phi((x - mu_k) / sigma_k)

followed by the probit function::

z = Phi ^ {-1}(F_GMM(x))

where Phi is the standard-normal CDF, w_k are mixture weights, and mu_k, sigma_k are the component means and standard deviations.

Parameters

n_components : int, default 5 Number of mixture components. Capped at max(1, min(n_components, n_samples // 5, n_samples)) during fit to avoid over-fitting on small data sets. random_state : int or None, default 0 Seed for reproducible GMM initialisation.

Attributes

gmms_ : list of sklearn.mixture.GaussianMixture One fitted 1-D GMM per feature dimension. n_features_ : int Number of feature dimensions seen during fit.

Notes

The log-det-Jacobian uses the analytic GMM density::

log|dz/dx| = log f_GMM(x) - log phi(z)

where f_GMM(x) = sum_k w_k * phi((x - mu_k) / sigma_k) / sigma_k is the GMM PDF and phi is the standard-normal PDF.

The inverse transform numerically inverts the GMM CDF via Brent's method on [-50, 50]; samples outside this range default to 0.0.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.marginal import GMMGaussianizer rng = np.random.default_rng(0) X = rng.standard_normal((200, 2)) gmm = GMMGaussianizer(n_components=3).fit(X) Z = gmm.transform(X) Z.shape (200, 2) ldj = gmm.get_log_det_jacobian(X) ldj.shape (200,)

Source code in rbig/_src/marginal.py

class GMMGaussianizer(Bijector):
    """Gaussianize each marginal using a Gaussian Mixture Model (GMM) CDF.

    Fits a univariate GMM with ``n_components`` Gaussian components to each
    feature dimension, then maps samples to standard-normal values via the
    analytic GMM CDF::

        F_GMM(x) = sum_k w_k * Phi((x - mu_k) / sigma_k)

    followed by the probit function::

        z = Phi ^ {-1}(F_GMM(x))

    where ``Phi`` is the standard-normal CDF, ``w_k`` are mixture weights,
    and ``mu_k``, ``sigma_k`` are the component means and standard deviations.

    Parameters
    ----------
    n_components : int, default 5
        Number of mixture components.  Capped at
        ``max(1, min(n_components, n_samples // 5, n_samples))`` during
        ``fit`` to avoid over-fitting on small data sets.
    random_state : int or None, default 0
        Seed for reproducible GMM initialisation.

    Attributes
    ----------
    gmms_ : list of sklearn.mixture.GaussianMixture
        One fitted 1-D GMM per feature dimension.
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The log-det-Jacobian uses the analytic GMM density::

        log|dz/dx| = log f_GMM(x) - log phi(z)

    where ``f_GMM(x) = sum_k w_k * phi((x - mu_k) / sigma_k) / sigma_k``
    is the GMM PDF and ``phi`` is the standard-normal PDF.

    The inverse transform numerically inverts the GMM CDF via Brent's
    method on [-50, 50]; samples outside this range default to 0.0.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import GMMGaussianizer
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 2))
    >>> gmm = GMMGaussianizer(n_components=3).fit(X)
    >>> Z = gmm.transform(X)
    >>> Z.shape
    (200, 2)
    >>> ldj = gmm.get_log_det_jacobian(X)
    >>> ldj.shape
    (200,)
    """

    def __init__(self, n_components: int = 5, random_state: int | None = 0):
        self.n_components = n_components
        self.random_state = random_state

    def fit(self, X: np.ndarray) -> GMMGaussianizer:
        """Fit a univariate GMM to each feature dimension.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : GMMGaussianizer
            Fitted bijector instance.

        Notes
        -----
        The number of mixture components is capped at
        ``max(1, min(n_components, n_samples // 5, n_samples))`` to avoid
        over-fitting when ``n_samples`` is small.
        """
        from sklearn.mixture import GaussianMixture

        self.gmms_ = []
        self.n_features_ = X.shape[1]
        n_samples = X.shape[0]
        # Cap n_components to avoid GMMs with more components than data points
        n_components = max(1, min(self.n_components, n_samples // 5, n_samples))
        for i in range(self.n_features_):
            gmm = GaussianMixture(
                n_components=n_components,
                random_state=self.random_state,
            )
            # Reshape to (n_samples, 1) as required by sklearn GMM API
            gmm.fit(X[:, i : i + 1])
            self.gmms_.append(gmm)
        return self

    def _cdf(self, gmm, x: np.ndarray) -> np.ndarray:
        """Compute the GMM CDF at points x (1-D).

        Evaluates the mixture CDF::

            F_GMM(x) = sum_k w_k * Phi((x - mu_k) / sigma_k)

        Parameters
        ----------
        gmm : sklearn.mixture.GaussianMixture
            Fitted 1-D GMM.
        x : np.ndarray of shape (n_samples,)
            Query points.

        Returns
        -------
        cdf : np.ndarray of shape (n_samples,)
            GMM CDF values in [0, 1].
        """
        weights = gmm.weights_  # mixture weights, shape (K,)
        means = gmm.means_.ravel()  # component means, shape (K,)
        stds = np.sqrt(gmm.covariances_.ravel())  # component stds, shape (K,)
        cdf = np.zeros_like(x, dtype=float)
        for w, mu, sigma in zip(weights, means, stds, strict=False):
            # Weighted sum of normal CDFs: w_k * Phi((x - mu_k) / sigma_k)
            cdf += w * stats.norm.cdf(x, loc=mu, scale=sigma)
        return cdf

    def _pdf(self, gmm, x: np.ndarray) -> np.ndarray:
        """Compute the GMM PDF at points x (1-D).

        Evaluates the mixture density::

            f_GMM(x) = sum_k w_k * phi((x - mu_k) / sigma_k) / sigma_k

        Parameters
        ----------
        gmm : sklearn.mixture.GaussianMixture
            Fitted 1-D GMM.
        x : np.ndarray of shape (n_samples,)
            Query points.

        Returns
        -------
        pdf : np.ndarray of shape (n_samples,)
            GMM PDF values (>= 0).
        """
        weights = gmm.weights_
        means = gmm.means_.ravel()
        stds = np.sqrt(gmm.covariances_.ravel())
        pdf = np.zeros_like(x, dtype=float)
        for w, mu, sigma in zip(weights, means, stds, strict=False):
            # Weighted sum of normal PDFs: w_k * phi((x - mu_k) / sigma_k) / sigma_k
            pdf += w * stats.norm.pdf(x, loc=mu, scale=sigma)
        return pdf

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Map each feature to N(0, 1) via GMM CDF then probit.

        Applies ``z = Phi^{-1}(F_GMM(x))`` to each feature independently.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Gaussianized data with approximately standard-normal marginals.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Evaluate analytic GMM CDF
            u = self._cdf(self.gmms_[i], X[:, i])
            # Clip away from boundaries before probit
            u = np.clip(u, 1e-6, 1 - 1e-6)
            Xt[:, i] = ndtri(u)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Map standard-normal values back to the original space.

        Numerically inverts ``F_GMM(x) = Phi(z)`` per sample via Brent's
        method on the interval [-50, 50].

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the Gaussianized (standard-normal) space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
            Samples for which root-finding fails default to 0.0.
        """
        from scipy.optimize import brentq

        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            for j, xj in enumerate(X[:, i]):
                # Map z -> u = Phi(z) in (0, 1)
                u = stats.norm.cdf(xj)
                try:
                    # Numerically solve F_GMM(x) = u for x
                    Xt[j, i] = brentq(
                        lambda x, u=u, i=i: (
                            self._cdf(self.gmms_[i], np.array([x]))[0] - u
                        ),
                        -50,
                        50,
                    )
                except ValueError:
                    # Root not found in [-50, 50]; fall back to zero
                    Xt[j, i] = 0.0
        return Xt

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute log |det J| using the analytic GMM density.

        Because the Jacobian is diagonal (each feature transformed
        independently)::

            log|det J| = sum_i log|dz_i/dx_i|
                       = sum_i [log f_GMM(x_i) - log phi(z_i)]

        where ``z_i = Phi^{-1}(F_GMM(x_i))`` and ``phi`` is the
        standard-normal PDF.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points at which to evaluate the log-det-Jacobian.

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant.
        """
        log_det = np.zeros(X.shape[0])
        for i in range(self.n_features_):
            # Evaluate GMM CDF and clip to avoid probit boundary issues
            u = self._cdf(self.gmms_[i], X[:, i])
            u = np.clip(u, 1e-6, 1 - 1e-6)
            g = ndtri(u)  # z = Phi^{-1}(F_GMM(x))
            pdf = self._pdf(self.gmms_[i], X[:, i])  # f_GMM(x)
            log_phi = stats.norm.logpdf(g)  # log phi(z)
            # log|dz/dx| = log f_GMM(x) - log phi(z)
            log_det += np.log(np.maximum(pdf, 1e-300)) - log_phi
        return log_det

`fit(X)`

Fit a univariate GMM to each feature dimension.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : GMMGaussianizer Fitted bijector instance.

Notes

The number of mixture components is capped at max(1, min(n_components, n_samples // 5, n_samples)) to avoid over-fitting when n_samples is small.

Source code in rbig/_src/marginal.py

def fit(self, X: np.ndarray) -> GMMGaussianizer:
    """Fit a univariate GMM to each feature dimension.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : GMMGaussianizer
        Fitted bijector instance.

    Notes
    -----
    The number of mixture components is capped at
    ``max(1, min(n_components, n_samples // 5, n_samples))`` to avoid
    over-fitting when ``n_samples`` is small.
    """
    from sklearn.mixture import GaussianMixture

    self.gmms_ = []
    self.n_features_ = X.shape[1]
    n_samples = X.shape[0]
    # Cap n_components to avoid GMMs with more components than data points
    n_components = max(1, min(self.n_components, n_samples // 5, n_samples))
    for i in range(self.n_features_):
        gmm = GaussianMixture(
            n_components=n_components,
            random_state=self.random_state,
        )
        # Reshape to (n_samples, 1) as required by sklearn GMM API
        gmm.fit(X[:, i : i + 1])
        self.gmms_.append(gmm)
    return self

`get_log_det_jacobian(X)`

Compute log |det J| using the analytic GMM density.

Because the Jacobian is diagonal (each feature transformed independently)::

log|det J| = sum_i log|dz_i/dx_i|
           = sum_i [log f_GMM(x_i) - log phi(z_i)]

where z_i = Phi^{-1}(F_GMM(x_i)) and phi is the standard-normal PDF.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points at which to evaluate the log-det-Jacobian.

Returns

log_det : np.ndarray of shape (n_samples,) Per-sample log absolute determinant.

Source code in rbig/_src/marginal.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute log |det J| using the analytic GMM density.

    Because the Jacobian is diagonal (each feature transformed
    independently)::

        log|det J| = sum_i log|dz_i/dx_i|
                   = sum_i [log f_GMM(x_i) - log phi(z_i)]

    where ``z_i = Phi^{-1}(F_GMM(x_i))`` and ``phi`` is the
    standard-normal PDF.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points at which to evaluate the log-det-Jacobian.

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant.
    """
    log_det = np.zeros(X.shape[0])
    for i in range(self.n_features_):
        # Evaluate GMM CDF and clip to avoid probit boundary issues
        u = self._cdf(self.gmms_[i], X[:, i])
        u = np.clip(u, 1e-6, 1 - 1e-6)
        g = ndtri(u)  # z = Phi^{-1}(F_GMM(x))
        pdf = self._pdf(self.gmms_[i], X[:, i])  # f_GMM(x)
        log_phi = stats.norm.logpdf(g)  # log phi(z)
        # log|dz/dx| = log f_GMM(x) - log phi(z)
        log_det += np.log(np.maximum(pdf, 1e-300)) - log_phi
    return log_det

`inverse_transform(X)`

Map standard-normal values back to the original space.

Numerically inverts F_GMM(x) = Phi(z) per sample via Brent's method on the interval [-50, 50].

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the Gaussianized (standard-normal) space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space. Samples for which root-finding fails default to 0.0.

Source code in rbig/_src/marginal.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Map standard-normal values back to the original space.

    Numerically inverts ``F_GMM(x) = Phi(z)`` per sample via Brent's
    method on the interval [-50, 50].

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the Gaussianized (standard-normal) space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
        Samples for which root-finding fails default to 0.0.
    """
    from scipy.optimize import brentq

    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        for j, xj in enumerate(X[:, i]):
            # Map z -> u = Phi(z) in (0, 1)
            u = stats.norm.cdf(xj)
            try:
                # Numerically solve F_GMM(x) = u for x
                Xt[j, i] = brentq(
                    lambda x, u=u, i=i: (
                        self._cdf(self.gmms_[i], np.array([x]))[0] - u
                    ),
                    -50,
                    50,
                )
            except ValueError:
                # Root not found in [-50, 50]; fall back to zero
                Xt[j, i] = 0.0
    return Xt

`transform(X)`

Map each feature to N(0, 1) via GMM CDF then probit.

Applies z = Phi^{-1}(F_GMM(x)) to each feature independently.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Gaussianized data with approximately standard-normal marginals.

Source code in rbig/_src/marginal.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Map each feature to N(0, 1) via GMM CDF then probit.

    Applies ``z = Phi^{-1}(F_GMM(x))`` to each feature independently.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Gaussianized data with approximately standard-normal marginals.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Evaluate analytic GMM CDF
        u = self._cdf(self.gmms_[i], X[:, i])
        # Clip away from boundaries before probit
        u = np.clip(u, 1e-6, 1 - 1e-6)
        Xt[:, i] = ndtri(u)
    return Xt

`GaussianRandomProjection`

Bases: RotationBijector

Johnson-Lindenstrauss style random projection with Gaussian entries.

Constructs a random projection matrix M in R^{D x K} whose entries are drawn i.i.d. from N(0, 1/K)::

M_ij ~ N(0, 1/K)

The 1/K normalisation approximately preserves pairwise Euclidean distances (Johnson-Lindenstrauss lemma)::

(1 - eps)||x - y||^2 <= ||Mx - My||^2 <= (1 + eps)||x - y||^2

with high probability when K = O(eps^{-2} log n).

Parameters

n_components : int or None, default None Output dimensionality K. If None, K = D (square case). random_state : int or None, default None Seed for reproducible matrix generation.

Attributes

matrix_ : np.ndarray of shape (n_features, n_components) The random projection matrix with entries ~ N(0, 1/K).

Notes

Unlike :class:RandomOrthogonalProjection, the columns of this matrix are not orthogonal, so |det M| != 1 in general. :meth:get_log_det_jacobian returns zeros as an approximation. For density estimation where accuracy matters, prefer :class:RandomOrthogonalProjection or :class:RandomRotation.

The inverse uses the Moore-Penrose pseudoinverse computed by :func:numpy.linalg.pinv.

References

Johnson, W. B., & Lindenstrauss, J. (1984). Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics, 26, 189-206.

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.rotation import GaussianRandomProjection rng = np.random.default_rng(0) X = rng.standard_normal((100, 4)) grp = GaussianRandomProjection(random_state=0).fit(X) Z = grp.transform(X) Z.shape (100, 4)

Source code in rbig/_src/rotation.py

class GaussianRandomProjection(RotationBijector):
    """Johnson-Lindenstrauss style random projection with Gaussian entries.

    Constructs a random projection matrix M in R^{D x K} whose entries are
    drawn i.i.d. from N(0, 1/K)::

        M_ij ~ N(0, 1/K)

    The 1/K normalisation approximately preserves pairwise Euclidean
    distances (Johnson-Lindenstrauss lemma)::

        (1 - eps)||x - y||^2 <= ||Mx - My||^2 <= (1 + eps)||x - y||^2

    with high probability when K = O(eps^{-2} log n).

    Parameters
    ----------
    n_components : int or None, default None
        Output dimensionality K.  If ``None``, K = D (square case).
    random_state : int or None, default None
        Seed for reproducible matrix generation.

    Attributes
    ----------
    matrix_ : np.ndarray of shape (n_features, n_components)
        The random projection matrix with entries ~ N(0, 1/K).

    Notes
    -----
    Unlike :class:`RandomOrthogonalProjection`, the columns of this matrix
    are *not* orthogonal, so ``|det M| != 1`` in general.
    :meth:`get_log_det_jacobian` returns zeros as an approximation.
    For density estimation where accuracy matters, prefer
    :class:`RandomOrthogonalProjection` or :class:`RandomRotation`.

    The inverse uses the Moore-Penrose pseudoinverse computed by
    :func:`numpy.linalg.pinv`.

    References
    ----------
    Johnson, W. B., & Lindenstrauss, J. (1984). Extensions of Lipschitz
    mappings into a Hilbert space. *Contemporary Mathematics*, 26, 189-206.

    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import GaussianRandomProjection
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((100, 4))
    >>> grp = GaussianRandomProjection(random_state=0).fit(X)
    >>> Z = grp.transform(X)
    >>> Z.shape
    (100, 4)
    """

    def __init__(
        self,
        n_components: int | None = None,
        random_state: int | None = None,
    ):
        self.n_components = n_components
        self.random_state = random_state

    def fit(self, X: np.ndarray) -> GaussianRandomProjection:
        """Build the Gaussian random projection matrix.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data (used only to infer D = n_features).

        Returns
        -------
        self : GaussianRandomProjection
            Fitted transform instance.
        """
        rng = np.random.default_rng(self.random_state)
        D = X.shape[1]
        K = self.n_components if self.n_components is not None else D
        # Entries drawn from N(0, 1), then scaled by 1/sqrt(K) for distance preservation
        self.matrix_ = rng.standard_normal((D, K)) / np.sqrt(K)  # (D, K)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Project X: z = X M.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_components)
            Projected data.
        """
        return X @ self.matrix_  # (N, D) @ (D, K) -> (N, K)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Approximate inverse via the Moore-Penrose pseudoinverse.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_components)
            Projected data.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Approximately recovered original data.
        """
        # Pseudoinverse: M^+ = (M^T M)^{-1} M^T; here we use pinv(M).T
        return X @ np.linalg.pinv(self.matrix_).T  # (N, K) @ (K, D) -> (N, D)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros (approximation; Gaussian projections are not isometric).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine n_samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Zeros (approximate; the true log-det is generally non-zero).
        """
        return np.zeros(X.shape[0])

`fit(X)`

Build the Gaussian random projection matrix.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data (used only to infer D = n_features).

Returns

self : GaussianRandomProjection Fitted transform instance.

Source code in rbig/_src/rotation.py

def fit(self, X: np.ndarray) -> GaussianRandomProjection:
    """Build the Gaussian random projection matrix.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data (used only to infer D = n_features).

    Returns
    -------
    self : GaussianRandomProjection
        Fitted transform instance.
    """
    rng = np.random.default_rng(self.random_state)
    D = X.shape[1]
    K = self.n_components if self.n_components is not None else D
    # Entries drawn from N(0, 1), then scaled by 1/sqrt(K) for distance preservation
    self.matrix_ = rng.standard_normal((D, K)) / np.sqrt(K)  # (D, K)
    return self

`get_log_det_jacobian(X)`

Return zeros (approximation; Gaussian projections are not isometric).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine n_samples).

Returns

ldj : np.ndarray of shape (n_samples,) Zeros (approximate; the true log-det is generally non-zero).

Source code in rbig/_src/rotation.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros (approximation; Gaussian projections are not isometric).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine n_samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Zeros (approximate; the true log-det is generally non-zero).
    """
    return np.zeros(X.shape[0])

`inverse_transform(X)`

Approximate inverse via the Moore-Penrose pseudoinverse.

Parameters

X : np.ndarray of shape (n_samples, n_components) Projected data.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Approximately recovered original data.

Source code in rbig/_src/rotation.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Approximate inverse via the Moore-Penrose pseudoinverse.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_components)
        Projected data.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Approximately recovered original data.
    """
    # Pseudoinverse: M^+ = (M^T M)^{-1} M^T; here we use pinv(M).T
    return X @ np.linalg.pinv(self.matrix_).T  # (N, K) @ (K, D) -> (N, D)

`transform(X)`

Project X: z = X M.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Z : np.ndarray of shape (n_samples, n_components) Projected data.

Source code in rbig/_src/rotation.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Project X: z = X M.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_components)
        Projected data.
    """
    return X @ self.matrix_  # (N, D) @ (D, K) -> (N, K)

`HartleyRotation`

Bases: ImageBijector

Discrete Hartley Transform — real-to-real orthonormal rotation.

The 2-D Discrete Hartley Transform (DHT) is defined as

.. math::

H(\mathbf{x}) = \operatorname{Re}\bigl(\text{FFT}(\mathbf{x})\bigr)
                - \operatorname{Im}\bigl(\text{FFT}(\mathbf{x})\bigr)

and is normalised by 1/√(H·W) to make it orthonormal (unitary). Because the DHT is its own inverse (self-inverse), the same operation is applied in both :meth:transform and :meth:inverse_transform.

Since the transform is orthonormal log|det J| = 0 for all inputs.

Parameters

C : int, default 1 Number of image channels. H : int, default 8 Image height in pixels. W : int, default 8 Image width in pixels.

Attributes

C_ : int Fitted number of channels. H_ : int Fitted image height. W_ : int Fitted image width.

Notes

The normalised DHT satisfies

.. math::

H(H(\mathbf{x})) = \mathbf{x}

making it a self-inverse bijection. The scaling factor is :math:1 / \sqrt{H \cdot W}.

Examples

import numpy as np rng = np.random.default_rng(0) X = rng.standard_normal((5, 64)) # N=5, C=1, H=8, W=8 layer = HartleyRotation(C=1, H=8, W=8) layer.fit(X) # doctest: +ELLIPSIS HartleyRotation(...) Xt = layer.transform(X) Xr = layer.inverse_transform(Xt) np.allclose(X, Xr, atol=1e-10) True

Source code in rbig/_src/image.py

class HartleyRotation(ImageBijector):
    """Discrete Hartley Transform — real-to-real orthonormal rotation.

    The 2-D Discrete Hartley Transform (DHT) is defined as

    .. math::

        H(\\mathbf{x}) = \\operatorname{Re}\\bigl(\\text{FFT}(\\mathbf{x})\\bigr)
                        - \\operatorname{Im}\\bigl(\\text{FFT}(\\mathbf{x})\\bigr)

    and is normalised by ``1/√(H·W)`` to make it orthonormal (unitary).
    Because the DHT is its own inverse (self-inverse), the same operation is
    applied in both :meth:`transform` and :meth:`inverse_transform`.

    Since the transform is orthonormal ``log|det J| = 0`` for all inputs.

    Parameters
    ----------
    C : int, default 1
        Number of image channels.
    H : int, default 8
        Image height in pixels.
    W : int, default 8
        Image width in pixels.

    Attributes
    ----------
    C_ : int
        Fitted number of channels.
    H_ : int
        Fitted image height.
    W_ : int
        Fitted image width.

    Notes
    -----
    The normalised DHT satisfies

    .. math::

        H(H(\\mathbf{x})) = \\mathbf{x}

    making it a self-inverse bijection.  The scaling factor is
    :math:`1 / \\sqrt{H \\cdot W}`.

    Examples
    --------
    >>> import numpy as np
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((5, 64))  # N=5, C=1, H=8, W=8
    >>> layer = HartleyRotation(C=1, H=8, W=8)
    >>> layer.fit(X)  # doctest: +ELLIPSIS
    HartleyRotation(...)
    >>> Xt = layer.transform(X)
    >>> Xr = layer.inverse_transform(Xt)
    >>> np.allclose(X, Xr, atol=1e-10)
    True
    """

    def __init__(self, C: int = 1, H: int = 8, W: int = 8):
        self.C = C
        self.H = H
        self.W = W

    def fit(self, X: np.ndarray) -> HartleyRotation:
        """Store spatial dimensions; no data-dependent fitting required.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``

        Returns
        -------
        self : HartleyRotation
        """
        self.C_ = self.C
        self.H_ = self.H
        self.W_ = self.W
        return self

    def _dht2(self, x: np.ndarray) -> np.ndarray:
        """Compute un-normalised 2-D Discrete Hartley Transform.

        .. math::

            H_{m,n} = \\operatorname{Re}(F_{m,n}) - \\operatorname{Im}(F_{m,n})

        where :math:`F = \\text{FFT2}(x)`.

        Parameters
        ----------
        x : np.ndarray, shape ``(H, W)``
            Single real-valued image channel.

        Returns
        -------
        h : np.ndarray, shape ``(H, W)``
            Un-normalised DHT coefficients.
        """
        from scipy.fft import fft2

        X_fft = fft2(x)  # complex FFT2, shape (H, W)
        # DHT = Re(FFT) - Im(FFT)
        return X_fft.real - X_fft.imag

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply normalised 2-D DHT to every image channel.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Flattened image batch.

        Returns
        -------
        Xt : np.ndarray, shape ``(N, C*H*W)``
            DHT coefficients scaled by ``1/√(H*W)``.
        """
        N = X.shape[0]
        imgs = self._to_tensor(X)  # (N, C, H, W)
        result = np.zeros_like(imgs)
        for n in range(N):
            for c in range(self.C_):
                # Normalise by 1/√(H·W) to make the transform orthonormal
                result[n, c] = self._dht2(imgs[n, c]) / np.sqrt(self.H_ * self.W_)
        return self._to_flat(result)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the inverse DHT (identical to the forward transform).

        The normalised DHT satisfies ``H(H(x)) = x``, so the forward and
        inverse transforms are the same function.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``

        Returns
        -------
        Xr : np.ndarray, shape ``(N, C*H*W)``
        """
        # DHT is self-inverse (up to scale factor already applied in transform)
        return self.transform(X)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros: normalised DHT has ``|det J| = 1``.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, D)``

        Returns
        -------
        log_det : np.ndarray, shape ``(N,)``
        """
        return np.zeros(X.shape[0])

`fit(X)`

Store spatial dimensions; no data-dependent fitting required.

Parameters

X : np.ndarray, shape (N, C*H*W)

Returns

self : HartleyRotation

Source code in rbig/_src/image.py

def fit(self, X: np.ndarray) -> HartleyRotation:
    """Store spatial dimensions; no data-dependent fitting required.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``

    Returns
    -------
    self : HartleyRotation
    """
    self.C_ = self.C
    self.H_ = self.H
    self.W_ = self.W
    return self

`get_log_det_jacobian(X)`

Return zeros: normalised DHT has |det J| = 1.

Parameters

X : np.ndarray, shape (N, D)

Returns

log_det : np.ndarray, shape (N,)

Source code in rbig/_src/image.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros: normalised DHT has ``|det J| = 1``.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, D)``

    Returns
    -------
    log_det : np.ndarray, shape ``(N,)``
    """
    return np.zeros(X.shape[0])

`inverse_transform(X)`

Apply the inverse DHT (identical to the forward transform).

The normalised DHT satisfies H(H(x)) = x, so the forward and inverse transforms are the same function.

Parameters

X : np.ndarray, shape (N, C*H*W)

Returns

Xr : np.ndarray, shape (N, C*H*W)

Source code in rbig/_src/image.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the inverse DHT (identical to the forward transform).

    The normalised DHT satisfies ``H(H(x)) = x``, so the forward and
    inverse transforms are the same function.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``

    Returns
    -------
    Xr : np.ndarray, shape ``(N, C*H*W)``
    """
    # DHT is self-inverse (up to scale factor already applied in transform)
    return self.transform(X)

`transform(X)`

Apply normalised 2-D DHT to every image channel.

Parameters

X : np.ndarray, shape (N, C*H*W) Flattened image batch.

Returns

Xt : np.ndarray, shape (N, C*H*W) DHT coefficients scaled by 1/√(H*W).

Source code in rbig/_src/image.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply normalised 2-D DHT to every image channel.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Flattened image batch.

    Returns
    -------
    Xt : np.ndarray, shape ``(N, C*H*W)``
        DHT coefficients scaled by ``1/√(H*W)``.
    """
    N = X.shape[0]
    imgs = self._to_tensor(X)  # (N, C, H, W)
    result = np.zeros_like(imgs)
    for n in range(N):
        for c in range(self.C_):
            # Normalise by 1/√(H·W) to make the transform orthonormal
            result[n, c] = self._dht2(imgs[n, c]) / np.sqrt(self.H_ * self.W_)
    return self._to_flat(result)

`ICARotation`

Bases: BaseTransform

ICA-based rotation using the Picard algorithm or FastICA fallback.

Fits an Independent Component Analysis (ICA) model that learns a linear unmixing matrix. When the optional picard package is available, it is used for faster and more accurate convergence::

s = W K x

where K in R^{K x D} is a pre-whitening matrix and W in R^{K x K} is the ICA unmixing matrix. The combined transform is W K.

If picard is not installed, :class:sklearn.decomposition.FastICA is used as a drop-in replacement.

Parameters

n_components : int or None, default None Number of independent components. If None, all D components are estimated (square unmixing matrix). random_state : int or None, default None Seed for reproducible ICA initialisation.

Attributes

K_ : np.ndarray of shape (n_components, n_features) or None Pre-whitening matrix from the Picard solver. None when using the FastICA fallback. W_ : np.ndarray of shape (n_components, n_components) or None ICA unmixing matrix from the Picard solver. None when using FastICA. ica_ : sklearn.decomposition.FastICA or None Fitted FastICA object used when Picard is unavailable. n_features_in_ : int Number of input features (set only when using Picard).

Notes

The log-absolute-Jacobian determinant is::

log|det J| = log|det(W K)|

for the Picard path, or log|det(components_)| for FastICA. The Jacobian is constant (independent of x) for any linear transform.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.rotation import ICARotation rng = np.random.default_rng(0) X = rng.standard_normal((200, 3)) ica = ICARotation(random_state=0).fit(X) S = ica.transform(X) S.shape (200, 3)

Source code in rbig/_src/rotation.py

class ICARotation(BaseTransform):
    """ICA-based rotation using the Picard algorithm or FastICA fallback.

    Fits an Independent Component Analysis (ICA) model that learns a linear
    unmixing matrix.  When the optional ``picard`` package is available, it
    is used for faster and more accurate convergence::

        s = W K x

    where ``K`` in R^{K x D} is a pre-whitening matrix and ``W`` in
    R^{K x K} is the ICA unmixing matrix.  The combined transform is
    ``W K``.

    If ``picard`` is not installed, :class:`sklearn.decomposition.FastICA`
    is used as a drop-in replacement.

    Parameters
    ----------
    n_components : int or None, default None
        Number of independent components.  If ``None``, all D components
        are estimated (square unmixing matrix).
    random_state : int or None, default None
        Seed for reproducible ICA initialisation.

    Attributes
    ----------
    K_ : np.ndarray of shape (n_components, n_features) or None
        Pre-whitening matrix from the Picard solver.  ``None`` when using
        the FastICA fallback.
    W_ : np.ndarray of shape (n_components, n_components) or None
        ICA unmixing matrix from the Picard solver.  ``None`` when using
        FastICA.
    ica_ : sklearn.decomposition.FastICA or None
        Fitted FastICA object used when Picard is unavailable.
    n_features_in_ : int
        Number of input features (set only when using Picard).

    Notes
    -----
    The log-absolute-Jacobian determinant is::

        log|det J| = log|det(W K)|

    for the Picard path, or ``log|det(components_)|`` for FastICA.  The
    Jacobian is constant (independent of x) for any linear transform.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import ICARotation
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 3))
    >>> ica = ICARotation(random_state=0).fit(X)
    >>> S = ica.transform(X)
    >>> S.shape
    (200, 3)
    """

    def __init__(
        self, n_components: int | None = None, random_state: int | None = None
    ):
        self.n_components = n_components
        self.random_state = random_state

    def fit(self, X: np.ndarray) -> ICARotation:
        """Fit the ICA model (Picard if available, otherwise FastICA).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : ICARotation
            Fitted transform instance.
        """
        try:
            from picard import picard

            n = X.shape[1] if self.n_components is None else self.n_components
            # Picard expects data as (n_features, n_samples), so transpose X
            K, W, _ = picard(
                X.T,
                n_components=n,
                random_state=self.random_state,
                max_iter=500,
                tol=1e-5,
            )
            self.K_ = K  # whitening matrix, shape (K, D)
            self.W_ = W  # unmixing matrix, shape (K, K)
            self.n_features_in_ = X.shape[1]
        except ImportError:
            from sklearn.decomposition import FastICA

            # Fall back to FastICA when picard is not installed
            self.ica_ = FastICA(
                n_components=self.n_components,
                random_state=self.random_state,
                max_iter=500,
            )
            self.ica_.fit(X)
            self.K_ = None  # signals that FastICA path is active
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the ICA unmixing to X: s = W K x.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original (mixed) space.

        Returns
        -------
        S : np.ndarray of shape (n_samples, n_components)
            Estimated independent components.
        """
        if self.K_ is None:
            # FastICA path: uses sklearn's built-in transform
            return self.ica_.transform(X)
        # Picard path: first whiten X with K, then unmix with W
        Xw = X @ self.K_.T  # (N, D) @ (D, K) -> (N, K)  whitening step
        return Xw @ self.W_.T  # (N, K) @ (K, K) -> (N, K)  unmixing step

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the ICA unmixing: x = (W K)^{-1} s.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_components)
            Independent-component representation.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original mixed space.
        """
        if self.K_ is None:
            return self.ica_.inverse_transform(X)
        # Invert unmixing W then whitening K using pseudo-inverses
        Xw = X @ np.linalg.pinv(self.W_).T  # (N, K) -> (N, K)
        return Xw @ np.linalg.pinv(self.K_).T  # (N, K) -> (N, D)

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Log absolute Jacobian determinant (constant for linear transforms).

        Computes ``log|det(W K)|`` (Picard path) or
        ``log|det(components_)|`` (FastICA path).  The result is replicated
        for every sample since the Jacobian of a linear transform is constant.

        .. note::
            This method is only valid when the unmixing matrix is square (i.e.
            ``n_components`` is ``None`` or equals the number of input
            features).  A non-square unmixing matrix is not bijective and its
            Jacobian determinant is undefined.  A :exc:`ValueError` is raised
            in that case.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine the number of samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Constant per-sample log absolute Jacobian determinant.

        Raises
        ------
        ValueError
            If the unmixing matrix is not square (``n_components != n_features``).
        """
        if self.K_ is None:
            W = self.ica_.components_  # shape (K, D) or (D, D)
            if W.shape[0] != W.shape[1]:
                raise ValueError(
                    "ICARotation.log_det_jacobian is only defined for square "
                    "unmixing matrices. Got components_ with shape "
                    f"{W.shape}. Ensure that `n_components` is None or "
                    "equals the number of features."
                )
            log_det = np.log(np.abs(np.linalg.det(W)))
        else:
            # Combined unmixing matrix: W @ K, shape (K, D)
            WK = self.W_ @ self.K_
            if WK.shape[0] != WK.shape[1]:
                raise ValueError(
                    "ICARotation.log_det_jacobian is only defined for square "
                    "unmixing matrices. Got W @ K with shape "
                    f"{WK.shape}. Ensure that `n_components` is None or "
                    "equals the number of features."
                )
            log_det = np.log(np.abs(np.linalg.det(WK)))
        return np.full(X.shape[0], log_det)

`fit(X)`

Fit the ICA model (Picard if available, otherwise FastICA).

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : ICARotation Fitted transform instance.

Source code in rbig/_src/rotation.py

def fit(self, X: np.ndarray) -> ICARotation:
    """Fit the ICA model (Picard if available, otherwise FastICA).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : ICARotation
        Fitted transform instance.
    """
    try:
        from picard import picard

        n = X.shape[1] if self.n_components is None else self.n_components
        # Picard expects data as (n_features, n_samples), so transpose X
        K, W, _ = picard(
            X.T,
            n_components=n,
            random_state=self.random_state,
            max_iter=500,
            tol=1e-5,
        )
        self.K_ = K  # whitening matrix, shape (K, D)
        self.W_ = W  # unmixing matrix, shape (K, K)
        self.n_features_in_ = X.shape[1]
    except ImportError:
        from sklearn.decomposition import FastICA

        # Fall back to FastICA when picard is not installed
        self.ica_ = FastICA(
            n_components=self.n_components,
            random_state=self.random_state,
            max_iter=500,
        )
        self.ica_.fit(X)
        self.K_ = None  # signals that FastICA path is active
    return self

`inverse_transform(X)`

Invert the ICA unmixing: x = (W K)^{-1} s.

Parameters

X : np.ndarray of shape (n_samples, n_components) Independent-component representation.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original mixed space.

Source code in rbig/_src/rotation.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the ICA unmixing: x = (W K)^{-1} s.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_components)
        Independent-component representation.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original mixed space.
    """
    if self.K_ is None:
        return self.ica_.inverse_transform(X)
    # Invert unmixing W then whitening K using pseudo-inverses
    Xw = X @ np.linalg.pinv(self.W_).T  # (N, K) -> (N, K)
    return Xw @ np.linalg.pinv(self.K_).T  # (N, K) -> (N, D)

`log_det_jacobian(X)`

Log absolute Jacobian determinant (constant for linear transforms).

Computes log|det(W K)| (Picard path) or log|det(components_)| (FastICA path). The result is replicated for every sample since the Jacobian of a linear transform is constant.

.. note:: This method is only valid when the unmixing matrix is square (i.e. n_components is None or equals the number of input features). A non-square unmixing matrix is not bijective and its Jacobian determinant is undefined. A :exc:ValueError is raised in that case.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine the number of samples).

Returns

ldj : np.ndarray of shape (n_samples,) Constant per-sample log absolute Jacobian determinant.

Raises

ValueError If the unmixing matrix is not square (n_components != n_features).

Source code in rbig/_src/rotation.py

def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Log absolute Jacobian determinant (constant for linear transforms).

    Computes ``log|det(W K)|`` (Picard path) or
    ``log|det(components_)|`` (FastICA path).  The result is replicated
    for every sample since the Jacobian of a linear transform is constant.

    .. note::
        This method is only valid when the unmixing matrix is square (i.e.
        ``n_components`` is ``None`` or equals the number of input
        features).  A non-square unmixing matrix is not bijective and its
        Jacobian determinant is undefined.  A :exc:`ValueError` is raised
        in that case.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine the number of samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Constant per-sample log absolute Jacobian determinant.

    Raises
    ------
    ValueError
        If the unmixing matrix is not square (``n_components != n_features``).
    """
    if self.K_ is None:
        W = self.ica_.components_  # shape (K, D) or (D, D)
        if W.shape[0] != W.shape[1]:
            raise ValueError(
                "ICARotation.log_det_jacobian is only defined for square "
                "unmixing matrices. Got components_ with shape "
                f"{W.shape}. Ensure that `n_components` is None or "
                "equals the number of features."
            )
        log_det = np.log(np.abs(np.linalg.det(W)))
    else:
        # Combined unmixing matrix: W @ K, shape (K, D)
        WK = self.W_ @ self.K_
        if WK.shape[0] != WK.shape[1]:
            raise ValueError(
                "ICARotation.log_det_jacobian is only defined for square "
                "unmixing matrices. Got W @ K with shape "
                f"{WK.shape}. Ensure that `n_components` is None or "
                "equals the number of features."
            )
        log_det = np.log(np.abs(np.linalg.det(WK)))
    return np.full(X.shape[0], log_det)

`transform(X)`

Apply the ICA unmixing to X: s = W K x.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in the original (mixed) space.

Returns

S : np.ndarray of shape (n_samples, n_components) Estimated independent components.

Source code in rbig/_src/rotation.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the ICA unmixing to X: s = W K x.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original (mixed) space.

    Returns
    -------
    S : np.ndarray of shape (n_samples, n_components)
        Estimated independent components.
    """
    if self.K_ is None:
        # FastICA path: uses sklearn's built-in transform
        return self.ica_.transform(X)
    # Picard path: first whiten X with K, then unmix with W
    Xw = X @ self.K_.T  # (N, D) @ (D, K) -> (N, K)  whitening step
    return Xw @ self.W_.T  # (N, K) @ (K, K) -> (N, K)  unmixing step

`ImageBijector`

Bases: Bijector

Abstract base class for bijective image transforms.

Manages the conversion between the flattened representation (N, C·H·W) expected by RBIG and the 4-D tensor (N, C, H, W) used internally by spatial transforms.

Subclasses must implement :meth:fit, :meth:transform, and :meth:inverse_transform. The default :meth:get_log_det_jacobian returns zeros, which is correct for all orthonormal transforms defined in this module (|det J| = 1).

Attributes

C_ : int Number of channels (set during :meth:fit). H_ : int Image height in pixels (set during :meth:fit). W_ : int Image width in pixels (set during :meth:fit).

Notes

The two helper methods implement:

.. math::

\text{_to_tensor}: (N, C \cdot H \cdot W)
    \longrightarrow (N, C, H, W)

\text{_to_flat}: (N, C, H, W)
    \longrightarrow (N, C \cdot H \cdot W)

Source code in rbig/_src/image.py

class ImageBijector(Bijector):
    """Abstract base class for bijective image transforms.

    Manages the conversion between the flattened representation
    ``(N, C·H·W)`` expected by RBIG and the 4-D tensor ``(N, C, H, W)``
    used internally by spatial transforms.

    Subclasses must implement :meth:`fit`, :meth:`transform`, and
    :meth:`inverse_transform`.  The default :meth:`get_log_det_jacobian`
    returns zeros, which is correct for all orthonormal transforms defined
    in this module (``|det J| = 1``).

    Attributes
    ----------
    C_ : int
        Number of channels (set during :meth:`fit`).
    H_ : int
        Image height in pixels (set during :meth:`fit`).
    W_ : int
        Image width in pixels (set during :meth:`fit`).

    Notes
    -----
    The two helper methods implement:

    .. math::

        \\text{_to_tensor}: (N, C \\cdot H \\cdot W)
            \\longrightarrow (N, C, H, W)

        \\text{_to_flat}: (N, C, H, W)
            \\longrightarrow (N, C \\cdot H \\cdot W)
    """

    def _to_tensor(self, X: np.ndarray) -> np.ndarray:
        """Reshape flat ``(N, C*H*W)`` array to tensor ``(N, C, H, W)``.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Flattened image batch.

        Returns
        -------
        tensor : np.ndarray, shape ``(N, C, H, W)``
        """
        N = X.shape[0]
        C, H, W = self.C_, self.H_, self.W_
        return X.reshape(N, C, H, W)

    def _to_flat(self, X: np.ndarray) -> np.ndarray:
        """Reshape tensor ``(N, C, H, W)`` to flat ``(N, C*H*W)``.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C, H, W)``
            Image tensor batch.

        Returns
        -------
        flat : np.ndarray, shape ``(N, C*H*W)``
        """
        N = X.shape[0]
        return X.reshape(N, -1)

    def fit(self, X: np.ndarray) -> ImageBijector:
        raise NotImplementedError

    def transform(self, X: np.ndarray) -> np.ndarray:
        raise NotImplementedError

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        raise NotImplementedError

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return per-sample log |det J| = 0 (orthonormal transform).

        Parameters
        ----------
        X : np.ndarray, shape ``(N, D)``

        Returns
        -------
        log_det : np.ndarray, shape ``(N,)``
            All-zero array because the Jacobian determinant is ±1 for every
            orthonormal linear map.
        """
        return np.zeros(X.shape[0])

`get_log_det_jacobian(X)`

Return per-sample log |det J| = 0 (orthonormal transform).

Parameters

X : np.ndarray, shape (N, D)

Returns

log_det : np.ndarray, shape (N,) All-zero array because the Jacobian determinant is ±1 for every orthonormal linear map.

Source code in rbig/_src/image.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return per-sample log |det J| = 0 (orthonormal transform).

    Parameters
    ----------
    X : np.ndarray, shape ``(N, D)``

    Returns
    -------
    log_det : np.ndarray, shape ``(N,)``
        All-zero array because the Jacobian determinant is ±1 for every
        orthonormal linear map.
    """
    return np.zeros(X.shape[0])

`ImageRBIG`

RBIG orchestrator for image data.

Alternates between marginal Gaussianisation and an orthonormal spatial rotation for n_layers steps, progressively pushing the joint distribution of image pixels towards a multivariate Gaussian.

Each layer applies:

:class:~rbig._src.marginal.MarginalGaussianize — maps every feature dimension to a standard normal marginal distribution.
An orthonormal rotation selected by strategy — decorrelates features without changing the differential entropy.

Parameters

n_layers : int, default 10 Number of (marginal + rotation) layer pairs to apply. C : int, default 1 Number of image channels passed to the rotation layers. H : int, default 8 Image height in pixels passed to the rotation layers. W : int, default 8 Image width in pixels passed to the rotation layers. strategy : str, default "dct" Rotation strategy. One of:

* ``"dct"`` — Type-II orthonormal DCT (:class:`DCTRotation`).
* ``"hartley"`` — Discrete Hartley Transform
  (:class:`HartleyRotation`).
* ``"random_channel"`` — Random orthogonal channel mixing
  (:class:`RandomChannelRotation`).

Any unknown string falls back to ``"dct"``.

random_state : int or None, default None Base seed for rotation layers that use randomness (random_channel). Layer i uses seed random_state + i.

Attributes

layers_ : list of tuple (MarginalGaussianize, ImageBijector) Fitted (marginal, rotation) pairs in application order. X_transformed_ : np.ndarray, shape (N, C*H*W) Final transformed representation after the last layer.

Notes

The composed forward transform for a single sample :math:\mathbf{x} is

.. math::

\mathbf{z} = (R_L \circ G_L \circ \cdots \circ R_1 \circ G_1)(\mathbf{x})

where :math:G_\ell is marginal Gaussianisation and :math:R_\ell is an orthonormal rotation at layer :math:\ell. Because each rotation is orthonormal, the total log-determinant is determined entirely by the marginal transforms.

Examples

import numpy as np rng = np.random.default_rng(0) X = rng.standard_normal((50, 64)) # 50 images, C=1, H=8, W=8 model = ImageRBIG(n_layers=3, C=1, H=8, W=8, strategy="dct", random_state=0) model.fit(X) # doctest: +ELLIPSIS ImageRBIG(...) Xt = model.transform(X) Xt.shape (50, 64) Xr = model.inverse_transform(Xt) Xr.shape (50, 64)

Source code in rbig/_src/image.py

class ImageRBIG:
    """RBIG orchestrator for image data.

    Alternates between marginal Gaussianisation and an orthonormal spatial
    rotation for ``n_layers`` steps, progressively pushing the joint
    distribution of image pixels towards a multivariate Gaussian.

    Each layer applies:

    1. :class:`~rbig._src.marginal.MarginalGaussianize` — maps every feature
       dimension to a standard normal marginal distribution.
    2. An orthonormal rotation selected by ``strategy`` — decorrelates
       features without changing the differential entropy.

    Parameters
    ----------
    n_layers : int, default 10
        Number of (marginal + rotation) layer pairs to apply.
    C : int, default 1
        Number of image channels passed to the rotation layers.
    H : int, default 8
        Image height in pixels passed to the rotation layers.
    W : int, default 8
        Image width in pixels passed to the rotation layers.
    strategy : str, default ``"dct"``
        Rotation strategy.  One of:

        * ``"dct"`` — Type-II orthonormal DCT (:class:`DCTRotation`).
        * ``"hartley"`` — Discrete Hartley Transform
          (:class:`HartleyRotation`).
        * ``"random_channel"`` — Random orthogonal channel mixing
          (:class:`RandomChannelRotation`).

        Any unknown string falls back to ``"dct"``.
    random_state : int or None, default None
        Base seed for rotation layers that use randomness (``random_channel``).
        Layer ``i`` uses seed ``random_state + i``.

    Attributes
    ----------
    layers_ : list of tuple (MarginalGaussianize, ImageBijector)
        Fitted (marginal, rotation) pairs in application order.
    X_transformed_ : np.ndarray, shape ``(N, C*H*W)``
        Final transformed representation after the last layer.

    Notes
    -----
    The composed forward transform for a single sample :math:`\\mathbf{x}` is

    .. math::

        \\mathbf{z} = (R_L \\circ G_L \\circ \\cdots \\circ R_1 \\circ G_1)(\\mathbf{x})

    where :math:`G_\\ell` is marginal Gaussianisation and :math:`R_\\ell` is an
    orthonormal rotation at layer :math:`\\ell`.  Because each rotation is
    orthonormal, the total log-determinant is determined entirely by the
    marginal transforms.

    Examples
    --------
    >>> import numpy as np
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((50, 64))  # 50 images, C=1, H=8, W=8
    >>> model = ImageRBIG(n_layers=3, C=1, H=8, W=8, strategy="dct", random_state=0)
    >>> model.fit(X)  # doctest: +ELLIPSIS
    ImageRBIG(...)
    >>> Xt = model.transform(X)
    >>> Xt.shape
    (50, 64)
    >>> Xr = model.inverse_transform(Xt)
    >>> Xr.shape
    (50, 64)
    """

    def __init__(
        self,
        n_layers: int = 10,
        C: int = 1,
        H: int = 8,
        W: int = 8,
        strategy: str = "dct",
        random_state: int | None = None,
    ):
        self.n_layers = n_layers
        self.C = C
        self.H = H
        self.W = W
        self.strategy = strategy
        self.random_state = random_state

    def fit(self, X: np.ndarray) -> ImageRBIG:
        """Fit all (marginal, rotation) layer pairs sequentially.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Training image batch in flattened format.

        Returns
        -------
        self : ImageRBIG
        """
        from rbig._src.marginal import MarginalGaussianize

        self.layers_ = []
        Xt = X.copy()  # working copy updated layer by layer
        for i in range(self.n_layers):
            # Step 1: marginal Gaussianisation
            marginal = MarginalGaussianize()
            Xt_m = marginal.fit_transform(Xt)
            # Step 2: orthonormal spatial rotation
            rotation = self._make_rotation(seed=i)
            rotation.fit(Xt_m)
            self.layers_.append((marginal, rotation))
            Xt = rotation.transform(Xt_m)  # update for the next iteration
        self.X_transformed_ = Xt  # final representation after all layers
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply all fitted layers in forward order.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Image batch to transform.

        Returns
        -------
        Xt : np.ndarray, shape ``(N, C*H*W)``
            Gaussianised representation.
        """
        Xt = X.copy()
        for marginal, rotation in self.layers_:
            Xt = rotation.transform(marginal.transform(Xt))
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply all fitted layers in reverse order.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Gaussianised representation to invert.

        Returns
        -------
        Xr : np.ndarray, shape ``(N, C*H*W)``
            Reconstructed image batch in the original domain.
        """
        Xt = X.copy()
        # Reverse the layer list; apply inverse of each (rotation first, then marginal)
        for marginal, rotation in reversed(self.layers_):
            Xt = marginal.inverse_transform(rotation.inverse_transform(Xt))
        return Xt

    def _make_rotation(self, seed: int = 0):
        """Instantiate the rotation layer for the given layer index.

        Parameters
        ----------
        seed : int, default 0
            Layer index; combined with ``random_state`` for reproducibility.

        Returns
        -------
        rotation : ImageBijector
            An unfitted rotation bijector of the type specified by
            ``self.strategy``.
        """
        # Combine base seed with layer index so each layer gets a unique seed
        rng_seed = (self.random_state or 0) + seed
        if self.strategy == "dct":
            return DCTRotation(C=self.C, H=self.H, W=self.W)
        elif self.strategy == "hartley":
            return HartleyRotation(C=self.C, H=self.H, W=self.W)
        elif self.strategy == "random_channel":
            return RandomChannelRotation(
                C=self.C, H=self.H, W=self.W, random_state=rng_seed
            )
        else:
            # Unknown strategy: fall back to DCT
            return DCTRotation(C=self.C, H=self.H, W=self.W)

`fit(X)`

Fit all (marginal, rotation) layer pairs sequentially.

Parameters

X : np.ndarray, shape (N, C*H*W) Training image batch in flattened format.

Returns

self : ImageRBIG

Source code in rbig/_src/image.py

def fit(self, X: np.ndarray) -> ImageRBIG:
    """Fit all (marginal, rotation) layer pairs sequentially.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Training image batch in flattened format.

    Returns
    -------
    self : ImageRBIG
    """
    from rbig._src.marginal import MarginalGaussianize

    self.layers_ = []
    Xt = X.copy()  # working copy updated layer by layer
    for i in range(self.n_layers):
        # Step 1: marginal Gaussianisation
        marginal = MarginalGaussianize()
        Xt_m = marginal.fit_transform(Xt)
        # Step 2: orthonormal spatial rotation
        rotation = self._make_rotation(seed=i)
        rotation.fit(Xt_m)
        self.layers_.append((marginal, rotation))
        Xt = rotation.transform(Xt_m)  # update for the next iteration
    self.X_transformed_ = Xt  # final representation after all layers
    return self

`inverse_transform(X)`

Apply all fitted layers in reverse order.

Parameters

X : np.ndarray, shape (N, C*H*W) Gaussianised representation to invert.

Returns

Xr : np.ndarray, shape (N, C*H*W) Reconstructed image batch in the original domain.

Source code in rbig/_src/image.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply all fitted layers in reverse order.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Gaussianised representation to invert.

    Returns
    -------
    Xr : np.ndarray, shape ``(N, C*H*W)``
        Reconstructed image batch in the original domain.
    """
    Xt = X.copy()
    # Reverse the layer list; apply inverse of each (rotation first, then marginal)
    for marginal, rotation in reversed(self.layers_):
        Xt = marginal.inverse_transform(rotation.inverse_transform(Xt))
    return Xt

`transform(X)`

Apply all fitted layers in forward order.

Parameters

X : np.ndarray, shape (N, C*H*W) Image batch to transform.

Returns

Xt : np.ndarray, shape (N, C*H*W) Gaussianised representation.

Source code in rbig/_src/image.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply all fitted layers in forward order.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Image batch to transform.

    Returns
    -------
    Xt : np.ndarray, shape ``(N, C*H*W)``
        Gaussianised representation.
    """
    Xt = X.copy()
    for marginal, rotation in self.layers_:
        Xt = rotation.transform(marginal.transform(Xt))
    return Xt

`KDEGaussianizer`

Bases: Bijector

Gaussianize each marginal using a KDE-estimated CDF and probit.

Fits a Gaussian kernel density estimate (KDE) to each feature dimension, then maps samples to standard-normal values via::

z = Phi ^ {-1}(F_KDE(x))

where F_KDE(x) = integral_{-inf}^{x} f_KDE(t) dt is the smooth KDE-based CDF and Phi^{-1} is the Gaussian probit (inverse CDF).

Parameters

bw_method : str, float, or None, default None Bandwidth selection passed to :class:scipy.stats.gaussian_kde. None uses Scott's rule; 'silverman' uses Silverman's rule; a scalar sets the smoothing factor directly. eps : float, default 1e-6 Clipping margin applied to the CDF value before the probit to prevent Phi^{-1}(0) = -inf or Phi^{-1}(1) = +inf.

Attributes

kdes_ : list of scipy.stats.gaussian_kde One fitted KDE per feature dimension. n_features_ : int Number of feature dimensions seen during fit.

Notes

The log-det-Jacobian uses the analytic KDE density::

log|dz/dx| = log f_KDE(x) - log phi(z)

where phi is the standard-normal PDF evaluated at the Gaussianized value z = Phi^{-1}(F_KDE(x)).

The inverse transform uses Brent's root-finding algorithm to numerically invert F_KDE(x) = Phi(z) on the interval [-100, 100].

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.marginal import KDEGaussianizer rng = np.random.default_rng(0) X = rng.standard_normal((100, 2)) kde = KDEGaussianizer().fit(X) Z = kde.transform(X) Z.shape (100, 2) ldj = kde.get_log_det_jacobian(X) ldj.shape (100,)

Source code in rbig/_src/marginal.py

class KDEGaussianizer(Bijector):
    """Gaussianize each marginal using a KDE-estimated CDF and probit.

    Fits a Gaussian kernel density estimate (KDE) to each feature dimension,
    then maps samples to standard-normal values via::

        z = Phi ^ {-1}(F_KDE(x))

    where ``F_KDE(x) = integral_{-inf}^{x} f_KDE(t) dt`` is the smooth
    KDE-based CDF and ``Phi^{-1}`` is the Gaussian probit (inverse CDF).

    Parameters
    ----------
    bw_method : str, float, or None, default None
        Bandwidth selection passed to :class:`scipy.stats.gaussian_kde`.
        ``None`` uses Scott's rule; ``'silverman'`` uses Silverman's rule;
        a scalar sets the smoothing factor directly.
    eps : float, default 1e-6
        Clipping margin applied to the CDF value before the probit to
        prevent ``Phi^{-1}(0) = -inf`` or ``Phi^{-1}(1) = +inf``.

    Attributes
    ----------
    kdes_ : list of scipy.stats.gaussian_kde
        One fitted KDE per feature dimension.
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The log-det-Jacobian uses the analytic KDE density::

        log|dz/dx| = log f_KDE(x) - log phi(z)

    where ``phi`` is the standard-normal PDF evaluated at the Gaussianized
    value ``z = Phi^{-1}(F_KDE(x))``.

    The inverse transform uses Brent's root-finding algorithm to numerically
    invert ``F_KDE(x) = Phi(z)`` on the interval [-100, 100].

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import KDEGaussianizer
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((100, 2))
    >>> kde = KDEGaussianizer().fit(X)
    >>> Z = kde.transform(X)
    >>> Z.shape
    (100, 2)
    >>> ldj = kde.get_log_det_jacobian(X)
    >>> ldj.shape
    (100,)
    """

    def __init__(self, bw_method: str | float | None = None, eps: float = 1e-6):
        self.bw_method = bw_method
        self.eps = eps

    def fit(self, X: np.ndarray) -> KDEGaussianizer:
        """Fit a Gaussian KDE to each feature dimension.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : KDEGaussianizer
            Fitted bijector instance.
        """
        self.kdes_ = []
        self.n_features_ = X.shape[1]
        for i in range(self.n_features_):
            # Independent KDE per feature
            self.kdes_.append(stats.gaussian_kde(X[:, i], bw_method=self.bw_method))
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Map each feature to N(0, 1) via KDE CDF then probit.

        Computes ``z = Phi^{-1}(F_KDE(x))`` for each feature independently.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Gaussianized data with approximately standard-normal marginals.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Numerical integration of KDE from -inf to each sample value
            u = np.array(
                [self.kdes_[i].integrate_box_1d(-np.inf, xi) for xi in X[:, i]]
            )
            # Clip CDF values away from boundaries before probit
            u = np.clip(u, self.eps, 1 - self.eps)
            Xt[:, i] = ndtri(u)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Map standard-normal values back to the original space.

        Numerically inverts ``F_KDE(x) = Phi(z)`` using Brent's method
        on the interval [-100, 100] per sample and feature.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the Gaussianized (standard-normal) space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
            Samples for which root-finding fails default to 0.0.
        """
        from scipy.optimize import brentq

        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            for j, xj in enumerate(X[:, i]):
                # Map z -> u = Phi(z) in (0, 1)
                u = stats.norm.cdf(xj)
                try:
                    # Find x such that F_KDE(x) = u
                    Xt[j, i] = brentq(
                        lambda x, u=u, i=i: (
                            self.kdes_[i].integrate_box_1d(-np.inf, x) - u
                        ),
                        -100,
                        100,
                    )
                except ValueError:
                    # Root not bracketed in [-100, 100]; use zero as fallback
                    Xt[j, i] = 0.0
        return Xt

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute log |det J| using the analytic KDE density.

        Because the Jacobian is diagonal (each feature transformed
        independently)::

            log|det J| = sum_i log|dz_i/dx_i|
                       = sum_i [log f_KDE(x_i) - log phi(z_i)]

        where ``phi`` is the standard-normal PDF evaluated at
        ``z_i = Phi^{-1}(F_KDE(x_i))``.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points at which to evaluate the log-det-Jacobian.

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant.
        """
        log_det = np.zeros(X.shape[0])
        for i in range(self.n_features_):
            # Evaluate KDE density (used as the empirical marginal PDF)
            pdf = self.kdes_[i](X[:, i])
            # Compute KDE CDF via numerical integration
            u = np.array(
                [self.kdes_[i].integrate_box_1d(-np.inf, xi) for xi in X[:, i]]
            )
            u = np.clip(u, self.eps, 1 - self.eps)
            g = ndtri(u)  # Gaussianized value z = Phi^{-1}(u)
            log_phi = stats.norm.logpdf(g)  # log phi(z)
            # log|dz/dx| = log f_KDE(x) - log phi(z)
            log_det += np.log(np.maximum(pdf, 1e-300)) - log_phi
        return log_det

`fit(X)`

Fit a Gaussian KDE to each feature dimension.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : KDEGaussianizer Fitted bijector instance.

Source code in rbig/_src/marginal.py

def fit(self, X: np.ndarray) -> KDEGaussianizer:
    """Fit a Gaussian KDE to each feature dimension.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : KDEGaussianizer
        Fitted bijector instance.
    """
    self.kdes_ = []
    self.n_features_ = X.shape[1]
    for i in range(self.n_features_):
        # Independent KDE per feature
        self.kdes_.append(stats.gaussian_kde(X[:, i], bw_method=self.bw_method))
    return self

`get_log_det_jacobian(X)`

Compute log |det J| using the analytic KDE density.

Because the Jacobian is diagonal (each feature transformed independently)::

log|det J| = sum_i log|dz_i/dx_i|
           = sum_i [log f_KDE(x_i) - log phi(z_i)]

where phi is the standard-normal PDF evaluated at z_i = Phi^{-1}(F_KDE(x_i)).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points at which to evaluate the log-det-Jacobian.

Returns

log_det : np.ndarray of shape (n_samples,) Per-sample log absolute determinant.

Source code in rbig/_src/marginal.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute log |det J| using the analytic KDE density.

    Because the Jacobian is diagonal (each feature transformed
    independently)::

        log|det J| = sum_i log|dz_i/dx_i|
                   = sum_i [log f_KDE(x_i) - log phi(z_i)]

    where ``phi`` is the standard-normal PDF evaluated at
    ``z_i = Phi^{-1}(F_KDE(x_i))``.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points at which to evaluate the log-det-Jacobian.

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant.
    """
    log_det = np.zeros(X.shape[0])
    for i in range(self.n_features_):
        # Evaluate KDE density (used as the empirical marginal PDF)
        pdf = self.kdes_[i](X[:, i])
        # Compute KDE CDF via numerical integration
        u = np.array(
            [self.kdes_[i].integrate_box_1d(-np.inf, xi) for xi in X[:, i]]
        )
        u = np.clip(u, self.eps, 1 - self.eps)
        g = ndtri(u)  # Gaussianized value z = Phi^{-1}(u)
        log_phi = stats.norm.logpdf(g)  # log phi(z)
        # log|dz/dx| = log f_KDE(x) - log phi(z)
        log_det += np.log(np.maximum(pdf, 1e-300)) - log_phi
    return log_det

`inverse_transform(X)`

Map standard-normal values back to the original space.

Numerically inverts F_KDE(x) = Phi(z) using Brent's method on the interval [-100, 100] per sample and feature.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the Gaussianized (standard-normal) space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space. Samples for which root-finding fails default to 0.0.

Source code in rbig/_src/marginal.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Map standard-normal values back to the original space.

    Numerically inverts ``F_KDE(x) = Phi(z)`` using Brent's method
    on the interval [-100, 100] per sample and feature.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the Gaussianized (standard-normal) space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
        Samples for which root-finding fails default to 0.0.
    """
    from scipy.optimize import brentq

    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        for j, xj in enumerate(X[:, i]):
            # Map z -> u = Phi(z) in (0, 1)
            u = stats.norm.cdf(xj)
            try:
                # Find x such that F_KDE(x) = u
                Xt[j, i] = brentq(
                    lambda x, u=u, i=i: (
                        self.kdes_[i].integrate_box_1d(-np.inf, x) - u
                    ),
                    -100,
                    100,
                )
            except ValueError:
                # Root not bracketed in [-100, 100]; use zero as fallback
                Xt[j, i] = 0.0
    return Xt

`transform(X)`

Map each feature to N(0, 1) via KDE CDF then probit.

Computes z = Phi^{-1}(F_KDE(x)) for each feature independently.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Gaussianized data with approximately standard-normal marginals.

Source code in rbig/_src/marginal.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Map each feature to N(0, 1) via KDE CDF then probit.

    Computes ``z = Phi^{-1}(F_KDE(x))`` for each feature independently.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Gaussianized data with approximately standard-normal marginals.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Numerical integration of KDE from -inf to each sample value
        u = np.array(
            [self.kdes_[i].integrate_box_1d(-np.inf, xi) for xi in X[:, i]]
        )
        # Clip CDF values away from boundaries before probit
        u = np.clip(u, self.eps, 1 - self.eps)
        Xt[:, i] = ndtri(u)
    return Xt

`LogitTransform`

Bases: BaseTransform

Logit transform: bijectively maps the unit hypercube (0,1)ᵈ to ℝᵈ.

Each feature is transformed independently by the logit (log-odds) function:

Forward  : y = log(x / (1 − x))
Inverse  : x = σ(y) = 1 / (1 + e^{−y})        (sigmoid)
Log-det  : ∑ᵢ [−log xᵢ − log(1 − xᵢ)]

The transform is useful as a pre-processing step when data lives in (0, 1), e.g. probabilities or proportions.

Examples

import numpy as np from rbig._src.parametric import LogitTransform rng = np.random.default_rng(0) X = rng.uniform(0.05, 0.95, size=(100, 3)) # data in (0, 1) tr = LogitTransform().fit(X) Y = tr.transform(X) # data now in ℝ X_rec = tr.inverse_transform(Y) np.allclose(X, X_rec, atol=1e-10) True

Source code in rbig/_src/parametric.py

class LogitTransform(BaseTransform):
    """Logit transform: bijectively maps the unit hypercube (0,1)ᵈ to ℝᵈ.

    Each feature is transformed independently by the logit (log-odds) function:

        Forward  : y = log(x / (1 − x))
        Inverse  : x = σ(y) = 1 / (1 + e^{−y})        (sigmoid)
        Log-det  : ∑ᵢ [−log xᵢ − log(1 − xᵢ)]

    The transform is useful as a pre-processing step when data lives in (0, 1),
    e.g. probabilities or proportions.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import LogitTransform
    >>> rng = np.random.default_rng(0)
    >>> X = rng.uniform(0.05, 0.95, size=(100, 3))  # data in (0, 1)
    >>> tr = LogitTransform().fit(X)
    >>> Y = tr.transform(X)  # data now in ℝ
    >>> X_rec = tr.inverse_transform(Y)
    >>> np.allclose(X, X_rec, atol=1e-10)
    True
    """

    def fit(self, X: np.ndarray) -> LogitTransform:
        """No-op fit (stateless transform).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Ignored.

        Returns
        -------
        self : LogitTransform
        """
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the logit map y = log(x / (1 − x)).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in (0, 1).

        Returns
        -------
        Y : np.ndarray of shape (n_samples, n_features)
            Log-odds transformed data in ℝ.
        """
        return np.log(X / (1 - X))  # y = logit(x) = log(x/(1-x))

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the inverse sigmoid (logistic) map x = 1 / (1 + e^{−y}).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in ℝ.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Recovered data in (0, 1).
        """
        return 1 / (1 + np.exp(-X))  # x = sigmoid(y) = 1/(1+e^{-y})

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute per-sample log |det J| of the forward logit transform.

        The Jacobian of logit is diagonal with entries
        d(logit xᵢ)/dxᵢ = 1/xᵢ + 1/(1−xᵢ), so:

            log |det J| = ∑ᵢ [−log xᵢ − log(1 − xᵢ)]

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in (0, 1) (pre-transform).

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Log absolute determinant of the Jacobian for each sample.
        """
        # Diagonal Jacobian: sum_i (-log xi - log(1-xi))
        return np.sum(-np.log(X) - np.log(1 - X), axis=1)

`fit(X)`

No-op fit (stateless transform).

Parameters

X : np.ndarray of shape (n_samples, n_features) Ignored.

Returns

self : LogitTransform

Source code in rbig/_src/parametric.py

def fit(self, X: np.ndarray) -> LogitTransform:
    """No-op fit (stateless transform).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Ignored.

    Returns
    -------
    self : LogitTransform
    """
    return self

`inverse_transform(X)`

Apply the inverse sigmoid (logistic) map x = 1 / (1 + e^{−y}).

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in ℝ.

Returns

Z : np.ndarray of shape (n_samples, n_features) Recovered data in (0, 1).

Source code in rbig/_src/parametric.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the inverse sigmoid (logistic) map x = 1 / (1 + e^{−y}).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in ℝ.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Recovered data in (0, 1).
    """
    return 1 / (1 + np.exp(-X))  # x = sigmoid(y) = 1/(1+e^{-y})

`log_det_jacobian(X)`

Compute per-sample log |det J| of the forward logit transform.

The Jacobian of logit is diagonal with entries d(logit xᵢ)/dxᵢ = 1/xᵢ + 1/(1−xᵢ), so:

log |det J| = ∑ᵢ [−log xᵢ − log(1 − xᵢ)]

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in (0, 1) (pre-transform).

Returns

log_det : np.ndarray of shape (n_samples,) Log absolute determinant of the Jacobian for each sample.

Source code in rbig/_src/parametric.py

def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute per-sample log |det J| of the forward logit transform.

    The Jacobian of logit is diagonal with entries
    d(logit xᵢ)/dxᵢ = 1/xᵢ + 1/(1−xᵢ), so:

        log |det J| = ∑ᵢ [−log xᵢ − log(1 − xᵢ)]

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in (0, 1) (pre-transform).

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Log absolute determinant of the Jacobian for each sample.
    """
    # Diagonal Jacobian: sum_i (-log xi - log(1-xi))
    return np.sum(-np.log(X) - np.log(1 - X), axis=1)

`transform(X)`

Apply the logit map y = log(x / (1 − x)).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in (0, 1).

Returns

Y : np.ndarray of shape (n_samples, n_features) Log-odds transformed data in ℝ.

Source code in rbig/_src/parametric.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the logit map y = log(x / (1 − x)).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in (0, 1).

    Returns
    -------
    Y : np.ndarray of shape (n_samples, n_features)
        Log-odds transformed data in ℝ.
    """
    return np.log(X / (1 - X))  # y = logit(x) = log(x/(1-x))

`MarginalBijector`

Bases: Bijector

Abstract bijector for independent, per-dimension (marginal) transforms.

Each feature dimension is transformed by a separate invertible function. Because the transform is applied independently to each coordinate, the Jacobian is diagonal and its log-determinant is the sum of per-dimension log-derivatives:

log|det J_f(x)| = ∑ᵢ log|f′(xᵢ)|

Subclasses implement concrete marginal mappings such as empirical CDF Gaussianization, quantile transform, or kernel density estimation.

Notes

In RBIG, the marginal step maps each dimension to a standard Gaussian via

z = Φ⁻¹(F̂ₙ(x))

where F̂ₙ is the estimated marginal CDF and Φ⁻¹ is the standard normal quantile function (probit).

Source code in rbig/_src/base.py

class MarginalBijector(Bijector):
    """Abstract bijector for independent, per-dimension (marginal) transforms.

    Each feature dimension is transformed by a separate invertible function.
    Because the transform is applied independently to each coordinate, the
    Jacobian is diagonal and its log-determinant is the sum of per-dimension
    log-derivatives:

        log|det J_f(x)| = ∑ᵢ log|f′(xᵢ)|

    Subclasses implement concrete marginal mappings such as empirical CDF
    Gaussianization, quantile transform, or kernel density estimation.

    Notes
    -----
    In RBIG, the marginal step maps each dimension to a standard Gaussian via

        z = Φ⁻¹(F̂ₙ(x))

    where F̂ₙ is the estimated marginal CDF and Φ⁻¹ is the standard normal
    quantile function (probit).
    """

`MarginalGaussianize`

Bases: BaseTransform

Transform each marginal to standard Gaussian using empirical CDF + probit.

Combines a mid-point empirical CDF estimate with the Gaussian probit (quantile) function Phi^{-1} to map each feature to an approximately standard-normal marginal::

z = Phi ^ {-1}(F_hat_n(x))

where F_hat_n(x) = (rank + 0.5) / N is the mid-point empirical CDF and Phi^{-1} is the inverse standard-normal CDF (probit).

Parameters

bound_correct : bool, default True Clip the intermediate uniform value to [eps, 1 - eps] before applying the probit to prevent +/-inf outputs at the tails. eps : float, default 1e-6 Clipping margin for the uniform intermediate value.

Attributes

support_ : np.ndarray of shape (n_samples, n_features) Column-wise sorted training data (empirical quantile nodes). n_features_ : int Number of feature dimensions seen during fit.

Notes

The log-absolute Jacobian determinant needed for density estimation is::

log|dz/dx| = log f_hat_n(x) - log phi(Phi^{-1}(F_hat_n(x)))

where f_hat_n is the empirical density estimated from the spacing of adjacent sorted training values, and phi is the standard-normal PDF. This is computed in :meth:log_det_jacobian.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.marginal import MarginalGaussianize rng = np.random.default_rng(0) X = rng.standard_normal((200, 3)) mg = MarginalGaussianize().fit(X) Z = mg.transform(X) Z.shape (200, 3) abs(float(Z.mean())) < 0.5 True

Source code in rbig/_src/marginal.py

class MarginalGaussianize(BaseTransform):
    """Transform each marginal to standard Gaussian using empirical CDF + probit.

    Combines a mid-point empirical CDF estimate with the Gaussian probit
    (quantile) function Phi^{-1} to map each feature to an approximately
    standard-normal marginal::

        z = Phi ^ {-1}(F_hat_n(x))

    where ``F_hat_n(x) = (rank + 0.5) / N`` is the mid-point empirical CDF
    and ``Phi^{-1}`` is the inverse standard-normal CDF (probit).

    Parameters
    ----------
    bound_correct : bool, default True
        Clip the intermediate uniform value to ``[eps, 1 - eps]`` before
        applying the probit to prevent +/-inf outputs at the tails.
    eps : float, default 1e-6
        Clipping margin for the uniform intermediate value.

    Attributes
    ----------
    support_ : np.ndarray of shape (n_samples, n_features)
        Column-wise sorted training data (empirical quantile nodes).
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The log-absolute Jacobian determinant needed for density estimation is::

        log|dz/dx| = log f_hat_n(x) - log phi(Phi^{-1}(F_hat_n(x)))

    where ``f_hat_n`` is the empirical density estimated from the spacing of
    adjacent sorted training values, and ``phi`` is the standard-normal PDF.
    This is computed in :meth:`log_det_jacobian`.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import MarginalGaussianize
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 3))
    >>> mg = MarginalGaussianize().fit(X)
    >>> Z = mg.transform(X)
    >>> Z.shape
    (200, 3)
    >>> abs(float(Z.mean())) < 0.5
    True
    """

    def __init__(self, bound_correct: bool = True, eps: float = 1e-6):
        self.bound_correct = bound_correct
        self.eps = eps

    def fit(self, X: np.ndarray) -> MarginalGaussianize:
        """Fit by storing the column-wise sorted training data.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data used to build the per-feature empirical CDF.

        Returns
        -------
        self : MarginalGaussianize
            Fitted transform instance.
        """
        # Sorted columns serve as empirical quantile nodes
        self.support_ = np.sort(X, axis=0)
        self.n_features_ = X.shape[1]
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Map each feature to N(0, 1) via empirical CDF then probit.

        Applies ``z = Phi^{-1}(F_hat_n(x))`` column by column.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Gaussianized data; each column has approximately N(0, 1) marginal.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Step 1: empirical CDF -> uniform value in (0, 1)
            u = MarginalUniformize._uniformize(X[:, i], self.support_[:, i])
            if self.bound_correct:
                # Clip to avoid Phi^{-1}(0) = -inf or Phi^{-1}(1) = +inf
                u = np.clip(u, self.eps, 1 - self.eps)
            # Step 2: probit transform Phi^{-1}(u) -> standard normal
            Xt[:, i] = ndtri(u)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Map standard-normal values back to the original space.

        Applies the normal CDF Phi to obtain uniform values, then uses
        piecewise-linear interpolation through the sorted support to recover
        approximate original-space values.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the Gaussianized (standard-normal) space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Invert probit: z -> Phi(z) in (0, 1)
            u = stats.norm.cdf(X[:, i])
            # Invert empirical CDF via linear interpolation
            Xt[:, i] = np.interp(
                u, np.linspace(0, 1, len(self.support_[:, i])), self.support_[:, i]
            )
        return Xt

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Log |det J| for marginal Gaussianization.

        For g(x) = Phi^{-1}(F_n(x)):
            log|dg/dx| = log f_n(x_i) - log phi(g(x_i))

        where f_n is estimated from the spacing of the empirical support.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data at which to evaluate the log-det-Jacobian.

        Returns
        -------
        log_jac : np.ndarray of shape (n_samples,)
            Per-sample sum of per-feature log-derivatives.

        Notes
        -----
        The empirical density is approximated as::

            f_hat_n(x) ~= 1 / (N * spacing)

        where *spacing* is the gap between adjacent sorted training values
        at the location of x.  Tied values produce zero spacings; these are
        replaced by the minimum positive spacing to keep the log-density
        finite.
        """
        return np.sum(self._per_feature_log_deriv(X), axis=1)

    def _per_feature_log_deriv(
        self, X: np.ndarray, return_transform: bool = False
    ) -> np.ndarray | tuple[np.ndarray, np.ndarray]:
        """Per-feature log |dz_i/dx_i| for marginal Gaussianization.

        Returns the un-summed per-feature log-derivatives needed for both
        ``log_det_jacobian`` (which sums them) and the full Jacobian
        computation in ``AnnealedRBIG.jacobian``.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data at which to evaluate the per-feature log-derivatives.
        return_transform : bool, default False
            If True, also return the Gaussianized output to avoid recomputing.

        Returns
        -------
        log_derivs : np.ndarray of shape (n_samples, n_features)
            Per-feature log |dz_i/dx_i| for each sample.
        Xt : np.ndarray of shape (n_samples, n_features)
            Only returned when ``return_transform=True``.
        """
        Xt = self.transform(X)  # Gaussianized output, shape (N, D)
        n = self.support_.shape[0]  # number of training samples
        log_derivs = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            support_i = self.support_[:, i]  # sorted training values, shape (n,)
            spacings = np.diff(support_i)  # gaps between consecutive support points
            pos_sp = spacings[spacings > 0]
            min_sp = pos_sp.min() if len(pos_sp) > 0 else 1e-10
            safe_sp = np.where(spacings > 0, spacings, min_sp)
            # Pad to n elements: first element uses the first spacing
            sp_full = np.concatenate([[safe_sp[0]], safe_sp])
            ranks = np.clip(np.searchsorted(support_i, X[:, i], side="left"), 0, n - 1)
            local_spacing = sp_full[ranks]  # spacing at each sample's location
            # Empirical log-density: log f_hat_n(x) ~= -log(N * spacing)
            log_f_i = -np.log(n * local_spacing)
            # Log standard-normal PDF at Gaussianized value: log phi(z_i)
            log_phi_gi = stats.norm.logpdf(Xt[:, i])
            # Chain rule: log|dz/dx| = log f_hat_n(x) - log phi(z)
            log_derivs[:, i] = log_f_i - log_phi_gi
        if return_transform:
            return log_derivs, Xt
        return log_derivs

`fit(X)`

Fit by storing the column-wise sorted training data.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data used to build the per-feature empirical CDF.

Returns

self : MarginalGaussianize Fitted transform instance.

Source code in rbig/_src/marginal.py

def fit(self, X: np.ndarray) -> MarginalGaussianize:
    """Fit by storing the column-wise sorted training data.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data used to build the per-feature empirical CDF.

    Returns
    -------
    self : MarginalGaussianize
        Fitted transform instance.
    """
    # Sorted columns serve as empirical quantile nodes
    self.support_ = np.sort(X, axis=0)
    self.n_features_ = X.shape[1]
    return self

`inverse_transform(X)`

Map standard-normal values back to the original space.

Applies the normal CDF Phi to obtain uniform values, then uses piecewise-linear interpolation through the sorted support to recover approximate original-space values.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the Gaussianized (standard-normal) space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space.

Source code in rbig/_src/marginal.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Map standard-normal values back to the original space.

    Applies the normal CDF Phi to obtain uniform values, then uses
    piecewise-linear interpolation through the sorted support to recover
    approximate original-space values.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the Gaussianized (standard-normal) space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Invert probit: z -> Phi(z) in (0, 1)
        u = stats.norm.cdf(X[:, i])
        # Invert empirical CDF via linear interpolation
        Xt[:, i] = np.interp(
            u, np.linspace(0, 1, len(self.support_[:, i])), self.support_[:, i]
        )
    return Xt

`log_det_jacobian(X)`

Log |det J| for marginal Gaussianization.

For g(x) = Phi^{-1}(F_n(x)): log|dg/dx| = log f_n(x_i) - log phi(g(x_i))

where f_n is estimated from the spacing of the empirical support.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data at which to evaluate the log-det-Jacobian.

Returns

log_jac : np.ndarray of shape (n_samples,) Per-sample sum of per-feature log-derivatives.

Notes

The empirical density is approximated as::

f_hat_n(x) ~= 1 / (N * spacing)

where spacing is the gap between adjacent sorted training values at the location of x. Tied values produce zero spacings; these are replaced by the minimum positive spacing to keep the log-density finite.

Source code in rbig/_src/marginal.py

def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Log |det J| for marginal Gaussianization.

    For g(x) = Phi^{-1}(F_n(x)):
        log|dg/dx| = log f_n(x_i) - log phi(g(x_i))

    where f_n is estimated from the spacing of the empirical support.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data at which to evaluate the log-det-Jacobian.

    Returns
    -------
    log_jac : np.ndarray of shape (n_samples,)
        Per-sample sum of per-feature log-derivatives.

    Notes
    -----
    The empirical density is approximated as::

        f_hat_n(x) ~= 1 / (N * spacing)

    where *spacing* is the gap between adjacent sorted training values
    at the location of x.  Tied values produce zero spacings; these are
    replaced by the minimum positive spacing to keep the log-density
    finite.
    """
    return np.sum(self._per_feature_log_deriv(X), axis=1)

`transform(X)`

Map each feature to N(0, 1) via empirical CDF then probit.

Applies z = Phi^{-1}(F_hat_n(x)) column by column.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Gaussianized data; each column has approximately N(0, 1) marginal.

Source code in rbig/_src/marginal.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Map each feature to N(0, 1) via empirical CDF then probit.

    Applies ``z = Phi^{-1}(F_hat_n(x))`` column by column.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Gaussianized data; each column has approximately N(0, 1) marginal.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Step 1: empirical CDF -> uniform value in (0, 1)
        u = MarginalUniformize._uniformize(X[:, i], self.support_[:, i])
        if self.bound_correct:
            # Clip to avoid Phi^{-1}(0) = -inf or Phi^{-1}(1) = +inf
            u = np.clip(u, self.eps, 1 - self.eps)
        # Step 2: probit transform Phi^{-1}(u) -> standard normal
        Xt[:, i] = ndtri(u)
    return Xt

`MarginalKDEGaussianize`

Bases: BaseTransform

Transform each marginal to Gaussian using a KDE-estimated CDF.

A kernel density estimate (KDE) with a Gaussian kernel is fitted to each feature dimension. The cumulative integral of the KDE serves as a smooth approximation to the marginal CDF, which is then composed with the probit function Phi^{-1} to Gaussianize each dimension::

z = Phi ^ {-1}(F_KDE(x))

where F_KDE(x) = integral_{-inf}^{x} f_KDE(t) dt and f_KDE is the Gaussian-kernel density estimate.

Parameters

bw_method : str, float, or None, default None Bandwidth selection method passed to :class:scipy.stats.gaussian_kde. None uses Scott's rule; 'silverman' uses Silverman's rule; a scalar sets the bandwidth factor directly. eps : float, default 1e-6 Clipping margin to prevent Phi^{-1}(0) = -inf or Phi^{-1}(1) = +inf.

Attributes

kdes_ : list of scipy.stats.gaussian_kde One fitted KDE object per feature dimension. n_features_ : int Number of feature dimensions seen during fit.

Notes

The inverse transform inverts the KDE CDF numerically via Brent's method (:func:scipy.optimize.brentq) searching in [-100, 100]. Samples outside this range default to 0.0.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.marginal import MarginalKDEGaussianize rng = np.random.default_rng(0) X = rng.standard_normal((50, 2)) kde_g = MarginalKDEGaussianize().fit(X) Z = kde_g.transform(X) Z.shape (50, 2)

Source code in rbig/_src/marginal.py

class MarginalKDEGaussianize(BaseTransform):
    """Transform each marginal to Gaussian using a KDE-estimated CDF.

    A kernel density estimate (KDE) with a Gaussian kernel is fitted to each
    feature dimension.  The cumulative integral of the KDE serves as a smooth
    approximation to the marginal CDF, which is then composed with the probit
    function Phi^{-1} to Gaussianize each dimension::

        z = Phi ^ {-1}(F_KDE(x))

    where ``F_KDE(x) = integral_{-inf}^{x} f_KDE(t) dt`` and ``f_KDE`` is
    the Gaussian-kernel density estimate.

    Parameters
    ----------
    bw_method : str, float, or None, default None
        Bandwidth selection method passed to
        :class:`scipy.stats.gaussian_kde`.  ``None`` uses Scott's rule;
        ``'silverman'`` uses Silverman's rule; a scalar sets the bandwidth
        factor directly.
    eps : float, default 1e-6
        Clipping margin to prevent ``Phi^{-1}(0) = -inf`` or
        ``Phi^{-1}(1) = +inf``.

    Attributes
    ----------
    kdes_ : list of scipy.stats.gaussian_kde
        One fitted KDE object per feature dimension.
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The inverse transform inverts the KDE CDF numerically via Brent's method
    (:func:`scipy.optimize.brentq`) searching in [-100, 100].  Samples
    outside this range default to 0.0.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import MarginalKDEGaussianize
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((50, 2))
    >>> kde_g = MarginalKDEGaussianize().fit(X)
    >>> Z = kde_g.transform(X)
    >>> Z.shape
    (50, 2)
    """

    def __init__(self, bw_method: str | float | None = None, eps: float = 1e-6):
        self.bw_method = bw_method
        self.eps = eps

    def fit(self, X: np.ndarray) -> MarginalKDEGaussianize:
        """Fit a Gaussian KDE to each feature dimension.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : MarginalKDEGaussianize
            Fitted transform instance.
        """
        self.kdes_ = []
        self.n_features_ = X.shape[1]
        for i in range(self.n_features_):
            # Fit an independent Gaussian KDE per feature
            self.kdes_.append(stats.gaussian_kde(X[:, i], bw_method=self.bw_method))
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Map each feature to N(0, 1) via KDE CDF then probit.

        Computes ``z = Phi^{-1}(F_KDE(x))`` per feature using numerical
        integration of the fitted KDE.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Gaussianized data.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Integrate KDE from -inf to each sample value to get CDF
            u = np.array(
                [self.kdes_[i].integrate_box_1d(-np.inf, xi) for xi in X[:, i]]
            )
            # Clip to avoid +/-inf from the probit function
            u = np.clip(u, self.eps, 1 - self.eps)
            Xt[:, i] = ndtri(u)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Map standard-normal values back to the original space.

        Numerically inverts the KDE CDF via Brent's root-finding method.
        For each sample *j* and feature *i*, solves::

            F_KDE(x) = Phi(z_j)

        searching on the interval [-100, 100].

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the Gaussianized space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
            Samples that fail root-finding are set to 0.0.
        """
        from scipy.optimize import brentq

        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            for j, xj in enumerate(X[:, i]):
                # Map z -> u in (0, 1) via normal CDF
                u = stats.norm.cdf(xj)
                try:
                    # Numerically invert F_KDE(x) = u via root-finding
                    Xt[j, i] = brentq(
                        lambda x, u=u, i=i: (
                            self.kdes_[i].integrate_box_1d(-np.inf, x) - u
                        ),
                        -100,
                        100,
                    )
                except ValueError:
                    # Root not found in [-100, 100]; fall back to zero
                    Xt[j, i] = 0.0
        return Xt

`fit(X)`

Fit a Gaussian KDE to each feature dimension.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : MarginalKDEGaussianize Fitted transform instance.

Source code in rbig/_src/marginal.py

def fit(self, X: np.ndarray) -> MarginalKDEGaussianize:
    """Fit a Gaussian KDE to each feature dimension.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : MarginalKDEGaussianize
        Fitted transform instance.
    """
    self.kdes_ = []
    self.n_features_ = X.shape[1]
    for i in range(self.n_features_):
        # Fit an independent Gaussian KDE per feature
        self.kdes_.append(stats.gaussian_kde(X[:, i], bw_method=self.bw_method))
    return self

`inverse_transform(X)`

Map standard-normal values back to the original space.

Numerically inverts the KDE CDF via Brent's root-finding method. For each sample j and feature i, solves::

F_KDE(x) = Phi(z_j)

searching on the interval [-100, 100].

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the Gaussianized space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space. Samples that fail root-finding are set to 0.0.

Source code in rbig/_src/marginal.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Map standard-normal values back to the original space.

    Numerically inverts the KDE CDF via Brent's root-finding method.
    For each sample *j* and feature *i*, solves::

        F_KDE(x) = Phi(z_j)

    searching on the interval [-100, 100].

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the Gaussianized space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
        Samples that fail root-finding are set to 0.0.
    """
    from scipy.optimize import brentq

    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        for j, xj in enumerate(X[:, i]):
            # Map z -> u in (0, 1) via normal CDF
            u = stats.norm.cdf(xj)
            try:
                # Numerically invert F_KDE(x) = u via root-finding
                Xt[j, i] = brentq(
                    lambda x, u=u, i=i: (
                        self.kdes_[i].integrate_box_1d(-np.inf, x) - u
                    ),
                    -100,
                    100,
                )
            except ValueError:
                # Root not found in [-100, 100]; fall back to zero
                Xt[j, i] = 0.0
    return Xt

`transform(X)`

Map each feature to N(0, 1) via KDE CDF then probit.

Computes z = Phi^{-1}(F_KDE(x)) per feature using numerical integration of the fitted KDE.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Gaussianized data.

Source code in rbig/_src/marginal.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Map each feature to N(0, 1) via KDE CDF then probit.

    Computes ``z = Phi^{-1}(F_KDE(x))`` per feature using numerical
    integration of the fitted KDE.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Gaussianized data.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Integrate KDE from -inf to each sample value to get CDF
        u = np.array(
            [self.kdes_[i].integrate_box_1d(-np.inf, xi) for xi in X[:, i]]
        )
        # Clip to avoid +/-inf from the probit function
        u = np.clip(u, self.eps, 1 - self.eps)
        Xt[:, i] = ndtri(u)
    return Xt

`MarginalUniformize`

Bases: BaseTransform

Transform each marginal to uniform [0, 1] using the empirical CDF.

For each feature dimension i, the empirical CDF is estimated from the training data with a mid-point (Hazen) continuity correction::

u_hat = F_hat_n(x) = (rank(x, X_train) + 0.5) / N

where rank is the number of training samples <= x (left-sided searchsorted) and N is the number of training samples. The +0.5 shift avoids the degenerate values 0 and 1 for in-sample boundary points.

Parameters

bound_correct : bool, default True If True, clip the output to [eps, 1 - eps] to prevent exact 0 or 1, which is useful when feeding the result into a probit or logit function. eps : float, default 1e-6 Half-width of the clipping margin when bound_correct=True.

Attributes

support_ : np.ndarray of shape (n_samples, n_features) Column-wise sorted training data. Serves as empirical quantile nodes for both the forward transform and piecewise-linear inversion. n_features_ : int Number of feature dimensions seen during fit.

Notes

The mid-point empirical CDF (Hazen plotting position) is::

F_hat_n(x) = (rank + 0.5) / N

The inverse is approximated by piecewise-linear interpolation between the sorted support values and their corresponding uniform probabilities np.linspace(0, 1, N).

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.marginal import MarginalUniformize rng = np.random.default_rng(0) X = rng.standard_normal((100, 2)) uni = MarginalUniformize().fit(X) U = uni.transform(X) U.shape (100, 2) bool(U.min() > 0.0) and bool(U.max() < 1.0) True Xr = uni.inverse_transform(U) Xr.shape (100, 2)

Source code in rbig/_src/marginal.py

class MarginalUniformize(BaseTransform):
    """Transform each marginal to uniform [0, 1] using the empirical CDF.

    For each feature dimension *i*, the empirical CDF is estimated from the
    training data with a mid-point (Hazen) continuity correction::

        u_hat = F_hat_n(x) = (rank(x, X_train) + 0.5) / N

    where *rank* is the number of training samples <= x (left-sided
    ``searchsorted``) and *N* is the number of training samples.  The
    ``+0.5`` shift avoids the degenerate values 0 and 1 for in-sample
    boundary points.

    Parameters
    ----------
    bound_correct : bool, default True
        If True, clip the output to ``[eps, 1 - eps]`` to prevent exact 0
        or 1, which is useful when feeding the result into a probit or
        logit function.
    eps : float, default 1e-6
        Half-width of the clipping margin when ``bound_correct=True``.

    Attributes
    ----------
    support_ : np.ndarray of shape (n_samples, n_features)
        Column-wise sorted training data.  Serves as empirical quantile
        nodes for both the forward transform and piecewise-linear inversion.
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The mid-point empirical CDF (Hazen plotting position) is::

        F_hat_n(x) = (rank + 0.5) / N

    The inverse is approximated by piecewise-linear interpolation between
    the sorted support values and their corresponding uniform probabilities
    ``np.linspace(0, 1, N)``.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import MarginalUniformize
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((100, 2))
    >>> uni = MarginalUniformize().fit(X)
    >>> U = uni.transform(X)
    >>> U.shape
    (100, 2)
    >>> bool(U.min() > 0.0) and bool(U.max() < 1.0)
    True
    >>> Xr = uni.inverse_transform(U)
    >>> Xr.shape
    (100, 2)
    """

    def __init__(
        self,
        bound_correct: bool = True,
        eps: float = 1e-6,
        pdf_extension: float = 0.0,
        pdf_resolution: int = 1000,
    ):
        self.bound_correct = bound_correct
        self.eps = eps
        self.pdf_extension = pdf_extension
        self.pdf_resolution = pdf_resolution

    def fit(self, X: np.ndarray) -> MarginalUniformize:
        """Fit the transform by storing sorted training values per feature.

        When ``pdf_extension > 0``, a histogram-based CDF pipeline is used
        instead of the default empirical CDF.  This extends the support by
        ``pdf_extension`` percent of the data range and builds an interpolated,
        monotonic CDF on a grid of ``pdf_resolution`` points.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.  Each column is sorted and stored as the empirical
            support (quantile nodes) for that feature.

        Returns
        -------
        self : MarginalUniformize
            Fitted transform instance.
        """
        self.n_features_ = X.shape[1]

        if self.pdf_extension > 0:
            self._fit_histogram_cdf(X)
        else:
            # Sort each column independently to obtain empirical quantile nodes
            self.support_ = np.sort(X, axis=0)
        return self

    def _fit_histogram_cdf(self, X: np.ndarray) -> None:
        """Build per-feature histogram CDF with extended support."""
        n_samples = X.shape[0]
        self.cdf_support_ = []
        self.cdf_values_ = []
        self.pdf_support_ = []
        self.pdf_values_ = []

        for i in range(self.n_features_):
            xi = X[:, i]
            x_min, x_max = xi.min(), xi.max()

            # Handle constant-valued feature: trivial linear CDF
            if x_min == x_max:
                support = np.array([x_min - 1.0, x_min, x_min + 1.0])
                cdf_vals = np.array([0.0, 0.5, 1.0])
                pdf_sup = np.array([x_min - 1.0, x_min, x_min + 1.0])
                pdf_vals = np.array([0.0, 1.0, 0.0])
                self.cdf_support_.append(support)
                self.cdf_values_.append(cdf_vals)
                self.pdf_support_.append(pdf_sup)
                self.pdf_values_.append(pdf_vals)
                continue

            support_ext = (self.pdf_extension / 100) * abs(x_max - x_min)

            # Build histogram bins: sqrt(n) + 1 edges
            n_bin_edges = int(np.sqrt(float(n_samples)) + 1)
            bin_edges = np.linspace(x_min, x_max, n_bin_edges)
            bin_centers = (bin_edges[:-1] + bin_edges[1:]) / 2
            counts, _ = np.histogram(xi, bin_edges)
            bin_size = bin_edges[1] - bin_edges[0]

            # Empirical PDF with zero-padded edges
            pdf_support = np.concatenate(
                [
                    [bin_centers[0] - bin_size],
                    bin_centers,
                    [bin_centers[-1] + bin_size],
                ]
            )
            empirical_pdf = np.concatenate(
                [
                    [0.0],
                    counts / (np.sum(counts) * bin_size),
                    [0.0],
                ]
            )

            # CDF from cumulative counts with extended support
            c_sum = np.cumsum(counts)
            cdf = (1 - 1 / n_samples) * c_sum / n_samples
            incr_bin = bin_size / 2

            new_bin_edges = np.concatenate(
                [
                    [x_min - support_ext],
                    [x_min],
                    bin_centers + incr_bin,
                    [x_max + support_ext + incr_bin],
                ]
            )
            extended_cdf = np.concatenate(
                [
                    [0.0],
                    [1.0 / n_samples],
                    cdf,
                    [1.0],
                ]
            )

            # Interpolate onto fine grid, enforce monotonicity, normalize
            new_support = np.linspace(
                new_bin_edges[0], new_bin_edges[-1], self.pdf_resolution
            )
            learned_cdf = np.interp(new_support, new_bin_edges, extended_cdf)
            uniform_cdf = make_cdf_monotonic(learned_cdf)
            uniform_cdf /= uniform_cdf.max()

            self.cdf_support_.append(new_support)
            self.cdf_values_.append(uniform_cdf)
            self.pdf_support_.append(pdf_support)
            self.pdf_values_.append(empirical_pdf)

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Map each feature to [0, 1] via the mid-point empirical CDF.

        Applies ``u = (rank + 0.5) / N`` to every column independently.
        When ``pdf_extension > 0``, uses interpolation with the stored
        histogram-based CDF grid instead.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Uniformized data in [0, 1] (or ``[eps, 1 - eps]`` when
            ``bound_correct=True``).
        """
        Xt = np.zeros_like(X, dtype=float)
        if self.pdf_extension > 0:
            for i in range(self.n_features_):
                Xt[:, i] = np.interp(X[:, i], self.cdf_support_[i], self.cdf_values_[i])
        else:
            for i in range(self.n_features_):
                Xt[:, i] = self._uniformize(X[:, i], self.support_[:, i])
        if self.bound_correct:
            # Clip to (eps, 1-eps) to prevent boundary issues downstream
            Xt = np.clip(Xt, self.eps, 1 - self.eps)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Map uniform [0, 1] values back to the original space.

        Uses piecewise-linear interpolation between the stored sorted support
        values and their corresponding uniform probabilities
        ``np.linspace(0, 1, N)``.  When ``pdf_extension > 0``, uses the
        inverted histogram CDF grid instead.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the uniform [0, 1] space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
        """
        Xt = np.zeros_like(X, dtype=float)
        if self.pdf_extension > 0:
            for i in range(self.n_features_):
                # Ensure strictly increasing xp for np.interp by
                # dropping duplicate CDF values
                cdf_vals = self.cdf_values_[i]
                cdf_sup = self.cdf_support_[i]
                unique_mask = np.concatenate([[True], np.diff(cdf_vals) > 0])
                Xt[:, i] = np.interp(
                    X[:, i], cdf_vals[unique_mask], cdf_sup[unique_mask]
                )
        else:
            for i in range(self.n_features_):
                # Interpolate: uniform grid [0, 1] -> sorted training values
                Xt[:, i] = np.interp(
                    X[:, i],
                    np.linspace(0, 1, len(self.support_[:, i])),
                    self.support_[:, i],
                )
        return Xt

    @staticmethod
    def _uniformize(x: np.ndarray, support: np.ndarray) -> np.ndarray:
        """Compute the mid-point empirical CDF for a single feature.

        Parameters
        ----------
        x : np.ndarray of shape (n_samples,)
            New data values to evaluate the empirical CDF at.
        support : np.ndarray of shape (n_train,)
            Sorted training values used as the empirical quantile nodes.

        Returns
        -------
        u : np.ndarray of shape (n_samples,)
            Empirical CDF values: ``(rank + 0.5) / n_train``.
        """
        n = len(support)
        # Left-sided searchsorted gives the number of training points <= x
        ranks = np.searchsorted(support, x, side="left")
        # Mid-point shift (+0.5) avoids exact 0 and 1
        return (ranks + 0.5) / n

`fit(X)`

Fit the transform by storing sorted training values per feature.

When pdf_extension > 0, a histogram-based CDF pipeline is used instead of the default empirical CDF. This extends the support by pdf_extension percent of the data range and builds an interpolated, monotonic CDF on a grid of pdf_resolution points.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data. Each column is sorted and stored as the empirical support (quantile nodes) for that feature.

Returns

self : MarginalUniformize Fitted transform instance.

Source code in rbig/_src/marginal.py

def fit(self, X: np.ndarray) -> MarginalUniformize:
    """Fit the transform by storing sorted training values per feature.

    When ``pdf_extension > 0``, a histogram-based CDF pipeline is used
    instead of the default empirical CDF.  This extends the support by
    ``pdf_extension`` percent of the data range and builds an interpolated,
    monotonic CDF on a grid of ``pdf_resolution`` points.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.  Each column is sorted and stored as the empirical
        support (quantile nodes) for that feature.

    Returns
    -------
    self : MarginalUniformize
        Fitted transform instance.
    """
    self.n_features_ = X.shape[1]

    if self.pdf_extension > 0:
        self._fit_histogram_cdf(X)
    else:
        # Sort each column independently to obtain empirical quantile nodes
        self.support_ = np.sort(X, axis=0)
    return self

`inverse_transform(X)`

Map uniform [0, 1] values back to the original space.

Uses piecewise-linear interpolation between the stored sorted support values and their corresponding uniform probabilities np.linspace(0, 1, N). When pdf_extension > 0, uses the inverted histogram CDF grid instead.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the uniform [0, 1] space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space.

Source code in rbig/_src/marginal.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Map uniform [0, 1] values back to the original space.

    Uses piecewise-linear interpolation between the stored sorted support
    values and their corresponding uniform probabilities
    ``np.linspace(0, 1, N)``.  When ``pdf_extension > 0``, uses the
    inverted histogram CDF grid instead.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the uniform [0, 1] space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
    """
    Xt = np.zeros_like(X, dtype=float)
    if self.pdf_extension > 0:
        for i in range(self.n_features_):
            # Ensure strictly increasing xp for np.interp by
            # dropping duplicate CDF values
            cdf_vals = self.cdf_values_[i]
            cdf_sup = self.cdf_support_[i]
            unique_mask = np.concatenate([[True], np.diff(cdf_vals) > 0])
            Xt[:, i] = np.interp(
                X[:, i], cdf_vals[unique_mask], cdf_sup[unique_mask]
            )
    else:
        for i in range(self.n_features_):
            # Interpolate: uniform grid [0, 1] -> sorted training values
            Xt[:, i] = np.interp(
                X[:, i],
                np.linspace(0, 1, len(self.support_[:, i])),
                self.support_[:, i],
            )
    return Xt

`transform(X)`

Map each feature to [0, 1] via the mid-point empirical CDF.

Applies u = (rank + 0.5) / N to every column independently. When pdf_extension > 0, uses interpolation with the stored histogram-based CDF grid instead.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Uniformized data in [0, 1] (or [eps, 1 - eps] when bound_correct=True).

Source code in rbig/_src/marginal.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Map each feature to [0, 1] via the mid-point empirical CDF.

    Applies ``u = (rank + 0.5) / N`` to every column independently.
    When ``pdf_extension > 0``, uses interpolation with the stored
    histogram-based CDF grid instead.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Uniformized data in [0, 1] (or ``[eps, 1 - eps]`` when
        ``bound_correct=True``).
    """
    Xt = np.zeros_like(X, dtype=float)
    if self.pdf_extension > 0:
        for i in range(self.n_features_):
            Xt[:, i] = np.interp(X[:, i], self.cdf_support_[i], self.cdf_values_[i])
    else:
        for i in range(self.n_features_):
            Xt[:, i] = self._uniformize(X[:, i], self.support_[:, i])
    if self.bound_correct:
        # Clip to (eps, 1-eps) to prevent boundary issues downstream
        Xt = np.clip(Xt, self.eps, 1 - self.eps)
    return Xt

`OrthogonalDimensionalityReduction`

Bases: RotationBijector

Full orthogonal rotation followed by optional dimension truncation.

Applies a D x D orthogonal rotation Q (drawn from the Haar measure via QR) and then retains only the first K <= D components::

z = (X Q^T)[:, :K]

The rotation is sampled fresh at fit time from a square standard-normal matrix processed through QR with sign correction.

Parameters

n_components : int or None, default None Number of output dimensions K. If None, K = D (no truncation). random_state : int or None, default None Seed for reproducible rotation matrix generation.

Attributes

rotation_matrix_ : np.ndarray of shape (n_features, n_features) Full D x D orthogonal rotation matrix Q. n_components_ : int Number of retained output dimensions K. input_dim_ : int Input dimensionality D.

Notes

When K = D the transform is a bijection and::

log|det J| = log|det Q| = 0

When K < D the transform is not invertible; both :meth:inverse_transform and :meth:get_log_det_jacobian raise :class:NotImplementedError.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.rotation import OrthogonalDimensionalityReduction rng = np.random.default_rng(0) X = rng.standard_normal((100, 4)) odr = OrthogonalDimensionalityReduction(random_state=0).fit(X) Z = odr.transform(X) Z.shape (100, 4)

Source code in rbig/_src/rotation.py

class OrthogonalDimensionalityReduction(RotationBijector):
    """Full orthogonal rotation followed by optional dimension truncation.

    Applies a D x D orthogonal rotation Q (drawn from the Haar measure via
    QR) and then retains only the first K <= D components::

        z = (X Q^T)[:, :K]

    The rotation is sampled fresh at ``fit`` time from a square
    standard-normal matrix processed through QR with sign correction.

    Parameters
    ----------
    n_components : int or None, default None
        Number of output dimensions K.  If ``None``, K = D (no truncation).
    random_state : int or None, default None
        Seed for reproducible rotation matrix generation.

    Attributes
    ----------
    rotation_matrix_ : np.ndarray of shape (n_features, n_features)
        Full D x D orthogonal rotation matrix Q.
    n_components_ : int
        Number of retained output dimensions K.
    input_dim_ : int
        Input dimensionality D.

    Notes
    -----
    When K = D the transform is a bijection and::

        log|det J| = log|det Q| = 0

    When K < D the transform is not invertible; both
    :meth:`inverse_transform` and :meth:`get_log_det_jacobian` raise
    :class:`NotImplementedError`.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import OrthogonalDimensionalityReduction
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((100, 4))
    >>> odr = OrthogonalDimensionalityReduction(random_state=0).fit(X)
    >>> Z = odr.transform(X)
    >>> Z.shape
    (100, 4)
    """

    def __init__(
        self,
        n_components: int | None = None,
        random_state: int | None = None,
    ):
        self.n_components = n_components
        self.random_state = random_state

    def fit(self, X: np.ndarray) -> OrthogonalDimensionalityReduction:
        """Sample a Haar-uniform D x D rotation matrix.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data (used only to infer D = n_features).

        Returns
        -------
        self : OrthogonalDimensionalityReduction
            Fitted transform instance.
        """
        rng = np.random.default_rng(self.random_state)
        D = X.shape[1]
        K = self.n_components if self.n_components is not None else D
        # QR of a random Gaussian matrix gives a Haar-uniform orthogonal matrix
        A = rng.standard_normal((D, D))
        Q, _ = np.linalg.qr(A)
        self.rotation_matrix_ = Q  # (D, D)
        self.n_components_ = K
        self.input_dim_ = D
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Rotate then truncate: z = (X Q^T)[:, :K].

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_components)
            Rotated and (optionally) truncated data.
        """
        Xr = X @ self.rotation_matrix_.T  # (N, D) @ (D, D) -> (N, D)  full rotation
        return Xr[:, : self.n_components_]  # (N, K)  keep first K components

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the rotation (only valid for square case K = D).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_components)
            Rotated data.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original space.

        Raises
        ------
        NotImplementedError
            If ``n_components < n_features`` (not invertible).
        """
        if self.n_components_ < self.input_dim_:
            raise NotImplementedError(
                "OrthogonalDimensionalityReduction with n_components < input dimension "
                "is not bijective; inverse_transform is undefined."
            )
        return X @ self.rotation_matrix_  # (N, D) @ (D, D) -> (N, D)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros for the square (bijective) case.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine n_samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Zeros, because ``|det Q| = 1`` for any orthogonal Q.

        Raises
        ------
        NotImplementedError
            If ``n_components < n_features`` (Jacobian determinant undefined).
        """
        if self.n_components_ < self.input_dim_:
            raise NotImplementedError(
                "OrthogonalDimensionalityReduction with n_components < input dimension "
                "does not have a well-defined Jacobian determinant."
            )
        return np.zeros(X.shape[0])

`fit(X)`

Sample a Haar-uniform D x D rotation matrix.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data (used only to infer D = n_features).

Returns

self : OrthogonalDimensionalityReduction Fitted transform instance.

Source code in rbig/_src/rotation.py

def fit(self, X: np.ndarray) -> OrthogonalDimensionalityReduction:
    """Sample a Haar-uniform D x D rotation matrix.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data (used only to infer D = n_features).

    Returns
    -------
    self : OrthogonalDimensionalityReduction
        Fitted transform instance.
    """
    rng = np.random.default_rng(self.random_state)
    D = X.shape[1]
    K = self.n_components if self.n_components is not None else D
    # QR of a random Gaussian matrix gives a Haar-uniform orthogonal matrix
    A = rng.standard_normal((D, D))
    Q, _ = np.linalg.qr(A)
    self.rotation_matrix_ = Q  # (D, D)
    self.n_components_ = K
    self.input_dim_ = D
    return self

`get_log_det_jacobian(X)`

Return zeros for the square (bijective) case.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine n_samples).

Returns

ldj : np.ndarray of shape (n_samples,) Zeros, because |det Q| = 1 for any orthogonal Q.

Raises

NotImplementedError If n_components < n_features (Jacobian determinant undefined).

Source code in rbig/_src/rotation.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros for the square (bijective) case.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine n_samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Zeros, because ``|det Q| = 1`` for any orthogonal Q.

    Raises
    ------
    NotImplementedError
        If ``n_components < n_features`` (Jacobian determinant undefined).
    """
    if self.n_components_ < self.input_dim_:
        raise NotImplementedError(
            "OrthogonalDimensionalityReduction with n_components < input dimension "
            "does not have a well-defined Jacobian determinant."
        )
    return np.zeros(X.shape[0])

`inverse_transform(X)`

Invert the rotation (only valid for square case K = D).

Parameters

X : np.ndarray of shape (n_samples, n_components) Rotated data.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original space.

Raises

NotImplementedError If n_components < n_features (not invertible).

Source code in rbig/_src/rotation.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the rotation (only valid for square case K = D).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_components)
        Rotated data.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original space.

    Raises
    ------
    NotImplementedError
        If ``n_components < n_features`` (not invertible).
    """
    if self.n_components_ < self.input_dim_:
        raise NotImplementedError(
            "OrthogonalDimensionalityReduction with n_components < input dimension "
            "is not bijective; inverse_transform is undefined."
        )
    return X @ self.rotation_matrix_  # (N, D) @ (D, D) -> (N, D)

`transform(X)`

Rotate then truncate: z = (X Q^T)[:, :K].

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Z : np.ndarray of shape (n_samples, n_components) Rotated and (optionally) truncated data.

Source code in rbig/_src/rotation.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Rotate then truncate: z = (X Q^T)[:, :K].

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_components)
        Rotated and (optionally) truncated data.
    """
    Xr = X @ self.rotation_matrix_.T  # (N, D) @ (D, D) -> (N, D)  full rotation
    return Xr[:, : self.n_components_]  # (N, K)  keep first K components

`OrthogonalWaveletLayer`

Bases: ImageBijector

Single-level 2-D DWT "squeeze" bijector.

Applies a single-level 2-D discrete wavelet transform to each spatial channel, splitting each (H, W) feature map into four (H/2, W/2) sub-bands and concatenating them along the channel axis.

The forward transform is

.. math::

(N,\, C,\, H,\, W)
\xrightarrow{\text{DWT}}
(N,\, 4C,\, H/2,\, W/2)

Sub-band channel ordering (repeated C times):

4c + 0 — LL (approximation, cA)
4c + 1 — LH (horizontal detail, cH)
4c + 2 — HL (vertical detail, cV)
4c + 3 — HH (diagonal detail, cD)

The Jacobian determinant is 1 for orthonormal wavelets (e.g. Haar), so log|det J| = 0.

Requires PyWavelets (pip install PyWavelets).

Parameters

wavelet : str, default "haar" Wavelet name accepted by :func:pywt.dwt2. C : int, default 1 Number of input channels. H : int, default 8 Input image height. Must be even. W : int, default 8 Input image width. Must be even.

Attributes

C_ : int Fitted number of channels. H_ : int Fitted image height. W_ : int Fitted image width.

Examples

import numpy as np rng = np.random.default_rng(0) X = rng.standard_normal((10, 1 * 8 * 8)) # N=10, C=1, H=8, W=8 layer = OrthogonalWaveletLayer(wavelet="haar", C=1, H=8, W=8) layer.fit(X) # doctest: +ELLIPSIS OrthogonalWaveletLayer(...) Xt = layer.transform(X) Xt.shape # N=10, 4C=4, H/2=4, W/2=4 → flat 44*4=64 (10, 64) Xr = layer.inverse_transform(Xt) Xr.shape (10, 64)

Source code in rbig/_src/image.py

class OrthogonalWaveletLayer(ImageBijector):
    """Single-level 2-D DWT "squeeze" bijector.

    Applies a single-level 2-D discrete wavelet transform to each spatial
    channel, splitting each ``(H, W)`` feature map into four ``(H/2, W/2)``
    sub-bands and concatenating them along the channel axis.

    The forward transform is

    .. math::

        (N,\\, C,\\, H,\\, W)
        \\xrightarrow{\\text{DWT}}
        (N,\\, 4C,\\, H/2,\\, W/2)

    Sub-band channel ordering (repeated ``C`` times):

    * ``4c + 0`` — LL (approximation, cA)
    * ``4c + 1`` — LH (horizontal detail, cH)
    * ``4c + 2`` — HL (vertical detail, cV)
    * ``4c + 3`` — HH (diagonal detail, cD)

    The Jacobian determinant is 1 for orthonormal wavelets (e.g. Haar),
    so ``log|det J| = 0``.

    Requires PyWavelets (``pip install PyWavelets``).

    Parameters
    ----------
    wavelet : str, default ``"haar"``
        Wavelet name accepted by :func:`pywt.dwt2`.
    C : int, default 1
        Number of input channels.
    H : int, default 8
        Input image height.  Must be even.
    W : int, default 8
        Input image width.  Must be even.

    Attributes
    ----------
    C_ : int
        Fitted number of channels.
    H_ : int
        Fitted image height.
    W_ : int
        Fitted image width.

    Examples
    --------
    >>> import numpy as np
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((10, 1 * 8 * 8))  # N=10, C=1, H=8, W=8
    >>> layer = OrthogonalWaveletLayer(wavelet="haar", C=1, H=8, W=8)
    >>> layer.fit(X)  # doctest: +ELLIPSIS
    OrthogonalWaveletLayer(...)
    >>> Xt = layer.transform(X)
    >>> Xt.shape  # N=10, 4*C=4, H/2=4, W/2=4 → flat 4*4*4=64
    (10, 64)
    >>> Xr = layer.inverse_transform(Xt)
    >>> Xr.shape
    (10, 64)
    """

    def __init__(self, wavelet: str = "haar", C: int = 1, H: int = 8, W: int = 8):
        self.wavelet = wavelet
        self.C = C
        self.H = H
        self.W = W

    def fit(self, X: np.ndarray) -> OrthogonalWaveletLayer:
        """Store spatial dimensions; no data-dependent fitting required.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``

        Returns
        -------
        self : OrthogonalWaveletLayer
        """
        self.C_ = self.C
        self.H_ = self.H
        self.W_ = self.W
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply single-level 2-D DWT and flatten output.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Flattened image batch.

        Returns
        -------
        Xt : np.ndarray, shape ``(N, 4*C * H//2 * W//2)``
            Flattened wavelet coefficient batch.
        """
        import pywt

        N = X.shape[0]
        imgs = self._to_tensor(X)  # (N, C, H, W)
        result = []
        for n in range(N):
            channels = []
            for c in range(self.C_):
                # 2-D DWT on single (H, W) channel
                coeffs = pywt.dwt2(imgs[n, c], self.wavelet)
                cA, (cH, cV, cD) = coeffs  # each sub-band: (H/2, W/2)
                # Stack sub-bands: LL, LH, HL, HH
                channels.extend([cA, cH, cV, cD])
            result.append(np.stack(channels, axis=0))  # (4C, H/2, W/2)
        out = np.array(result)  # (N, 4C, H/2, W/2)
        return out.reshape(N, -1)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Reconstruct images via single-level 2-D inverse DWT.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, 4*C * H//2 * W//2)``
            Flattened coefficient batch from :meth:`transform`.

        Returns
        -------
        Xr : np.ndarray, shape ``(N, C*H*W)``
            Reconstructed flattened image batch.
        """
        import pywt

        N = X.shape[0]
        H2, W2 = self.H_ // 2, self.W_ // 2
        # Unpack into (N, 4C, H/2, W/2) tensor
        imgs = X.reshape(N, 4 * self.C_, H2, W2)
        result = []
        for n in range(N):
            channels = []
            for c in range(self.C_):
                # Extract the four sub-bands for channel c
                cA = imgs[n, 4 * c]  # LL approximation
                cH = imgs[n, 4 * c + 1]  # LH horizontal detail
                cV = imgs[n, 4 * c + 2]  # HL vertical detail
                cD = imgs[n, 4 * c + 3]  # HH diagonal detail
                reconstructed = pywt.idwt2((cA, (cH, cV, cD)), self.wavelet)
                channels.append(reconstructed)
            result.append(np.stack(channels, axis=0))
        out = np.array(result)  # (N, C, H, W)
        return out.reshape(N, -1)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros: orthonormal DWT has ``|det J| = 1``.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, D)``

        Returns
        -------
        log_det : np.ndarray, shape ``(N,)``
        """
        return np.zeros(X.shape[0])

`fit(X)`

Store spatial dimensions; no data-dependent fitting required.

Parameters

X : np.ndarray, shape (N, C*H*W)

Returns

self : OrthogonalWaveletLayer

Source code in rbig/_src/image.py

def fit(self, X: np.ndarray) -> OrthogonalWaveletLayer:
    """Store spatial dimensions; no data-dependent fitting required.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``

    Returns
    -------
    self : OrthogonalWaveletLayer
    """
    self.C_ = self.C
    self.H_ = self.H
    self.W_ = self.W
    return self

`get_log_det_jacobian(X)`

Return zeros: orthonormal DWT has |det J| = 1.

Parameters

X : np.ndarray, shape (N, D)

Returns

log_det : np.ndarray, shape (N,)

Source code in rbig/_src/image.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros: orthonormal DWT has ``|det J| = 1``.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, D)``

    Returns
    -------
    log_det : np.ndarray, shape ``(N,)``
    """
    return np.zeros(X.shape[0])

`inverse_transform(X)`

Reconstruct images via single-level 2-D inverse DWT.

Parameters

X : np.ndarray, shape (N, 4*C * H//2 * W//2) Flattened coefficient batch from :meth:transform.

Returns

Xr : np.ndarray, shape (N, C*H*W) Reconstructed flattened image batch.

Source code in rbig/_src/image.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Reconstruct images via single-level 2-D inverse DWT.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, 4*C * H//2 * W//2)``
        Flattened coefficient batch from :meth:`transform`.

    Returns
    -------
    Xr : np.ndarray, shape ``(N, C*H*W)``
        Reconstructed flattened image batch.
    """
    import pywt

    N = X.shape[0]
    H2, W2 = self.H_ // 2, self.W_ // 2
    # Unpack into (N, 4C, H/2, W/2) tensor
    imgs = X.reshape(N, 4 * self.C_, H2, W2)
    result = []
    for n in range(N):
        channels = []
        for c in range(self.C_):
            # Extract the four sub-bands for channel c
            cA = imgs[n, 4 * c]  # LL approximation
            cH = imgs[n, 4 * c + 1]  # LH horizontal detail
            cV = imgs[n, 4 * c + 2]  # HL vertical detail
            cD = imgs[n, 4 * c + 3]  # HH diagonal detail
            reconstructed = pywt.idwt2((cA, (cH, cV, cD)), self.wavelet)
            channels.append(reconstructed)
        result.append(np.stack(channels, axis=0))
    out = np.array(result)  # (N, C, H, W)
    return out.reshape(N, -1)

`transform(X)`

Apply single-level 2-D DWT and flatten output.

Parameters

X : np.ndarray, shape (N, C*H*W) Flattened image batch.

Returns

Xt : np.ndarray, shape (N, 4*C * H//2 * W//2) Flattened wavelet coefficient batch.

Source code in rbig/_src/image.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply single-level 2-D DWT and flatten output.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Flattened image batch.

    Returns
    -------
    Xt : np.ndarray, shape ``(N, 4*C * H//2 * W//2)``
        Flattened wavelet coefficient batch.
    """
    import pywt

    N = X.shape[0]
    imgs = self._to_tensor(X)  # (N, C, H, W)
    result = []
    for n in range(N):
        channels = []
        for c in range(self.C_):
            # 2-D DWT on single (H, W) channel
            coeffs = pywt.dwt2(imgs[n, c], self.wavelet)
            cA, (cH, cV, cD) = coeffs  # each sub-band: (H/2, W/2)
            # Stack sub-bands: LL, LH, HL, HH
            channels.extend([cA, cH, cV, cD])
        result.append(np.stack(channels, axis=0))  # (4C, H/2, W/2)
    out = np.array(result)  # (N, 4C, H/2, W/2)
    return out.reshape(N, -1)

`PCARotation`

Bases: BaseTransform

PCA-based rotation with optional whitening (decorrelation + rescaling).

Fits a standard PCA (via scikit-learn's :class:~sklearn.decomposition.PCA) and uses it as a linear rotation transform. When whiten=True (default), each principal component is additionally rescaled by the reciprocal square root of its eigenvalue so the output has unit variance per component::

z = Lambda^{-1/2} V^T (x - mu)

where V in R^{D x K} is the matrix of leading eigenvectors (principal axes), Lambda in R^{K x K} is the diagonal eigenvalue matrix, and mu is the sample mean. When whiten=False, the rescaling is omitted and the transform is a pure rotation::

z = V ^ T(x - mu)

Parameters

n_components : int or None, default None Number of principal components to retain. If None, all D components are kept. whiten : bool, default True If True, divide each component by sqrt(lambda_i) to decorrelate and normalise variance. If False, only rotate (and center).

Attributes

pca_ : sklearn.decomposition.PCA Fitted PCA object containing eigenvectors, eigenvalues, and the sample mean.

Notes

The log-absolute-Jacobian determinant for the whitening transform is::

log|det J| = -1/2 * sum_i log(lambda_i)

because each whitening factor Lambda^{-1/2} contributes -1/2 * log(lambda_i) per component. For a pure rotation (whiten=False), the determinant is 1 and the log is 0.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.rotation import PCARotation rng = np.random.default_rng(0) X = rng.standard_normal((200, 4)) pca_rot = PCARotation(whiten=True).fit(X) Z = pca_rot.transform(X) Z.shape (200, 4) ldj = pca_rot.log_det_jacobian(X) ldj.shape (200,)

Source code in rbig/_src/rotation.py

class PCARotation(BaseTransform):
    """PCA-based rotation with optional whitening (decorrelation + rescaling).

    Fits a standard PCA (via scikit-learn's :class:`~sklearn.decomposition.PCA`)
    and uses it as a linear rotation transform.  When ``whiten=True`` (default),
    each principal component is additionally rescaled by the reciprocal square
    root of its eigenvalue so the output has unit variance per component::

        z = Lambda^{-1/2} V^T (x - mu)

    where ``V`` in R^{D x K} is the matrix of leading eigenvectors (principal
    axes), ``Lambda`` in R^{K x K} is the diagonal eigenvalue matrix, and
    ``mu`` is the sample mean.  When ``whiten=False``, the rescaling is
    omitted and the transform is a pure rotation::

        z = V ^ T(x - mu)

    Parameters
    ----------
    n_components : int or None, default None
        Number of principal components to retain.  If ``None``, all D
        components are kept.
    whiten : bool, default True
        If True, divide each component by sqrt(lambda_i) to decorrelate
        *and* normalise variance.  If False, only rotate (and center).

    Attributes
    ----------
    pca_ : sklearn.decomposition.PCA
        Fitted PCA object containing eigenvectors, eigenvalues, and the
        sample mean.

    Notes
    -----
    The log-absolute-Jacobian determinant for the whitening transform is::

        log|det J| = -1/2 * sum_i log(lambda_i)

    because each whitening factor Lambda^{-1/2} contributes
    ``-1/2 * log(lambda_i)`` per component.  For a pure rotation
    (``whiten=False``), the determinant is 1 and the log is 0.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import PCARotation
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 4))
    >>> pca_rot = PCARotation(whiten=True).fit(X)
    >>> Z = pca_rot.transform(X)
    >>> Z.shape
    (200, 4)
    >>> ldj = pca_rot.log_det_jacobian(X)
    >>> ldj.shape
    (200,)
    """

    def __init__(self, n_components: int | None = None, whiten: bool = True):
        self.n_components = n_components
        self.whiten = whiten

    def fit(self, X: np.ndarray) -> PCARotation:
        """Fit PCA to the training data.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : PCARotation
            Fitted transform instance.
        """
        # Stores eigenvectors, eigenvalues, and mean in pca_
        self.pca_ = PCA(n_components=self.n_components, whiten=self.whiten)
        self.pca_.fit(X)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply PCA rotation (and optional whitening) to X.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_components)
            Rotated (and optionally whitened) data.
        """
        return self.pca_.transform(X)  # (N, D) -> (N, K)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the PCA rotation (and optional whitening).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_components)
            Data in the PCA / whitened space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
        """
        return self.pca_.inverse_transform(X)  # (N, K) -> (N, D)

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Log absolute Jacobian determinant (constant for linear transforms).

        For a whitening PCA the Jacobian determinant is constant::

            log|det J| = -1/2 * sum_i log(lambda_i)

        where ``lambda_i`` are the PCA eigenvalues (``explained_variance_``).
        For a plain rotation (``whiten=False``), ``|det J| = 1`` and the
        log is 0.

        .. note::
            This method is only valid when the transform is square (i.e.
            ``n_components`` is ``None`` or equals the number of input
            features).  A dimensionality-reducing PCA (``n_components`` <
            ``n_features``) is not bijective and its Jacobian determinant is
            undefined.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine the number of samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Constant per-sample log absolute Jacobian determinant.
        """
        if self.whiten:
            # Whitening scales each eigendirection by lambda_i^{-1/2}
            # -> log|det| = -1/2 * sum log(lambda_i)
            log_det = -0.5 * np.sum(np.log(self.pca_.explained_variance_))
        else:
            # Pure rotation: |det Q| = 1 -> log|det| = 0
            log_det = 0.0
        return np.full(X.shape[0], log_det)

`fit(X)`

Fit PCA to the training data.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : PCARotation Fitted transform instance.

Source code in rbig/_src/rotation.py

def fit(self, X: np.ndarray) -> PCARotation:
    """Fit PCA to the training data.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : PCARotation
        Fitted transform instance.
    """
    # Stores eigenvectors, eigenvalues, and mean in pca_
    self.pca_ = PCA(n_components=self.n_components, whiten=self.whiten)
    self.pca_.fit(X)
    return self

`inverse_transform(X)`

Invert the PCA rotation (and optional whitening).

Parameters

X : np.ndarray of shape (n_samples, n_components) Data in the PCA / whitened space.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space.

Source code in rbig/_src/rotation.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the PCA rotation (and optional whitening).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_components)
        Data in the PCA / whitened space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
    """
    return self.pca_.inverse_transform(X)  # (N, K) -> (N, D)

`log_det_jacobian(X)`

Log absolute Jacobian determinant (constant for linear transforms).

For a whitening PCA the Jacobian determinant is constant::

log|det J| = -1/2 * sum_i log(lambda_i)

where lambda_i are the PCA eigenvalues (explained_variance_). For a plain rotation (whiten=False), |det J| = 1 and the log is 0.

.. note:: This method is only valid when the transform is square (i.e. n_components is None or equals the number of input features). A dimensionality-reducing PCA (n_components < n_features) is not bijective and its Jacobian determinant is undefined.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine the number of samples).

Returns

ldj : np.ndarray of shape (n_samples,) Constant per-sample log absolute Jacobian determinant.

Source code in rbig/_src/rotation.py

def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Log absolute Jacobian determinant (constant for linear transforms).

    For a whitening PCA the Jacobian determinant is constant::

        log|det J| = -1/2 * sum_i log(lambda_i)

    where ``lambda_i`` are the PCA eigenvalues (``explained_variance_``).
    For a plain rotation (``whiten=False``), ``|det J| = 1`` and the
    log is 0.

    .. note::
        This method is only valid when the transform is square (i.e.
        ``n_components`` is ``None`` or equals the number of input
        features).  A dimensionality-reducing PCA (``n_components`` <
        ``n_features``) is not bijective and its Jacobian determinant is
        undefined.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine the number of samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Constant per-sample log absolute Jacobian determinant.
    """
    if self.whiten:
        # Whitening scales each eigendirection by lambda_i^{-1/2}
        # -> log|det| = -1/2 * sum log(lambda_i)
        log_det = -0.5 * np.sum(np.log(self.pca_.explained_variance_))
    else:
        # Pure rotation: |det Q| = 1 -> log|det| = 0
        log_det = 0.0
    return np.full(X.shape[0], log_det)

`transform(X)`

Apply PCA rotation (and optional whitening) to X.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Z : np.ndarray of shape (n_samples, n_components) Rotated (and optionally whitened) data.

Source code in rbig/_src/rotation.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply PCA rotation (and optional whitening) to X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_components)
        Rotated (and optionally whitened) data.
    """
    return self.pca_.transform(X)  # (N, D) -> (N, K)

`PicardRotation`

Bases: RotationBijector

ICA rotation via the Picard algorithm with a FastICA fallback.

Fits an ICA model that learns maximally statistically-independent sources. When the optional picard package is available, it solves::

K, W = picard(X^T)
s = W K x

where K in R^{K x D} is the pre-whitening matrix and W in R^{K x K} is the Picard unmixing matrix. The log-det-Jacobian is::

log|det J| = log|det(W K)|

If picard is not installed (or incompatible), :class:sklearn .decomposition.FastICA is used as a fallback.

Parameters

n_components : int or None, default None Number of independent components K. If None, K = D. extended : bool, default False If True, use the extended Picard algorithm that can handle both super- and sub-Gaussian sources (passed directly to picard). random_state : int or None, default None Seed for reproducible initialisation. max_iter : int, default 500 Maximum number of ICA iterations. tol : float, default 1e-5 Convergence tolerance for the ICA algorithm.

Attributes

K_ : np.ndarray of shape (n_components, n_features) or None Pre-whitening matrix (Picard path). None when using FastICA. W_ : np.ndarray of shape (n_components, n_components) or None Unmixing matrix (Picard path). None when using FastICA. use_picard_ : bool True if the Picard solver was used; False if FastICA was used. ica_ : sklearn.decomposition.FastICA or None Fitted FastICA model (FastICA path only).

Notes

The log-det-Jacobian is::

log|det J| = log|det(W K)|

for the Picard path, or log|det(components_)| for the FastICA path. The Jacobian is constant because the transform is linear.

:meth:get_log_det_jacobian raises :class:ValueError if the unmixing matrix is not square (i.e. n_components != n_features).

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Ablin, P., Cardoso, J.-F., & Gramfort, A. (2018). Faster Independent Component Analysis by Preconditioning with Hessian Approximations. IEEE Transactions on Signal Processing, 66(15), 4040-4049.

Examples

import numpy as np from rbig._src.rotation import PicardRotation rng = np.random.default_rng(0) X = rng.standard_normal((200, 3)) pic = PicardRotation(random_state=0).fit(X) S = pic.transform(X) S.shape (200, 3)

Source code in rbig/_src/rotation.py

class PicardRotation(RotationBijector):
    """ICA rotation via the Picard algorithm with a FastICA fallback.

    Fits an ICA model that learns maximally statistically-independent
    sources.  When the optional ``picard`` package is available, it solves::

        K, W = picard(X^T)
        s = W K x

    where ``K`` in R^{K x D} is the pre-whitening matrix and ``W`` in
    R^{K x K} is the Picard unmixing matrix.  The log-det-Jacobian is::

        log|det J| = log|det(W K)|

    If ``picard`` is not installed (or incompatible), :class:`sklearn
    .decomposition.FastICA` is used as a fallback.

    Parameters
    ----------
    n_components : int or None, default None
        Number of independent components K.  If ``None``, K = D.
    extended : bool, default False
        If True, use the extended Picard algorithm that can handle both
        super- and sub-Gaussian sources (passed directly to ``picard``).
    random_state : int or None, default None
        Seed for reproducible initialisation.
    max_iter : int, default 500
        Maximum number of ICA iterations.
    tol : float, default 1e-5
        Convergence tolerance for the ICA algorithm.

    Attributes
    ----------
    K_ : np.ndarray of shape (n_components, n_features) or None
        Pre-whitening matrix (Picard path).  ``None`` when using FastICA.
    W_ : np.ndarray of shape (n_components, n_components) or None
        Unmixing matrix (Picard path).  ``None`` when using FastICA.
    use_picard_ : bool
        True if the Picard solver was used; False if FastICA was used.
    ica_ : sklearn.decomposition.FastICA or None
        Fitted FastICA model (FastICA path only).

    Notes
    -----
    The log-det-Jacobian is::

        log|det J| = log|det(W K)|

    for the Picard path, or ``log|det(components_)|`` for the FastICA path.
    The Jacobian is constant because the transform is linear.

    :meth:`get_log_det_jacobian` raises :class:`ValueError` if the unmixing
    matrix is not square (i.e. ``n_components != n_features``).

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Ablin, P., Cardoso, J.-F., & Gramfort, A. (2018). Faster Independent
    Component Analysis by Preconditioning with Hessian Approximations.
    *IEEE Transactions on Signal Processing*, 66(15), 4040-4049.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import PicardRotation
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 3))
    >>> pic = PicardRotation(random_state=0).fit(X)
    >>> S = pic.transform(X)
    >>> S.shape
    (200, 3)
    """

    def __init__(
        self,
        n_components: int | None = None,
        extended: bool = False,
        random_state: int | None = None,
        max_iter: int = 500,
        tol: float = 1e-5,
    ):
        self.n_components = n_components
        self.extended = extended
        self.random_state = random_state
        self.max_iter = max_iter
        self.tol = tol

    def fit(self, X: np.ndarray) -> PicardRotation:
        """Fit ICA (Picard if available, otherwise FastICA).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : PicardRotation
            Fitted transform instance.
        """
        try:
            from picard import picard

            n = X.shape[1] if self.n_components is None else self.n_components
            # Picard expects (n_features, n_samples); returns K (whitening) and W (unmixing)
            K, W, _ = picard(
                X.T,  # (D, N)
                n_components=n,
                random_state=self.random_state,
                max_iter=self.max_iter,
                tol=self.tol,
                extended=self.extended,
            )
            self.K_ = K  # pre-whitening matrix, shape (K, D)
            self.W_ = W  # ICA unmixing matrix, shape (K, K)
            self.use_picard_ = True
        except (ImportError, TypeError):
            from sklearn.decomposition import FastICA

            # FastICA fallback when picard is unavailable or incompatible
            self.ica_ = FastICA(
                n_components=self.n_components,
                random_state=self.random_state,
                max_iter=self.max_iter,
            )
            self.ica_.fit(X)
            self.K_ = None
            self.use_picard_ = False
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the ICA unmixing: s = W K x.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original (mixed) space.

        Returns
        -------
        S : np.ndarray of shape (n_samples, n_components)
            Estimated independent components.
        """
        if not self.use_picard_:
            return self.ica_.transform(X)
        # Picard path: whiten then unmix
        # (N, D) @ (D, K) -> (N, K) whitening; then (N, K) @ (K, K) -> (N, K) unmixing
        return (X @ self.K_.T) @ self.W_.T

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the ICA unmixing: x = (W K)^{-1} s.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_components)
            Independent-component representation.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original mixed space.
        """
        if not self.use_picard_:
            return self.ica_.inverse_transform(X)
        # Invert unmixing W then whitening K, each via pseudo-inverse
        return (X @ np.linalg.pinv(self.W_).T) @ np.linalg.pinv(self.K_).T

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Log absolute Jacobian determinant: log|det(W K)| (constant).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine n_samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Constant per-sample log absolute Jacobian determinant.

        Raises
        ------
        ValueError
            If the unmixing matrix is not square
            (``n_components != n_features``).
        """
        if not self.use_picard_:
            W = self.ica_.components_
            if W.shape[0] != W.shape[1]:
                raise ValueError(
                    "PicardRotation.get_log_det_jacobian is only defined for square "
                    "unmixing matrices when using the FastICA fallback. Got "
                    f"components_ with shape {W.shape}. Ensure that `n_components` "
                    "is None or equals the number of features."
                )
            log_det = np.log(np.abs(np.linalg.det(W)))
        else:
            WK = self.W_ @ self.K_  # combined unmixing matrix, shape (K, D)
            if WK.shape[0] != WK.shape[1]:
                raise ValueError(
                    "PicardRotation.get_log_det_jacobian is only defined for square "
                    "unmixing matrices when using the Picard solver. Got "
                    f"W @ K with shape {WK.shape}. Ensure that `n_components` "
                    "is None or equals the number of features."
                )
            log_det = np.log(np.abs(np.linalg.det(WK)))
        return np.full(X.shape[0], log_det)

`fit(X)`

Fit ICA (Picard if available, otherwise FastICA).

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : PicardRotation Fitted transform instance.

Source code in rbig/_src/rotation.py

def fit(self, X: np.ndarray) -> PicardRotation:
    """Fit ICA (Picard if available, otherwise FastICA).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : PicardRotation
        Fitted transform instance.
    """
    try:
        from picard import picard

        n = X.shape[1] if self.n_components is None else self.n_components
        # Picard expects (n_features, n_samples); returns K (whitening) and W (unmixing)
        K, W, _ = picard(
            X.T,  # (D, N)
            n_components=n,
            random_state=self.random_state,
            max_iter=self.max_iter,
            tol=self.tol,
            extended=self.extended,
        )
        self.K_ = K  # pre-whitening matrix, shape (K, D)
        self.W_ = W  # ICA unmixing matrix, shape (K, K)
        self.use_picard_ = True
    except (ImportError, TypeError):
        from sklearn.decomposition import FastICA

        # FastICA fallback when picard is unavailable or incompatible
        self.ica_ = FastICA(
            n_components=self.n_components,
            random_state=self.random_state,
            max_iter=self.max_iter,
        )
        self.ica_.fit(X)
        self.K_ = None
        self.use_picard_ = False
    return self

`get_log_det_jacobian(X)`

Log absolute Jacobian determinant: log|det(W K)| (constant).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine n_samples).

Returns

ldj : np.ndarray of shape (n_samples,) Constant per-sample log absolute Jacobian determinant.

Raises

ValueError If the unmixing matrix is not square (n_components != n_features).

Source code in rbig/_src/rotation.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Log absolute Jacobian determinant: log|det(W K)| (constant).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine n_samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Constant per-sample log absolute Jacobian determinant.

    Raises
    ------
    ValueError
        If the unmixing matrix is not square
        (``n_components != n_features``).
    """
    if not self.use_picard_:
        W = self.ica_.components_
        if W.shape[0] != W.shape[1]:
            raise ValueError(
                "PicardRotation.get_log_det_jacobian is only defined for square "
                "unmixing matrices when using the FastICA fallback. Got "
                f"components_ with shape {W.shape}. Ensure that `n_components` "
                "is None or equals the number of features."
            )
        log_det = np.log(np.abs(np.linalg.det(W)))
    else:
        WK = self.W_ @ self.K_  # combined unmixing matrix, shape (K, D)
        if WK.shape[0] != WK.shape[1]:
            raise ValueError(
                "PicardRotation.get_log_det_jacobian is only defined for square "
                "unmixing matrices when using the Picard solver. Got "
                f"W @ K with shape {WK.shape}. Ensure that `n_components` "
                "is None or equals the number of features."
            )
        log_det = np.log(np.abs(np.linalg.det(WK)))
    return np.full(X.shape[0], log_det)

`inverse_transform(X)`

Invert the ICA unmixing: x = (W K)^{-1} s.

Parameters

X : np.ndarray of shape (n_samples, n_components) Independent-component representation.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original mixed space.

Source code in rbig/_src/rotation.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the ICA unmixing: x = (W K)^{-1} s.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_components)
        Independent-component representation.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original mixed space.
    """
    if not self.use_picard_:
        return self.ica_.inverse_transform(X)
    # Invert unmixing W then whitening K, each via pseudo-inverse
    return (X @ np.linalg.pinv(self.W_).T) @ np.linalg.pinv(self.K_).T

`transform(X)`

Apply the ICA unmixing: s = W K x.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in the original (mixed) space.

Returns

S : np.ndarray of shape (n_samples, n_components) Estimated independent components.

Source code in rbig/_src/rotation.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the ICA unmixing: s = W K x.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original (mixed) space.

    Returns
    -------
    S : np.ndarray of shape (n_samples, n_components)
        Estimated independent components.
    """
    if not self.use_picard_:
        return self.ica_.transform(X)
    # Picard path: whiten then unmix
    # (N, D) @ (D, K) -> (N, K) whitening; then (N, K) @ (K, K) -> (N, K) unmixing
    return (X @ self.K_.T) @ self.W_.T

`QuantileGaussianizer`

Bases: Bijector

Gaussianize each marginal using sklearn's QuantileTransformer.

Wraps :class:sklearn.preprocessing.QuantileTransformer configured with output_distribution='normal' to map each feature to an approximately standard-normal distribution. The quantile transform is a step-function CDF estimate that is particularly robust to outliers.

Parameters

n_quantiles : int, default 1000 Number of quantile nodes used to define the piecewise-linear mapping. Capped at n_samples during fit to avoid requesting more quantiles than there are training points. random_state : int or None, default 0 Seed for reproducible subsampling inside QuantileTransformer.

Attributes

qt_ : sklearn.preprocessing.QuantileTransformer Fitted quantile transformer with output_distribution='normal'. n_features_ : int Number of feature dimensions seen during fit.

Notes

The log-absolute-Jacobian is estimated via central finite differences::

dz_i/dx_i ~= (z_i(x + eps*e_i) - z_i(x - eps*e_i)) / (2*eps)

with eps = 1e-5. This approximation may be inaccurate near discontinuities of the piecewise quantile function.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.marginal import QuantileGaussianizer rng = np.random.default_rng(0) X = rng.standard_normal((200, 3)) qg = QuantileGaussianizer().fit(X) Z = qg.transform(X) Z.shape (200, 3) Xr = qg.inverse_transform(Z) Xr.shape (200, 3)

Source code in rbig/_src/marginal.py

class QuantileGaussianizer(Bijector):
    """Gaussianize each marginal using sklearn's QuantileTransformer.

    Wraps :class:`sklearn.preprocessing.QuantileTransformer` configured with
    ``output_distribution='normal'`` to map each feature to an approximately
    standard-normal distribution.  The quantile transform is a step-function
    CDF estimate that is particularly robust to outliers.

    Parameters
    ----------
    n_quantiles : int, default 1000
        Number of quantile nodes used to define the piecewise-linear mapping.
        Capped at ``n_samples`` during ``fit`` to avoid requesting more
        quantiles than there are training points.
    random_state : int or None, default 0
        Seed for reproducible subsampling inside ``QuantileTransformer``.

    Attributes
    ----------
    qt_ : sklearn.preprocessing.QuantileTransformer
        Fitted quantile transformer with ``output_distribution='normal'``.
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The log-absolute-Jacobian is estimated via central finite differences::

        dz_i/dx_i ~= (z_i(x + eps*e_i) - z_i(x - eps*e_i)) / (2*eps)

    with ``eps = 1e-5``.  This approximation may be inaccurate near
    discontinuities of the piecewise quantile function.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import QuantileGaussianizer
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 3))
    >>> qg = QuantileGaussianizer().fit(X)
    >>> Z = qg.transform(X)
    >>> Z.shape
    (200, 3)
    >>> Xr = qg.inverse_transform(Z)
    >>> Xr.shape
    (200, 3)
    """

    def __init__(self, n_quantiles: int = 1000, random_state: int | None = 0):
        self.n_quantiles = n_quantiles
        self.random_state = random_state

    def fit(self, X: np.ndarray) -> QuantileGaussianizer:
        """Fit the quantile transformer to the training data.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : QuantileGaussianizer
            Fitted bijector instance.
        """
        from sklearn.preprocessing import QuantileTransformer

        # Cap quantile count so it cannot exceed the available samples
        n_quantiles = min(self.n_quantiles, X.shape[0])
        self.qt_ = QuantileTransformer(
            n_quantiles=n_quantiles,
            output_distribution="normal",
            random_state=self.random_state,
        )
        self.qt_.fit(X)
        self.n_features_ = X.shape[1]
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the quantile transform: x -> z approximately N(0, 1).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Gaussianized data.
        """
        return self.qt_.transform(X)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the quantile transform: z -> x.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the Gaussianized (standard-normal) space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
        """
        return self.qt_.inverse_transform(X)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Estimate log |det J| by finite differences on the quantile transform.

        Uses a small perturbation ``eps = 1e-5`` in each dimension::

            dz_i/dx_i ~= (z_i(x + eps*e_i) - z_i(x - eps*e_i)) / (2*eps)

        and sums the log-absolute-derivatives::

            log|det J| = sum_i log|dz_i/dx_i|

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points at which to evaluate the log-det-Jacobian.

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant approximation.

        Notes
        -----
        The quantile transform is piecewise-linear; the finite-difference
        derivative equals the local slope and is exact within each segment.
        """
        eps = 1e-5
        log_det = np.zeros(X.shape[0])
        for i in range(X.shape[1]):
            dummy_plus = X.copy()
            dummy_plus[:, i] = X[:, i] + eps
            dummy_minus = X.copy()
            dummy_minus[:, i] = X[:, i] - eps
            y_plus = self.qt_.transform(dummy_plus)[:, i]
            y_minus = self.qt_.transform(dummy_minus)[:, i]
            # Central-difference derivative for dimension i
            deriv = (y_plus - y_minus) / (2 * eps)
            log_det += np.log(np.maximum(np.abs(deriv), 1e-300))
        return log_det

`fit(X)`

Fit the quantile transformer to the training data.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : QuantileGaussianizer Fitted bijector instance.

Source code in rbig/_src/marginal.py

def fit(self, X: np.ndarray) -> QuantileGaussianizer:
    """Fit the quantile transformer to the training data.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : QuantileGaussianizer
        Fitted bijector instance.
    """
    from sklearn.preprocessing import QuantileTransformer

    # Cap quantile count so it cannot exceed the available samples
    n_quantiles = min(self.n_quantiles, X.shape[0])
    self.qt_ = QuantileTransformer(
        n_quantiles=n_quantiles,
        output_distribution="normal",
        random_state=self.random_state,
    )
    self.qt_.fit(X)
    self.n_features_ = X.shape[1]
    return self

`get_log_det_jacobian(X)`

Estimate log |det J| by finite differences on the quantile transform.

Uses a small perturbation eps = 1e-5 in each dimension::

dz_i/dx_i ~= (z_i(x + eps*e_i) - z_i(x - eps*e_i)) / (2*eps)

and sums the log-absolute-derivatives::

log|det J| = sum_i log|dz_i/dx_i|

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points at which to evaluate the log-det-Jacobian.

Returns

log_det : np.ndarray of shape (n_samples,) Per-sample log absolute determinant approximation.

Notes

The quantile transform is piecewise-linear; the finite-difference derivative equals the local slope and is exact within each segment.

Source code in rbig/_src/marginal.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Estimate log |det J| by finite differences on the quantile transform.

    Uses a small perturbation ``eps = 1e-5`` in each dimension::

        dz_i/dx_i ~= (z_i(x + eps*e_i) - z_i(x - eps*e_i)) / (2*eps)

    and sums the log-absolute-derivatives::

        log|det J| = sum_i log|dz_i/dx_i|

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points at which to evaluate the log-det-Jacobian.

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant approximation.

    Notes
    -----
    The quantile transform is piecewise-linear; the finite-difference
    derivative equals the local slope and is exact within each segment.
    """
    eps = 1e-5
    log_det = np.zeros(X.shape[0])
    for i in range(X.shape[1]):
        dummy_plus = X.copy()
        dummy_plus[:, i] = X[:, i] + eps
        dummy_minus = X.copy()
        dummy_minus[:, i] = X[:, i] - eps
        y_plus = self.qt_.transform(dummy_plus)[:, i]
        y_minus = self.qt_.transform(dummy_minus)[:, i]
        # Central-difference derivative for dimension i
        deriv = (y_plus - y_minus) / (2 * eps)
        log_det += np.log(np.maximum(np.abs(deriv), 1e-300))
    return log_det

`inverse_transform(X)`

Invert the quantile transform: z -> x.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the Gaussianized (standard-normal) space.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space.

Source code in rbig/_src/marginal.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the quantile transform: z -> x.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the Gaussianized (standard-normal) space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
    """
    return self.qt_.inverse_transform(X)

`transform(X)`

Apply the quantile transform: x -> z approximately N(0, 1).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Z : np.ndarray of shape (n_samples, n_features) Gaussianized data.

Source code in rbig/_src/marginal.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the quantile transform: x -> z approximately N(0, 1).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Gaussianized data.
    """
    return self.qt_.transform(X)

`QuantileTransform`

Bases: BaseTransform

Quantile transform that maps each feature to a target distribution.

Wraps sklearn.preprocessing.QuantileTransformer to provide a uniform interface compatible with RBIG pipelines. By default, features are mapped to a standard Gaussian distribution, which is a common pre-processing step for Gaussianisation.

Parameters

n_quantiles : int, optional (default=1000) Number of quantiles used to build the empirical CDF. Capped at n_samples during fit. output_distribution : str, optional (default='normal') Target distribution for the transform. Accepted values are 'normal' (standard Gaussian) and 'uniform'.

Attributes

qt_ : sklearn.preprocessing.QuantileTransformer Fitted sklearn transformer, available after calling fit.

Examples

import numpy as np from rbig._src.parametric import QuantileTransform rng = np.random.default_rng(42) X = rng.exponential(scale=1.0, size=(500, 2)) tr = QuantileTransform(n_quantiles=200).fit(X) Y = tr.transform(X) # approximately standard Gaussian Y.shape (500, 2)

Marginal means should be near zero, stds near 1

np.allclose(Y.mean(axis=0), 0, atol=0.1) True

Source code in rbig/_src/parametric.py

class QuantileTransform(BaseTransform):
    """Quantile transform that maps each feature to a target distribution.

    Wraps ``sklearn.preprocessing.QuantileTransformer`` to provide a uniform
    interface compatible with RBIG pipelines.  By default, features are
    mapped to a standard Gaussian distribution, which is a common
    pre-processing step for Gaussianisation.

    Parameters
    ----------
    n_quantiles : int, optional (default=1000)
        Number of quantiles used to build the empirical CDF.  Capped at
        ``n_samples`` during ``fit``.
    output_distribution : str, optional (default='normal')
        Target distribution for the transform.  Accepted values are
        ``'normal'`` (standard Gaussian) and ``'uniform'``.

    Attributes
    ----------
    qt_ : sklearn.preprocessing.QuantileTransformer
        Fitted sklearn transformer, available after calling ``fit``.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import QuantileTransform
    >>> rng = np.random.default_rng(42)
    >>> X = rng.exponential(scale=1.0, size=(500, 2))
    >>> tr = QuantileTransform(n_quantiles=200).fit(X)
    >>> Y = tr.transform(X)  # approximately standard Gaussian
    >>> Y.shape
    (500, 2)
    >>> # Marginal means should be near zero, stds near 1
    >>> np.allclose(Y.mean(axis=0), 0, atol=0.1)
    True
    """

    def __init__(self, n_quantiles: int = 1000, output_distribution: str = "normal"):
        self.n_quantiles = n_quantiles
        self.output_distribution = output_distribution

    def fit(self, X: np.ndarray) -> QuantileTransform:
        """Fit the quantile transform to the training data.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : QuantileTransform
            Fitted instance with ``qt_`` attribute set.
        """
        from sklearn.preprocessing import QuantileTransformer

        # Cap n_quantiles at the number of available training samples
        n_quantiles = min(self.n_quantiles, X.shape[0])
        self.qt_ = QuantileTransformer(
            n_quantiles=n_quantiles,
            output_distribution=self.output_distribution,
            random_state=0,
        )
        self.qt_.fit(X)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the fitted quantile transform.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Y : np.ndarray of shape (n_samples, n_features)
            Data mapped to the target distribution.
        """
        return self.qt_.transform(X)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the quantile transform.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the target distribution space.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Recovered data in the original distribution space.
        """
        return self.qt_.inverse_transform(X)

`fit(X)`

Fit the quantile transform to the training data.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : QuantileTransform Fitted instance with qt_ attribute set.

Source code in rbig/_src/parametric.py

def fit(self, X: np.ndarray) -> QuantileTransform:
    """Fit the quantile transform to the training data.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : QuantileTransform
        Fitted instance with ``qt_`` attribute set.
    """
    from sklearn.preprocessing import QuantileTransformer

    # Cap n_quantiles at the number of available training samples
    n_quantiles = min(self.n_quantiles, X.shape[0])
    self.qt_ = QuantileTransformer(
        n_quantiles=n_quantiles,
        output_distribution=self.output_distribution,
        random_state=0,
    )
    self.qt_.fit(X)
    return self

`inverse_transform(X)`

Invert the quantile transform.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the target distribution space.

Returns

Z : np.ndarray of shape (n_samples, n_features) Recovered data in the original distribution space.

Source code in rbig/_src/parametric.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the quantile transform.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the target distribution space.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Recovered data in the original distribution space.
    """
    return self.qt_.inverse_transform(X)

`transform(X)`

Apply the fitted quantile transform.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Y : np.ndarray of shape (n_samples, n_features) Data mapped to the target distribution.

Source code in rbig/_src/parametric.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the fitted quantile transform.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Y : np.ndarray of shape (n_samples, n_features)
        Data mapped to the target distribution.
    """
    return self.qt_.transform(X)

`RBIGLayer` `dataclass`

Single RBIG layer: marginal Gaussianization followed by rotation.

One iteration of the RBIG algorithm applies two successive bijections:

Marginal Gaussianization – maps each feature independently to a standard Gaussian via its empirical CDF and the probit function:

z = Φ⁻¹(F̂ₙ(x))

where F̂ₙ is the estimated marginal CDF and Φ⁻¹ is the standard normal quantile function.

Rotation/whitening – applies a linear transform R (default: PCA whitening) to de-correlate the Gaussianized features:

y = R · z

The full single-layer transform is therefore:

y = R · Φ⁻¹(F̂ₙ(x))

Parameters

marginal : MarginalGaussianize, optional Marginal Gaussianization transform (fitted per feature). Defaults to a new MarginalGaussianize instance. rotation : PCARotation, optional Rotation transform applied after marginal Gaussianization. Defaults to a new PCARotation instance.

Attributes

marginal : MarginalGaussianize Fitted marginal transform. rotation : PCARotation Fitted rotation transform.

Notes

The layer log-det-Jacobian is the sum of the marginal and rotation contributions:

log|det J_layer(x)| = log|det J_marginal(x)| + log|det J_rotation|
                     = ∑ᵢ log|Φ⁻¹′(F̂ₙ(xᵢ)) · f̂ₙ(xᵢ)| + log|det J_rotation|

The rotation term log|det J_rotation| is zero only when the rotation is strictly orthogonal (|det R| = 1). The default PCARotation(whiten=True) includes per-component scaling and therefore has a generally non-zero log_det_jacobian. More general rotations such as ICA are also not guaranteed to be orthogonal.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: From ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537–549. https://doi.org/10.1109/TNN.2011.2106511

Examples

import numpy as np from rbig._src.model import RBIGLayer rng = np.random.default_rng(0) X = rng.standard_normal((500, 3)) layer = RBIGLayer() layer.fit(X) RBIGLayer(...) Z = layer.transform(X) Z.shape (500, 3)

Source code in rbig/_src/model.py

@dataclass
class RBIGLayer:
    """Single RBIG layer: marginal Gaussianization followed by rotation.

    One iteration of the RBIG algorithm applies two successive bijections:

    1. **Marginal Gaussianization** – maps each feature independently to a
       standard Gaussian via its empirical CDF and the probit function:

           z = Φ⁻¹(F̂ₙ(x))

       where F̂ₙ is the estimated marginal CDF and Φ⁻¹ is the standard
       normal quantile function.

    2. **Rotation/whitening** – applies a linear transform R (default: PCA
       whitening) to de-correlate the Gaussianized features:

           y = R · z

    The full single-layer transform is therefore:

        y = R · Φ⁻¹(F̂ₙ(x))

    Parameters
    ----------
    marginal : MarginalGaussianize, optional
        Marginal Gaussianization transform (fitted per feature).
        Defaults to a new ``MarginalGaussianize`` instance.
    rotation : PCARotation, optional
        Rotation transform applied after marginal Gaussianization.
        Defaults to a new ``PCARotation`` instance.

    Attributes
    ----------
    marginal : MarginalGaussianize
        Fitted marginal transform.
    rotation : PCARotation
        Fitted rotation transform.

    Notes
    -----
    The layer log-det-Jacobian is the sum of the marginal and rotation
    contributions:

        log|det J_layer(x)| = log|det J_marginal(x)| + log|det J_rotation|
                             = ∑ᵢ log|Φ⁻¹′(F̂ₙ(xᵢ)) · f̂ₙ(xᵢ)| + log|det J_rotation|

    The rotation term ``log|det J_rotation|`` is zero only when the rotation is
    strictly orthogonal (``|det R| = 1``).  The default
    ``PCARotation(whiten=True)`` includes per-component scaling and therefore
    has a generally non-zero ``log_det_jacobian``.  More general rotations such
    as ICA are also not guaranteed to be orthogonal.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    From ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537–549. https://doi.org/10.1109/TNN.2011.2106511

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.model import RBIGLayer
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((500, 3))
    >>> layer = RBIGLayer()
    >>> layer.fit(X)
    RBIGLayer(...)
    >>> Z = layer.transform(X)
    >>> Z.shape
    (500, 3)
    """

    marginal: MarginalGaussianize = field(default_factory=MarginalGaussianize)
    rotation: PCARotation = field(default_factory=PCARotation)

    def fit(self, X: np.ndarray) -> RBIGLayer:
        """Fit the marginal and rotation transforms to data X.

        First fits the marginal Gaussianizer on X, applies it to obtain the
        intermediate Gaussianized representation, then fits the rotation on
        that intermediate representation.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : RBIGLayer
            The fitted layer.
        """
        Xm = self.marginal.fit_transform(
            X
        )  # shape (n_samples, n_features) - Gaussianized
        self.rotation.fit(Xm)  # fit rotation on the Gaussianized data
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply marginal Gaussianization then rotation: y = R · Φ⁻¹(F̂ₙ(x)).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Y : np.ndarray of shape (n_samples, n_features)
            Transformed data after Gaussianization and rotation.
        """
        Xm = self.marginal.transform(
            X
        )  # marginal Gaussianization, shape (n_samples, n_features)
        return self.rotation.transform(Xm)  # rotation, shape (n_samples, n_features)

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Log |det J| for this layer at input X.

        The total log-det-Jacobian is the sum of contributions from the
        marginal step and the rotation step:

            log|det J_layer(x)| = log|det J_marginal(x)| + log|det J_rotation(z)|

        For strictly orthogonal rotations (e.g. ``RandomRotation``), the
        rotation term is zero.  For ``PCARotation(whiten=True)`` the rotation
        includes a per-component rescaling, so its term is generally non-zero.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points.

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant of the layer Jacobian.
        """
        Xm = self.marginal.transform(
            X
        )  # intermediate Gaussianized data, shape (n_samples, n_features)
        # marginal log-det + rotation log-det (non-zero for PCARotation with whiten=True)
        return self.marginal.log_det_jacobian(X) + self.rotation.log_det_jacobian(Xm)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the layer: apply inverse rotation then inverse marginal.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the layer's output (latent) space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original input space.
        """
        Xr = self.rotation.inverse_transform(
            X
        )  # undo rotation, shape (n_samples, n_features)
        return self.marginal.inverse_transform(Xr)  # undo marginal Gaussianization

`fit(X)`

Fit the marginal and rotation transforms to data X.

First fits the marginal Gaussianizer on X, applies it to obtain the intermediate Gaussianized representation, then fits the rotation on that intermediate representation.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : RBIGLayer The fitted layer.

Source code in rbig/_src/model.py

def fit(self, X: np.ndarray) -> RBIGLayer:
    """Fit the marginal and rotation transforms to data X.

    First fits the marginal Gaussianizer on X, applies it to obtain the
    intermediate Gaussianized representation, then fits the rotation on
    that intermediate representation.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : RBIGLayer
        The fitted layer.
    """
    Xm = self.marginal.fit_transform(
        X
    )  # shape (n_samples, n_features) - Gaussianized
    self.rotation.fit(Xm)  # fit rotation on the Gaussianized data
    return self

`inverse_transform(X)`

Invert the layer: apply inverse rotation then inverse marginal.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the layer's output (latent) space.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original input space.

Source code in rbig/_src/model.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the layer: apply inverse rotation then inverse marginal.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the layer's output (latent) space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original input space.
    """
    Xr = self.rotation.inverse_transform(
        X
    )  # undo rotation, shape (n_samples, n_features)
    return self.marginal.inverse_transform(Xr)  # undo marginal Gaussianization

`log_det_jacobian(X)`

Log |det J| for this layer at input X.

The total log-det-Jacobian is the sum of contributions from the marginal step and the rotation step:

log|det J_layer(x)| = log|det J_marginal(x)| + log|det J_rotation(z)|

For strictly orthogonal rotations (e.g. RandomRotation), the rotation term is zero. For PCARotation(whiten=True) the rotation includes a per-component rescaling, so its term is generally non-zero.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points.

Returns

ldj : np.ndarray of shape (n_samples,) Per-sample log absolute determinant of the layer Jacobian.

Source code in rbig/_src/model.py

def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Log |det J| for this layer at input X.

    The total log-det-Jacobian is the sum of contributions from the
    marginal step and the rotation step:

        log|det J_layer(x)| = log|det J_marginal(x)| + log|det J_rotation(z)|

    For strictly orthogonal rotations (e.g. ``RandomRotation``), the
    rotation term is zero.  For ``PCARotation(whiten=True)`` the rotation
    includes a per-component rescaling, so its term is generally non-zero.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points.

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant of the layer Jacobian.
    """
    Xm = self.marginal.transform(
        X
    )  # intermediate Gaussianized data, shape (n_samples, n_features)
    # marginal log-det + rotation log-det (non-zero for PCARotation with whiten=True)
    return self.marginal.log_det_jacobian(X) + self.rotation.log_det_jacobian(Xm)

`transform(X)`

Apply marginal Gaussianization then rotation: y = R · Φ⁻¹(F̂ₙ(x)).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Y : np.ndarray of shape (n_samples, n_features) Transformed data after Gaussianization and rotation.

Source code in rbig/_src/model.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply marginal Gaussianization then rotation: y = R · Φ⁻¹(F̂ₙ(x)).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Y : np.ndarray of shape (n_samples, n_features)
        Transformed data after Gaussianization and rotation.
    """
    Xm = self.marginal.transform(
        X
    )  # marginal Gaussianization, shape (n_samples, n_features)
    return self.rotation.transform(Xm)  # rotation, shape (n_samples, n_features)

`RandomChannelRotation`

Bases: ImageBijector

Orthogonal 1×1 convolution that mixes information across channels.

Generates a random orthogonal matrix Q ∈ ℝ^{C×C} via QR decomposition and applies it as a 1×1 convolution, independently at every spatial location. This is equivalent to multiplying the (H*W, C) spatial-pixel matrix by Q^T on the right:

.. math::

(N,\, H{\cdot}W,\, C)\;
\overset{\times\,\mathbf{Q}^\top}{\longrightarrow}\;
(N,\, H{\cdot}W,\, C)

Because Q is orthogonal, log|det J| = 0.

Parameters

C : int, default 1 Number of image channels. H : int, default 8 Image height in pixels. W : int, default 8 Image width in pixels. random_state : int or None, default None Seed for the random number generator used to initialise Q.

Attributes

rotation_matrix_ : np.ndarray, shape (C, C) The fitted orthogonal rotation matrix Q.

Notes

The forward pass reshapes the tensor as

.. math::

(N, C, H, W)
\xrightarrow{\text{transpose}} (N \cdot H \cdot W, C)
\xrightarrow{\times\,\mathbf{Q}^\top} (N \cdot H \cdot W, C)
\xrightarrow{\text{reshape}} (N, C, H, W)

The inverse uses :math:\mathbf{Q}^{-1} = \mathbf{Q}^\top, so

.. math::

(N \cdot H \cdot W, C)
\xrightarrow{\times\,\mathbf{Q}} (N \cdot H \cdot W, C)

Examples

import numpy as np rng = np.random.default_rng(0) X = rng.standard_normal((8, 3 * 8 * 8)) # N=8, C=3, H=8, W=8 layer = RandomChannelRotation(C=3, H=8, W=8, random_state=42) layer.fit(X) # doctest: +ELLIPSIS RandomChannelRotation(...) Xt = layer.transform(X) Xr = layer.inverse_transform(Xt) np.allclose(X, Xr, atol=1e-10) True

Source code in rbig/_src/image.py

class RandomChannelRotation(ImageBijector):
    """Orthogonal 1×1 convolution that mixes information across channels.

    Generates a random orthogonal matrix ``Q ∈ ℝ^{C×C}`` via QR
    decomposition and applies it as a 1×1 convolution, independently at every
    spatial location.  This is equivalent to multiplying the ``(H*W, C)``
    spatial-pixel matrix by ``Q^T`` on the right:

    .. math::

        (N,\\, H{\\cdot}W,\\, C)\\;
        \\overset{\\times\\,\\mathbf{Q}^\\top}{\\longrightarrow}\\;
        (N,\\, H{\\cdot}W,\\, C)

    Because ``Q`` is orthogonal, ``log|det J| = 0``.

    Parameters
    ----------
    C : int, default 1
        Number of image channels.
    H : int, default 8
        Image height in pixels.
    W : int, default 8
        Image width in pixels.
    random_state : int or None, default None
        Seed for the random number generator used to initialise ``Q``.

    Attributes
    ----------
    rotation_matrix_ : np.ndarray, shape ``(C, C)``
        The fitted orthogonal rotation matrix ``Q``.

    Notes
    -----
    The forward pass reshapes the tensor as

    .. math::

        (N, C, H, W)
        \\xrightarrow{\\text{transpose}} (N \\cdot H \\cdot W, C)
        \\xrightarrow{\\times\\,\\mathbf{Q}^\\top} (N \\cdot H \\cdot W, C)
        \\xrightarrow{\\text{reshape}} (N, C, H, W)

    The inverse uses :math:`\\mathbf{Q}^{-1} = \\mathbf{Q}^\\top`, so

    .. math::

        (N \\cdot H \\cdot W, C)
        \\xrightarrow{\\times\\,\\mathbf{Q}} (N \\cdot H \\cdot W, C)

    Examples
    --------
    >>> import numpy as np
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((8, 3 * 8 * 8))  # N=8, C=3, H=8, W=8
    >>> layer = RandomChannelRotation(C=3, H=8, W=8, random_state=42)
    >>> layer.fit(X)  # doctest: +ELLIPSIS
    RandomChannelRotation(...)
    >>> Xt = layer.transform(X)
    >>> Xr = layer.inverse_transform(Xt)
    >>> np.allclose(X, Xr, atol=1e-10)
    True
    """

    def __init__(
        self,
        C: int = 1,
        H: int = 8,
        W: int = 8,
        random_state: int | None = None,
    ):
        self.C = C
        self.H = H
        self.W = W
        self.random_state = random_state

    def fit(self, X: np.ndarray) -> RandomChannelRotation:
        """Draw a random orthogonal rotation matrix via QR decomposition.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Training data (used only to confirm shape; values are ignored).

        Returns
        -------
        self : RandomChannelRotation
        """
        rng = np.random.default_rng(self.random_state)
        self.C_ = self.C
        self.H_ = self.H
        self.W_ = self.W
        # Sample a random Gaussian matrix and orthogonalise via QR
        A = rng.standard_normal((self.C_, self.C_))
        Q, _ = np.linalg.qr(A)
        self.rotation_matrix_ = Q  # (C, C) orthogonal matrix
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Mix channels with the random orthogonal matrix.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Flattened image batch.

        Returns
        -------
        Xt : np.ndarray, shape ``(N, C*H*W)``
            Channel-rotated flattened image batch.
        """
        N = X.shape[0]
        imgs = self._to_tensor(X)  # (N, C, H, W)
        # Rearrange to (N*H*W, C) for batch matrix multiply
        spatial = imgs.transpose(0, 2, 3, 1).reshape(-1, self.C_)
        # Apply Q^T: (N*H*W, C) @ (C, C)^T → (N*H*W, C)
        rotated = spatial @ self.rotation_matrix_.T
        # Restore tensor shape (N, C, H, W) then flatten
        result = rotated.reshape(N, self.H_, self.W_, self.C_).transpose(0, 3, 1, 2)
        return self._to_flat(result)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the channel rotation using Q^T (since Q^{-1} = Q^T).

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Rotated flattened image batch from :meth:`transform`.

        Returns
        -------
        Xr : np.ndarray, shape ``(N, C*H*W)``
            Reconstructed flattened image batch.
        """
        N = X.shape[0]
        imgs = X.reshape(N, self.C_, self.H_, self.W_)  # (N, C, H, W)
        # Rearrange to (N*H*W, C)
        spatial = imgs.transpose(0, 2, 3, 1).reshape(-1, self.C_)
        # Multiply by Q (the transpose of Q^T) to invert: Q^{-1} = Q^T
        rotated = spatial @ self.rotation_matrix_  # (N*H*W, C) @ (C, C)
        result = rotated.reshape(N, self.H_, self.W_, self.C_).transpose(0, 3, 1, 2)
        return self._to_flat(result)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros: orthogonal rotation has ``log|det J| = 0``.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, D)``

        Returns
        -------
        log_det : np.ndarray, shape ``(N,)``
        """
        return np.zeros(X.shape[0])

`fit(X)`

Draw a random orthogonal rotation matrix via QR decomposition.

Parameters

X : np.ndarray, shape (N, C*H*W) Training data (used only to confirm shape; values are ignored).

Returns

self : RandomChannelRotation

Source code in rbig/_src/image.py

def fit(self, X: np.ndarray) -> RandomChannelRotation:
    """Draw a random orthogonal rotation matrix via QR decomposition.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Training data (used only to confirm shape; values are ignored).

    Returns
    -------
    self : RandomChannelRotation
    """
    rng = np.random.default_rng(self.random_state)
    self.C_ = self.C
    self.H_ = self.H
    self.W_ = self.W
    # Sample a random Gaussian matrix and orthogonalise via QR
    A = rng.standard_normal((self.C_, self.C_))
    Q, _ = np.linalg.qr(A)
    self.rotation_matrix_ = Q  # (C, C) orthogonal matrix
    return self

`get_log_det_jacobian(X)`

Return zeros: orthogonal rotation has log|det J| = 0.

Parameters

X : np.ndarray, shape (N, D)

Returns

log_det : np.ndarray, shape (N,)

Source code in rbig/_src/image.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros: orthogonal rotation has ``log|det J| = 0``.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, D)``

    Returns
    -------
    log_det : np.ndarray, shape ``(N,)``
    """
    return np.zeros(X.shape[0])

`inverse_transform(X)`

Invert the channel rotation using Q^T (since Q^{-1} = Q^T).

Parameters

X : np.ndarray, shape (N, C*H*W) Rotated flattened image batch from :meth:transform.

Returns

Xr : np.ndarray, shape (N, C*H*W) Reconstructed flattened image batch.

Source code in rbig/_src/image.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the channel rotation using Q^T (since Q^{-1} = Q^T).

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Rotated flattened image batch from :meth:`transform`.

    Returns
    -------
    Xr : np.ndarray, shape ``(N, C*H*W)``
        Reconstructed flattened image batch.
    """
    N = X.shape[0]
    imgs = X.reshape(N, self.C_, self.H_, self.W_)  # (N, C, H, W)
    # Rearrange to (N*H*W, C)
    spatial = imgs.transpose(0, 2, 3, 1).reshape(-1, self.C_)
    # Multiply by Q (the transpose of Q^T) to invert: Q^{-1} = Q^T
    rotated = spatial @ self.rotation_matrix_  # (N*H*W, C) @ (C, C)
    result = rotated.reshape(N, self.H_, self.W_, self.C_).transpose(0, 3, 1, 2)
    return self._to_flat(result)

`transform(X)`

Mix channels with the random orthogonal matrix.

Parameters

X : np.ndarray, shape (N, C*H*W) Flattened image batch.

Returns

Xt : np.ndarray, shape (N, C*H*W) Channel-rotated flattened image batch.

Source code in rbig/_src/image.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Mix channels with the random orthogonal matrix.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Flattened image batch.

    Returns
    -------
    Xt : np.ndarray, shape ``(N, C*H*W)``
        Channel-rotated flattened image batch.
    """
    N = X.shape[0]
    imgs = self._to_tensor(X)  # (N, C, H, W)
    # Rearrange to (N*H*W, C) for batch matrix multiply
    spatial = imgs.transpose(0, 2, 3, 1).reshape(-1, self.C_)
    # Apply Q^T: (N*H*W, C) @ (C, C)^T → (N*H*W, C)
    rotated = spatial @ self.rotation_matrix_.T
    # Restore tensor shape (N, C, H, W) then flatten
    result = rotated.reshape(N, self.H_, self.W_, self.C_).transpose(0, 3, 1, 2)
    return self._to_flat(result)

`RandomOrthogonalProjection`

Bases: RotationBijector

Semi-orthogonal random projection from D to K dimensions via QR.

Generates a semi-orthogonal matrix P in R^{D x K} (K <= D) whose columns are orthonormal, obtained by taking the first K columns of a QR decomposition of a random Gaussian matrix::

A ~ N(0, 1)^{D x K},  A = Q R,  P = Q[:, :K]

The forward transform projects D-dimensional input to K dimensions::

z = X P   where P in R^{D x K}

Parameters

n_components : int or None, default None Output dimensionality K. If None, K = D (square case). random_state : int or None, default None Seed for reproducible matrix generation.

Attributes

projection_matrix_ : np.ndarray of shape (n_features, n_components) Semi-orthogonal projection matrix P with orthonormal columns. input_dim_ : int Input dimensionality D. output_dim_ : int Output dimensionality K.

Notes

When K = D the matrix is fully orthogonal and log|det J| = 0. When K < D the transform is not invertible and both :meth:inverse_transform and :meth:get_log_det_jacobian raise :class:NotImplementedError.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.rotation import RandomOrthogonalProjection rng = np.random.default_rng(0) X = rng.standard_normal((100, 4)) proj = RandomOrthogonalProjection(random_state=0).fit(X) Z = proj.transform(X) Z.shape (100, 4)

Source code in rbig/_src/rotation.py

class RandomOrthogonalProjection(RotationBijector):
    """Semi-orthogonal random projection from D to K dimensions via QR.

    Generates a semi-orthogonal matrix P in R^{D x K} (K <= D) whose
    columns are orthonormal, obtained by taking the first K columns of a
    QR decomposition of a random Gaussian matrix::

        A ~ N(0, 1)^{D x K},  A = Q R,  P = Q[:, :K]

    The forward transform projects D-dimensional input to K dimensions::

        z = X P   where P in R^{D x K}

    Parameters
    ----------
    n_components : int or None, default None
        Output dimensionality K.  If ``None``, K = D (square case).
    random_state : int or None, default None
        Seed for reproducible matrix generation.

    Attributes
    ----------
    projection_matrix_ : np.ndarray of shape (n_features, n_components)
        Semi-orthogonal projection matrix P with orthonormal columns.
    input_dim_ : int
        Input dimensionality D.
    output_dim_ : int
        Output dimensionality K.

    Notes
    -----
    When K = D the matrix is fully orthogonal and ``log|det J| = 0``.
    When K < D the transform is not invertible and both
    :meth:`inverse_transform` and :meth:`get_log_det_jacobian` raise
    :class:`NotImplementedError`.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import RandomOrthogonalProjection
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((100, 4))
    >>> proj = RandomOrthogonalProjection(random_state=0).fit(X)
    >>> Z = proj.transform(X)
    >>> Z.shape
    (100, 4)
    """

    def __init__(
        self,
        n_components: int | None = None,
        random_state: int | None = None,
    ):
        self.n_components = n_components
        self.random_state = random_state

    def fit(self, X: np.ndarray) -> RandomOrthogonalProjection:
        """Build the semi-orthogonal projection matrix P.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data (used only to infer D = n_features).

        Returns
        -------
        self : RandomOrthogonalProjection
            Fitted transform instance.
        """
        rng = np.random.default_rng(self.random_state)
        D = X.shape[1]
        K = self.n_components if self.n_components is not None else D
        # Random Gaussian seed matrix; QR gives orthonormal columns
        A = rng.standard_normal((D, K))
        Q, _ = np.linalg.qr(A)
        self.projection_matrix_ = Q[:, :K]  # (D, K)  semi-orthogonal basis
        self.input_dim_ = D
        self.output_dim_ = K
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Project X from D to K dimensions: z = X P.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_components)
            Projected data.
        """
        return X @ self.projection_matrix_  # (N, D) @ (D, K) -> (N, K)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the projection (only valid for square case K = D).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_components)
            Projected data.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original space (exact only when K = D).

        Raises
        ------
        NotImplementedError
            If ``n_components < n_features`` (projection is not invertible).
        """
        if self.output_dim_ < self.input_dim_:
            raise NotImplementedError(
                "RandomOrthogonalProjection with n_components < input dimension "
                "is not bijective; inverse_transform is undefined."
            )
        return X @ self.projection_matrix_.T  # exact inverse only when square (N, D)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros for the square (bijective) case.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine n_samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Zeros, because ``|det P| = 1`` for a square orthogonal matrix.

        Raises
        ------
        NotImplementedError
            If ``n_components < n_features`` (Jacobian determinant undefined).
        """
        if self.output_dim_ < self.input_dim_:
            raise NotImplementedError(
                "RandomOrthogonalProjection with n_components < input dimension "
                "does not have a well-defined Jacobian determinant."
            )
        # For a square orthogonal matrix, |det(J)| = 1, so log|det(J)| = 0.
        return np.zeros(X.shape[0])

`fit(X)`

Build the semi-orthogonal projection matrix P.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data (used only to infer D = n_features).

Returns

self : RandomOrthogonalProjection Fitted transform instance.

Source code in rbig/_src/rotation.py

def fit(self, X: np.ndarray) -> RandomOrthogonalProjection:
    """Build the semi-orthogonal projection matrix P.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data (used only to infer D = n_features).

    Returns
    -------
    self : RandomOrthogonalProjection
        Fitted transform instance.
    """
    rng = np.random.default_rng(self.random_state)
    D = X.shape[1]
    K = self.n_components if self.n_components is not None else D
    # Random Gaussian seed matrix; QR gives orthonormal columns
    A = rng.standard_normal((D, K))
    Q, _ = np.linalg.qr(A)
    self.projection_matrix_ = Q[:, :K]  # (D, K)  semi-orthogonal basis
    self.input_dim_ = D
    self.output_dim_ = K
    return self

`get_log_det_jacobian(X)`

Return zeros for the square (bijective) case.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine n_samples).

Returns

ldj : np.ndarray of shape (n_samples,) Zeros, because |det P| = 1 for a square orthogonal matrix.

Raises

NotImplementedError If n_components < n_features (Jacobian determinant undefined).

Source code in rbig/_src/rotation.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros for the square (bijective) case.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine n_samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Zeros, because ``|det P| = 1`` for a square orthogonal matrix.

    Raises
    ------
    NotImplementedError
        If ``n_components < n_features`` (Jacobian determinant undefined).
    """
    if self.output_dim_ < self.input_dim_:
        raise NotImplementedError(
            "RandomOrthogonalProjection with n_components < input dimension "
            "does not have a well-defined Jacobian determinant."
        )
    # For a square orthogonal matrix, |det(J)| = 1, so log|det(J)| = 0.
    return np.zeros(X.shape[0])

`inverse_transform(X)`

Invert the projection (only valid for square case K = D).

Parameters

X : np.ndarray of shape (n_samples, n_components) Projected data.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original space (exact only when K = D).

Raises

NotImplementedError If n_components < n_features (projection is not invertible).

Source code in rbig/_src/rotation.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the projection (only valid for square case K = D).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_components)
        Projected data.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original space (exact only when K = D).

    Raises
    ------
    NotImplementedError
        If ``n_components < n_features`` (projection is not invertible).
    """
    if self.output_dim_ < self.input_dim_:
        raise NotImplementedError(
            "RandomOrthogonalProjection with n_components < input dimension "
            "is not bijective; inverse_transform is undefined."
        )
    return X @ self.projection_matrix_.T  # exact inverse only when square (N, D)

`transform(X)`

Project X from D to K dimensions: z = X P.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Z : np.ndarray of shape (n_samples, n_components) Projected data.

Source code in rbig/_src/rotation.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Project X from D to K dimensions: z = X P.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_components)
        Projected data.
    """
    return X @ self.projection_matrix_  # (N, D) @ (D, K) -> (N, K)

`RandomRotation`

Bases: RotationBijector

Random orthogonal rotation drawn from the Haar measure via QR.

Generates a uniformly random orthogonal matrix Q in R^{D x D} by QR decomposing a matrix of i.i.d. standard-normal entries and applying a sign correction to ensure the result is Haar-uniform::

A ~ N(0, 1)^{D x D},  A = Q R,  Q <- Q * diag(sign(diag(R)))

The sign correction guarantees that Q is sampled uniformly from the orthogonal group O(D) (the Haar measure).

Parameters

random_state : int or None, default None Seed for reproducible rotation matrix generation.

Attributes

rotation_matrix_ : np.ndarray of shape (n_features, n_features) The sampled orthogonal rotation matrix Q.

Notes

Because Q is orthogonal, |det Q| = 1 and::

log|det J| = log|det Q| = 0

This is the default implementation inherited from :class:~rbig._src.base.RotationBijector.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.rotation import RandomRotation rng_data = np.random.default_rng(0) X = rng_data.standard_normal((100, 4)) rot = RandomRotation(random_state=42).fit(X) Z = rot.transform(X) Z.shape (100, 4) Xr = rot.inverse_transform(Z) np.allclose(X, Xr) True

Source code in rbig/_src/rotation.py

class RandomRotation(RotationBijector):
    """Random orthogonal rotation drawn from the Haar measure via QR.

    Generates a uniformly random orthogonal matrix Q in R^{D x D} by QR
    decomposing a matrix of i.i.d. standard-normal entries and applying a
    sign correction to ensure the result is Haar-uniform::

        A ~ N(0, 1)^{D x D},  A = Q R,  Q <- Q * diag(sign(diag(R)))

    The sign correction guarantees that Q is sampled uniformly from the
    orthogonal group O(D) (the Haar measure).

    Parameters
    ----------
    random_state : int or None, default None
        Seed for reproducible rotation matrix generation.

    Attributes
    ----------
    rotation_matrix_ : np.ndarray of shape (n_features, n_features)
        The sampled orthogonal rotation matrix Q.

    Notes
    -----
    Because Q is orthogonal, ``|det Q| = 1`` and::

        log|det J| = log|det Q| = 0

    This is the default implementation inherited from
    :class:`~rbig._src.base.RotationBijector`.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import RandomRotation
    >>> rng_data = np.random.default_rng(0)
    >>> X = rng_data.standard_normal((100, 4))
    >>> rot = RandomRotation(random_state=42).fit(X)
    >>> Z = rot.transform(X)
    >>> Z.shape
    (100, 4)
    >>> Xr = rot.inverse_transform(Z)
    >>> np.allclose(X, Xr)
    True
    """

    def __init__(self, random_state: int | None = None):
        self.random_state = random_state

    def fit(self, X: np.ndarray) -> RandomRotation:
        """Sample a Haar-uniform orthogonal rotation matrix of size D x D.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data (used only to infer the dimensionality D).

        Returns
        -------
        self : RandomRotation
            Fitted transform instance with ``rotation_matrix_`` set.
        """
        rng = np.random.default_rng(self.random_state)
        n_features = X.shape[1]
        # Draw a random D x D Gaussian matrix
        A = rng.standard_normal((n_features, n_features))
        Q, R = np.linalg.qr(A)
        # Sign correction: multiply columns of Q by sign(diag(R)) for Haar measure
        Q *= np.sign(np.diag(R))  # ensures uniform distribution on O(D)
        self.rotation_matrix_ = Q  # shape (D, D)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Rotate X by the sampled orthogonal matrix: z = Q x.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Rotated data.
        """
        return X @ self.rotation_matrix_.T  # (N, D) @ (D, D) -> (N, D)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the rotation: x = Q^T z = Q^{-1} z.

        Because Q is orthogonal, its inverse equals its transpose.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Rotated data.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original space.
        """
        return X @ self.rotation_matrix_  # Q^{-1} = Q^T -> (N, D) @ (D, D)

`fit(X)`

Sample a Haar-uniform orthogonal rotation matrix of size D x D.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data (used only to infer the dimensionality D).

Returns

self : RandomRotation Fitted transform instance with rotation_matrix_ set.

Source code in rbig/_src/rotation.py

def fit(self, X: np.ndarray) -> RandomRotation:
    """Sample a Haar-uniform orthogonal rotation matrix of size D x D.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data (used only to infer the dimensionality D).

    Returns
    -------
    self : RandomRotation
        Fitted transform instance with ``rotation_matrix_`` set.
    """
    rng = np.random.default_rng(self.random_state)
    n_features = X.shape[1]
    # Draw a random D x D Gaussian matrix
    A = rng.standard_normal((n_features, n_features))
    Q, R = np.linalg.qr(A)
    # Sign correction: multiply columns of Q by sign(diag(R)) for Haar measure
    Q *= np.sign(np.diag(R))  # ensures uniform distribution on O(D)
    self.rotation_matrix_ = Q  # shape (D, D)
    return self

`inverse_transform(X)`

Invert the rotation: x = Q^T z = Q^{-1} z.

Because Q is orthogonal, its inverse equals its transpose.

Parameters

X : np.ndarray of shape (n_samples, n_features) Rotated data.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original space.

Source code in rbig/_src/rotation.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the rotation: x = Q^T z = Q^{-1} z.

    Because Q is orthogonal, its inverse equals its transpose.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Rotated data.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original space.
    """
    return X @ self.rotation_matrix_  # Q^{-1} = Q^T -> (N, D) @ (D, D)

`transform(X)`

Rotate X by the sampled orthogonal matrix: z = Q x.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Z : np.ndarray of shape (n_samples, n_features) Rotated data.

Source code in rbig/_src/rotation.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Rotate X by the sampled orthogonal matrix: z = Q x.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Rotated data.
    """
    return X @ self.rotation_matrix_.T  # (N, D) @ (D, D) -> (N, D)

`RotationBijector`

Bases: Bijector

Abstract bijector for orthogonal rotation transforms.

Rotation matrices Q satisfy QᵀQ = I and |det Q| = 1, so the log-absolute-determinant of the Jacobian is exactly zero:

log|det J_Q(x)| = log|det Q| = log 1 = 0

This default implementation of get_log_det_jacobian returns a zero vector of length n_samples, which concrete subclasses (e.g. PCA, ICA, random orthogonal) can inherit without override.

Notes

In RBIG, the rotation step de-correlates the marginally Gaussianized data, driving the joint distribution closer to a standard multivariate Gaussian with each iteration.

Source code in rbig/_src/base.py

class RotationBijector(Bijector):
    """Abstract bijector for orthogonal rotation transforms.

    Rotation matrices Q satisfy QᵀQ = I and |det Q| = 1, so the
    log-absolute-determinant of the Jacobian is exactly zero:

        log|det J_Q(x)| = log|det Q| = log 1 = 0

    This default implementation of ``get_log_det_jacobian`` returns a
    zero vector of length ``n_samples``, which concrete subclasses (e.g.
    PCA, ICA, random orthogonal) can inherit without override.

    Notes
    -----
    In RBIG, the rotation step de-correlates the marginally Gaussianized
    data, driving the joint distribution closer to a standard multivariate
    Gaussian with each iteration.
    """

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros because |det Q| = 1 for any orthogonal matrix Q.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine n_samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Array of zeros; rotations do not change volume.
        """
        # Orthogonal matrices preserve volume: log|det Q| = 0 for all x
        return np.zeros(X.shape[0])

`get_log_det_jacobian(X)`

Return zeros because |det Q| = 1 for any orthogonal matrix Q.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine n_samples).

Returns

ldj : np.ndarray of shape (n_samples,) Array of zeros; rotations do not change volume.

Source code in rbig/_src/base.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros because |det Q| = 1 for any orthogonal matrix Q.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine n_samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Array of zeros; rotations do not change volume.
    """
    # Orthogonal matrices preserve volume: log|det Q| = 0 for all x
    return np.zeros(X.shape[0])

`SplineGaussianizer`

Bases: Bijector

Gaussianize each marginal using monotone PCHIP spline interpolation.

Estimates the marginal CDF from empirical quantiles and fits a shape-preserving (monotone) cubic Hermite spline (PCHIP) from original-space quantile values to the corresponding Gaussian quantiles. The forward transform is::

z = S(x)

where S is the fitted :class:scipy.interpolate.PchipInterpolator mapping data values to standard-normal quantiles. Because PCHIP preserves monotonicity, the mapping is guaranteed to be invertible.

Parameters

n_quantiles : int, default 200 Number of quantile nodes used to fit the splines. Capped at n_samples when fewer training samples are available. eps : float, default 1e-6 Clipping margin applied to the Gaussian quantile grid to keep the spline endpoints finite (avoids +/-inf at boundary quantiles).

Attributes

splines_ : list of scipy.interpolate.PchipInterpolator Forward splines (x -> z) per feature, mapping original-space values to standard-normal quantiles. inv_splines_ : list of scipy.interpolate.PchipInterpolator Inverse splines (z -> x) per feature, mapping Gaussian quantiles back to original-space values. n_features_ : int Number of feature dimensions seen during fit.

Notes

The log-det-Jacobian uses the analytic first derivative of the spline::

log|dz/dx| = log|S'(x)|

where S' is the first derivative of the PCHIP forward spline, evaluated via spline(x, 1) (the derivative-order argument).

Duplicate x-values (arising from discrete or constant features) are removed before fitting to ensure strict monotonicity.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.marginal import SplineGaussianizer rng = np.random.default_rng(0) X = rng.standard_normal((300, 3)) sg = SplineGaussianizer(n_quantiles=100).fit(X) Z = sg.transform(X) Z.shape (300, 3) ldj = sg.get_log_det_jacobian(X) ldj.shape (300,)

Source code in rbig/_src/marginal.py

class SplineGaussianizer(Bijector):
    """Gaussianize each marginal using monotone PCHIP spline interpolation.

    Estimates the marginal CDF from empirical quantiles and fits a
    shape-preserving (monotone) cubic Hermite spline (PCHIP) from
    original-space quantile values to the corresponding Gaussian quantiles.
    The forward transform is::

        z = S(x)

    where ``S`` is the fitted :class:`scipy.interpolate.PchipInterpolator`
    mapping data values to standard-normal quantiles.  Because PCHIP
    preserves monotonicity, the mapping is guaranteed to be invertible.

    Parameters
    ----------
    n_quantiles : int, default 200
        Number of quantile nodes used to fit the splines.  Capped at
        ``n_samples`` when fewer training samples are available.
    eps : float, default 1e-6
        Clipping margin applied to the Gaussian quantile grid to keep
        the spline endpoints finite (avoids +/-inf at boundary quantiles).

    Attributes
    ----------
    splines_ : list of scipy.interpolate.PchipInterpolator
        Forward splines (x -> z) per feature, mapping original-space
        values to standard-normal quantiles.
    inv_splines_ : list of scipy.interpolate.PchipInterpolator
        Inverse splines (z -> x) per feature, mapping Gaussian quantiles
        back to original-space values.
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The log-det-Jacobian uses the analytic first derivative of the spline::

        log|dz/dx| = log|S'(x)|

    where ``S'`` is the first derivative of the PCHIP forward spline,
    evaluated via ``spline(x, 1)`` (the derivative-order argument).

    Duplicate x-values (arising from discrete or constant features) are
    removed before fitting to ensure strict monotonicity.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import SplineGaussianizer
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((300, 3))
    >>> sg = SplineGaussianizer(n_quantiles=100).fit(X)
    >>> Z = sg.transform(X)
    >>> Z.shape
    (300, 3)
    >>> ldj = sg.get_log_det_jacobian(X)
    >>> ldj.shape
    (300,)
    """

    def __init__(self, n_quantiles: int = 200, eps: float = 1e-6):
        self.n_quantiles = n_quantiles
        self.eps = eps

    def fit(self, X: np.ndarray) -> SplineGaussianizer:
        """Fit forward and inverse PCHIP splines for each feature.

        For each dimension, ``n_quantiles`` evenly-spaced probability levels
        are mapped to their empirical quantile values in the data, and the
        corresponding Gaussian quantile values ``Phi^{-1}(p)`` are computed.
        PCHIP interpolants are then fitted in both directions.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : SplineGaussianizer
            Fitted bijector instance.
        """
        from scipy.interpolate import PchipInterpolator

        self.splines_ = []
        self.inv_splines_ = []
        self.n_features_ = X.shape[1]
        # Use at most n_samples quantile nodes
        n_q = min(self.n_quantiles, X.shape[0])
        # Probability grid: n_q evenly-spaced points in [0, 1]
        quantiles = np.linspace(0, 1, n_q)
        # Corresponding Gaussian quantiles Phi^{-1}(p), clipped away from +/-inf
        g_q = ndtri(np.clip(quantiles, self.eps, 1 - self.eps))
        for i in range(self.n_features_):
            xi_sorted = np.sort(X[:, i])
            # Empirical quantile values at each probability level
            x_q = np.quantile(xi_sorted, quantiles)
            # Remove duplicate x values so PchipInterpolator gets a strictly
            # increasing sequence (duplicates arise with discrete/tied data).
            x_q_u, idx = np.unique(x_q, return_index=True)
            g_q_u = g_q[idx]
            self.splines_.append(PchipInterpolator(x_q_u, g_q_u))
            self.inv_splines_.append(PchipInterpolator(g_q_u, x_q_u))
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the forward spline map: x -> z = S(x).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Gaussianized data with approximately standard-normal marginals.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Evaluate the forward PCHIP spline at the input values
            Xt[:, i] = self.splines_[i](X[:, i])
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the inverse spline map: z -> x = S^{-1}(z).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the Gaussianized space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Evaluate the inverse PCHIP spline at the Gaussian values
            Xt[:, i] = self.inv_splines_[i](X[:, i])
        return Xt

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute log |det J| using the analytic spline first derivative.

        Because the Jacobian is diagonal::

            log|det J| = sum_i log|S'(x_i)|

        where ``S'`` is the first derivative of the PCHIP forward spline,
        evaluated via ``spline(x, 1)`` (the derivative order argument).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points at which to evaluate the log-det-Jacobian.

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant.
        """
        log_det = np.zeros(X.shape[0])
        for i in range(self.n_features_):
            deriv = self.splines_[i](X[:, i], 1)  # first derivative
            log_det += np.log(np.maximum(np.abs(deriv), 1e-300))
        return log_det

`fit(X)`

Fit forward and inverse PCHIP splines for each feature.

For each dimension, n_quantiles evenly-spaced probability levels are mapped to their empirical quantile values in the data, and the corresponding Gaussian quantile values Phi^{-1}(p) are computed. PCHIP interpolants are then fitted in both directions.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : SplineGaussianizer Fitted bijector instance.

Source code in rbig/_src/marginal.py

def fit(self, X: np.ndarray) -> SplineGaussianizer:
    """Fit forward and inverse PCHIP splines for each feature.

    For each dimension, ``n_quantiles`` evenly-spaced probability levels
    are mapped to their empirical quantile values in the data, and the
    corresponding Gaussian quantile values ``Phi^{-1}(p)`` are computed.
    PCHIP interpolants are then fitted in both directions.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : SplineGaussianizer
        Fitted bijector instance.
    """
    from scipy.interpolate import PchipInterpolator

    self.splines_ = []
    self.inv_splines_ = []
    self.n_features_ = X.shape[1]
    # Use at most n_samples quantile nodes
    n_q = min(self.n_quantiles, X.shape[0])
    # Probability grid: n_q evenly-spaced points in [0, 1]
    quantiles = np.linspace(0, 1, n_q)
    # Corresponding Gaussian quantiles Phi^{-1}(p), clipped away from +/-inf
    g_q = ndtri(np.clip(quantiles, self.eps, 1 - self.eps))
    for i in range(self.n_features_):
        xi_sorted = np.sort(X[:, i])
        # Empirical quantile values at each probability level
        x_q = np.quantile(xi_sorted, quantiles)
        # Remove duplicate x values so PchipInterpolator gets a strictly
        # increasing sequence (duplicates arise with discrete/tied data).
        x_q_u, idx = np.unique(x_q, return_index=True)
        g_q_u = g_q[idx]
        self.splines_.append(PchipInterpolator(x_q_u, g_q_u))
        self.inv_splines_.append(PchipInterpolator(g_q_u, x_q_u))
    return self

`get_log_det_jacobian(X)`

Compute log |det J| using the analytic spline first derivative.

Because the Jacobian is diagonal::

log|det J| = sum_i log|S'(x_i)|

where S' is the first derivative of the PCHIP forward spline, evaluated via spline(x, 1) (the derivative order argument).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points at which to evaluate the log-det-Jacobian.

Returns

log_det : np.ndarray of shape (n_samples,) Per-sample log absolute determinant.

Source code in rbig/_src/marginal.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute log |det J| using the analytic spline first derivative.

    Because the Jacobian is diagonal::

        log|det J| = sum_i log|S'(x_i)|

    where ``S'`` is the first derivative of the PCHIP forward spline,
    evaluated via ``spline(x, 1)`` (the derivative order argument).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points at which to evaluate the log-det-Jacobian.

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant.
    """
    log_det = np.zeros(X.shape[0])
    for i in range(self.n_features_):
        deriv = self.splines_[i](X[:, i], 1)  # first derivative
        log_det += np.log(np.maximum(np.abs(deriv), 1e-300))
    return log_det

`inverse_transform(X)`

Apply the inverse spline map: z -> x = S^{-1}(z).

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the Gaussianized space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space.

Source code in rbig/_src/marginal.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the inverse spline map: z -> x = S^{-1}(z).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the Gaussianized space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Evaluate the inverse PCHIP spline at the Gaussian values
        Xt[:, i] = self.inv_splines_[i](X[:, i])
    return Xt

`transform(X)`

Apply the forward spline map: x -> z = S(x).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Gaussianized data with approximately standard-normal marginals.

Source code in rbig/_src/marginal.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the forward spline map: x -> z = S(x).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Gaussianized data with approximately standard-normal marginals.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Evaluate the forward PCHIP spline at the input values
        Xt[:, i] = self.splines_[i](X[:, i])
    return Xt

`Tanh`

Bases: Bijector

Elementwise hyperbolic-tangent bijector with analytic log-det Jacobian.

Implements the smooth, bounded map:

Forward  : y = tanh(x)
Inverse  : x = arctanh(y)   (y clipped to (−1+ε, 1−ε) for stability)
Log-det  : ∑ᵢ log(1 − tanh²(xᵢ)) = ∑ᵢ log sech²(xᵢ)

Because tanh saturates near ±1, the inverse is numerically clipped to avoid divergence.

Examples

import numpy as np from rbig._src.densities import Tanh X = np.array([[0.0, 1.0], [-1.0, 0.5]]) bij = Tanh().fit(X) Y = bij.transform(X) # y = tanh(x) X_rec = bij.inverse_transform(Y) # x = arctanh(y) np.allclose(X, X_rec, atol=1e-5) True

Source code in rbig/_src/densities.py

class Tanh(Bijector):
    """Elementwise hyperbolic-tangent bijector with analytic log-det Jacobian.

    Implements the smooth, bounded map:

        Forward  : y = tanh(x)
        Inverse  : x = arctanh(y)   (y clipped to (−1+ε, 1−ε) for stability)
        Log-det  : ∑ᵢ log(1 − tanh²(xᵢ)) = ∑ᵢ log sech²(xᵢ)

    Because tanh saturates near ±1, the inverse is numerically clipped to
    avoid divergence.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.densities import Tanh
    >>> X = np.array([[0.0, 1.0], [-1.0, 0.5]])
    >>> bij = Tanh().fit(X)
    >>> Y = bij.transform(X)  # y = tanh(x)
    >>> X_rec = bij.inverse_transform(Y)  # x = arctanh(y)
    >>> np.allclose(X, X_rec, atol=1e-5)
    True
    """

    def fit(self, X: np.ndarray) -> Tanh:
        """No-op fit (stateless bijector).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Ignored.

        Returns
        -------
        self : Tanh
        """
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the forward tanh map.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in ℝ.

        Returns
        -------
        Y : np.ndarray of shape (n_samples, n_features)
            Transformed data in (−1, 1).
        """
        return np.tanh(X)  # y = tanh(x), element-wise

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the inverse arctanh map with clipping for stability.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in (−1, 1).  Values outside this range are clipped to
            (−1 + 1e-6, 1 − 1e-6).

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Reconstructed data in ℝ.
        """
        # Clip to open interval to prevent arctanh(±1) = ±∞
        return np.arctanh(np.clip(X, -1 + 1e-6, 1 - 1e-6))

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute per-sample log |det J| of the forward transform.

        The Jacobian of tanh is diagonal with entries
        d(tanh(xᵢ))/dxᵢ = 1 − tanh²(xᵢ) = sech²(xᵢ), so:

            log |det J| = ∑ᵢ log(1 − tanh²(xᵢ))

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data (pre-transform).

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Log absolute determinant of the Jacobian for each sample.
        """
        # 1 - tanh^2(xi) is the diagonal Jacobian; sum over features
        return np.sum(np.log(1 - np.tanh(X) ** 2 + 1e-300), axis=1)

`fit(X)`

No-op fit (stateless bijector).

Parameters

X : np.ndarray of shape (n_samples, n_features) Ignored.

Returns

self : Tanh

Source code in rbig/_src/densities.py

def fit(self, X: np.ndarray) -> Tanh:
    """No-op fit (stateless bijector).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Ignored.

    Returns
    -------
    self : Tanh
    """
    return self

`get_log_det_jacobian(X)`

Compute per-sample log |det J| of the forward transform.

The Jacobian of tanh is diagonal with entries d(tanh(xᵢ))/dxᵢ = 1 − tanh²(xᵢ) = sech²(xᵢ), so:

log |det J| = ∑ᵢ log(1 − tanh²(xᵢ))

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data (pre-transform).

Returns

log_det : np.ndarray of shape (n_samples,) Log absolute determinant of the Jacobian for each sample.

Source code in rbig/_src/densities.py

def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute per-sample log |det J| of the forward transform.

    The Jacobian of tanh is diagonal with entries
    d(tanh(xᵢ))/dxᵢ = 1 − tanh²(xᵢ) = sech²(xᵢ), so:

        log |det J| = ∑ᵢ log(1 − tanh²(xᵢ))

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data (pre-transform).

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Log absolute determinant of the Jacobian for each sample.
    """
    # 1 - tanh^2(xi) is the diagonal Jacobian; sum over features
    return np.sum(np.log(1 - np.tanh(X) ** 2 + 1e-300), axis=1)

`inverse_transform(X)`

Apply the inverse arctanh map with clipping for stability.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in (−1, 1). Values outside this range are clipped to (−1 + 1e-6, 1 − 1e-6).

Returns

Z : np.ndarray of shape (n_samples, n_features) Reconstructed data in ℝ.

Source code in rbig/_src/densities.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the inverse arctanh map with clipping for stability.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in (−1, 1).  Values outside this range are clipped to
        (−1 + 1e-6, 1 − 1e-6).

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Reconstructed data in ℝ.
    """
    # Clip to open interval to prevent arctanh(±1) = ±∞
    return np.arctanh(np.clip(X, -1 + 1e-6, 1 - 1e-6))

`transform(X)`

Apply the forward tanh map.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in ℝ.

Returns

Y : np.ndarray of shape (n_samples, n_features) Transformed data in (−1, 1).

Source code in rbig/_src/densities.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the forward tanh map.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in ℝ.

    Returns
    -------
    Y : np.ndarray of shape (n_samples, n_features)
        Transformed data in (−1, 1).
    """
    return np.tanh(X)  # y = tanh(x), element-wise

`WaveletTransform`

Bases: BaseTransform

Multi-level 2-D wavelet decomposition for image data.

Wraps PyWavelets wavedec2 / waverec2 to provide the standard fit / transform / inverse_transform interface expected by RBIG pipeline components.

The forward transform maps each (H, W) image to a flat coefficient vector of length H * W (for periodization boundary mode the coefficient array has the same number of elements as the input image).

Requires PyWavelets (pip install PyWavelets).

Parameters

wavelet : str, default "haar" Wavelet name accepted by :func:pywt.Wavelet, e.g. "haar", "db2", "sym4". level : int, default 1 Decomposition depth. Higher levels produce coarser approximation sub-bands. mode : str, default "periodization" Signal extension mode passed to PyWavelets. "periodization" ensures the output coefficient array has the same total size as the input.

Attributes

original_shape_ : tuple of int Shape (N, H, W) of the training data passed to :meth:fit. coeff_slices_ : list PyWavelets slicing metadata needed to pack/unpack the coefficient array. Set during :meth:fit. coeff_shape_ : tuple of int Shape of the 2-D coefficient array produced by :func:pywt.coeffs_to_array.

Notes

The mapping from image to coefficients is

.. math::

(N,\, H \cdot W) \xrightarrow{\text{wavedec2}}
(N,\, H \cdot W)

For level=1 and wavelet="haar" the four sub-bands are: approximation (LL), horizontal detail (LH), vertical detail (HL), and diagonal detail (HH).

Examples

import numpy as np rng = np.random.default_rng(0) X = rng.standard_normal((50, 8, 8)) # 50 grayscale 8×8 images wt = WaveletTransform(wavelet="haar", level=1) wt.fit(X) # doctest: +ELLIPSIS WaveletTransform(...) Xt = wt.transform(X) Xt.shape (50, 64) Xr = wt.inverse_transform(Xt) Xr.shape (50, 8, 8)

Source code in rbig/_src/image.py

class WaveletTransform(BaseTransform):
    """Multi-level 2-D wavelet decomposition for image data.

    Wraps PyWavelets ``wavedec2`` / ``waverec2`` to provide the standard
    ``fit`` / ``transform`` / ``inverse_transform`` interface expected by
    RBIG pipeline components.

    The forward transform maps each ``(H, W)`` image to a flat coefficient
    vector of length ``H * W`` (for periodization boundary mode the coefficient
    array has the same number of elements as the input image).

    Requires PyWavelets (``pip install PyWavelets``).

    Parameters
    ----------
    wavelet : str, default ``"haar"``
        Wavelet name accepted by :func:`pywt.Wavelet`, e.g. ``"haar"``,
        ``"db2"``, ``"sym4"``.
    level : int, default 1
        Decomposition depth.  Higher levels produce coarser approximation
        sub-bands.
    mode : str, default ``"periodization"``
        Signal extension mode passed to PyWavelets.  ``"periodization"``
        ensures the output coefficient array has the same total size as the
        input.

    Attributes
    ----------
    original_shape_ : tuple of int
        Shape ``(N, H, W)`` of the training data passed to :meth:`fit`.
    coeff_slices_ : list
        PyWavelets slicing metadata needed to pack/unpack the coefficient
        array.  Set during :meth:`fit`.
    coeff_shape_ : tuple of int
        Shape of the 2-D coefficient array produced by
        :func:`pywt.coeffs_to_array`.

    Notes
    -----
    The mapping from image to coefficients is

    .. math::

        (N,\\, H \\cdot W) \\xrightarrow{\\text{wavedec2}}
        (N,\\, H \\cdot W)

    For ``level=1`` and ``wavelet="haar"`` the four sub-bands are:
    approximation (LL), horizontal detail (LH), vertical detail (HL), and
    diagonal detail (HH).

    Examples
    --------
    >>> import numpy as np
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((50, 8, 8))  # 50 grayscale 8×8 images
    >>> wt = WaveletTransform(wavelet="haar", level=1)
    >>> wt.fit(X)  # doctest: +ELLIPSIS
    WaveletTransform(...)
    >>> Xt = wt.transform(X)
    >>> Xt.shape
    (50, 64)
    >>> Xr = wt.inverse_transform(Xt)
    >>> Xr.shape
    (50, 8, 8)
    """

    def __init__(
        self, wavelet: str = "haar", level: int = 1, mode: str = "periodization"
    ):
        self.wavelet = wavelet
        self.level = level
        self.mode = mode

    def fit(self, X: np.ndarray) -> WaveletTransform:
        """Compute and store coefficient layout from the first sample.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, H, W)``
            Training images.  Only the first sample is used to determine the
            coefficient array shape; the data values are not retained.

        Returns
        -------
        self : WaveletTransform
        """
        import pywt

        self.pywt_ = pywt
        self.original_shape_ = X.shape  # store (N, H, W) for reference
        test = X[0]  # single image of shape (H, W)
        coeffs = pywt.wavedec2(test, self.wavelet, level=self.level, mode=self.mode)
        self.coeff_slices_ = None
        # coeffs_to_array packs all sub-bands into one 2-D array
        arr, self.coeff_slices_ = pywt.coeffs_to_array(coeffs)
        self.coeff_shape_ = arr.shape  # e.g. (H, W) for periodization
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Decompose images into flattened wavelet coefficients.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, H, W)``
            Images to transform.

        Returns
        -------
        Xt : np.ndarray, shape ``(N, H*W)``
            Flattened coefficient vectors, one row per image.
        """
        import pywt

        result = []
        for xi in X:  # xi shape: (H, W)
            coeffs = pywt.wavedec2(xi, self.wavelet, level=self.level, mode=self.mode)
            arr, _ = pywt.coeffs_to_array(coeffs)  # arr shape: coeff_shape_
            result.append(arr.ravel())  # flatten to 1-D coefficient vector
        return np.array(result)  # (N, H*W)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Reconstruct images from flattened wavelet coefficients.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, H*W)``
            Flattened coefficient vectors produced by :meth:`transform`.

        Returns
        -------
        Xr : np.ndarray, shape ``(N, H, W)``
            Reconstructed images.
        """
        import pywt

        result = []
        for xi in X:  # xi shape: (H*W,)
            arr = xi.reshape(self.coeff_shape_)  # restore 2-D coefficient array
            coeffs = pywt.array_to_coeffs(
                arr, self.coeff_slices_, output_format="wavedec2"
            )
            img = pywt.waverec2(coeffs, self.wavelet, mode=self.mode)  # (H, W)
            result.append(img)
        return np.array(result)  # (N, H, W)

`fit(X)`

Compute and store coefficient layout from the first sample.

Parameters

X : np.ndarray, shape (N, H, W) Training images. Only the first sample is used to determine the coefficient array shape; the data values are not retained.

Returns

self : WaveletTransform

Source code in rbig/_src/image.py

def fit(self, X: np.ndarray) -> WaveletTransform:
    """Compute and store coefficient layout from the first sample.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, H, W)``
        Training images.  Only the first sample is used to determine the
        coefficient array shape; the data values are not retained.

    Returns
    -------
    self : WaveletTransform
    """
    import pywt

    self.pywt_ = pywt
    self.original_shape_ = X.shape  # store (N, H, W) for reference
    test = X[0]  # single image of shape (H, W)
    coeffs = pywt.wavedec2(test, self.wavelet, level=self.level, mode=self.mode)
    self.coeff_slices_ = None
    # coeffs_to_array packs all sub-bands into one 2-D array
    arr, self.coeff_slices_ = pywt.coeffs_to_array(coeffs)
    self.coeff_shape_ = arr.shape  # e.g. (H, W) for periodization
    return self

`inverse_transform(X)`

Reconstruct images from flattened wavelet coefficients.

Parameters

X : np.ndarray, shape (N, H*W) Flattened coefficient vectors produced by :meth:transform.

Returns

Xr : np.ndarray, shape (N, H, W) Reconstructed images.

Source code in rbig/_src/image.py

def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Reconstruct images from flattened wavelet coefficients.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, H*W)``
        Flattened coefficient vectors produced by :meth:`transform`.

    Returns
    -------
    Xr : np.ndarray, shape ``(N, H, W)``
        Reconstructed images.
    """
    import pywt

    result = []
    for xi in X:  # xi shape: (H*W,)
        arr = xi.reshape(self.coeff_shape_)  # restore 2-D coefficient array
        coeffs = pywt.array_to_coeffs(
            arr, self.coeff_slices_, output_format="wavedec2"
        )
        img = pywt.waverec2(coeffs, self.wavelet, mode=self.mode)  # (H, W)
        result.append(img)
    return np.array(result)  # (N, H, W)

`transform(X)`

Decompose images into flattened wavelet coefficients.

Parameters

X : np.ndarray, shape (N, H, W) Images to transform.

Returns

Xt : np.ndarray, shape (N, H*W) Flattened coefficient vectors, one row per image.

Source code in rbig/_src/image.py

def transform(self, X: np.ndarray) -> np.ndarray:
    """Decompose images into flattened wavelet coefficients.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, H, W)``
        Images to transform.

    Returns
    -------
    Xt : np.ndarray, shape ``(N, H*W)``
        Flattened coefficient vectors, one row per image.
    """
    import pywt

    result = []
    for xi in X:  # xi shape: (H, W)
        coeffs = pywt.wavedec2(xi, self.wavelet, level=self.level, mode=self.mode)
        arr, _ = pywt.coeffs_to_array(coeffs)  # arr shape: coeff_shape_
        result.append(arr.ravel())  # flatten to 1-D coefficient vector
    return np.array(result)  # (N, H*W)

`XarrayRBIG`

RBIG model with an xarray-aware interface.

Wraps an :class:~rbig._src.model.AnnealedRBIG (or compatible class) so that it can be fitted and applied directly to :class:xarray.DataArray / :class:xarray.Dataset objects with spatiotemporal dimensions. The underlying model operates on a 2-D (samples, features) matrix obtained via :func:xr_st_to_matrix.

Parameters

n_layers : int, default 100 Maximum number of RBIG layers. strategy : list or None, default None Rotation strategy list passed to the underlying RBIG model. If None, the default rotation of the model class is used. tol : float, default 1e-5 Convergence tolerance for early stopping. random_state : int or None, default None Random seed for reproducibility. rbig_class : class or None, default None RBIG model class to instantiate. Defaults to :class:~rbig._src.model.AnnealedRBIG when None. rbig_kwargs : dict or None, default None Additional keyword arguments forwarded to rbig_class.

Attributes

model_ : AnnealedRBIG The fitted underlying RBIG model. meta_ : dict xarray metadata captured during :meth:fit, used to reconstruct output arrays.

Examples

import numpy as np import xarray as xr rng = np.random.default_rng(0) da = xr.DataArray( ... rng.standard_normal((30, 4, 5)), ... dims=["time", "lat", "lon"], ... ) xrbig = XarrayRBIG(n_layers=10, random_state=0)

info = xrbig.fit(da)

da_t = xrbig.transform(da)

Source code in rbig/_src/xarray_st.py

class XarrayRBIG:
    """RBIG model with an xarray-aware interface.

    Wraps an :class:`~rbig._src.model.AnnealedRBIG` (or compatible class) so
    that it can be fitted and applied directly to :class:`xarray.DataArray` /
    :class:`xarray.Dataset` objects with spatiotemporal dimensions.  The
    underlying model operates on a 2-D ``(samples, features)`` matrix obtained
    via :func:`xr_st_to_matrix`.

    Parameters
    ----------
    n_layers : int, default 100
        Maximum number of RBIG layers.
    strategy : list or None, default None
        Rotation strategy list passed to the underlying RBIG model.  If
        ``None``, the default rotation of the model class is used.
    tol : float, default 1e-5
        Convergence tolerance for early stopping.
    random_state : int or None, default None
        Random seed for reproducibility.
    rbig_class : class or None, default None
        RBIG model class to instantiate.  Defaults to
        :class:`~rbig._src.model.AnnealedRBIG` when ``None``.
    rbig_kwargs : dict or None, default None
        Additional keyword arguments forwarded to ``rbig_class``.

    Attributes
    ----------
    model_ : AnnealedRBIG
        The fitted underlying RBIG model.
    meta_ : dict
        xarray metadata captured during :meth:`fit`, used to reconstruct
        output arrays.

    Examples
    --------
    >>> import numpy as np
    >>> import xarray as xr
    >>> rng = np.random.default_rng(0)
    >>> da = xr.DataArray(
    ...     rng.standard_normal((30, 4, 5)),
    ...     dims=["time", "lat", "lon"],
    ... )
    >>> xrbig = XarrayRBIG(n_layers=10, random_state=0)
    >>> # info = xrbig.fit(da)
    >>> # da_t = xrbig.transform(da)
    """

    def __init__(
        self,
        n_layers: int = 100,
        strategy: list | None = None,
        tol: float = 1e-5,
        random_state: int | None = None,
        rbig_class=None,
        rbig_kwargs: dict | None = None,
    ):
        self.n_layers = n_layers
        self.strategy = strategy
        self.tol = tol
        self.random_state = random_state
        self.rbig_class = rbig_class
        self.rbig_kwargs = rbig_kwargs or {}

    def fit(self, X) -> dict:
        """Fit the RBIG model to xarray data and return an information summary.

        Parameters
        ----------
        X : xr.DataArray or xr.Dataset
            Input spatiotemporal data.  Internally converted to a 2-D matrix
            via :func:`xr_st_to_matrix`.

        Returns
        -------
        info : dict
            Dictionary of RBIG information metrics (e.g. total correlation,
            entropy estimates) as returned by
            :func:`~rbig._src.metrics.information_summary`.
        """
        from rbig._src.metrics import information_summary
        from rbig._src.model import AnnealedRBIG

        rbig_cls = self.rbig_class if self.rbig_class is not None else AnnealedRBIG
        kwargs = {
            "n_layers": self.n_layers,
            "tol": self.tol,
            "random_state": self.random_state,
        }
        if self.strategy is not None:
            kwargs["strategy"] = self.strategy
        kwargs.update(self.rbig_kwargs)

        # Convert xarray → (n_samples, n_features) matrix and store metadata
        matrix, self.meta_ = xr_st_to_matrix(X)
        self.model_ = rbig_cls(**kwargs)
        self.model_.fit(matrix)
        return information_summary(self.model_, matrix)

    def transform(self, X):
        """Gaussianise samples and return an xarray object.

        Applies the fitted RBIG transform to ``X``, then reconstructs the
        original xarray structure.  Original coordinates and DataArray name
        are re-attached when possible.

        Parameters
        ----------
        X : xr.DataArray or xr.Dataset
            Data to transform.  Must have the same structure as the data
            passed to :meth:`fit`.

        Returns
        -------
        out : xr.DataArray or xr.Dataset
            Gaussianised data with the same shape and dimension names as ``X``.
        """
        matrix, _ = xr_st_to_matrix(X)
        Xt = self.model_.transform(matrix)
        out = matrix_to_xr_st(Xt, self.meta_)
        # Re-attach original xarray coordinates and name when available
        if hasattr(X, "assign_coords") and hasattr(X, "coords"):
            try:
                out = out.assign_coords(X.coords)
            except Exception:
                pass
        if hasattr(X, "name") and hasattr(out, "name") and X.name is not None:
            try:
                out.name = X.name
            except Exception:
                pass
        return out

    def score_samples(self, X):
        """Compute per-sample log-probability log p(x).

        Parameters
        ----------
        X : xr.DataArray or xr.Dataset
            Input data.

        Returns
        -------
        log_prob : np.ndarray, shape ``(n_samples,)``
            Log-probability of each sample under the fitted RBIG model.
        """
        matrix, _ = xr_st_to_matrix(X)
        return self.model_.score_samples(matrix)

    def mutual_information(self, X, Y) -> float:
        """Estimate mutual information between two xarray variables via RBIG.

        Fits independent RBIG models to ``X``, ``Y``, and their concatenation
        ``[X, Y]``, then computes:

        .. math::

            \\mathrm{MI}(X;\\,Y)
            = H(X) + H(Y) - H(X,\\,Y)

        where each differential entropy :math:`H` is estimated from the RBIG
        log-determinant accumulation.

        Parameters
        ----------
        X : xr.DataArray or xr.Dataset
            First variable.
        Y : xr.DataArray or xr.Dataset
            Second variable.  Must have the same number of samples as ``X``
            after flattening.

        Returns
        -------
        mi : float
            Estimated mutual information in nats.

        Notes
        -----
        All three RBIG models share the same ``n_layers``, ``tol``, and
        ``random_state`` settings as the parent :class:`XarrayRBIG` instance.

        Examples
        --------
        >>> import numpy as np
        >>> import xarray as xr
        >>> rng = np.random.default_rng(0)
        >>> x = xr.DataArray(rng.standard_normal((50, 1)), dims=["time", "f"])
        >>> y = xr.DataArray(rng.standard_normal((50, 1)), dims=["time", "f"])
        >>> xrbig = XarrayRBIG(n_layers=5, random_state=0)
        >>> # mi = xrbig.mutual_information(x, y)
        """
        from rbig._src.metrics import entropy_rbig
        from rbig._src.model import AnnealedRBIG

        rbig_cls = self.rbig_class if self.rbig_class is not None else AnnealedRBIG
        kwargs = {
            "n_layers": self.n_layers,
            "tol": self.tol,
            "random_state": self.random_state,
        }
        kwargs.update(self.rbig_kwargs)

        # Flatten both variables to 2-D matrices
        X_mat, _ = xr_st_to_matrix(X)
        Y_mat, _ = xr_st_to_matrix(Y)
        XY_mat = np.hstack([X_mat, Y_mat])  # joint representation (n, dx + dy)

        # Fit three separate RBIG models for H(X), H(Y), H(X,Y)
        mx = rbig_cls(**kwargs).fit(X_mat)
        my = rbig_cls(**kwargs).fit(Y_mat)
        mxy = rbig_cls(**kwargs).fit(XY_mat)

        hx = entropy_rbig(mx, X_mat)  # H(X)
        hy = entropy_rbig(my, Y_mat)  # H(Y)
        hxy = entropy_rbig(mxy, XY_mat)  # H(X, Y)
        # MI(X;Y) = H(X) + H(Y) - H(X,Y)
        return float(hx + hy - hxy)

`fit(X)`

Fit the RBIG model to xarray data and return an information summary.

Parameters

X : xr.DataArray or xr.Dataset Input spatiotemporal data. Internally converted to a 2-D matrix via :func:xr_st_to_matrix.

Returns

info : dict Dictionary of RBIG information metrics (e.g. total correlation, entropy estimates) as returned by :func:~rbig._src.metrics.information_summary.

Source code in rbig/_src/xarray_st.py

def fit(self, X) -> dict:
    """Fit the RBIG model to xarray data and return an information summary.

    Parameters
    ----------
    X : xr.DataArray or xr.Dataset
        Input spatiotemporal data.  Internally converted to a 2-D matrix
        via :func:`xr_st_to_matrix`.

    Returns
    -------
    info : dict
        Dictionary of RBIG information metrics (e.g. total correlation,
        entropy estimates) as returned by
        :func:`~rbig._src.metrics.information_summary`.
    """
    from rbig._src.metrics import information_summary
    from rbig._src.model import AnnealedRBIG

    rbig_cls = self.rbig_class if self.rbig_class is not None else AnnealedRBIG
    kwargs = {
        "n_layers": self.n_layers,
        "tol": self.tol,
        "random_state": self.random_state,
    }
    if self.strategy is not None:
        kwargs["strategy"] = self.strategy
    kwargs.update(self.rbig_kwargs)

    # Convert xarray → (n_samples, n_features) matrix and store metadata
    matrix, self.meta_ = xr_st_to_matrix(X)
    self.model_ = rbig_cls(**kwargs)
    self.model_.fit(matrix)
    return information_summary(self.model_, matrix)

`mutual_information(X, Y)`

Estimate mutual information between two xarray variables via RBIG.

Fits independent RBIG models to X, Y, and their concatenation [X, Y], then computes:

.. math::

\mathrm{MI}(X;\,Y)
= H(X) + H(Y) - H(X,\,Y)

where each differential entropy :math:H is estimated from the RBIG log-determinant accumulation.

Parameters

X : xr.DataArray or xr.Dataset First variable. Y : xr.DataArray or xr.Dataset Second variable. Must have the same number of samples as X after flattening.

Returns

mi : float Estimated mutual information in nats.

Notes

All three RBIG models share the same n_layers, tol, and random_state settings as the parent :class:XarrayRBIG instance.

Examples

import numpy as np import xarray as xr rng = np.random.default_rng(0) x = xr.DataArray(rng.standard_normal((50, 1)), dims=["time", "f"]) y = xr.DataArray(rng.standard_normal((50, 1)), dims=["time", "f"]) xrbig = XarrayRBIG(n_layers=5, random_state=0)

mi = xrbig.mutual_information(x, y)

Source code in rbig/_src/xarray_st.py

def mutual_information(self, X, Y) -> float:
    """Estimate mutual information between two xarray variables via RBIG.

    Fits independent RBIG models to ``X``, ``Y``, and their concatenation
    ``[X, Y]``, then computes:

    .. math::

        \\mathrm{MI}(X;\\,Y)
        = H(X) + H(Y) - H(X,\\,Y)

    where each differential entropy :math:`H` is estimated from the RBIG
    log-determinant accumulation.

    Parameters
    ----------
    X : xr.DataArray or xr.Dataset
        First variable.
    Y : xr.DataArray or xr.Dataset
        Second variable.  Must have the same number of samples as ``X``
        after flattening.

    Returns
    -------
    mi : float
        Estimated mutual information in nats.

    Notes
    -----
    All three RBIG models share the same ``n_layers``, ``tol``, and
    ``random_state`` settings as the parent :class:`XarrayRBIG` instance.

    Examples
    --------
    >>> import numpy as np
    >>> import xarray as xr
    >>> rng = np.random.default_rng(0)
    >>> x = xr.DataArray(rng.standard_normal((50, 1)), dims=["time", "f"])
    >>> y = xr.DataArray(rng.standard_normal((50, 1)), dims=["time", "f"])
    >>> xrbig = XarrayRBIG(n_layers=5, random_state=0)
    >>> # mi = xrbig.mutual_information(x, y)
    """
    from rbig._src.metrics import entropy_rbig
    from rbig._src.model import AnnealedRBIG

    rbig_cls = self.rbig_class if self.rbig_class is not None else AnnealedRBIG
    kwargs = {
        "n_layers": self.n_layers,
        "tol": self.tol,
        "random_state": self.random_state,
    }
    kwargs.update(self.rbig_kwargs)

    # Flatten both variables to 2-D matrices
    X_mat, _ = xr_st_to_matrix(X)
    Y_mat, _ = xr_st_to_matrix(Y)
    XY_mat = np.hstack([X_mat, Y_mat])  # joint representation (n, dx + dy)

    # Fit three separate RBIG models for H(X), H(Y), H(X,Y)
    mx = rbig_cls(**kwargs).fit(X_mat)
    my = rbig_cls(**kwargs).fit(Y_mat)
    mxy = rbig_cls(**kwargs).fit(XY_mat)

    hx = entropy_rbig(mx, X_mat)  # H(X)
    hy = entropy_rbig(my, Y_mat)  # H(Y)
    hxy = entropy_rbig(mxy, XY_mat)  # H(X, Y)
    # MI(X;Y) = H(X) + H(Y) - H(X,Y)
    return float(hx + hy - hxy)

`score_samples(X)`

Compute per-sample log-probability log p(x).

Parameters

X : xr.DataArray or xr.Dataset Input data.

Returns

log_prob : np.ndarray, shape (n_samples,) Log-probability of each sample under the fitted RBIG model.

Source code in rbig/_src/xarray_st.py

def score_samples(self, X):
    """Compute per-sample log-probability log p(x).

    Parameters
    ----------
    X : xr.DataArray or xr.Dataset
        Input data.

    Returns
    -------
    log_prob : np.ndarray, shape ``(n_samples,)``
        Log-probability of each sample under the fitted RBIG model.
    """
    matrix, _ = xr_st_to_matrix(X)
    return self.model_.score_samples(matrix)

`transform(X)`

Gaussianise samples and return an xarray object.

Applies the fitted RBIG transform to X, then reconstructs the original xarray structure. Original coordinates and DataArray name are re-attached when possible.

Parameters

X : xr.DataArray or xr.Dataset Data to transform. Must have the same structure as the data passed to :meth:fit.

Returns

out : xr.DataArray or xr.Dataset Gaussianised data with the same shape and dimension names as X.

Source code in rbig/_src/xarray_st.py

def transform(self, X):
    """Gaussianise samples and return an xarray object.

    Applies the fitted RBIG transform to ``X``, then reconstructs the
    original xarray structure.  Original coordinates and DataArray name
    are re-attached when possible.

    Parameters
    ----------
    X : xr.DataArray or xr.Dataset
        Data to transform.  Must have the same structure as the data
        passed to :meth:`fit`.

    Returns
    -------
    out : xr.DataArray or xr.Dataset
        Gaussianised data with the same shape and dimension names as ``X``.
    """
    matrix, _ = xr_st_to_matrix(X)
    Xt = self.model_.transform(matrix)
    out = matrix_to_xr_st(Xt, self.meta_)
    # Re-attach original xarray coordinates and name when available
    if hasattr(X, "assign_coords") and hasattr(X, "coords"):
        try:
            out = out.assign_coords(X.coords)
        except Exception:
            pass
    if hasattr(X, "name") and hasattr(out, "name") and X.name is not None:
        try:
            out.name = X.name
        except Exception:
            pass
    return out

`beta(n_samples=1000, a=2.0, b=2.0, random_state=None)`

Sample from a Beta distribution on the interval (0, 1).

Parameters

n_samples : int, optional (default=1000) Number of samples to draw. a : float, optional (default=2.0) First shape parameter α > 0. b : float, optional (default=2.0) Second shape parameter β > 0. random_state : int or None, optional (default=None) Seed for the random number generator.

Returns

samples : np.ndarray of shape (n_samples,) Independent draws from Beta(a, b). Values are in (0, 1).

Examples

from rbig._src.parametric import beta x = beta(n_samples=400, a=2.0, b=5.0, random_state=4) x.shape (400,) import numpy as np np.all((x > 0) & (x < 1)) True

Source code in rbig/_src/parametric.py

def beta(
    n_samples: int = 1000,
    a: float = 2.0,
    b: float = 2.0,
    random_state: int | None = None,
) -> np.ndarray:
    """Sample from a Beta distribution on the interval (0, 1).

    Parameters
    ----------
    n_samples : int, optional (default=1000)
        Number of samples to draw.
    a : float, optional (default=2.0)
        First shape parameter α > 0.
    b : float, optional (default=2.0)
        Second shape parameter β > 0.
    random_state : int or None, optional (default=None)
        Seed for the random number generator.

    Returns
    -------
    samples : np.ndarray of shape (n_samples,)
        Independent draws from Beta(a, b).  Values are in (0, 1).

    Examples
    --------
    >>> from rbig._src.parametric import beta
    >>> x = beta(n_samples=400, a=2.0, b=5.0, random_state=4)
    >>> x.shape
    (400,)
    >>> import numpy as np
    >>> np.all((x > 0) & (x < 1))
    True
    """
    rng = np.random.default_rng(random_state)
    return rng.beta(a=a, b=b, size=n_samples)

`bin_estimation(n_samples, rule='sturge')`

Estimate the number of histogram bins for a given sample size.

Parameters

n_samples : int Number of data samples. rule : str, optional (default="sturge") Bin estimation rule. One of:

- ``"sturge"`` : ``ceil(1 + log2(n))``
- ``"sqrt"``   : ``ceil(sqrt(n))``
- ``"rice"``   : ``ceil(2 * n^(1/3))``

Returns

n_bins : int Estimated number of bins (minimum 1).

Raises

ValueError If rule is not one of the recognised strings or n_samples < 1.

Source code in rbig/_src/densities.py

def bin_estimation(n_samples: int, rule: str = "sturge") -> int:
    """Estimate the number of histogram bins for a given sample size.

    Parameters
    ----------
    n_samples : int
        Number of data samples.
    rule : str, optional (default="sturge")
        Bin estimation rule. One of:

        - ``"sturge"`` : ``ceil(1 + log2(n))``
        - ``"sqrt"``   : ``ceil(sqrt(n))``
        - ``"rice"``   : ``ceil(2 * n^(1/3))``

    Returns
    -------
    n_bins : int
        Estimated number of bins (minimum 1).

    Raises
    ------
    ValueError
        If *rule* is not one of the recognised strings or *n_samples* < 1.
    """
    if n_samples < 1:
        raise ValueError(f"n_samples must be >= 1, got {n_samples}")
    if rule == "sturge":
        n_bins = 1 + np.log2(n_samples)
    elif rule == "sqrt":
        n_bins = np.sqrt(n_samples)
    elif rule == "rice":
        n_bins = 2 * n_samples ** (1 / 3)
    else:
        raise ValueError(
            f"Unknown bin estimation rule {rule!r}. "
            f"Choose from 'sturge', 'sqrt', 'rice'."
        )
    return max(1, int(np.ceil(n_bins)))

`check_density(model, X, n_grid=1000)`

Verify that the model density approximately integrates to one.

Uses importance sampling with a standard-Gaussian proposal q = 𝒩(0, I):

𝔼_q[p(x)/q(x)] = ∫ p(x) dx ≈ 1

An estimate close to 1 confirms that the model is a properly normalised density; values far from 1 suggest numerical issues in the model.

Parameters

model : fitted RBIG model Must expose a score_samples(Z) method that returns log p(z) for each row of Z. X : np.ndarray of shape (n_samples, n_features) Reference data used only to infer the feature dimensionality. n_grid : int, optional (default=1000) Number of Monte-Carlo samples drawn from the proposal q.

Returns

estimate : float Importance-weighted average 𝔼_q[p(x)/q(x)]. Should be close to 1 for a well-calibrated model.

Notes

Log-ratios log(p/q) are clipped to [−500, 500] before exponentiation to prevent numerical overflow or underflow.

Examples

import numpy as np from rbig._src.densities import check_density

A mock model that returns N(0,I) log-density

class GaussModel: ... def score_samples(self, Z): ... return -0.5 * np.sum(Z**2, axis=1) - Z.shape[1] * 0.5 * np.log( ... 2 * np.pi ... ) X_ref = np.zeros((10, 2)) est = check_density(GaussModel(), X_ref, n_grid=5000) np.isclose(est, 1.0, atol=0.15) True

Source code in rbig/_src/densities.py

def check_density(model, X: np.ndarray, n_grid: int = 1000) -> float:
    """Verify that the model density approximately integrates to one.

    Uses importance sampling with a standard-Gaussian proposal q = 𝒩(0, I):

        𝔼_q[p(x)/q(x)] = ∫ p(x) dx ≈ 1

    An estimate close to 1 confirms that the model is a properly normalised
    density; values far from 1 suggest numerical issues in the model.

    Parameters
    ----------
    model : fitted RBIG model
        Must expose a ``score_samples(Z)`` method that returns
        log p(z) for each row of Z.
    X : np.ndarray of shape (n_samples, n_features)
        Reference data used only to infer the feature dimensionality.
    n_grid : int, optional (default=1000)
        Number of Monte-Carlo samples drawn from the proposal q.

    Returns
    -------
    estimate : float
        Importance-weighted average 𝔼_q[p(x)/q(x)].  Should be close to 1
        for a well-calibrated model.

    Notes
    -----
    Log-ratios log(p/q) are clipped to [−500, 500] before exponentiation to
    prevent numerical overflow or underflow.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.densities import check_density
    >>> # A mock model that returns N(0,I) log-density
    >>> class GaussModel:
    ...     def score_samples(self, Z):
    ...         return -0.5 * np.sum(Z**2, axis=1) - Z.shape[1] * 0.5 * np.log(
    ...             2 * np.pi
    ...         )
    >>> X_ref = np.zeros((10, 2))
    >>> est = check_density(GaussModel(), X_ref, n_grid=5000)
    >>> np.isclose(est, 1.0, atol=0.15)
    True
    """
    # Sample from the proposal q = N(0, I_d)
    Z = stats.norm.rvs(size=(n_grid, X.shape[1]))  # shape (n_grid, d)
    log_p = model.score_samples(Z)  # log p(z), shape (n_grid,)
    log_q = np.sum(stats.norm.logpdf(Z), axis=1)  # log q(z) = ∑ᵢ log N(zᵢ)
    log_ratio = log_p - log_q  # log(p/q), shape (n_grid,)
    log_ratio = np.clip(log_ratio, -500, 500)  # numerical safety
    return float(np.mean(np.exp(log_ratio)))  # E_q[p/q] ~= integral p dx

`dirichlet(n_samples=1000, alpha=None, random_state=None)`

Sample from a Dirichlet distribution.

The Dirichlet is a multivariate generalisation of the Beta distribution. Each sample is a probability vector (non-negative, sums to 1).

Parameters

n_samples : int, optional (default=1000) Number of samples to draw. alpha : np.ndarray of shape (k,) or None, optional Concentration parameter vector α. All entries must be > 0. Defaults to np.ones(3) (uniform Dirichlet on 3-simplex). random_state : int or None, optional (default=None) Seed for the random number generator.

Returns

samples : np.ndarray of shape (n_samples, k) Independent draws from Dir(alpha). Each row sums to 1 and all entries are non-negative.

Examples

import numpy as np from rbig._src.parametric import dirichlet x = dirichlet(n_samples=200, alpha=np.array([1.0, 2.0, 3.0]), random_state=8) x.shape (200, 3) np.allclose(x.sum(axis=1), 1.0) True

Source code in rbig/_src/parametric.py

def dirichlet(
    n_samples: int = 1000,
    alpha: np.ndarray | None = None,
    random_state: int | None = None,
) -> np.ndarray:
    """Sample from a Dirichlet distribution.

    The Dirichlet is a multivariate generalisation of the Beta distribution.
    Each sample is a probability vector (non-negative, sums to 1).

    Parameters
    ----------
    n_samples : int, optional (default=1000)
        Number of samples to draw.
    alpha : np.ndarray of shape (k,) or None, optional
        Concentration parameter vector α.  All entries must be > 0.
        Defaults to ``np.ones(3)`` (uniform Dirichlet on 3-simplex).
    random_state : int or None, optional (default=None)
        Seed for the random number generator.

    Returns
    -------
    samples : np.ndarray of shape (n_samples, k)
        Independent draws from Dir(alpha).  Each row sums to 1 and all
        entries are non-negative.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import dirichlet
    >>> x = dirichlet(n_samples=200, alpha=np.array([1.0, 2.0, 3.0]), random_state=8)
    >>> x.shape
    (200, 3)
    >>> np.allclose(x.sum(axis=1), 1.0)
    True
    """
    rng = np.random.default_rng(random_state)
    if alpha is None:
        alpha = np.ones(3)  # default: uniform on 3-simplex
    return rng.dirichlet(alpha=alpha, size=n_samples)

`entropy_gaussian(cov)`

Analytic differential entropy of a multivariate Gaussian.

Computes the closed-form entropy of 𝒩(μ, Σ):

H(X) = ½ log|2πeΣ| = ½ (d(1 + log 2π) + log|Σ|)

where d is the dimensionality and |·| denotes the matrix determinant. The mean μ does not affect the entropy.

Parameters

cov : np.ndarray of shape (d, d) or (1,) for scalar Covariance matrix Σ (or variance for d=1). Coerced to at least 2-D via np.atleast_2d.

Returns

entropy : float Differential entropy in nats. Returns -inf if Σ is singular or not positive definite.

Notes

np.linalg.slogdet is used for numerically stable log-determinant computation.

Examples

import numpy as np from rbig._src.parametric import entropy_gaussian

2-D standard Gaussian: H = 0.5 * 2 * (1 + log 2π) ≈ 2.838 nats

h = entropy_gaussian(np.eye(2)) np.isclose(h, 0.5 * 2 * (1 + np.log(2 * np.pi))) True

Source code in rbig/_src/parametric.py

def entropy_gaussian(cov: np.ndarray) -> float:
    """Analytic differential entropy of a multivariate Gaussian.

    Computes the closed-form entropy of 𝒩(μ, Σ):

        H(X) = ½ log|2πeΣ| = ½ (d(1 + log 2π) + log|Σ|)

    where d is the dimensionality and |·| denotes the matrix determinant.
    The mean μ does not affect the entropy.

    Parameters
    ----------
    cov : np.ndarray of shape (d, d) or (1,) for scalar
        Covariance matrix Σ (or variance for d=1).  Coerced to at least 2-D
        via ``np.atleast_2d``.

    Returns
    -------
    entropy : float
        Differential entropy in nats.  Returns ``-inf`` if Σ is singular or
        not positive definite.

    Notes
    -----
    ``np.linalg.slogdet`` is used for numerically stable log-determinant
    computation.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import entropy_gaussian
    >>> # 2-D standard Gaussian: H = 0.5 * 2 * (1 + log 2π) ≈ 2.838 nats
    >>> h = entropy_gaussian(np.eye(2))
    >>> np.isclose(h, 0.5 * 2 * (1 + np.log(2 * np.pi)))
    True
    """
    cov = np.atleast_2d(cov)
    d = cov.shape[0]
    sign, log_det = np.linalg.slogdet(cov)  # stable log|Σ|
    if sign <= 0:
        return -np.inf  # singular covariance
    # H = ½ (d(1 + log 2π) + log|Σ|)
    return 0.5 * (d * (1 + np.log(2 * np.pi)) + log_det)

`entropy_histogram(X, rule='sturge', correction=True, base=np.e)`

Per-dimension differential entropy via histogram binning.

For each feature, computes the discrete entropy of the histogram counts and adds the log of the bin width to convert to a differential entropy estimate. Optionally applies the Miller-Maddow finite-sample bias correction.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data matrix. rule : str, optional (default="sturge") Bin count rule passed to :func:bin_estimation. correction : bool, optional (default=True) If True, apply the Miller-Maddow correction: + 0.5 * (nonzero_bins - 1) / n_samples. base : float, optional (default=np.e) Logarithm base for entropy. Use np.e for nats, 2 for bits.

Returns

H : np.ndarray of shape (n_features,) Estimated differential entropy per feature.

Source code in rbig/_src/densities.py

def entropy_histogram(
    X: np.ndarray,
    rule: str = "sturge",
    correction: bool = True,
    base: float = np.e,
) -> np.ndarray:
    """Per-dimension differential entropy via histogram binning.

    For each feature, computes the discrete entropy of the histogram counts
    and adds the log of the bin width to convert to a differential entropy
    estimate.  Optionally applies the Miller-Maddow finite-sample bias
    correction.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data matrix.
    rule : str, optional (default="sturge")
        Bin count rule passed to :func:`bin_estimation`.
    correction : bool, optional (default=True)
        If True, apply the Miller-Maddow correction:
        ``+ 0.5 * (nonzero_bins - 1) / n_samples``.
    base : float, optional (default=np.e)
        Logarithm base for entropy.  Use ``np.e`` for nats, ``2`` for bits.

    Returns
    -------
    H : np.ndarray of shape (n_features,)
        Estimated differential entropy per feature.
    """
    n_samples, n_features = X.shape
    n_bins = bin_estimation(n_samples, rule=rule)
    H = np.zeros(n_features)

    for i in range(n_features):
        col = X[:, i]
        counts, bin_edges = np.histogram(col, bins=n_bins, range=(col.min(), col.max()))
        bin_width = bin_edges[1] - bin_edges[0]

        # Discrete entropy of the histogram counts
        H[i] = stats.entropy(counts, base=base)

        # Convert to differential entropy by adding log(bin_width)
        if base == np.e:
            H[i] += np.log(bin_width)
        else:
            H[i] += np.log(bin_width) / np.log(base)

        # Miller-Maddow bias correction (scaled to match entropy units)
        if correction:
            nonzero_bins = np.sum(counts > 0)
            mm = 0.5 * (nonzero_bins - 1) / n_samples
            if base != np.e:
                mm /= np.log(base)
            H[i] += mm

    return H

`entropy_marginal(X)`

Per-dimension marginal entropy using the Vasicek spacing estimator.

Applies :func:entropy_univariate independently to each column of X.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data matrix.

Returns

entropies : np.ndarray of shape (n_features,) Vasicek entropy estimate (nats) for each feature dimension.

Examples

import numpy as np from rbig._src.metrics import entropy_marginal rng = np.random.default_rng(9) X = rng.standard_normal((800, 3)) h = entropy_marginal(X) h.shape (3,)

Source code in rbig/_src/metrics.py

def entropy_marginal(X: np.ndarray) -> np.ndarray:
    """Per-dimension marginal entropy using the Vasicek spacing estimator.

    Applies :func:`entropy_univariate` independently to each column of X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data matrix.

    Returns
    -------
    entropies : np.ndarray of shape (n_features,)
        Vasicek entropy estimate (nats) for each feature dimension.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.metrics import entropy_marginal
    >>> rng = np.random.default_rng(9)
    >>> X = rng.standard_normal((800, 3))
    >>> h = entropy_marginal(X)
    >>> h.shape
    (3,)
    """
    n_features = X.shape[1]
    # Apply 1-D Vasicek estimator to each column independently
    return np.array([entropy_univariate(X[:, i]) for i in range(n_features)])

`entropy_normal_approx(X)`

Entropy via Gaussian approximation H(X) ≈ ½ log|2πeΣ|.

Fits a multivariate Gaussian with the empirical covariance of X and returns the analytic Gaussian entropy. This is a fast, closed-form approximation that is exact only when X is truly Gaussian.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data matrix.

Returns

entropy : float Gaussian approximation to the differential entropy in nats.

Examples

import numpy as np from rbig._src.metrics import entropy_normal_approx rng = np.random.default_rng(7) X = rng.standard_normal((500, 2)) h = entropy_normal_approx(X) np.isfinite(h) True

Source code in rbig/_src/metrics.py

def entropy_normal_approx(X: np.ndarray) -> float:
    """Entropy via Gaussian approximation H(X) ≈ ½ log|2πeΣ|.

    Fits a multivariate Gaussian with the empirical covariance of X and
    returns the analytic Gaussian entropy.  This is a fast, closed-form
    approximation that is exact only when X is truly Gaussian.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data matrix.

    Returns
    -------
    entropy : float
        Gaussian approximation to the differential entropy in nats.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.metrics import entropy_normal_approx
    >>> rng = np.random.default_rng(7)
    >>> X = rng.standard_normal((500, 2))
    >>> h = entropy_normal_approx(X)
    >>> np.isfinite(h)
    True
    """
    from rbig._src.densities import joint_entropy_gaussian

    return joint_entropy_gaussian(X)

`entropy_quantile_spacing(x, n_quantiles=100)`

Estimate univariate differential entropy via quantile spacings.

Divides the empirical distribution into n_quantiles equal-probability intervals and uses the spacing between adjacent quantile values as a proxy for the local probability mass:

Ĥ ≈ mean_k log(Δq_k · n_quantiles)

where Δq_k = q_{k+1} − q_k are the differences between successive quantile values.

Parameters

x : np.ndarray of shape (n_samples,) 1-D array of observations. n_quantiles : int, optional (default=100) Number of equal-probability quantile intervals.

Returns

entropy : float Estimated differential entropy in nats. Returns 0.0 if all quantile spacings are zero (i.e. the data is constant).

Notes

The endpoints (0 and 1) are excluded to avoid quantile boundary effects. A small constant (1e-300) guards against log(0) for repeated quantile values.

Examples

import numpy as np from rbig._src.metrics import entropy_quantile_spacing rng = np.random.default_rng(4) x = rng.standard_normal(2000) h = entropy_quantile_spacing(x, n_quantiles=200)

Gaussian entropy ≈ 1.419 nats

np.isclose(h, 0.5 * (1 + np.log(2 * np.pi)), atol=0.15) True

Source code in rbig/_src/metrics.py

def entropy_quantile_spacing(x: np.ndarray, n_quantiles: int = 100) -> float:
    """Estimate univariate differential entropy via quantile spacings.

    Divides the empirical distribution into ``n_quantiles`` equal-probability
    intervals and uses the spacing between adjacent quantile values as a proxy
    for the local probability mass:

        Ĥ ≈ mean_k log(Δq_k · n_quantiles)

    where Δq_k = q_{k+1} − q_k are the differences between successive
    quantile values.

    Parameters
    ----------
    x : np.ndarray of shape (n_samples,)
        1-D array of observations.
    n_quantiles : int, optional (default=100)
        Number of equal-probability quantile intervals.

    Returns
    -------
    entropy : float
        Estimated differential entropy in nats.  Returns ``0.0`` if all
        quantile spacings are zero (i.e. the data is constant).

    Notes
    -----
    The endpoints (0 and 1) are excluded to avoid quantile boundary effects.
    A small constant (1e-300) guards against log(0) for repeated quantile
    values.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.metrics import entropy_quantile_spacing
    >>> rng = np.random.default_rng(4)
    >>> x = rng.standard_normal(2000)
    >>> h = entropy_quantile_spacing(x, n_quantiles=200)
    >>> # Gaussian entropy ≈ 1.419 nats
    >>> np.isclose(h, 0.5 * (1 + np.log(2 * np.pi)), atol=0.15)
    True
    """
    # Evenly spaced probability levels, excluding 0 and 1
    quantiles = np.linspace(0, 1, n_quantiles + 2)[1:-1]  # length n_quantiles
    q = np.quantile(x, quantiles)  # empirical quantile values
    diffs = np.diff(q)  # Dq_k = q_{k+1} - q_k
    diffs = diffs[diffs > 0]  # keep positive spacings only
    if len(diffs) == 0:
        return 0.0  # constant data
    # Ĥ ≈ mean log(Δqₖ · n_quantiles)
    return float(np.mean(np.log(diffs * n_quantiles + 1e-300)))

`entropy_rbig(model, X)`

Estimate differential entropy of X using a fitted RBIG model.

Approximates the entropy via the plug-in estimator:

H(X) = −𝔼[log p(x)] ≈ −(1/N) ∑ᵢ log p(xᵢ)

where log p(xᵢ) is provided by model.score_samples.

Parameters

model : AnnealedRBIG RBIG model fitted on data from the same distribution as X. Must expose a score_samples(X) method returning per-sample log probabilities. X : np.ndarray of shape (n_samples, n_features) Evaluation data used to compute the empirical expectation.

Returns

entropy : float Estimated differential entropy in nats.

Examples

Assumes a pre-fitted AnnealedRBIG model.

h = entropy_rbig(fitted_model, X_test) h > 0 # entropy is typically positive for continuous distributions True

Source code in rbig/_src/metrics.py

def entropy_rbig(model: AnnealedRBIG, X: np.ndarray) -> float:
    """Estimate differential entropy of X using a fitted RBIG model.

    Approximates the entropy via the plug-in estimator:

        H(X) = −𝔼[log p(x)] ≈ −(1/N) ∑ᵢ log p(xᵢ)

    where log p(xᵢ) is provided by ``model.score_samples``.

    Parameters
    ----------
    model : AnnealedRBIG
        RBIG model fitted on data from the same distribution as X.  Must
        expose a ``score_samples(X)`` method returning per-sample log
        probabilities.
    X : np.ndarray of shape (n_samples, n_features)
        Evaluation data used to compute the empirical expectation.

    Returns
    -------
    entropy : float
        Estimated differential entropy in nats.

    Examples
    --------
    >>> # Assumes a pre-fitted AnnealedRBIG model.
    >>> h = entropy_rbig(fitted_model, X_test)
    >>> h > 0  # entropy is typically positive for continuous distributions
    True
    """
    log_probs = model.score_samples(X)  # log p(xᵢ) for each sample, shape (N,)
    return float(-np.mean(log_probs))  # H ~= -(1/N) sum log p(xi)

`entropy_reduction(X_before, X_after)`

Compute the reduction in Total Correlation between two representations.

Measures how much statistical dependence is removed by a transformation:

ΔTC = TC(X_before) − TC(X_after)

A positive value indicates that the transformation has reduced the total correlation (i.e. made the dimensions more independent).

Parameters

X_before : np.ndarray of shape (n_samples, n_features) Data matrix before the transformation. X_after : np.ndarray of shape (n_samples, n_features) Data matrix after the transformation. Must have the same number of samples and features as X_before.

Returns

delta_tc : float Reduction in total correlation (nats). Positive values indicate increased statistical independence after the transformation.

Notes

Both TC values are estimated using :func:total_correlation, which internally uses KDE for marginal entropies and a Gaussian approximation for the joint entropy.

Examples

import numpy as np from rbig._src.densities import entropy_reduction rng = np.random.default_rng(1)

Correlated before, independent after whitening

A = np.array([[1.0, 0.9], [0.9, 1.0]]) X = rng.multivariate_normal([0, 0], A, size=500) X_white = (X - X.mean(0)) / X.std(0) entropy_reduction(X, X_white) >= 0 True

Source code in rbig/_src/densities.py

def entropy_reduction(X_before: np.ndarray, X_after: np.ndarray) -> float:
    """Compute the reduction in Total Correlation between two representations.

    Measures how much statistical dependence is removed by a transformation:

        ΔTC = TC(X_before) − TC(X_after)

    A positive value indicates that the transformation has reduced the total
    correlation (i.e. made the dimensions more independent).

    Parameters
    ----------
    X_before : np.ndarray of shape (n_samples, n_features)
        Data matrix before the transformation.
    X_after : np.ndarray of shape (n_samples, n_features)
        Data matrix after the transformation.  Must have the same number of
        samples and features as ``X_before``.

    Returns
    -------
    delta_tc : float
        Reduction in total correlation (nats).  Positive values indicate
        increased statistical independence after the transformation.

    Notes
    -----
    Both TC values are estimated using :func:`total_correlation`, which
    internally uses KDE for marginal entropies and a Gaussian approximation
    for the joint entropy.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.densities import entropy_reduction
    >>> rng = np.random.default_rng(1)
    >>> # Correlated before, independent after whitening
    >>> A = np.array([[1.0, 0.9], [0.9, 1.0]])
    >>> X = rng.multivariate_normal([0, 0], A, size=500)
    >>> X_white = (X - X.mean(0)) / X.std(0)
    >>> entropy_reduction(X, X_white) >= 0
    True
    """
    tc_before = total_correlation(X_before)  # TC of original representation
    tc_after = total_correlation(X_after)  # TC of transformed representation
    return tc_before - tc_after  # DeltaTC = TC_before - TC_after

`entropy_univariate(x)`

Univariate differential entropy via the Vasicek spacing estimator.

The Vasicek (1976) estimator uses order-statistic spacings:

Ĥ = (1/N) ∑ᵢ log[ (N / (2m)) · (x_{(i+m)} − x_{(i−m)}) ]

where N is the sample size, m = ⌊√(N/2)⌋ is the window half-width, and x_{(i)} denotes the i-th order statistic. In practice the spacings are formed as x_sorted[m:] − x_sorted[:N−m].

Parameters

x : np.ndarray of shape (n_samples,) 1-D array of observations.

Returns

entropy : float Estimated differential entropy in nats.

Notes

A small constant (1e-300) is added inside the log to prevent −∞ for duplicate values. The window width m = ⌊√(N/2)⌋ follows a commonly used heuristic that balances bias and variance.

References

Vasicek, O. (1976). A test for normality based on sample entropy. Journal of the Royal Statistical Society B, 38(1), 54–59.

Examples

import numpy as np from rbig._src.metrics import entropy_univariate rng = np.random.default_rng(5) x = rng.standard_normal(1000) h = entropy_univariate(x)

Gaussian entropy is 0.5*(1 + log 2π) ≈ 1.419 nats

np.isclose(h, 0.5 * (1 + np.log(2 * np.pi)), atol=0.1) True

Source code in rbig/_src/metrics.py

def entropy_univariate(x: np.ndarray) -> float:
    """Univariate differential entropy via the Vasicek spacing estimator.

    The Vasicek (1976) estimator uses order-statistic spacings:

        Ĥ = (1/N) ∑ᵢ log[ (N / (2m)) · (x_{(i+m)} − x_{(i−m)}) ]

    where N is the sample size, m = ⌊√(N/2)⌋ is the window half-width, and
    x_{(i)} denotes the i-th order statistic.  In practice the spacings are
    formed as ``x_sorted[m:] − x_sorted[:N−m]``.

    Parameters
    ----------
    x : np.ndarray of shape (n_samples,)
        1-D array of observations.

    Returns
    -------
    entropy : float
        Estimated differential entropy in nats.

    Notes
    -----
    A small constant (1e-300) is added inside the log to prevent ``−∞`` for
    duplicate values.  The window width m = ⌊√(N/2)⌋ follows a commonly used
    heuristic that balances bias and variance.

    References
    ----------
    Vasicek, O. (1976). A test for normality based on sample entropy.
    *Journal of the Royal Statistical Society B*, 38(1), 54–59.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.metrics import entropy_univariate
    >>> rng = np.random.default_rng(5)
    >>> x = rng.standard_normal(1000)
    >>> h = entropy_univariate(x)
    >>> # Gaussian entropy is 0.5*(1 + log 2π) ≈ 1.419 nats
    >>> np.isclose(h, 0.5 * (1 + np.log(2 * np.pi)), atol=0.1)
    True
    """
    n = len(x)
    m = max(1, int(np.floor(np.sqrt(n / 2))))  # window half-width
    x_sorted = np.sort(x)  # x_{(1)}, …, x_{(N)}
    # Spacing: x_{(i+m)} - x_{(i)}, length N-m
    diffs = x_sorted[m:] - x_sorted[: n - m]
    # Ĥ = mean of log[(N/2m) · spacing]
    h = np.mean(np.log(n / (2 * m) * diffs + 1e-300))
    return float(h)

`exponential(n_samples=1000, scale=1.0, random_state=None)`

Sample from an exponential distribution with given scale (mean).

Parameters

n_samples : int, optional (default=1000) Number of samples to draw. scale : float, optional (default=1.0) Scale parameter β = 1/λ (mean of the distribution). Must be > 0. random_state : int or None, optional (default=None) Seed for the random number generator.

Returns

samples : np.ndarray of shape (n_samples,) Independent draws from Exp(1/scale). All values are non-negative.

Examples

from rbig._src.parametric import exponential x = exponential(n_samples=500, scale=2.0, random_state=1) x.shape (500,) import numpy as np np.all(x >= 0) True

Source code in rbig/_src/parametric.py

def exponential(
    n_samples: int = 1000,
    scale: float = 1.0,
    random_state: int | None = None,
) -> np.ndarray:
    """Sample from an exponential distribution with given scale (mean).

    Parameters
    ----------
    n_samples : int, optional (default=1000)
        Number of samples to draw.
    scale : float, optional (default=1.0)
        Scale parameter β = 1/λ (mean of the distribution).  Must be > 0.
    random_state : int or None, optional (default=None)
        Seed for the random number generator.

    Returns
    -------
    samples : np.ndarray of shape (n_samples,)
        Independent draws from Exp(1/scale).  All values are non-negative.

    Examples
    --------
    >>> from rbig._src.parametric import exponential
    >>> x = exponential(n_samples=500, scale=2.0, random_state=1)
    >>> x.shape
    (500,)
    >>> import numpy as np
    >>> np.all(x >= 0)
    True
    """
    rng = np.random.default_rng(random_state)
    return rng.exponential(scale=scale, size=n_samples)

`extract_patches(image, patch_size=8, step=1)`

Extract overlapping square patches from a 2-D or 3-D image.

Uses a sliding-window approach with configurable stride. For a grayscale image of shape (H, W) the number of patches along each axis is ⌊(H - patch_size) / step⌋ + 1.

Parameters

image : np.ndarray, shape (H, W) or (H, W, C) Input image. Grayscale images are 2-D; colour/multi-band images carry a trailing channel axis C. patch_size : int, default 8 Side length (in pixels) of each square patch. Both height and width of every extracted patch equal patch_size. step : int, default 1 Stride between consecutive patch origins along each spatial axis. step=1 gives maximally overlapping patches; step=patch_size gives non-overlapping (tiling) patches.

Returns

patches : np.ndarray - shape (n_patches, patch_size, patch_size) for 2-D input - shape (n_patches, patch_size, patch_size, C) for 3-D input

where ``n_patches = n_row_patches * n_col_patches`` and
``n_row_patches = ⌊(H - patch_size) / step⌋ + 1``.

Raises

ValueError If image.ndim is not 2 or 3.

Notes

The patch at row i, column j covers pixels

.. math::

\text{patch}_{i,j} = \text{image}[i \cdot s : i \cdot s + p,\;
                                      j \cdot s : j \cdot s + p]

where :math:s = step and :math:p = patch_size.

Examples

import numpy as np img = np.random.default_rng(0).standard_normal((32, 32)) patches = extract_patches(img, patch_size=8, step=8) patches.shape # 4×4 non-overlapping grid (16, 8, 8)

img_rgb = np.random.default_rng(1).standard_normal((32, 32, 3)) p = extract_patches(img_rgb, patch_size=8, step=8) p.shape (16, 8, 8, 3)

Source code in rbig/_src/image.py

def extract_patches(
    image: np.ndarray,
    patch_size: int = 8,
    step: int = 1,
) -> np.ndarray:
    """Extract overlapping square patches from a 2-D or 3-D image.

    Uses a sliding-window approach with configurable stride.  For a grayscale
    image of shape ``(H, W)`` the number of patches along each axis is
    ``⌊(H - patch_size) / step⌋ + 1``.

    Parameters
    ----------
    image : np.ndarray, shape ``(H, W)`` or ``(H, W, C)``
        Input image.  Grayscale images are 2-D; colour/multi-band images carry
        a trailing channel axis ``C``.
    patch_size : int, default 8
        Side length (in pixels) of each square patch.  Both height and width
        of every extracted patch equal ``patch_size``.
    step : int, default 1
        Stride between consecutive patch origins along each spatial axis.
        ``step=1`` gives maximally overlapping patches; ``step=patch_size``
        gives non-overlapping (tiling) patches.

    Returns
    -------
    patches : np.ndarray
        - shape ``(n_patches, patch_size, patch_size)`` for 2-D input
        - shape ``(n_patches, patch_size, patch_size, C)`` for 3-D input

        where ``n_patches = n_row_patches * n_col_patches`` and
        ``n_row_patches = ⌊(H - patch_size) / step⌋ + 1``.

    Raises
    ------
    ValueError
        If ``image.ndim`` is not 2 or 3.

    Notes
    -----
    The patch at row ``i``, column ``j`` covers pixels

    .. math::

        \\text{patch}_{i,j} = \\text{image}[i \\cdot s : i \\cdot s + p,\\;
                                              j \\cdot s : j \\cdot s + p]

    where :math:`s` = ``step`` and :math:`p` = ``patch_size``.

    Examples
    --------
    >>> import numpy as np
    >>> img = np.random.default_rng(0).standard_normal((32, 32))
    >>> patches = extract_patches(img, patch_size=8, step=8)
    >>> patches.shape  # 4×4 non-overlapping grid
    (16, 8, 8)

    >>> img_rgb = np.random.default_rng(1).standard_normal((32, 32, 3))
    >>> p = extract_patches(img_rgb, patch_size=8, step=8)
    >>> p.shape
    (16, 8, 8, 3)
    """
    if image.ndim == 2:
        H, W = image.shape
        patches = []
        for i in range(0, H - patch_size + 1, step):
            for j in range(0, W - patch_size + 1, step):
                # Slice a (patch_size, patch_size) window from the image
                patches.append(image[i : i + patch_size, j : j + patch_size])
        return np.array(patches)
    elif image.ndim == 3:
        H, W, _C = image.shape
        patches = []
        for i in range(0, H - patch_size + 1, step):
            for j in range(0, W - patch_size + 1, step):
                # Slice a (patch_size, patch_size, C) window, keeping all channels
                patches.append(image[i : i + patch_size, j : j + patch_size, :])
        return np.array(patches)
    else:
        raise ValueError(f"Image must be 2D or 3D, got {image.ndim}D")

`gamma(n_samples=1000, shape=2.0, scale=1.0, random_state=None)`

Sample from a Gamma distribution.

Parameters

n_samples : int, optional (default=1000) Number of samples to draw. shape : float, optional (default=2.0) Shape parameter k > 0 (also called α). scale : float, optional (default=1.0) Scale parameter θ > 0. The mean of the distribution is k · θ. random_state : int or None, optional (default=None) Seed for the random number generator.

Returns

samples : np.ndarray of shape (n_samples,) Independent draws from Gamma(shape, scale). All values are positive.

Examples

from rbig._src.parametric import gamma x = gamma(n_samples=500, shape=3.0, scale=2.0, random_state=0) x.shape (500,) import numpy as np np.all(x > 0) True

Source code in rbig/_src/parametric.py

def gamma(
    n_samples: int = 1000,
    shape: float = 2.0,
    scale: float = 1.0,
    random_state: int | None = None,
) -> np.ndarray:
    """Sample from a Gamma distribution.

    Parameters
    ----------
    n_samples : int, optional (default=1000)
        Number of samples to draw.
    shape : float, optional (default=2.0)
        Shape parameter k > 0 (also called α).
    scale : float, optional (default=1.0)
        Scale parameter θ > 0.  The mean of the distribution is k · θ.
    random_state : int or None, optional (default=None)
        Seed for the random number generator.

    Returns
    -------
    samples : np.ndarray of shape (n_samples,)
        Independent draws from Gamma(shape, scale).  All values are positive.

    Examples
    --------
    >>> from rbig._src.parametric import gamma
    >>> x = gamma(n_samples=500, shape=3.0, scale=2.0, random_state=0)
    >>> x.shape
    (500,)
    >>> import numpy as np
    >>> np.all(x > 0)
    True
    """
    rng = np.random.default_rng(random_state)
    return rng.gamma(shape=shape, scale=scale, size=n_samples)

`gaussian(n_samples=1000, loc=0.0, scale=1.0, random_state=None)`

Sample from a univariate Gaussian (normal) distribution.

Parameters

n_samples : int, optional (default=1000) Number of samples to draw. loc : float, optional (default=0.0) Mean μ of the distribution. scale : float, optional (default=1.0) Standard deviation σ > 0 of the distribution. random_state : int or None, optional (default=None) Seed for the random number generator. Pass an integer for reproducible results.

Returns

samples : np.ndarray of shape (n_samples,) Independent draws from 𝒩(loc, scale²).

Examples

from rbig._src.parametric import gaussian x = gaussian(n_samples=500, loc=2.0, scale=0.5, random_state=0) x.shape (500,) import numpy as np np.isclose(x.mean(), 2.0, atol=0.1) True

Source code in rbig/_src/parametric.py

def gaussian(
    n_samples: int = 1000,
    loc: float = 0.0,
    scale: float = 1.0,
    random_state: int | None = None,
) -> np.ndarray:
    """Sample from a univariate Gaussian (normal) distribution.

    Parameters
    ----------
    n_samples : int, optional (default=1000)
        Number of samples to draw.
    loc : float, optional (default=0.0)
        Mean μ of the distribution.
    scale : float, optional (default=1.0)
        Standard deviation σ > 0 of the distribution.
    random_state : int or None, optional (default=None)
        Seed for the random number generator.  Pass an integer for
        reproducible results.

    Returns
    -------
    samples : np.ndarray of shape (n_samples,)
        Independent draws from 𝒩(loc, scale²).

    Examples
    --------
    >>> from rbig._src.parametric import gaussian
    >>> x = gaussian(n_samples=500, loc=2.0, scale=0.5, random_state=0)
    >>> x.shape
    (500,)
    >>> import numpy as np
    >>> np.isclose(x.mean(), 2.0, atol=0.1)
    True
    """
    rng = np.random.default_rng(random_state)
    return rng.normal(loc=loc, scale=scale, size=n_samples)

`gaussian_entropy(n_features, cov=None)`

Analytic entropy of a multivariate Gaussian distribution.

Computes the differential entropy of 𝒩(μ, Σ):

H = ½ log|2πeΣ| = ½ (d(1 + log 2π) + log|Σ|)

where d = n_features and |·| denotes the determinant.

Parameters

n_features : int Dimensionality d of the distribution. cov : np.ndarray of shape (n_features, n_features) or None Covariance matrix Σ. If None, the identity matrix is assumed (i.e. Σ = Iₐ), and the entropy reduces to ½ · d · (1 + log 2π).

Returns

entropy : float Differential entropy in nats. Returns -inf if the covariance matrix is singular or not positive definite.

Notes

np.linalg.slogdet is used for numerically stable log-determinant computation.

Examples

import numpy as np from rbig._src.densities import gaussian_entropy

Identity covariance: H = 0.5 * d * (1 + log 2π)

h = gaussian_entropy(2, cov=np.eye(2)) np.isclose(h, 0.5 * 2 * (1 + np.log(2 * np.pi))) True

None ⟹ identity covariance assumed

gaussian_entropy(2) == gaussian_entropy(2, cov=np.eye(2)) True

Source code in rbig/_src/densities.py

def gaussian_entropy(n_features: int, cov: np.ndarray | None = None) -> float:
    """Analytic entropy of a multivariate Gaussian distribution.

    Computes the differential entropy of 𝒩(μ, Σ):

        H = ½ log|2πeΣ| = ½ (d(1 + log 2π) + log|Σ|)

    where d = ``n_features`` and |·| denotes the determinant.

    Parameters
    ----------
    n_features : int
        Dimensionality d of the distribution.
    cov : np.ndarray of shape (n_features, n_features) or None
        Covariance matrix Σ.  If ``None``, the identity matrix is assumed
        (i.e. Σ = Iₐ), and the entropy reduces to
        ``½ · d · (1 + log 2π)``.

    Returns
    -------
    entropy : float
        Differential entropy in nats.  Returns ``-inf`` if the covariance
        matrix is singular or not positive definite.

    Notes
    -----
    ``np.linalg.slogdet`` is used for numerically stable log-determinant
    computation.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.densities import gaussian_entropy
    >>> # Identity covariance: H = 0.5 * d * (1 + log 2π)
    >>> h = gaussian_entropy(2, cov=np.eye(2))
    >>> np.isclose(h, 0.5 * 2 * (1 + np.log(2 * np.pi)))
    True
    >>> # None ⟹ identity covariance assumed
    >>> gaussian_entropy(2) == gaussian_entropy(2, cov=np.eye(2))
    True
    """
    if cov is None:
        # Σ = I_d  ⟹  log|Σ| = 0
        return 0.5 * n_features * (1 + np.log(2 * np.pi))
    sign, log_det = np.linalg.slogdet(cov)  # stable log|Σ|
    if sign <= 0:
        return -np.inf  # singular covariance
    # H = ½ (d(1 + log 2π) + log|Σ|)
    return 0.5 * (n_features * (1 + np.log(2 * np.pi)) + log_det)

`generate_batches(n_samples, batch_size)`

Yield (start, end) index pairs that partition a range into batches.

Parameters

n_samples : int Total number of samples. batch_size : int Maximum number of samples per batch.

Yields

start : int Start index of the batch (inclusive). end : int End index of the batch (exclusive).

Raises

ValueError If n_samples < 0 or batch_size < 1.

Source code in rbig/_src/densities.py

def generate_batches(n_samples: int, batch_size: int) -> Iterator[tuple[int, int]]:
    """Yield (start, end) index pairs that partition a range into batches.

    Parameters
    ----------
    n_samples : int
        Total number of samples.
    batch_size : int
        Maximum number of samples per batch.

    Yields
    ------
    start : int
        Start index of the batch (inclusive).
    end : int
        End index of the batch (exclusive).

    Raises
    ------
    ValueError
        If *n_samples* < 0 or *batch_size* < 1.
    """
    if n_samples < 0:
        raise ValueError(f"n_samples must be >= 0, got {n_samples}")
    if batch_size < 1:
        raise ValueError(f"batch_size must be >= 1, got {batch_size}")
    for start in range(0, n_samples, batch_size):
        yield start, min(start + batch_size, n_samples)

`information_reduction(X_before, X_after)`

Compute reduction in Total Correlation between two representations.

Measures how much statistical dependence is removed by a transformation:

ΔTC = TC(X_before) − TC(X_after)

where TC is computed using KDE-based marginal entropies and a Gaussian approximation for the joint entropy.

Parameters

X_before : np.ndarray of shape (n_samples, n_features) Data matrix prior to the transformation. X_after : np.ndarray of shape (n_samples, n_features) Data matrix after the transformation.

Returns

delta_tc : float Reduction in total correlation (nats). A positive value indicates that the transformation increased statistical independence.

Examples

import numpy as np from rbig._src.metrics import information_reduction rng = np.random.default_rng(2) cov = np.array([[1.0, 0.8], [0.8, 1.0]]) X = rng.multivariate_normal([0, 0], cov, size=400) X_white = (X - X.mean(0)) / X.std(0) information_reduction(X, X_white) >= 0 True

Source code in rbig/_src/metrics.py

def information_reduction(X_before: np.ndarray, X_after: np.ndarray) -> float:
    """Compute reduction in Total Correlation between two representations.

    Measures how much statistical dependence is removed by a transformation:

        ΔTC = TC(X_before) − TC(X_after)

    where TC is computed using KDE-based marginal entropies and a Gaussian
    approximation for the joint entropy.

    Parameters
    ----------
    X_before : np.ndarray of shape (n_samples, n_features)
        Data matrix prior to the transformation.
    X_after : np.ndarray of shape (n_samples, n_features)
        Data matrix after the transformation.

    Returns
    -------
    delta_tc : float
        Reduction in total correlation (nats).  A positive value indicates
        that the transformation increased statistical independence.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.metrics import information_reduction
    >>> rng = np.random.default_rng(2)
    >>> cov = np.array([[1.0, 0.8], [0.8, 1.0]])
    >>> X = rng.multivariate_normal([0, 0], cov, size=400)
    >>> X_white = (X - X.mean(0)) / X.std(0)
    >>> information_reduction(X, X_white) >= 0
    True
    """
    from rbig._src.densities import (
        joint_entropy_gaussian,
        marginal_entropy as _marginal_entropy,
    )

    def _tc(X):
        # TC(X) = sum H(Xi) - H(X)
        return float(np.sum(_marginal_entropy(X)) - joint_entropy_gaussian(X))

    return _tc(X_before) - _tc(X_after)  # DeltaTC = TC_before - TC_after

`information_summary(model, X)`

Compute a summary of information-theoretic quantities from a RBIG model.

Estimates three quantities from the fitted model and the data:

entropy H(X) = −𝔼[log p(x)]
total_correlation TC = ∑ᵢ H(Xᵢ) − H(X)
neg_log_likelihood NLL = −(1/N) ∑ log p(xᵢ)

Parameters

model : AnnealedRBIG RBIG model fitted on data from the same distribution as X. Must expose score_samples(X). X : np.ndarray of shape (n_samples, n_features) Evaluation data.

Returns

summary : dict Dictionary with the following keys:

``'entropy'`` : float
    Differential entropy H(X) in nats.
``'total_correlation'`` : float
    Total correlation TC = ∑ᵢ H(Xᵢ) − H(X) in nats.
``'neg_log_likelihood'`` : float
    Average negative log-likelihood in nats.

Notes

Marginal entropies H(Xᵢ) are estimated via KDE (see :func:rbig._src.densities.marginal_entropy); joint entropy H(X) is obtained from the RBIG model.

Examples

summary = information_summary(fitted_model, X) set(summary.keys()) == {"entropy", "total_correlation", "neg_log_likelihood"} True

Source code in rbig/_src/metrics.py

def information_summary(model: AnnealedRBIG, X: np.ndarray) -> dict:
    """Compute a summary of information-theoretic quantities from a RBIG model.

    Estimates three quantities from the fitted model and the data:

    * **entropy** H(X) = −𝔼[log p(x)]
    * **total_correlation** TC = ∑ᵢ H(Xᵢ) − H(X)
    * **neg_log_likelihood** NLL = −(1/N) ∑ log p(xᵢ)

    Parameters
    ----------
    model : AnnealedRBIG
        RBIG model fitted on data from the same distribution as X.  Must
        expose ``score_samples(X)``.
    X : np.ndarray of shape (n_samples, n_features)
        Evaluation data.

    Returns
    -------
    summary : dict
        Dictionary with the following keys:

        ``'entropy'`` : float
            Differential entropy H(X) in nats.
        ``'total_correlation'`` : float
            Total correlation TC = ∑ᵢ H(Xᵢ) − H(X) in nats.
        ``'neg_log_likelihood'`` : float
            Average negative log-likelihood in nats.

    Notes
    -----
    Marginal entropies H(Xᵢ) are estimated via KDE (see
    :func:`rbig._src.densities.marginal_entropy`); joint entropy H(X) is
    obtained from the RBIG model.

    Examples
    --------
    >>> summary = information_summary(fitted_model, X)
    >>> set(summary.keys()) == {"entropy", "total_correlation", "neg_log_likelihood"}
    True
    """
    from rbig._src.densities import marginal_entropy as _marginal_entropy

    h = entropy_rbig(model, X)  # H(X) via RBIG model
    marginal_h = _marginal_entropy(X)  # H(Xᵢ) via KDE, shape (n_features,)
    tc = float(np.sum(marginal_h) - h)  # TC = sum H(Xi) - H(X)
    return {
        "entropy": h,
        "total_correlation": tc,
        "neg_log_likelihood": negative_log_likelihood(model, X),
    }

`joint_entropy_gaussian(X)`

Entropy of a multivariate Gaussian fitted to the covariance of X.

Treats the data as if it were drawn from a multivariate Gaussian with the empirical covariance Σ = cov(X) and computes the analytic entropy:

H(X) = ½ log|2πeΣ| = ½ (d(1 + log 2π) + log|Σ|)

where d is the number of features and |·| denotes the matrix determinant.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data matrix used to estimate the covariance.

Returns

entropy : float Differential entropy (nats) of the fitted Gaussian. Returns -inf if the covariance matrix is singular or not positive definite.

Notes

The log-determinant is computed via np.linalg.slogdet for numerical stability. When the matrix is singular (sign ≤ 0), -inf is returned.

Examples

import numpy as np from rbig._src.densities import joint_entropy_gaussian rng = np.random.default_rng(42) X = rng.standard_normal((1000, 2)) h = joint_entropy_gaussian(X)

For N(0, I₂): H = ½(2(1+log 2π)) ≈ 2.838 nats

np.isfinite(h) True

Source code in rbig/_src/densities.py

def joint_entropy_gaussian(X: np.ndarray) -> float:
    """Entropy of a multivariate Gaussian fitted to the covariance of X.

    Treats the data as if it were drawn from a multivariate Gaussian with the
    empirical covariance Σ = cov(X) and computes the analytic entropy:

        H(X) = ½ log|2πeΣ| = ½ (d(1 + log 2π) + log|Σ|)

    where d is the number of features and |·| denotes the matrix determinant.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data matrix used to estimate the covariance.

    Returns
    -------
    entropy : float
        Differential entropy (nats) of the fitted Gaussian.  Returns ``-inf``
        if the covariance matrix is singular or not positive definite.

    Notes
    -----
    The log-determinant is computed via ``np.linalg.slogdet`` for numerical
    stability.  When the matrix is singular (sign ≤ 0), ``-inf`` is returned.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.densities import joint_entropy_gaussian
    >>> rng = np.random.default_rng(42)
    >>> X = rng.standard_normal((1000, 2))
    >>> h = joint_entropy_gaussian(X)
    >>> # For N(0, I₂): H = ½(2(1+log 2π)) ≈ 2.838 nats
    >>> np.isfinite(h)
    True
    """
    _n, d = X.shape
    cov = np.cov(X.T)  # empirical covariance, shape (d, d)
    if d == 1:
        cov = np.array([[cov]])  # ensure 2-D for slogdet
    sign, log_det = np.linalg.slogdet(cov)  # stable log|Σ|
    if sign <= 0:
        log_det = -np.inf  # singular / not PD covariance
    # H = ½ (d(1 + log 2π) + log|Σ|)
    return 0.5 * (d * (1 + np.log(2 * np.pi)) + log_det)

`kl_divergence_gaussian(mu0, cov0, mu1, cov1)`

Analytic KL divergence KL(P₀ ‖ P₁) between two multivariate Gaussians.

Both distributions are assumed to be multivariate Gaussian:

P₀ = 𝒩(μ₀, Σ₀)   and   P₁ = 𝒩(μ₁, Σ₁)

The closed-form KL divergence is:

KL(P₀ ‖ P₁) = ½ [ tr(Σ₁⁻¹Σ₀) + (μ₁ − μ₀)ᵀ Σ₁⁻¹ (μ₁ − μ₀)
                   − d + log(|Σ₁| / |Σ₀|) ]

Parameters

mu0 : np.ndarray of shape (d,) Mean of the source distribution P₀. cov0 : np.ndarray of shape (d, d) Covariance Σ₀ of the source distribution P₀. mu1 : np.ndarray of shape (d,) Mean of the target distribution P₁. cov1 : np.ndarray of shape (d, d) Covariance Σ₁ of the target distribution P₁.

Returns

kl : float KL divergence KL(P₀ ‖ P₁) in nats. Always non-negative for valid covariance matrices.

Notes

The matrix inverse Σ₁⁻¹ is computed via np.linalg.inv; for large d a Cholesky-based approach would be more numerically stable.

Examples

import numpy as np from rbig._src.parametric import kl_divergence_gaussian

KL(P ‖ P) = 0 for identical distributions

mu = np.array([1.0, 2.0]) cov = np.array([[2.0, 0.5], [0.5, 1.5]]) kl = kl_divergence_gaussian(mu, cov, mu, cov) np.isclose(kl, 0.0, atol=1e-10) True

Source code in rbig/_src/parametric.py

def kl_divergence_gaussian(
    mu0: np.ndarray,
    cov0: np.ndarray,
    mu1: np.ndarray,
    cov1: np.ndarray,
) -> float:
    """Analytic KL divergence KL(P₀ ‖ P₁) between two multivariate Gaussians.

    Both distributions are assumed to be multivariate Gaussian:

        P₀ = 𝒩(μ₀, Σ₀)   and   P₁ = 𝒩(μ₁, Σ₁)

    The closed-form KL divergence is:

        KL(P₀ ‖ P₁) = ½ [ tr(Σ₁⁻¹Σ₀) + (μ₁ − μ₀)ᵀ Σ₁⁻¹ (μ₁ − μ₀)
                           − d + log(|Σ₁| / |Σ₀|) ]

    Parameters
    ----------
    mu0 : np.ndarray of shape (d,)
        Mean of the *source* distribution P₀.
    cov0 : np.ndarray of shape (d, d)
        Covariance Σ₀ of the source distribution P₀.
    mu1 : np.ndarray of shape (d,)
        Mean of the *target* distribution P₁.
    cov1 : np.ndarray of shape (d, d)
        Covariance Σ₁ of the target distribution P₁.

    Returns
    -------
    kl : float
        KL divergence KL(P₀ ‖ P₁) in nats.  Always non-negative for valid
        covariance matrices.

    Notes
    -----
    The matrix inverse Σ₁⁻¹ is computed via ``np.linalg.inv``; for large d
    a Cholesky-based approach would be more numerically stable.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import kl_divergence_gaussian
    >>> # KL(P ‖ P) = 0 for identical distributions
    >>> mu = np.array([1.0, 2.0])
    >>> cov = np.array([[2.0, 0.5], [0.5, 1.5]])
    >>> kl = kl_divergence_gaussian(mu, cov, mu, cov)
    >>> np.isclose(kl, 0.0, atol=1e-10)
    True
    """
    mu0 = np.atleast_1d(mu0)
    mu1 = np.atleast_1d(mu1)
    cov0 = np.atleast_2d(cov0)
    cov1 = np.atleast_2d(cov1)
    d = len(mu0)
    cov1_inv = np.linalg.inv(cov1)  # Σ₁⁻¹, shape (d, d)
    diff = mu1 - mu0  # mu1 - mu0, shape (d,)
    _sign0, log_det0 = np.linalg.slogdet(cov0)  # log|Σ₀|
    _sign1, log_det1 = np.linalg.slogdet(cov1)  # log|Σ₁|
    trace_term = np.trace(cov1_inv @ cov0)  # tr(Σ₁⁻¹Σ₀)
    quad_term = diff @ cov1_inv @ diff  # (mu1-mu0)^T Sigma1^-1 (mu1-mu0)
    log_det_term = log_det1 - log_det0  # log(|Sigma1|/|Sigma0|)
    # KL = 0.5 [tr(Sigma1^-1 Sigma0) + quad - d + log(|Sigma1|/|Sigma0|)]
    return float(0.5 * (trace_term + quad_term - d + log_det_term))

`kl_divergence_rbig(model_P, X_Q)`

Estimate a divergence between distributions P and Q via a fitted RBIG model.

As implemented, this function returns:

−𝔼_Q[log p(x)] − H(P)

where 𝔼_Q is the expectation over samples X_Q from Q, log p is the log-density of model_P, and H(P) is the entropy of P estimated from the training data. Expanding H(P) = −𝔼_P[log p(x)]:

result = 𝔼_P[log p(x)] − 𝔼_Q[log p(x)]

.. note:: This quantity is not the standard KL divergence KL(P ‖ Q) = 𝔼_P[log p(x)/q(x)], because Q's log-density log q is never evaluated. The result is a measure of how differently P's log-density scores the P-samples versus the Q-samples. It equals zero when P and Q assign identical average log-probability under P's model.

Parameters

model_P : AnnealedRBIG RBIG model fitted on samples from distribution P. Must expose score_samples(X) and entropy(). X_Q : np.ndarray of shape (n_samples, n_features) Samples drawn from distribution Q against which P is compared.

Returns

divergence : float Estimated 𝔼_P[log p(x)] − 𝔼_Q[log p(x)] in nats.

Examples

When P == Q the divergence should be near zero.

kl = kl_divergence_rbig(model_P, X_from_P) kl >= -0.1 # small negative values possible due to approximation True

Source code in rbig/_src/metrics.py

def kl_divergence_rbig(
    model_P: AnnealedRBIG,
    X_Q: np.ndarray,
) -> float:
    """Estimate a divergence between distributions P and Q via a fitted RBIG model.

    As implemented, this function returns:

        −𝔼_Q[log p(x)] − H(P)

    where ``𝔼_Q`` is the expectation over samples ``X_Q`` from Q, ``log p``
    is the log-density of ``model_P``, and ``H(P)`` is the entropy of P
    estimated from the training data.  Expanding ``H(P) = −𝔼_P[log p(x)]``:

        result = 𝔼_P[log p(x)] − 𝔼_Q[log p(x)]

    .. note::
        This quantity is **not** the standard KL divergence
        ``KL(P ‖ Q) = 𝔼_P[log p(x)/q(x)]``, because Q's log-density
        ``log q`` is never evaluated.  The result is a measure of how
        differently P's log-density scores the P-samples versus the
        Q-samples.  It equals zero when P and Q assign identical average
        log-probability under P's model.

    Parameters
    ----------
    model_P : AnnealedRBIG
        RBIG model fitted on samples from distribution P.  Must expose
        ``score_samples(X)`` and ``entropy()``.
    X_Q : np.ndarray of shape (n_samples, n_features)
        Samples drawn from distribution Q against which P is compared.

    Returns
    -------
    divergence : float
        Estimated ``𝔼_P[log p(x)] − 𝔼_Q[log p(x)]`` in nats.

    Examples
    --------
    >>> # When P == Q the divergence should be near zero.
    >>> kl = kl_divergence_rbig(model_P, X_from_P)
    >>> kl >= -0.1  # small negative values possible due to approximation
    True
    """
    # Evaluate log p(x) for samples drawn from Q
    log_pq = model_P.score_samples(X_Q)  # shape (n_samples,)
    hp = model_P.entropy()  # H(P) estimated by RBIG
    return float(-np.mean(log_pq) - hp)  # -E_Q[log p] - H(P)

`laplace(n_samples=1000, loc=0.0, scale=1.0, random_state=None)`

Sample from a Laplace (double-exponential) distribution.

The Laplace distribution has heavier tails than a Gaussian and is commonly used as a sparsity-inducing prior.

Parameters

n_samples : int, optional (default=1000) Number of samples to draw. loc : float, optional (default=0.0) Location parameter μ (mean and median of the distribution). scale : float, optional (default=1.0) Scale parameter b > 0 (half the variance is 2b²). random_state : int or None, optional (default=None) Seed for the random number generator.

Returns

samples : np.ndarray of shape (n_samples,) Independent draws from Laplace(loc, scale).

Examples

from rbig._src.parametric import laplace x = laplace(n_samples=400, loc=0.0, scale=1.0, random_state=5) x.shape (400,)

Source code in rbig/_src/parametric.py

def laplace(
    n_samples: int = 1000,
    loc: float = 0.0,
    scale: float = 1.0,
    random_state: int | None = None,
) -> np.ndarray:
    """Sample from a Laplace (double-exponential) distribution.

    The Laplace distribution has heavier tails than a Gaussian and is
    commonly used as a sparsity-inducing prior.

    Parameters
    ----------
    n_samples : int, optional (default=1000)
        Number of samples to draw.
    loc : float, optional (default=0.0)
        Location parameter μ (mean and median of the distribution).
    scale : float, optional (default=1.0)
        Scale parameter b > 0 (half the variance is 2b²).
    random_state : int or None, optional (default=None)
        Seed for the random number generator.

    Returns
    -------
    samples : np.ndarray of shape (n_samples,)
        Independent draws from Laplace(loc, scale).

    Examples
    --------
    >>> from rbig._src.parametric import laplace
    >>> x = laplace(n_samples=400, loc=0.0, scale=1.0, random_state=5)
    >>> x.shape
    (400,)
    """
    rng = np.random.default_rng(random_state)
    return rng.laplace(loc=loc, scale=scale, size=n_samples)

`lognormal(n_samples=1000, mean=0.0, sigma=1.0, random_state=None)`

Sample from a log-normal distribution.

If Y ~ 𝒩(mean, sigma²) then X = exp(Y) ~ LogNormal(mean, sigma²).

Parameters

n_samples : int, optional (default=1000) Number of samples to draw. mean : float, optional (default=0.0) Mean μ of the underlying normal distribution (i.e. log-scale mean). sigma : float, optional (default=1.0) Standard deviation σ > 0 of the underlying normal distribution (i.e. log-scale standard deviation). random_state : int or None, optional (default=None) Seed for the random number generator.

Returns

samples : np.ndarray of shape (n_samples,) Independent draws from LogNormal(mean, sigma²). All values are strictly positive.

Examples

from rbig._src.parametric import lognormal x = lognormal(n_samples=500, mean=0.0, sigma=0.5, random_state=6) x.shape (500,) import numpy as np np.all(x > 0) True

Source code in rbig/_src/parametric.py

def lognormal(
    n_samples: int = 1000,
    mean: float = 0.0,
    sigma: float = 1.0,
    random_state: int | None = None,
) -> np.ndarray:
    """Sample from a log-normal distribution.

    If Y ~ 𝒩(mean, sigma²) then X = exp(Y) ~ LogNormal(mean, sigma²).

    Parameters
    ----------
    n_samples : int, optional (default=1000)
        Number of samples to draw.
    mean : float, optional (default=0.0)
        Mean μ of the underlying normal distribution (i.e. log-scale mean).
    sigma : float, optional (default=1.0)
        Standard deviation σ > 0 of the underlying normal distribution
        (i.e. log-scale standard deviation).
    random_state : int or None, optional (default=None)
        Seed for the random number generator.

    Returns
    -------
    samples : np.ndarray of shape (n_samples,)
        Independent draws from LogNormal(mean, sigma²).  All values are
        strictly positive.

    Examples
    --------
    >>> from rbig._src.parametric import lognormal
    >>> x = lognormal(n_samples=500, mean=0.0, sigma=0.5, random_state=6)
    >>> x.shape
    (500,)
    >>> import numpy as np
    >>> np.all(x > 0)
    True
    """
    rng = np.random.default_rng(random_state)
    return rng.lognormal(mean=mean, sigma=sigma, size=n_samples)

`make_cdf_monotonic(cdf)`

Enforce monotonicity on an empirical CDF via running maximum.

Replaces each value with the cumulative maximum so that the result is non-decreasing.

Parameters

cdf : np.ndarray 1-D or 2-D array of CDF values. For 2-D input, monotonicity is enforced independently along each column (axis 0).

Returns

cdf_monotonic : np.ndarray Array of same shape as cdf with non-decreasing values along axis 0.

Source code in rbig/_src/marginal.py

def make_cdf_monotonic(cdf: np.ndarray) -> np.ndarray:
    """Enforce monotonicity on an empirical CDF via running maximum.

    Replaces each value with the cumulative maximum so that the result is
    non-decreasing.

    Parameters
    ----------
    cdf : np.ndarray
        1-D or 2-D array of CDF values.  For 2-D input, monotonicity is
        enforced independently along each column (axis 0).

    Returns
    -------
    cdf_monotonic : np.ndarray
        Array of same shape as *cdf* with non-decreasing values along axis 0.
    """
    return np.maximum.accumulate(cdf, axis=0)

`marginal_entropy(X, correction=True)`

Estimate the marginal (per-dimension) differential entropy using KDE.

For each dimension i, kernel density estimation (KDE) is used to obtain a smooth density estimate f̂(xᵢ), and the entropy is approximated as

H(Xᵢ) = −𝔼[log f̂(xᵢ)] ≈ −(1/N) ∑ⱼ log f̂(xᵢⱼ)

Parameters

X : np.ndarray of shape (n_samples, n_features) Data matrix. Each column is treated as an independent marginal. correction : bool, optional (default=True) Placeholder flag for an optional finite-sample bias correction. Currently unused in the computation.

Returns

entropies : np.ndarray of shape (n_features,) Estimated differential entropy (nats) for each feature dimension.

Notes

The KDE bandwidth is chosen by Scott's rule (scipy default). A small constant (1e-300) is added inside the log to prevent −∞ values when the density estimate is numerically zero.

Examples

import numpy as np from rbig._src.densities import marginal_entropy rng = np.random.default_rng(0) X = rng.standard_normal((500, 3)) h = marginal_entropy(X) h.shape (3,)

Gaussian entropy is 0.5*(1 + ln 2π) ≈ 1.419 nats

np.allclose(h, 0.5 * (1 + np.log(2 * np.pi)), atol=0.15) True

Source code in rbig/_src/densities.py

def marginal_entropy(X: np.ndarray, correction: bool = True) -> np.ndarray:
    """Estimate the marginal (per-dimension) differential entropy using KDE.

    For each dimension i, kernel density estimation (KDE) is used to obtain
    a smooth density estimate f̂(xᵢ), and the entropy is approximated as

        H(Xᵢ) = −𝔼[log f̂(xᵢ)] ≈ −(1/N) ∑ⱼ log f̂(xᵢⱼ)

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data matrix.  Each column is treated as an independent marginal.
    correction : bool, optional (default=True)
        Placeholder flag for an optional finite-sample bias correction.
        Currently unused in the computation.

    Returns
    -------
    entropies : np.ndarray of shape (n_features,)
        Estimated differential entropy (nats) for each feature dimension.

    Notes
    -----
    The KDE bandwidth is chosen by Scott's rule (scipy default).  A small
    constant (1e-300) is added inside the log to prevent ``−∞`` values when
    the density estimate is numerically zero.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.densities import marginal_entropy
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((500, 3))
    >>> h = marginal_entropy(X)
    >>> h.shape
    (3,)
    >>> # Gaussian entropy is 0.5*(1 + ln 2π) ≈ 1.419 nats
    >>> np.allclose(h, 0.5 * (1 + np.log(2 * np.pi)), atol=0.15)
    True
    """
    _n_samples, n_features = X.shape
    entropies = np.zeros(n_features)
    for i in range(n_features):
        kde = stats.gaussian_kde(X[:, i])  # fit KDE for column i
        log_density = np.log(kde(X[:, i]) + 1e-300)  # evaluate log f̂(xᵢ)
        entropies[i] = -np.mean(log_density)  # H(Xi) ~= -E[log f_hat]
    return entropies

`matrix_to_patches(matrix, patch_shape)`

Reshape a 2-D feature matrix back into a stack of image patches.

This is the inverse of :func:patches_to_matrix.

Parameters

matrix : np.ndarray, shape (n_patches, n_features) Flattened patch matrix where n_features = prod(patch_shape). patch_shape : tuple of int Target shape for each individual patch, e.g. (h, w) or (h, w, C). The product of all elements must equal matrix.shape[1].

Returns

patches : np.ndarray, shape (n_patches, *patch_shape) Stack of patches with the original spatial structure restored.

Notes

The reshape satisfies

.. math::

\mathbf{P}[i] = \operatorname{unvec}(\mathbf{M}[i, :],\, h, w)

Examples

import numpy as np rng = np.random.default_rng(0) mat = rng.standard_normal((100, 64)) patches = matrix_to_patches(mat, patch_shape=(8, 8)) patches.shape (100, 8, 8)

Source code in rbig/_src/image.py

def matrix_to_patches(
    matrix: np.ndarray,
    patch_shape: tuple[int, ...],
) -> np.ndarray:
    """Reshape a 2-D feature matrix back into a stack of image patches.

    This is the inverse of :func:`patches_to_matrix`.

    Parameters
    ----------
    matrix : np.ndarray, shape ``(n_patches, n_features)``
        Flattened patch matrix where ``n_features = prod(patch_shape)``.
    patch_shape : tuple of int
        Target shape for each individual patch, e.g. ``(h, w)`` or
        ``(h, w, C)``.  The product of all elements must equal
        ``matrix.shape[1]``.

    Returns
    -------
    patches : np.ndarray, shape ``(n_patches, *patch_shape)``
        Stack of patches with the original spatial structure restored.

    Notes
    -----
    The reshape satisfies

    .. math::

        \\mathbf{P}[i] = \\operatorname{unvec}(\\mathbf{M}[i, :],\\, h, w)

    Examples
    --------
    >>> import numpy as np
    >>> rng = np.random.default_rng(0)
    >>> mat = rng.standard_normal((100, 64))
    >>> patches = matrix_to_patches(mat, patch_shape=(8, 8))
    >>> patches.shape
    (100, 8, 8)
    """
    # Restore spatial structure: (n_patches, n_features) → (n_patches, *patch_shape)
    return matrix.reshape(matrix.shape[0], *patch_shape)

`matrix_to_xr_image(matrix, coords, spatial_shape, name='data')`

Reconstruct an xarray image DataArray from a 2-D pixel matrix.

This is the inverse of :func:xr_image_to_matrix.

Parameters

matrix : np.ndarray, shape (n_pixels, n_features) Pixel matrix as produced (or transformed) by a RBIG model. n_pixels must equal prod(spatial_shape). coords : dict Coordinate dictionary returned by :func:xr_image_to_matrix, mapping dimension names to their coordinate arrays. spatial_shape : tuple of int Target spatial shape of the output array, e.g. (H, W) or (H, W, C). The matrix is reshaped to this shape. name : str, default "data" Name assigned to the reconstructed DataArray.

Returns

da : xr.DataArray, shape spatial_shape Reconstructed image DataArray with the original dimension names and coordinate arrays.

Notes

The reshape inverts the flattening of :func:xr_image_to_matrix:

.. math::

(n_x \cdot n_y,\; n_f)
\longrightarrow
(n_x,\; n_y,\; [n_f])

Examples

import numpy as np import xarray as xr rng = np.random.default_rng(0) img = xr.DataArray( ... rng.standard_normal((8, 8)), ... dims=["x", "y"], ... coords={"x": np.arange(8), "y": np.arange(8)}, ... ) matrix, coords = xr_image_to_matrix(img) img_back = matrix_to_xr_image(matrix, coords, spatial_shape=(8, 8)) img_back.shape (8, 8) img_back.name 'data'

Source code in rbig/_src/xarray_image.py

def matrix_to_xr_image(
    matrix: np.ndarray,
    coords: dict,
    spatial_shape: tuple[int, ...],
    name: str = "data",
) -> xr.DataArray:
    """Reconstruct an xarray image DataArray from a 2-D pixel matrix.

    This is the inverse of :func:`xr_image_to_matrix`.

    Parameters
    ----------
    matrix : np.ndarray, shape ``(n_pixels, n_features)``
        Pixel matrix as produced (or transformed) by a RBIG model.
        ``n_pixels`` must equal ``prod(spatial_shape)``.
    coords : dict
        Coordinate dictionary returned by :func:`xr_image_to_matrix`,
        mapping dimension names to their coordinate arrays.
    spatial_shape : tuple of int
        Target spatial shape of the output array, e.g. ``(H, W)`` or
        ``(H, W, C)``.  The matrix is reshaped to this shape.
    name : str, default ``"data"``
        Name assigned to the reconstructed DataArray.

    Returns
    -------
    da : xr.DataArray, shape ``spatial_shape``
        Reconstructed image DataArray with the original dimension names and
        coordinate arrays.

    Notes
    -----
    The reshape inverts the flattening of :func:`xr_image_to_matrix`:

    .. math::

        (n_x \\cdot n_y,\\; n_f)
        \\longrightarrow
        (n_x,\\; n_y,\\; [n_f])

    Examples
    --------
    >>> import numpy as np
    >>> import xarray as xr
    >>> rng = np.random.default_rng(0)
    >>> img = xr.DataArray(
    ...     rng.standard_normal((8, 8)),
    ...     dims=["x", "y"],
    ...     coords={"x": np.arange(8), "y": np.arange(8)},
    ... )
    >>> matrix, coords = xr_image_to_matrix(img)
    >>> img_back = matrix_to_xr_image(matrix, coords, spatial_shape=(8, 8))
    >>> img_back.shape
    (8, 8)
    >>> img_back.name
    'data'
    """
    import xarray as xr

    arr = matrix.reshape(spatial_shape)  # (H, W) or (H, W, C)
    dims = list(coords.keys())
    xr_coords = {k: v for k, v in coords.items()}
    return xr.DataArray(arr, coords=xr_coords, dims=dims, name=name)

`matrix_to_xr_st(matrix, meta, time_dim='time')`

Reconstruct an xarray object from a 2-D sample matrix.

This is the inverse of :func:xr_st_to_matrix.

Parameters

matrix : np.ndarray, shape (n_samples, n_features) 2-D array as produced (or transformed) by RBIG. meta : dict Metadata dict returned by :func:xr_st_to_matrix. Must contain the "type" key ("DataArray" or "Dataset") plus the corresponding shape and dimension information. time_dim : str, default "time" Name of the time dimension (used for labelling, not reshaping).

Returns

out : xr.DataArray or xr.Dataset Reconstructed xarray object with the original shape and dimension names. Coordinates are not re-attached (use :meth:xr.DataArray.assign_coords if needed).

Notes

The reshape inverts the stacking performed by :func:xr_st_to_matrix:

.. math::

(n_{\text{time}} \times n_{\text{space}},\; n_{\text{features}})
\longrightarrow
(n_{\text{time}},\, n_{\text{lat}},\, n_{\text{lon}},\, \ldots)

Examples

import numpy as np import xarray as xr rng = np.random.default_rng(0) da = xr.DataArray( ... rng.standard_normal((10, 4, 5)), ... dims=["time", "lat", "lon"], ... ) matrix, meta = xr_st_to_matrix(da) da_back = matrix_to_xr_st(matrix, meta) da_back.shape (10, 4, 5)

Source code in rbig/_src/xarray_st.py

def matrix_to_xr_st(
    matrix: np.ndarray,
    meta: dict,
    time_dim: str = "time",
) -> xr.Dataset | xr.DataArray:
    """Reconstruct an xarray object from a 2-D sample matrix.

    This is the inverse of :func:`xr_st_to_matrix`.

    Parameters
    ----------
    matrix : np.ndarray, shape ``(n_samples, n_features)``
        2-D array as produced (or transformed) by RBIG.
    meta : dict
        Metadata dict returned by :func:`xr_st_to_matrix`.  Must contain the
        ``"type"`` key (``"DataArray"`` or ``"Dataset"``) plus the
        corresponding shape and dimension information.
    time_dim : str, default ``"time"``
        Name of the time dimension (used for labelling, not reshaping).

    Returns
    -------
    out : xr.DataArray or xr.Dataset
        Reconstructed xarray object with the original shape and dimension
        names.  Coordinates are **not** re-attached (use
        :meth:`xr.DataArray.assign_coords` if needed).

    Notes
    -----
    The reshape inverts the stacking performed by :func:`xr_st_to_matrix`:

    .. math::

        (n_{\\text{time}} \\times n_{\\text{space}},\\; n_{\\text{features}})
        \\longrightarrow
        (n_{\\text{time}},\\, n_{\\text{lat}},\\, n_{\\text{lon}},\\, \\ldots)

    Examples
    --------
    >>> import numpy as np
    >>> import xarray as xr
    >>> rng = np.random.default_rng(0)
    >>> da = xr.DataArray(
    ...     rng.standard_normal((10, 4, 5)),
    ...     dims=["time", "lat", "lon"],
    ... )
    >>> matrix, meta = xr_st_to_matrix(da)
    >>> da_back = matrix_to_xr_st(matrix, meta)
    >>> da_back.shape
    (10, 4, 5)
    """
    import xarray as xr

    if meta["type"] == "DataArray":
        # Restore the original multi-dimensional shape
        arr = matrix.reshape(meta["shape"])
        return xr.DataArray(arr, dims=meta["dims"], name="data")
    else:
        original_shape = meta["shape"]
        ds_dict = {}
        for i, var in enumerate(meta["feature_vars"]):
            # Each column of `matrix` corresponds to one data variable
            ds_dict[var] = xr.DataArray(
                matrix[:, i].reshape(original_shape), dims=meta["dims"]
            )
        return xr.Dataset(ds_dict)

`multivariate_gaussian(n_samples=1000, mean=None, cov=None, d=2, random_state=None)`

Sample from a multivariate Gaussian distribution.

Parameters

n_samples : int, optional (default=1000) Number of samples to draw. mean : np.ndarray of shape (d,) or None, optional Mean vector μ. Defaults to the zero vector of length d. cov : np.ndarray of shape (d, d) or None, optional Covariance matrix Σ. Must be symmetric positive semi-definite. Defaults to the identity matrix Iₐ. d : int, optional (default=2) Dimensionality used when mean is None. Ignored when mean is provided (its length determines the dimension). random_state : int or None, optional (default=None) Seed for the random number generator.

Returns

samples : np.ndarray of shape (n_samples, d) Independent draws from 𝒩(mean, cov).

Examples

import numpy as np from rbig._src.parametric import multivariate_gaussian cov = np.array([[1.0, 0.8], [0.8, 1.0]]) X = multivariate_gaussian(n_samples=200, cov=cov, d=2, random_state=7) X.shape (200, 2) np.isclose(np.corrcoef(X.T)[0, 1], 0.8, atol=0.1) True

Source code in rbig/_src/parametric.py

def multivariate_gaussian(
    n_samples: int = 1000,
    mean: np.ndarray | None = None,
    cov: np.ndarray | None = None,
    d: int = 2,
    random_state: int | None = None,
) -> np.ndarray:
    """Sample from a multivariate Gaussian distribution.

    Parameters
    ----------
    n_samples : int, optional (default=1000)
        Number of samples to draw.
    mean : np.ndarray of shape (d,) or None, optional
        Mean vector μ.  Defaults to the zero vector of length ``d``.
    cov : np.ndarray of shape (d, d) or None, optional
        Covariance matrix Σ.  Must be symmetric positive semi-definite.
        Defaults to the identity matrix Iₐ.
    d : int, optional (default=2)
        Dimensionality used when ``mean`` is ``None``.  Ignored when
        ``mean`` is provided (its length determines the dimension).
    random_state : int or None, optional (default=None)
        Seed for the random number generator.

    Returns
    -------
    samples : np.ndarray of shape (n_samples, d)
        Independent draws from 𝒩(mean, cov).

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import multivariate_gaussian
    >>> cov = np.array([[1.0, 0.8], [0.8, 1.0]])
    >>> X = multivariate_gaussian(n_samples=200, cov=cov, d=2, random_state=7)
    >>> X.shape
    (200, 2)
    >>> np.isclose(np.corrcoef(X.T)[0, 1], 0.8, atol=0.1)
    True
    """
    rng = np.random.default_rng(random_state)
    if mean is None:
        mean = np.zeros(d)  # default: zero mean
    if cov is None:
        cov = np.eye(len(mean))  # default: identity covariance
    return rng.multivariate_normal(mean, cov, size=n_samples)

`mutual_information_gaussian(cov_X, cov_Y, cov_XY)`

Analytic mutual information between two jointly Gaussian variables.

Uses the entropy identity:

MI(X; Y) = H(X) + H(Y) − H(X, Y)

where each entropy is computed analytically from the corresponding covariance matrix via :func:entropy_gaussian.

Parameters

cov_X : np.ndarray of shape (d_X, d_X) Marginal covariance of X. cov_Y : np.ndarray of shape (d_Y, d_Y) Marginal covariance of Y. cov_XY : np.ndarray of shape (d_X + d_Y, d_X + d_Y) Joint covariance matrix of the concatenated variable [X, Y].

Returns

mi : float Mutual information in nats. Non-negative for valid covariance matrices; small negative values indicate numerical imprecision.

Notes

For Gaussians the MI can also be expressed as:

MI(X; Y) = −½ log(|Σ_{XX}| · |Σ_{YY}| / |Σ_{XY}|)

Examples

import numpy as np from rbig._src.parametric import mutual_information_gaussian

Block-diagonal joint covariance → MI = 0

cov_X = np.eye(2) cov_Y = np.eye(2) cov_XY = np.block([[cov_X, np.zeros((2, 2))], [np.zeros((2, 2)), cov_Y]]) mi = mutual_information_gaussian(cov_X, cov_Y, cov_XY) np.isclose(mi, 0.0, atol=1e-10) True

Source code in rbig/_src/parametric.py

def mutual_information_gaussian(
    cov_X: np.ndarray,
    cov_Y: np.ndarray,
    cov_XY: np.ndarray,
) -> float:
    """Analytic mutual information between two jointly Gaussian variables.

    Uses the entropy identity:

        MI(X; Y) = H(X) + H(Y) − H(X, Y)

    where each entropy is computed analytically from the corresponding
    covariance matrix via :func:`entropy_gaussian`.

    Parameters
    ----------
    cov_X : np.ndarray of shape (d_X, d_X)
        Marginal covariance of X.
    cov_Y : np.ndarray of shape (d_Y, d_Y)
        Marginal covariance of Y.
    cov_XY : np.ndarray of shape (d_X + d_Y, d_X + d_Y)
        Joint covariance matrix of the concatenated variable [X, Y].

    Returns
    -------
    mi : float
        Mutual information in nats.  Non-negative for valid covariance
        matrices; small negative values indicate numerical imprecision.

    Notes
    -----
    For Gaussians the MI can also be expressed as:

        MI(X; Y) = −½ log(|Σ_{XX}| · |Σ_{YY}| / |Σ_{XY}|)

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import mutual_information_gaussian
    >>> # Block-diagonal joint covariance → MI = 0
    >>> cov_X = np.eye(2)
    >>> cov_Y = np.eye(2)
    >>> cov_XY = np.block([[cov_X, np.zeros((2, 2))], [np.zeros((2, 2)), cov_Y]])
    >>> mi = mutual_information_gaussian(cov_X, cov_Y, cov_XY)
    >>> np.isclose(mi, 0.0, atol=1e-10)
    True
    """
    hx = entropy_gaussian(cov_X)  # H(X)
    hy = entropy_gaussian(cov_Y)  # H(Y)
    hxy = entropy_gaussian(cov_XY)  # H(X, Y)
    return float(hx + hy - hxy)  # MI(X;Y) = H(X) + H(Y) - H(X,Y)

`mutual_information_rbig(model_X, model_Y, model_XY)`

Estimate mutual information between X and Y via RBIG models.

Uses the identity:

MI(X; Y) = H(X) + H(Y) − H(X, Y)

where each entropy is estimated from a separately fitted RBIG model.

Parameters

model_X : AnnealedRBIG RBIG model fitted on samples from the marginal distribution of X. model_Y : AnnealedRBIG RBIG model fitted on samples from the marginal distribution of Y. model_XY : AnnealedRBIG RBIG model fitted on joint samples [X, Y] (i.e. columns concatenated).

Returns

mi : float Estimated mutual information MI(X; Y) in nats. Non-negative for well-calibrated models; small negative values may appear due to numerical imprecision.

Notes

Each model.entropy() call returns the differential entropy estimated from the RBIG-transformed representation.

Examples

Assumes pre-fitted models; see AnnealedRBIG for fitting details.

mi = mutual_information_rbig(model_X, model_Y, model_XY) mi >= 0 # MI is non-negative True

Source code in rbig/_src/metrics.py

def mutual_information_rbig(
    model_X: AnnealedRBIG,
    model_Y: AnnealedRBIG,
    model_XY: AnnealedRBIG,
) -> float:
    """Estimate mutual information between X and Y via RBIG models.

    Uses the identity:

        MI(X; Y) = H(X) + H(Y) − H(X, Y)

    where each entropy is estimated from a separately fitted RBIG model.

    Parameters
    ----------
    model_X : AnnealedRBIG
        RBIG model fitted on samples from the marginal distribution of X.
    model_Y : AnnealedRBIG
        RBIG model fitted on samples from the marginal distribution of Y.
    model_XY : AnnealedRBIG
        RBIG model fitted on joint samples [X, Y] (i.e. columns concatenated).

    Returns
    -------
    mi : float
        Estimated mutual information MI(X; Y) in nats.  Non-negative for
        well-calibrated models; small negative values may appear due to
        numerical imprecision.

    Notes
    -----
    Each ``model.entropy()`` call returns the differential entropy estimated
    from the RBIG-transformed representation.

    Examples
    --------
    >>> # Assumes pre-fitted models; see AnnealedRBIG for fitting details.
    >>> mi = mutual_information_rbig(model_X, model_Y, model_XY)
    >>> mi >= 0  # MI is non-negative
    True
    """
    hx = model_X.entropy()  # H(X)
    hy = model_Y.entropy()  # H(Y)
    hxy = model_XY.entropy()  # H(X, Y)
    return float(hx + hy - hxy)  # MI(X;Y) = H(X) + H(Y) - H(X,Y)

`negative_log_likelihood(model, X)`

Average negative log-likelihood of X under the RBIG model.

Computes:

NLL = −(1/N) ∑ᵢ log p(xᵢ)

This is equivalent to :func:entropy_rbig but is exposed separately to make its role as a loss / evaluation metric explicit.

Parameters

model : AnnealedRBIG Fitted RBIG model. Must expose score_samples(X). X : np.ndarray of shape (n_samples, n_features) Evaluation data.

Returns

nll : float Average negative log-likelihood in nats.

Examples

nll = negative_log_likelihood(fitted_model, X_test) nll > 0 # NLL is positive for well-calibrated models True

Source code in rbig/_src/metrics.py

def negative_log_likelihood(model: AnnealedRBIG, X: np.ndarray) -> float:
    """Average negative log-likelihood of X under the RBIG model.

    Computes:

        NLL = −(1/N) ∑ᵢ log p(xᵢ)

    This is equivalent to :func:`entropy_rbig` but is exposed separately to
    make its role as a loss / evaluation metric explicit.

    Parameters
    ----------
    model : AnnealedRBIG
        Fitted RBIG model.  Must expose ``score_samples(X)``.
    X : np.ndarray of shape (n_samples, n_features)
        Evaluation data.

    Returns
    -------
    nll : float
        Average negative log-likelihood in nats.

    Examples
    --------
    >>> nll = negative_log_likelihood(fitted_model, X_test)
    >>> nll > 0  # NLL is positive for well-calibrated models
    True
    """
    log_probs = model.score_samples(X)  # log p(xᵢ), shape (N,)
    return float(-np.mean(log_probs))  # NLL = -(1/N) sum log p(xi)

`negentropy(X)`

Compute per-dimension negentropy J(Xᵢ) = H_Gauss(Xᵢ) − H(Xᵢ).

Negentropy measures non-Gaussianity for each marginal:

J(Xᵢ) = H_Gauss(Xᵢ) − H(Xᵢ) ≥ 0

where H_Gauss(Xᵢ) = ½(1 + log 2π) + ½ log Var(Xᵢ) is the Gaussian entropy matched to the observed variance and H(Xᵢ) is estimated via KDE.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data matrix.

Returns

neg_entropy : np.ndarray of shape (n_features,) Non-negative negentropy for each dimension. A value of 0 indicates that the marginal is Gaussian; larger values indicate more non-Gaussianity.

Notes

Negentropy is guaranteed non-negative by the maximum-entropy principle: among all distributions with a given variance, the Gaussian has the highest entropy.

Examples

import numpy as np from rbig._src.metrics import negentropy rng = np.random.default_rng(3) X_gauss = rng.standard_normal((500, 2)) J_gauss = negentropy(X_gauss) np.all(J_gauss >= -0.05) # nearly zero for Gaussian data True

Source code in rbig/_src/metrics.py

def negentropy(X: np.ndarray) -> np.ndarray:
    """Compute per-dimension negentropy J(Xᵢ) = H_Gauss(Xᵢ) − H(Xᵢ).

    Negentropy measures non-Gaussianity for each marginal:

        J(Xᵢ) = H_Gauss(Xᵢ) − H(Xᵢ) ≥ 0

    where H_Gauss(Xᵢ) = ½(1 + log 2π) + ½ log Var(Xᵢ) is the Gaussian
    entropy matched to the observed variance and H(Xᵢ) is estimated via KDE.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data matrix.

    Returns
    -------
    neg_entropy : np.ndarray of shape (n_features,)
        Non-negative negentropy for each dimension.  A value of 0 indicates
        that the marginal is Gaussian; larger values indicate more
        non-Gaussianity.

    Notes
    -----
    Negentropy is guaranteed non-negative by the maximum-entropy principle:
    among all distributions with a given variance, the Gaussian has the
    highest entropy.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.metrics import negentropy
    >>> rng = np.random.default_rng(3)
    >>> X_gauss = rng.standard_normal((500, 2))
    >>> J_gauss = negentropy(X_gauss)
    >>> np.all(J_gauss >= -0.05)  # nearly zero for Gaussian data
    True
    """
    _n, _d = X.shape
    # Gaussian entropy matched to empirical variance: H_Gauss = ½(1+log 2π) + ½ log σ²
    gauss_h = 0.5 * (1 + np.log(2 * np.pi)) + 0.5 * np.log(np.var(X, axis=0))
    from rbig._src.densities import marginal_entropy

    marg_h = marginal_entropy(X)  # KDE-based entropy, shape (n_features,)
    return gauss_h - marg_h  # J(Xi) = H_Gauss - H >= 0

`negentropy_kde(X, rule='sqrt')`

Per-dimension negentropy via KDE on histogram bins.

Estimates the KL divergence between the empirical distribution (smoothed by KDE) and a standard normal for each feature:

J(Xᵢ) = KL(p_kde ‖ N(0,1)) = Δ ∑_k p(k) log2(p(k) / g(k))

where p(k) is the normalised KDE density evaluated at histogram bin centres and g(k) = N(0,1).pdf(bin_centres).

Parameters

X : np.ndarray of shape (n_samples, n_features) Data matrix. Should be approximately zero-mean, unit-variance per feature for meaningful comparison against N(0,1). rule : str, optional (default="sqrt") Bin count rule passed to :func:~rbig._src.densities.bin_estimation.

Returns

neg : np.ndarray of shape (n_features,) Non-negative negentropy (bits) per feature. Near zero for Gaussian data.

Source code in rbig/_src/metrics.py

def negentropy_kde(X: np.ndarray, rule: str = "sqrt") -> np.ndarray:
    """Per-dimension negentropy via KDE on histogram bins.

    Estimates the KL divergence between the empirical distribution (smoothed
    by KDE) and a standard normal for each feature:

        J(Xᵢ) = KL(p_kde ‖ N(0,1)) = Δ ∑_k p(k) log2(p(k) / g(k))

    where p(k) is the normalised KDE density evaluated at histogram bin
    centres and g(k) = N(0,1).pdf(bin_centres).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data matrix.  Should be approximately zero-mean, unit-variance per
        feature for meaningful comparison against N(0,1).
    rule : str, optional (default="sqrt")
        Bin count rule passed to :func:`~rbig._src.densities.bin_estimation`.

    Returns
    -------
    neg : np.ndarray of shape (n_features,)
        Non-negative negentropy (bits) per feature.  Near zero for Gaussian
        data.
    """
    from scipy import stats as sp_stats

    from rbig._src.densities import bin_estimation

    n_samples, n_features = X.shape
    n_bins = bin_estimation(n_samples, rule=rule)
    neg = np.zeros(n_features)

    for i in range(n_features):
        col = X[:, i]

        # Guard against constant/near-constant features where KDE would fail
        if np.std(col) < 1e-12:
            neg[i] = 0.0
            continue

        _counts, bin_edges = np.histogram(
            col, bins=n_bins, range=(col.min(), col.max())
        )
        bin_centers = (bin_edges[:-1] + bin_edges[1:]) / 2
        delta = bin_centers[1] - bin_centers[0] if len(bin_centers) > 1 else 1.0

        # Standard normal reference density
        pg = sp_stats.norm.pdf(bin_centers, 0, 1)

        # KDE-smoothed density
        kde_model = sp_stats.gaussian_kde(col)
        hx = kde_model.pdf(bin_centers)
        px = hx / (hx.sum() * delta)

        # KL divergence at positive indices
        idx = (px > 0) & (pg > 0)
        neg[i] = delta * np.sum(px[idx] * np.log2(px[idx] / pg[idx]))

    return neg

`patches_to_matrix(patches)`

Flatten image patches into a 2-D feature matrix.

Each patch is ravelled along its spatial (and optionally channel) axes so that a collection of patches becomes a conventional (samples, features) matrix suitable for statistical modelling.

Parameters

patches : np.ndarray, shape (n_patches, h, w) or (n_patches, h, w, C) Stack of image patches. The first axis indexes patches; all remaining axes are flattened into the feature axis.

Returns

matrix : np.ndarray, shape (n_patches, h*w) or (n_patches, h*w*C) Row-major flattening of every patch: n_features = h * w (grayscale) or h * w * C (colour).

Notes

The reshape is equivalent to

.. math::

\mathbf{M}[i, :] = \operatorname{vec}(\mathbf{P}[i])

where :math:\operatorname{vec} stacks columns in row-major (C) order.

Examples

import numpy as np patches = np.random.default_rng(0).standard_normal((100, 8, 8)) mat = patches_to_matrix(patches) mat.shape (100, 64)

Source code in rbig/_src/image.py

def patches_to_matrix(
    patches: np.ndarray,
) -> np.ndarray:
    """Flatten image patches into a 2-D feature matrix.

    Each patch is ravelled along its spatial (and optionally channel) axes so
    that a collection of patches becomes a conventional ``(samples, features)``
    matrix suitable for statistical modelling.

    Parameters
    ----------
    patches : np.ndarray, shape ``(n_patches, h, w)`` or \
``(n_patches, h, w, C)``
        Stack of image patches.  The first axis indexes patches; all remaining
        axes are flattened into the feature axis.

    Returns
    -------
    matrix : np.ndarray, shape ``(n_patches, h*w)`` or \
``(n_patches, h*w*C)``
        Row-major flattening of every patch:
        ``n_features = h * w`` (grayscale) or ``h * w * C`` (colour).

    Notes
    -----
    The reshape is equivalent to

    .. math::

        \\mathbf{M}[i, :] = \\operatorname{vec}(\\mathbf{P}[i])

    where :math:`\\operatorname{vec}` stacks columns in row-major (C) order.

    Examples
    --------
    >>> import numpy as np
    >>> patches = np.random.default_rng(0).standard_normal((100, 8, 8))
    >>> mat = patches_to_matrix(patches)
    >>> mat.shape
    (100, 64)
    """
    # Collapse spatial (and channel) dims: (n_patches, h, w[, C]) → (n_patches, h*w[*C])
    return patches.reshape(patches.shape[0], -1)

`student_t(n_samples=1000, df=5.0, random_state=None)`

Sample from a Student-t distribution with df degrees of freedom.

The Student-t distribution has heavier tails than a Gaussian; as df → ∞ it converges to 𝒩(0, 1).

Parameters

n_samples : int, optional (default=1000) Number of samples to draw. df : float, optional (default=5.0) Degrees of freedom ν > 0. Smaller values produce heavier tails. random_state : int or None, optional (default=None) Seed for the random number generator.

Returns

samples : np.ndarray of shape (n_samples,) Independent draws from t(df).

Examples

from rbig._src.parametric import student_t x = student_t(n_samples=300, df=3.0, random_state=2) x.shape (300,)

Source code in rbig/_src/parametric.py

def student_t(
    n_samples: int = 1000,
    df: float = 5.0,
    random_state: int | None = None,
) -> np.ndarray:
    """Sample from a Student-t distribution with ``df`` degrees of freedom.

    The Student-t distribution has heavier tails than a Gaussian; as
    ``df → ∞`` it converges to 𝒩(0, 1).

    Parameters
    ----------
    n_samples : int, optional (default=1000)
        Number of samples to draw.
    df : float, optional (default=5.0)
        Degrees of freedom ν > 0.  Smaller values produce heavier tails.
    random_state : int or None, optional (default=None)
        Seed for the random number generator.

    Returns
    -------
    samples : np.ndarray of shape (n_samples,)
        Independent draws from t(df).

    Examples
    --------
    >>> from rbig._src.parametric import student_t
    >>> x = student_t(n_samples=300, df=3.0, random_state=2)
    >>> x.shape
    (300,)
    """
    rng = np.random.default_rng(random_state)
    return rng.standard_t(df=df, size=n_samples)

`total_correlation(X)`

Estimate the Total Correlation (multivariate mutual information) of X.

Total Correlation (TC) measures the statistical dependence among all dimensions simultaneously:

TC(X) = ∑ᵢ H(Xᵢ) − H(X)

Marginal entropies H(Xᵢ) are estimated via KDE (see marginal_entropy), while the joint entropy H(X) is approximated by fitting a multivariate Gaussian to the data (see joint_entropy_gaussian).

Parameters

X : np.ndarray of shape (n_samples, n_features) Data matrix.

Returns

tc : float Estimated total correlation in nats. A value near zero indicates near-independence across dimensions.

Notes

Because the joint term uses a Gaussian approximation, TC may be slightly biased for non-Gaussian data. For a fully non-parametric estimate use total_correlation_rbig from rbig._src.metrics.

Examples

import numpy as np from rbig._src.densities import total_correlation rng = np.random.default_rng(0) X_indep = rng.standard_normal((500, 3)) # independent columns → TC ≈ 0 tc = total_correlation(X_indep) tc >= -0.5 # should be close to 0 True

Source code in rbig/_src/densities.py

def total_correlation(X: np.ndarray) -> float:
    """Estimate the Total Correlation (multivariate mutual information) of X.

    Total Correlation (TC) measures the statistical dependence among all
    dimensions simultaneously:

        TC(X) = ∑ᵢ H(Xᵢ) − H(X)

    Marginal entropies H(Xᵢ) are estimated via KDE (see `marginal_entropy`),
    while the joint entropy H(X) is approximated by fitting a multivariate
    Gaussian to the data (see `joint_entropy_gaussian`).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data matrix.

    Returns
    -------
    tc : float
        Estimated total correlation in nats.  A value near zero indicates
        near-independence across dimensions.

    Notes
    -----
    Because the joint term uses a Gaussian approximation, TC may be slightly
    biased for non-Gaussian data.  For a fully non-parametric estimate use
    ``total_correlation_rbig`` from ``rbig._src.metrics``.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.densities import total_correlation
    >>> rng = np.random.default_rng(0)
    >>> X_indep = rng.standard_normal((500, 3))  # independent columns → TC ≈ 0
    >>> tc = total_correlation(X_indep)
    >>> tc >= -0.5  # should be close to 0
    True
    """
    marg_h = marginal_entropy(X)  # ∑ᵢ H(Xᵢ) via KDE, shape (n_features,)
    joint_h = joint_entropy_gaussian(X)  # H(X) via Gaussian approximation
    return float(np.sum(marg_h) - joint_h)

`total_correlation_gaussian(cov)`

Analytic Total Correlation of a multivariate Gaussian.

For a Gaussian with covariance Σ, the TC reduces to a function of the correlation matrix R = D^{-½} Σ D^{-½} (where D = diag(Σ)):

TC = ∑ᵢ H(Xᵢ) − H(X) = −½ log|R|

Equivalently, it measures how far the distribution is from being a product of its marginals.

Parameters

cov : np.ndarray of shape (d, d) Covariance matrix Σ. Coerced to at least 2-D.

Returns

tc : float Total correlation in nats. Returns +inf if Σ is singular.

Notes

The computation uses:

TC = (∑ᵢ ½ log(2πe σᵢ²)) − ½(d(1 + log 2π) + log|Σ|)
   = ½ ∑ᵢ log σᵢ² − ½ log|Σ|
   = −½ log|corr(Σ)|

Examples

import numpy as np from rbig._src.parametric import total_correlation_gaussian

Identity covariance → all marginals independent → TC = 0

tc = total_correlation_gaussian(np.eye(3)) np.isclose(tc, 0.0) True

Correlated covariance → TC > 0

cov = np.array([[1.0, 0.9], [0.9, 1.0]]) total_correlation_gaussian(cov) > 0 True

Source code in rbig/_src/parametric.py

def total_correlation_gaussian(cov: np.ndarray) -> float:
    """Analytic Total Correlation of a multivariate Gaussian.

    For a Gaussian with covariance Σ, the TC reduces to a function of the
    correlation matrix R = D^{-½} Σ D^{-½} (where D = diag(Σ)):

        TC = ∑ᵢ H(Xᵢ) − H(X) = −½ log|R|

    Equivalently, it measures how far the distribution is from being a
    product of its marginals.

    Parameters
    ----------
    cov : np.ndarray of shape (d, d)
        Covariance matrix Σ.  Coerced to at least 2-D.

    Returns
    -------
    tc : float
        Total correlation in nats.  Returns ``+inf`` if Σ is singular.

    Notes
    -----
    The computation uses:

        TC = (∑ᵢ ½ log(2πe σᵢ²)) − ½(d(1 + log 2π) + log|Σ|)
           = ½ ∑ᵢ log σᵢ² − ½ log|Σ|
           = −½ log|corr(Σ)|

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import total_correlation_gaussian
    >>> # Identity covariance → all marginals independent → TC = 0
    >>> tc = total_correlation_gaussian(np.eye(3))
    >>> np.isclose(tc, 0.0)
    True
    >>> # Correlated covariance → TC > 0
    >>> cov = np.array([[1.0, 0.9], [0.9, 1.0]])
    >>> total_correlation_gaussian(cov) > 0
    True
    """
    cov = np.atleast_2d(cov)
    d = cov.shape[0]
    marginal_vars = np.diag(cov)  # σᵢ²
    # ∑ᵢ H(Xᵢ) = ∑ᵢ ½ log(2πe σᵢ²)
    sum_marg_h = 0.5 * np.sum(np.log(2 * np.pi * np.e * marginal_vars))
    sign, log_det = np.linalg.slogdet(cov)  # log|Σ|
    if sign <= 0:
        return np.inf  # singular Σ
    # H(X) = ½ (d(1 + log 2π) + log|Σ|)
    joint_h = 0.5 * (d * (1 + np.log(2 * np.pi)) + log_det)
    return float(sum_marg_h - joint_h)  # TC = sum H(Xi) - H(X)

`total_correlation_rbig(X)`

Estimate Total Correlation (multivariate mutual information) of X.

Total Correlation is defined as:

TC(X) = ∑ᵢ H(Xᵢ) − H(X)

where the marginal entropies H(Xᵢ) are estimated via KDE (using marginal_entropy) and the joint entropy H(X) is estimated by fitting a multivariate Gaussian to the data (joint_entropy_gaussian).

Parameters

X : np.ndarray of shape (n_samples, n_features) Data matrix.

Returns

tc : float Estimated total correlation in nats. Values close to zero indicate approximate statistical independence among the dimensions.

Notes

See :func:rbig._src.densities.total_correlation for identical logic. This function is kept in metrics for API convenience.

Examples

import numpy as np from rbig._src.metrics import total_correlation_rbig rng = np.random.default_rng(0) X = rng.standard_normal((300, 4)) # independent Gaussians tc = total_correlation_rbig(X) tc >= -0.5 # should be near 0 True

Source code in rbig/_src/metrics.py

def total_correlation_rbig(X: np.ndarray) -> float:
    """Estimate Total Correlation (multivariate mutual information) of X.

    Total Correlation is defined as:

        TC(X) = ∑ᵢ H(Xᵢ) − H(X)

    where the marginal entropies H(Xᵢ) are estimated via KDE (using
    ``marginal_entropy``) and the joint entropy H(X) is estimated by fitting
    a multivariate Gaussian to the data (``joint_entropy_gaussian``).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data matrix.

    Returns
    -------
    tc : float
        Estimated total correlation in nats.  Values close to zero indicate
        approximate statistical independence among the dimensions.

    Notes
    -----
    See :func:`rbig._src.densities.total_correlation` for identical logic.
    This function is kept in ``metrics`` for API convenience.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.metrics import total_correlation_rbig
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((300, 4))  # independent Gaussians
    >>> tc = total_correlation_rbig(X)
    >>> tc >= -0.5  # should be near 0
    True
    """
    from rbig._src.densities import joint_entropy_gaussian, marginal_entropy

    marg_h = marginal_entropy(X)  # ∑ᵢ H(Xᵢ), shape (n_features,)
    joint_h = joint_entropy_gaussian(X)  # H(X) via Gaussian approximation
    return float(np.sum(marg_h) - joint_h)

`uniform(n_samples=1000, low=0.0, high=1.0, random_state=None)`

Sample from a continuous uniform distribution on [low, high).

Parameters

n_samples : int, optional (default=1000) Number of samples to draw. low : float, optional (default=0.0) Lower bound (inclusive) of the interval. high : float, optional (default=1.0) Upper bound (exclusive) of the interval. Must satisfy high > low. random_state : int or None, optional (default=None) Seed for the random number generator.

Returns

samples : np.ndarray of shape (n_samples,) Independent draws from Uniform(low, high).

Examples

from rbig._src.parametric import uniform x = uniform(n_samples=300, low=-1.0, high=1.0, random_state=3) x.shape (300,) import numpy as np np.all((x >= -1.0) & (x < 1.0)) True

Source code in rbig/_src/parametric.py

def uniform(
    n_samples: int = 1000,
    low: float = 0.0,
    high: float = 1.0,
    random_state: int | None = None,
) -> np.ndarray:
    """Sample from a continuous uniform distribution on [low, high).

    Parameters
    ----------
    n_samples : int, optional (default=1000)
        Number of samples to draw.
    low : float, optional (default=0.0)
        Lower bound (inclusive) of the interval.
    high : float, optional (default=1.0)
        Upper bound (exclusive) of the interval.  Must satisfy ``high > low``.
    random_state : int or None, optional (default=None)
        Seed for the random number generator.

    Returns
    -------
    samples : np.ndarray of shape (n_samples,)
        Independent draws from Uniform(low, high).

    Examples
    --------
    >>> from rbig._src.parametric import uniform
    >>> x = uniform(n_samples=300, low=-1.0, high=1.0, random_state=3)
    >>> x.shape
    (300,)
    >>> import numpy as np
    >>> np.all((x >= -1.0) & (x < 1.0))
    True
    """
    rng = np.random.default_rng(random_state)
    return rng.uniform(low=low, high=high, size=n_samples)

`von_mises(n_samples=1000, mu=0.0, kappa=1.0, random_state=None)`

Sample from a von Mises distribution on the circle.

The von Mises distribution is the circular analogue of the Gaussian and is parameterised by a mean angle μ and a concentration κ. As κ → 0 the distribution approaches a uniform distribution on [−π, π); as κ → ∞ it concentrates around μ.

Parameters

n_samples : int, optional (default=1000) Number of samples to draw. mu : float, optional (default=0.0) Mean direction in radians. kappa : float, optional (default=1.0) Concentration parameter κ ≥ 0. Larger values produce more peaked distributions. random_state : int or None, optional (default=None) Seed for the random number generator.

Returns

samples : np.ndarray of shape (n_samples,) Independent draws from VonMises(mu, kappa) in radians, with values in [−π, π).

Examples

import numpy as np from rbig._src.parametric import von_mises x = von_mises(n_samples=500, mu=0.0, kappa=2.0, random_state=11) x.shape (500,) np.all((x >= -np.pi) & (x < np.pi)) True

Source code in rbig/_src/parametric.py

def von_mises(
    n_samples: int = 1000,
    mu: float = 0.0,
    kappa: float = 1.0,
    random_state: int | None = None,
) -> np.ndarray:
    """Sample from a von Mises distribution on the circle.

    The von Mises distribution is the circular analogue of the Gaussian and
    is parameterised by a mean angle μ and a concentration κ.  As κ → 0 the
    distribution approaches a uniform distribution on [−π, π); as κ → ∞ it
    concentrates around μ.

    Parameters
    ----------
    n_samples : int, optional (default=1000)
        Number of samples to draw.
    mu : float, optional (default=0.0)
        Mean direction in radians.
    kappa : float, optional (default=1.0)
        Concentration parameter κ ≥ 0.  Larger values produce more peaked
        distributions.
    random_state : int or None, optional (default=None)
        Seed for the random number generator.

    Returns
    -------
    samples : np.ndarray of shape (n_samples,)
        Independent draws from VonMises(mu, kappa) in radians,
        with values in [−π, π).

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import von_mises
    >>> x = von_mises(n_samples=500, mu=0.0, kappa=2.0, random_state=11)
    >>> x.shape
    (500,)
    >>> np.all((x >= -np.pi) & (x < np.pi))
    True
    """
    rng = np.random.default_rng(random_state)
    return rng.vonmises(mu=mu, kappa=kappa, size=n_samples)

`wishart(n_samples=1000, df=5, scale=None, d=3, random_state=None)`

Sample from a Wishart distribution.

The Wishart distribution is a matrix-variate generalisation of the chi-squared distribution and is the conjugate prior of the inverse covariance (precision) matrix of a multivariate Gaussian.

Parameters

n_samples : int, optional (default=1000) Number of samples to draw. df : int, optional (default=5) Degrees of freedom ν. Must satisfy ν ≥ d. scale : np.ndarray of shape (d, d) or None, optional Positive-definite scale matrix V. Defaults to the d × d identity matrix Iₐ (using d to infer the dimension when None). d : int, optional (default=3) Dimension of the matrices; used only when scale is None. random_state : int or None, optional (default=None) Seed for the random number generator (converted to a 32-bit integer for scipy compatibility).

Returns

samples : np.ndarray of shape (n_samples, d, d) Independent positive semi-definite matrices drawn from W(df, scale).

Examples

import numpy as np from rbig._src.parametric import wishart W = wishart(n_samples=50, df=5, d=3, random_state=10) W.shape (50, 3, 3)

Each sample should be symmetric

np.allclose(W[0], W[0].T) True

Source code in rbig/_src/parametric.py

def wishart(
    n_samples: int = 1000,
    df: int = 5,
    scale: np.ndarray | None = None,
    d: int = 3,
    random_state: int | None = None,
) -> np.ndarray:
    """Sample from a Wishart distribution.

    The Wishart distribution is a matrix-variate generalisation of the
    chi-squared distribution and is the conjugate prior of the inverse
    covariance (precision) matrix of a multivariate Gaussian.

    Parameters
    ----------
    n_samples : int, optional (default=1000)
        Number of samples to draw.
    df : int, optional (default=5)
        Degrees of freedom ν.  Must satisfy ν ≥ d.
    scale : np.ndarray of shape (d, d) or None, optional
        Positive-definite scale matrix V.  Defaults to the d × d identity
        matrix Iₐ (using ``d`` to infer the dimension when ``None``).
    d : int, optional (default=3)
        Dimension of the matrices; used only when ``scale`` is ``None``.
    random_state : int or None, optional (default=None)
        Seed for the random number generator (converted to a 32-bit integer
        for scipy compatibility).

    Returns
    -------
    samples : np.ndarray of shape (n_samples, d, d)
        Independent positive semi-definite matrices drawn from W(df, scale).

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import wishart
    >>> W = wishart(n_samples=50, df=5, d=3, random_state=10)
    >>> W.shape
    (50, 3, 3)
    >>> # Each sample should be symmetric
    >>> np.allclose(W[0], W[0].T)
    True
    """
    from scipy.stats import wishart as wishart_dist

    if scale is None:
        scale = np.eye(d)  # default: identity
    rng = np.random.default_rng(random_state)
    seed = int(rng.integers(0, 2**31))  # scipy needs int seed
    return wishart_dist.rvs(df=df, scale=scale, size=n_samples, random_state=seed)

`xr_apply_rbig(da, model, feature_dim='variable')`

Apply a fitted RBIG model to an xarray DataArray.

Reshapes the DataArray into a 2-D matrix, runs the fitted model's transform, and returns the result with the original shape, dimension names, coordinates, and DataArray name preserved.

Parameters

da : xr.DataArray Input DataArray. If da.ndim > 2, all axes except the last are merged into the sample axis; otherwise the array is used as-is. model : AnnealedRBIG A fitted RBIG model instance exposing a transform(X) method where X has shape (n_samples, n_features). feature_dim : str, default "variable" Name of the feature dimension (informational; not used for reshaping, but available for extension to multi-feature DataArrays).

Returns

transformed : xr.DataArray Transformed DataArray with:

* the same shape as ``da``
* the same ``dims``, ``coords``, and ``name``
* values replaced by the RBIG-transformed values

Notes

The pipeline for a 3-D input (d0, d1, n_features) is:

.. math::

(d_0,\; d_1,\; n_f)
\xrightarrow{\text{reshape}}
(d_0 \cdot d_1,\; n_f)
\xrightarrow{\text{RBIG.transform}}
(d_0 \cdot d_1,\; n_f)
\xrightarrow{\text{reshape}}
(d_0,\; d_1,\; n_f)

For 2-D input the data is passed directly without reshaping.

Examples

import numpy as np import xarray as xr rng = np.random.default_rng(0) da = xr.DataArray( ... rng.standard_normal((20, 3)), ... dims=["sample", "feature"], ... coords={"sample": np.arange(20), "feature": ["r", "g", "b"]}, ... name="pixels", ... )

model = AnnealedRBIG(n_layers=5).fit(da.values)

transformed = xr_apply_rbig(da, model)

transformed.shape == da.shape

Source code in rbig/_src/xarray_image.py

def xr_apply_rbig(
    da: xr.DataArray,
    model: AnnealedRBIG,
    feature_dim: str = "variable",
) -> xr.DataArray:
    """Apply a fitted RBIG model to an xarray DataArray.

    Reshapes the DataArray into a 2-D matrix, runs the fitted ``model``'s
    ``transform``, and returns the result with the original shape, dimension
    names, coordinates, and DataArray name preserved.

    Parameters
    ----------
    da : xr.DataArray
        Input DataArray.  If ``da.ndim > 2``, all axes except the last are
        merged into the sample axis; otherwise the array is used as-is.
    model : AnnealedRBIG
        A fitted RBIG model instance exposing a ``transform(X)`` method
        where ``X`` has shape ``(n_samples, n_features)``.
    feature_dim : str, default ``"variable"``
        Name of the feature dimension (informational; not used for reshaping,
        but available for extension to multi-feature DataArrays).

    Returns
    -------
    transformed : xr.DataArray
        Transformed DataArray with:

        * the same shape as ``da``
        * the same ``dims``, ``coords``, and ``name``
        * values replaced by the RBIG-transformed values

    Notes
    -----
    The pipeline for a 3-D input ``(d0, d1, n_features)`` is:

    .. math::

        (d_0,\\; d_1,\\; n_f)
        \\xrightarrow{\\text{reshape}}
        (d_0 \\cdot d_1,\\; n_f)
        \\xrightarrow{\\text{RBIG.transform}}
        (d_0 \\cdot d_1,\\; n_f)
        \\xrightarrow{\\text{reshape}}
        (d_0,\\; d_1,\\; n_f)

    For 2-D input the data is passed directly without reshaping.

    Examples
    --------
    >>> import numpy as np
    >>> import xarray as xr
    >>> rng = np.random.default_rng(0)
    >>> da = xr.DataArray(
    ...     rng.standard_normal((20, 3)),
    ...     dims=["sample", "feature"],
    ...     coords={"sample": np.arange(20), "feature": ["r", "g", "b"]},
    ...     name="pixels",
    ... )
    >>> # model = AnnealedRBIG(n_layers=5).fit(da.values)
    >>> # transformed = xr_apply_rbig(da, model)
    >>> # transformed.shape == da.shape
    """
    import xarray as xr

    X = da.values
    original_shape = X.shape
    # Collapse leading spatial axes into the sample axis if ndim > 2
    if X.ndim > 2:
        X_2d = X.reshape(-1, X.shape[-1])  # (n_pixels, n_features)
    else:
        X_2d = X  # already (n_samples, n_features)

    Xt = model.transform(X_2d)  # (n_samples, n_features) → transformed
    Xt = Xt.reshape(original_shape)  # restore original spatial structure

    return xr.DataArray(Xt, coords=da.coords, dims=da.dims, name=da.name)

`xr_image_to_matrix(da, spatial_dims=('x', 'y'))`

Flatten an xarray image DataArray into a 2-D pixel matrix.

Reshapes a 2-D (or 2-D + feature) DataArray into a (n_pixels, n_features) matrix where n_pixels = ∏ sizes(spatial_dims). For a purely 2-D grayscale image, n_features = 1; for an image with a trailing feature axis the last dimension is preserved.

Parameters

da : xr.DataArray, shape (x, y) or (x, y, n_features) Source image DataArray. The first two axes are assumed to be the spatial dimensions; any additional trailing axis is treated as the feature axis. spatial_dims : tuple of str, default ("x", "y") Names of the two spatial dimensions in da. Currently this argument is ignored by the implementation; the function assumes that the spatial dimensions are the first two axes of da.

Returns

matrix : np.ndarray, shape (n_pixels, n_features) Row-major flattening of the spatial axes:

.. math::

    (n_x,\; n_y,\; [n_f])
    \longrightarrow
    (n_x \cdot n_y,\; n_f)

where ``n_f = 1`` for grayscale and ``n_f = da.shape[-1]`` for
multi-feature images.

coords : dict Mapping from dimension name to coordinate array. All dimensions of da are included. Pass this dict to :func:matrix_to_xr_image to reconstruct the DataArray.

Notes

The operation is equivalent to:

.. math::

\mathbf{M}[i, :] = \operatorname{vec}(\text{image pixel } i)

where pixels are indexed in C (row-major) order.

Examples

import numpy as np import xarray as xr rng = np.random.default_rng(0) img = xr.DataArray( ... rng.standard_normal((16, 16)), ... dims=["x", "y"], ... coords={"x": np.arange(16), "y": np.arange(16)}, ... ) matrix, coords = xr_image_to_matrix(img) matrix.shape # 16*16 pixels, 1 grayscale feature (256, 1) set(coords.keys()) == {"x", "y"} True

Source code in rbig/_src/xarray_image.py

def xr_image_to_matrix(
    da: xr.DataArray,
    spatial_dims: tuple[str, str] = ("x", "y"),
) -> tuple[np.ndarray, dict]:
    """Flatten an xarray image DataArray into a 2-D pixel matrix.

    Reshapes a 2-D (or 2-D + feature) DataArray into a ``(n_pixels, n_features)``
    matrix where ``n_pixels = ∏ sizes(spatial_dims)``.  For a purely 2-D
    grayscale image, ``n_features = 1``; for an image with a trailing feature
    axis the last dimension is preserved.

    Parameters
    ----------
    da : xr.DataArray, shape ``(x, y)`` or ``(x, y, n_features)``
        Source image DataArray. The first two axes are assumed to be the
        spatial dimensions; any additional trailing axis is treated as the
        feature axis.
    spatial_dims : tuple of str, default ``("x", "y")``
        Names of the two spatial dimensions in ``da``. Currently this
        argument is ignored by the implementation; the function assumes that
        the spatial dimensions are the first two axes of ``da``.

    Returns
    -------
    matrix : np.ndarray, shape ``(n_pixels, n_features)``
        Row-major flattening of the spatial axes:

        .. math::

            (n_x,\\; n_y,\\; [n_f])
            \\longrightarrow
            (n_x \\cdot n_y,\\; n_f)

        where ``n_f = 1`` for grayscale and ``n_f = da.shape[-1]`` for
        multi-feature images.
    coords : dict
        Mapping from dimension name to coordinate array.  All dimensions of
        ``da`` are included.  Pass this dict to :func:`matrix_to_xr_image` to
        reconstruct the DataArray.

    Notes
    -----
    The operation is equivalent to:

    .. math::

        \\mathbf{M}[i, :] = \\operatorname{vec}(\\text{image pixel } i)

    where pixels are indexed in C (row-major) order.

    Examples
    --------
    >>> import numpy as np
    >>> import xarray as xr
    >>> rng = np.random.default_rng(0)
    >>> img = xr.DataArray(
    ...     rng.standard_normal((16, 16)),
    ...     dims=["x", "y"],
    ...     coords={"x": np.arange(16), "y": np.arange(16)},
    ... )
    >>> matrix, coords = xr_image_to_matrix(img)
    >>> matrix.shape  # 16*16 pixels, 1 grayscale feature
    (256, 1)
    >>> set(coords.keys()) == {"x", "y"}
    True
    """
    # Store per-dimension coordinates for exact reconstruction
    coords = {dim: da[dim].values for dim in da.dims}
    # Flatten spatial axes; keep trailing feature axis if present
    matrix = (
        da.values.reshape(-1, 1)  # (H*W, 1) for grayscale
        if da.ndim == 2
        else da.values.reshape(-1, da.shape[-1])  # (H*W, C) for multi-feature
    )
    return matrix, coords

`xr_rbig_fit_transform(ds, model, feature_vars=None, time_dim='time', spatial_dims=('lat', 'lon'))`

Fit an RBIG model on xarray spatiotemporal data and return the transform.

Converts ds to a 2-D matrix, calls model.fit_transform, and reconstructs the result as an xarray object matching the original structure.

Parameters

ds : xr.Dataset or xr.DataArray Input spatiotemporal data. model : AnnealedRBIG An unfitted RBIG model instance exposing fit_transform(X). feature_vars : list of str or None, default None Variable names to use as features (Dataset only). time_dim : str, default "time" Name of the time dimension. spatial_dims : tuple of str, default ("lat", "lon") Names of the spatial dimensions.

Returns

matrix : np.ndarray, shape (n_time * n_space, n_features) The 2-D matrix used for fitting (useful for diagnostics). transformed : xr.DataArray or xr.Dataset Gaussianised data reconstructed in the original xarray shape.

Notes

The pipeline is:

.. math::

\text{ds}
\xrightarrow{\text{xr\_st\_to\_matrix}}
\mathbf{X}
\xrightarrow{\text{RBIG fit\_transform}}
\mathbf{X}_t
\xrightarrow{\text{matrix\_to\_xr\_st}}
\text{transformed}

Examples

import numpy as np import xarray as xr rng = np.random.default_rng(0) da = xr.DataArray( ... rng.standard_normal((20, 4, 5)), ... dims=["time", "lat", "lon"], ... )

model = AnnealedRBIG(n_layers=5)

matrix, da_t = xr_rbig_fit_transform(da, model)

Source code in rbig/_src/xarray_st.py

def xr_rbig_fit_transform(
    ds: xr.Dataset | xr.DataArray,
    model: AnnealedRBIG,
    feature_vars: list[str] | None = None,
    time_dim: str = "time",
    spatial_dims: tuple[str, ...] = ("lat", "lon"),
) -> tuple[np.ndarray, xr.Dataset | xr.DataArray]:
    """Fit an RBIG model on xarray spatiotemporal data and return the transform.

    Converts ``ds`` to a 2-D matrix, calls ``model.fit_transform``, and
    reconstructs the result as an xarray object matching the original
    structure.

    Parameters
    ----------
    ds : xr.Dataset or xr.DataArray
        Input spatiotemporal data.
    model : AnnealedRBIG
        An unfitted RBIG model instance exposing ``fit_transform(X)``.
    feature_vars : list of str or None, default None
        Variable names to use as features (Dataset only).
    time_dim : str, default ``"time"``
        Name of the time dimension.
    spatial_dims : tuple of str, default ``("lat", "lon")``
        Names of the spatial dimensions.

    Returns
    -------
    matrix : np.ndarray, shape ``(n_time * n_space, n_features)``
        The 2-D matrix used for fitting (useful for diagnostics).
    transformed : xr.DataArray or xr.Dataset
        Gaussianised data reconstructed in the original xarray shape.

    Notes
    -----
    The pipeline is:

    .. math::

        \\text{ds}
        \\xrightarrow{\\text{xr\\_st\\_to\\_matrix}}
        \\mathbf{X}
        \\xrightarrow{\\text{RBIG fit\\_transform}}
        \\mathbf{X}_t
        \\xrightarrow{\\text{matrix\\_to\\_xr\\_st}}
        \\text{transformed}

    Examples
    --------
    >>> import numpy as np
    >>> import xarray as xr
    >>> rng = np.random.default_rng(0)
    >>> da = xr.DataArray(
    ...     rng.standard_normal((20, 4, 5)),
    ...     dims=["time", "lat", "lon"],
    ... )
    >>> # model = AnnealedRBIG(n_layers=5)
    >>> # matrix, da_t = xr_rbig_fit_transform(da, model)
    """
    matrix, meta = xr_st_to_matrix(ds, feature_vars, time_dim, spatial_dims)
    Xt = model.fit_transform(matrix)
    reconstructed = matrix_to_xr_st(Xt, meta, time_dim)
    return matrix, reconstructed

`xr_st_to_matrix(ds, feature_vars=None, time_dim='time', spatial_dims=('lat', 'lon'))`

Flatten an xarray spatiotemporal object into a 2-D sample matrix.

Each observation is formed by stacking all spatial grid points for a single time step, giving n_samples = n_time * n_space rows. For a :class:xarray.Dataset, each data variable becomes one feature column.

Parameters

ds : xr.Dataset or xr.DataArray Source spatiotemporal data. The object should have at least the time_dim and any of the spatial_dims as named dimensions. feature_vars : list of str or None, default None Variable names to include as feature columns (Dataset only). If None, all data variables are used. Ignored for DataArray input. time_dim : str, default "time" Name of the time dimension in ds. spatial_dims : tuple of str, default ("lat", "lon") Names of the spatial dimensions. All present spatial dimensions are included in the flattening.

Returns

matrix : np.ndarray, shape (n_time * n_space, n_features) 2-D array suitable for RBIG. For a DataArray, n_features = 1 (or the size of any non-spatial, non-time trailing axis). For a Dataset with n_vars variables, n_features = n_vars. meta : dict Metadata required by :func:matrix_to_xr_st to reconstruct the original xarray object. Keys depend on the input type:

* ``"type"`` — ``"DataArray"`` or ``"Dataset"``
* ``"shape"`` — original numpy array shape
* ``"dims"`` — original dimension names
* ``"coords"`` — coordinate arrays (DataArray only)
* ``"feature_vars"`` — list of variable names (Dataset only)

Notes

The stacking operation is

.. math::

(n_{\text{time}},\, n_{\text{lat}},\, n_{\text{lon}},\, \ldots)
\longrightarrow
(n_{\text{time}} \times n_{\text{space}},\; n_{\text{features}})

where :math:n_{\text{space}} = \prod_d n_d over all spatial dimensions d present in ds.

Examples

import numpy as np import xarray as xr rng = np.random.default_rng(0) da = xr.DataArray( ... rng.standard_normal((10, 4, 5)), ... dims=["time", "lat", "lon"], ... ) matrix, meta = xr_st_to_matrix(da) matrix.shape # 10*20 time-space samples, 1 feature (200, 1) meta["type"] 'DataArray'

Source code in rbig/_src/xarray_st.py

def xr_st_to_matrix(
    ds: xr.Dataset | xr.DataArray,
    feature_vars: list[str] | None = None,
    time_dim: str = "time",
    spatial_dims: tuple[str, ...] = ("lat", "lon"),
) -> tuple[np.ndarray, dict]:
    """Flatten an xarray spatiotemporal object into a 2-D sample matrix.

    Each observation is formed by stacking all spatial grid points for a
    single time step, giving ``n_samples = n_time * n_space`` rows.  For a
    :class:`xarray.Dataset`, each data variable becomes one feature column.

    Parameters
    ----------
    ds : xr.Dataset or xr.DataArray
        Source spatiotemporal data.  The object should have at least the
        ``time_dim`` and any of the ``spatial_dims`` as named dimensions.
    feature_vars : list of str or None, default None
        Variable names to include as feature columns (Dataset only).  If
        ``None``, all data variables are used.  Ignored for DataArray input.
    time_dim : str, default ``"time"``
        Name of the time dimension in ``ds``.
    spatial_dims : tuple of str, default ``("lat", "lon")``
        Names of the spatial dimensions.  All present spatial dimensions are
        included in the flattening.

    Returns
    -------
    matrix : np.ndarray, shape ``(n_time * n_space, n_features)``
        2-D array suitable for RBIG.  For a DataArray, ``n_features = 1``
        (or the size of any non-spatial, non-time trailing axis).  For a
        Dataset with ``n_vars`` variables, ``n_features = n_vars``.
    meta : dict
        Metadata required by :func:`matrix_to_xr_st` to reconstruct the
        original xarray object.  Keys depend on the input type:

        * ``"type"`` — ``"DataArray"`` or ``"Dataset"``
        * ``"shape"`` — original numpy array shape
        * ``"dims"`` — original dimension names
        * ``"coords"`` — coordinate arrays (DataArray only)
        * ``"feature_vars"`` — list of variable names (Dataset only)

    Notes
    -----
    The stacking operation is

    .. math::

        (n_{\\text{time}},\\, n_{\\text{lat}},\\, n_{\\text{lon}},\\, \\ldots)
        \\longrightarrow
        (n_{\\text{time}} \\times n_{\\text{space}},\\; n_{\\text{features}})

    where :math:`n_{\\text{space}} = \\prod_d n_d` over all spatial dimensions
    ``d`` present in ``ds``.

    Examples
    --------
    >>> import numpy as np
    >>> import xarray as xr
    >>> rng = np.random.default_rng(0)
    >>> da = xr.DataArray(
    ...     rng.standard_normal((10, 4, 5)),
    ...     dims=["time", "lat", "lon"],
    ... )
    >>> matrix, meta = xr_st_to_matrix(da)
    >>> matrix.shape  # 10*20 time-space samples, 1 feature
    (200, 1)
    >>> meta["type"]
    'DataArray'
    """
    import xarray as xr

    if isinstance(ds, xr.DataArray):
        arr = ds.values
        n_time = arr.shape[0] if time_dim in ds.dims else 1
        n_spatial = int(np.prod([ds.sizes[d] for d in spatial_dims if d in ds.dims]))
        # Collapse time and space into the sample axis
        matrix = (
            arr.reshape(n_time * n_spatial, -1) if arr.ndim > 1 else arr.reshape(-1, 1)
        )
        meta = {
            "type": "DataArray",
            "shape": arr.shape,
            "dims": ds.dims,
            "coords": {k: v.values for k, v in ds.coords.items()},
        }
    else:
        if feature_vars is None:
            feature_vars = list(ds.data_vars)
        arrays = [ds[v].values for v in feature_vars]
        n_time = ds.sizes.get(time_dim, 1)
        n_spatial = int(np.prod([ds.sizes[d] for d in spatial_dims if d in ds.sizes]))
        # Each variable becomes one column; rows index (time * space)
        matrix = np.stack([a.reshape(n_time * n_spatial) for a in arrays], axis=-1)
        meta = {
            "type": "Dataset",
            "feature_vars": feature_vars,
            "shape": arrays[0].shape,
            "dims": list(ds.dims),
        }

    return matrix, meta

API Reference

AnnealedRBIG

Parameters

Attributes

Notes

References

Examples

entropy()

Returns

Notes

fit(X)

Parameters

Returns

fit_transform(X)

Parameters

Returns

inverse_transform(X)

Parameters

Returns

jacobian(X, return_X_transform=False)

Parameters

Returns

predict_proba(X, domain='input')

Parameters

Returns

sample(n_samples, random_state=None)

Parameters

Returns

score(X)

Parameters

Returns

score_samples(X)

Parameters

Returns

Notes

score_samples_raw_()

Returns

transform(X)

Parameters

Returns

BaseITMeasure

Notes

fit(X, Y=None) abstractmethod

Parameters

Returns

score() abstractmethod

Returns

BaseTransform

Notes

fit(X) abstractmethod

Parameters

Returns

fit_transform(X)

Parameters

Returns

inverse_transform(X) abstractmethod

Parameters

Returns

log_det_jacobian(X)

Parameters

Returns

Raises

transform(X) abstractmethod

Parameters

Returns

Bijector

Notes

References

fit(X) abstractmethod

Parameters

Returns

fit_transform(X)

Parameters

Returns

get_log_det_jacobian(X) abstractmethod

Parameters

Returns

inverse_transform(X) abstractmethod

Parameters

Returns

`AnnealedRBIG`

`entropy()`

`fit(X)`

`fit_transform(X)`

`inverse_transform(X)`

`jacobian(X, return_X_transform=False)`

`predict_proba(X, domain='input')`

`sample(n_samples, random_state=None)`

`score(X)`

`score_samples(X)`

`score_samples_raw_()`

`transform(X)`

`BaseITMeasure`

`fit(X, Y=None)` `abstractmethod`

`score()` `abstractmethod`

`BaseTransform`

`fit(X)` `abstractmethod`

`fit_transform(X)`

`inverse_transform(X)` `abstractmethod`

`log_det_jacobian(X)`

`transform(X)` `abstractmethod`

`Bijector`

`fit(X)` `abstractmethod`

`fit_transform(X)`

`get_log_det_jacobian(X)` `abstractmethod`

`inverse_transform(X)` `abstractmethod`

`log_det_jacobian(X)`

`transform(X)` `abstractmethod`

`BoxCoxTransform`

`fit(X)`

`inverse_transform(X)`

`log_det_jacobian(X)`

`transform(X)`

`CompositeBijector`

`fit(X)`

`get_log_det_jacobian(X)`

`inverse_transform(X)`

`transform(X)`

`Cube`

`fit(X)`

`get_log_det_jacobian(X)`

`inverse_transform(X)`

`transform(X)`

`DCTRotation`

`fit(X)`

`get_log_det_jacobian(X)`

`inverse_transform(X)`

`transform(X)`