Skip to content

API Reference

Models

rbig.AnnealedRBIG

Bases: TransformerMixin, BaseEstimator

Rotation-Based Iterative Gaussianization (RBIG).

RBIG is a density estimation and data transformation method that iteratively Gaussianizes multivariate data by alternating between:

  1. Marginal Gaussianization: mapping each feature to a Gaussian using its empirical CDF and the probit transform.
  2. Rotation: applying an orthogonal matrix (PCA or ICA) to de-correlate the Gaussianized features.

The process repeats until the total correlation (TC) of the transformed data converges. After fitting, the model represents a normalizing flow whose density is given by the change-of-variables formula:

log p(x) = log p_Z(f(x)) + log|det J_f(x)|

where f is the composition of all fitted layers and p_Z is a standard multivariate Gaussian.

Parameters

n_layers : int, default=100 Maximum number of RBIG layers to apply. Early stopping via patience may halt training before this limit. rotation : str, default="pca" Rotation method: "pca" (PCA without whitening — orthogonal), "ica" (Independent Component Analysis), or "random" (Haar-distributed orthogonal rotation). patience : int, default=10 Number of consecutive layers showing a TC change smaller than tol before training stops early. (Formerly zero_tolerance, which is still accepted but deprecated.) tol : float or "auto", default=1e-5 Convergence threshold for the per-layer change in total correlation: |TC(k) − TC(k−1)| < tol. When set to "auto", the tolerance is chosen adaptively based on the number of training samples using an empirically calibrated lookup table. random_state : int or None, default=None Seed for the random number generator used by stochastic components such as ICA or random rotations. strategy : list or None, default=None Optional per-layer override list. Each entry may be a string (rotation name) or a (rotation_name, marginal_name) pair. Entries cycle if the list is shorter than n_layers. verbose : bool or int, default=False Controls progress bar display. False (or 0) disables all progress bars. True (or 1) shows a progress bar for the fit loop. 2 additionally shows progress bars for transform, inverse_transform, score_samples, and jacobian.

Attributes

n_features_in_ : int Number of features seen during fit. layers_ : list of RBIGLayer Fitted RBIG layers in application order. tc_per_layer_ : list of float Total correlation of the data at each stage. Index 0 is the TC of the input data (before any layers); index k >= 1 is the TC after layer k. log_det_train_ : np.ndarray of shape (n_samples,) Accumulated per-sample log-det-Jacobian over all layers, computed on the training data during fit. X_transformed_ : np.ndarray of shape (n_samples, n_features) Training data after passing through all fitted layers.

Notes

Total correlation is defined as:

TC(X) = ∑ᵢ H(Xᵢ) − H(X)

where H(Xᵢ) is the marginal entropy of the i-th feature and H(X) is the joint entropy. For a fully Gaussianized, independent dataset, TC = 0.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: From ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537–549. https://doi.org/10.1109/TNN.2011.2106511

Examples

import numpy as np from rbig._src.model import AnnealedRBIG rng = np.random.default_rng(42) X = rng.standard_normal((300, 4)) model = AnnealedRBIG(n_layers=20, rotation="pca") model.fit(X) Z = model.transform(X) Z.shape (300, 4) model.score(X) # mean log-likelihood in nats -5.65...

Source code in rbig/_src/model.py
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 246
 247
 248
 249
 250
 251
 252
 253
 254
 255
 256
 257
 258
 259
 260
 261
 262
 263
 264
 265
 266
 267
 268
 269
 270
 271
 272
 273
 274
 275
 276
 277
 278
 279
 280
 281
 282
 283
 284
 285
 286
 287
 288
 289
 290
 291
 292
 293
 294
 295
 296
 297
 298
 299
 300
 301
 302
 303
 304
 305
 306
 307
 308
 309
 310
 311
 312
 313
 314
 315
 316
 317
 318
 319
 320
 321
 322
 323
 324
 325
 326
 327
 328
 329
 330
 331
 332
 333
 334
 335
 336
 337
 338
 339
 340
 341
 342
 343
 344
 345
 346
 347
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
 393
 394
 395
 396
 397
 398
 399
 400
 401
 402
 403
 404
 405
 406
 407
 408
 409
 410
 411
 412
 413
 414
 415
 416
 417
 418
 419
 420
 421
 422
 423
 424
 425
 426
 427
 428
 429
 430
 431
 432
 433
 434
 435
 436
 437
 438
 439
 440
 441
 442
 443
 444
 445
 446
 447
 448
 449
 450
 451
 452
 453
 454
 455
 456
 457
 458
 459
 460
 461
 462
 463
 464
 465
 466
 467
 468
 469
 470
 471
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485
 486
 487
 488
 489
 490
 491
 492
 493
 494
 495
 496
 497
 498
 499
 500
 501
 502
 503
 504
 505
 506
 507
 508
 509
 510
 511
 512
 513
 514
 515
 516
 517
 518
 519
 520
 521
 522
 523
 524
 525
 526
 527
 528
 529
 530
 531
 532
 533
 534
 535
 536
 537
 538
 539
 540
 541
 542
 543
 544
 545
 546
 547
 548
 549
 550
 551
 552
 553
 554
 555
 556
 557
 558
 559
 560
 561
 562
 563
 564
 565
 566
 567
 568
 569
 570
 571
 572
 573
 574
 575
 576
 577
 578
 579
 580
 581
 582
 583
 584
 585
 586
 587
 588
 589
 590
 591
 592
 593
 594
 595
 596
 597
 598
 599
 600
 601
 602
 603
 604
 605
 606
 607
 608
 609
 610
 611
 612
 613
 614
 615
 616
 617
 618
 619
 620
 621
 622
 623
 624
 625
 626
 627
 628
 629
 630
 631
 632
 633
 634
 635
 636
 637
 638
 639
 640
 641
 642
 643
 644
 645
 646
 647
 648
 649
 650
 651
 652
 653
 654
 655
 656
 657
 658
 659
 660
 661
 662
 663
 664
 665
 666
 667
 668
 669
 670
 671
 672
 673
 674
 675
 676
 677
 678
 679
 680
 681
 682
 683
 684
 685
 686
 687
 688
 689
 690
 691
 692
 693
 694
 695
 696
 697
 698
 699
 700
 701
 702
 703
 704
 705
 706
 707
 708
 709
 710
 711
 712
 713
 714
 715
 716
 717
 718
 719
 720
 721
 722
 723
 724
 725
 726
 727
 728
 729
 730
 731
 732
 733
 734
 735
 736
 737
 738
 739
 740
 741
 742
 743
 744
 745
 746
 747
 748
 749
 750
 751
 752
 753
 754
 755
 756
 757
 758
 759
 760
 761
 762
 763
 764
 765
 766
 767
 768
 769
 770
 771
 772
 773
 774
 775
 776
 777
 778
 779
 780
 781
 782
 783
 784
 785
 786
 787
 788
 789
 790
 791
 792
 793
 794
 795
 796
 797
 798
 799
 800
 801
 802
 803
 804
 805
 806
 807
 808
 809
 810
 811
 812
 813
 814
 815
 816
 817
 818
 819
 820
 821
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
class AnnealedRBIG(TransformerMixin, BaseEstimator):
    """Rotation-Based Iterative Gaussianization (RBIG).

    RBIG is a density estimation and data transformation method that
    iteratively Gaussianizes multivariate data by alternating between:

    1. **Marginal Gaussianization**: mapping each feature to a Gaussian
       using its empirical CDF and the probit transform.
    2. **Rotation**: applying an orthogonal matrix (PCA or ICA) to
       de-correlate the Gaussianized features.

    The process repeats until the total correlation (TC) of the
    transformed data converges.  After fitting, the model represents a
    normalizing flow whose density is given by the change-of-variables
    formula:

        log p(x) = log p_Z(f(x)) + log|det J_f(x)|

    where ``f`` is the composition of all fitted layers and ``p_Z`` is a
    standard multivariate Gaussian.

    Parameters
    ----------
    n_layers : int, default=100
        Maximum number of RBIG layers to apply.  Early stopping via
        ``patience`` may halt training before this limit.
    rotation : str, default="pca"
        Rotation method: ``"pca"`` (PCA without whitening — orthogonal),
        ``"ica"`` (Independent Component Analysis), or ``"random"``
        (Haar-distributed orthogonal rotation).
    patience : int, default=10
        Number of consecutive layers showing a TC change smaller than
        ``tol`` before training stops early.  (Formerly ``zero_tolerance``,
        which is still accepted but deprecated.)
    tol : float or "auto", default=1e-5
        Convergence threshold for the per-layer change in total correlation:
        ``|TC(k) − TC(k−1)| < tol``.  When set to ``"auto"``, the tolerance
        is chosen adaptively based on the number of training samples using
        an empirically calibrated lookup table.
    random_state : int or None, default=None
        Seed for the random number generator used by stochastic components
        such as ICA or random rotations.
    strategy : list or None, default=None
        Optional per-layer override list.  Each entry may be a string
        (rotation name) or a ``(rotation_name, marginal_name)`` pair.
        Entries cycle if the list is shorter than ``n_layers``.
    verbose : bool or int, default=False
        Controls progress bar display.  ``False`` (or ``0``) disables all
        progress bars.  ``True`` (or ``1``) shows a progress bar for the
        ``fit`` loop.  ``2`` additionally shows progress bars for
        ``transform``, ``inverse_transform``, ``score_samples``, and
        ``jacobian``.

    Attributes
    ----------
    n_features_in_ : int
        Number of features seen during ``fit``.
    layers_ : list of RBIGLayer
        Fitted RBIG layers in application order.
    tc_per_layer_ : list of float
        Total correlation of the data at each stage.  Index 0 is the TC
        of the *input* data (before any layers); index *k* >= 1 is the TC
        after layer *k*.
    log_det_train_ : np.ndarray of shape (n_samples,)
        Accumulated per-sample log-det-Jacobian over all layers,
        computed on the training data during ``fit``.
    X_transformed_ : np.ndarray of shape (n_samples, n_features)
        Training data after passing through all fitted layers.

    Notes
    -----
    Total correlation is defined as:

        TC(X) = ∑ᵢ H(Xᵢ) − H(X)

    where H(Xᵢ) is the marginal entropy of the i-th feature and H(X) is
    the joint entropy.  For a fully Gaussianized, independent dataset,
    TC = 0.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    From ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537–549. https://doi.org/10.1109/TNN.2011.2106511

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.model import AnnealedRBIG
    >>> rng = np.random.default_rng(42)
    >>> X = rng.standard_normal((300, 4))
    >>> model = AnnealedRBIG(n_layers=20, rotation="pca")
    >>> model.fit(X)
    <rbig._src.model.AnnealedRBIG object at ...>
    >>> Z = model.transform(X)
    >>> Z.shape
    (300, 4)
    >>> model.score(X)  # mean log-likelihood in nats
    -5.65...
    """

    def __init__(
        self,
        n_layers: int = 100,
        rotation: str = "pca",
        patience: int = 10,
        tol: float | str = 1e-5,
        random_state: int | None = None,
        strategy: list | None = None,
        verbose: bool | int = False,
    ):
        self.n_layers = n_layers
        self.rotation = rotation
        self.patience = patience
        self.tol = tol
        self.random_state = random_state
        self.strategy = strategy
        self.verbose = verbose

    @property
    def zero_tolerance(self):
        """Deprecated alias for ``patience``."""
        import warnings

        warnings.warn(
            "zero_tolerance is deprecated, use patience instead",
            FutureWarning,
            stacklevel=2,
        )
        return self.patience

    @zero_tolerance.setter
    def zero_tolerance(self, value):
        import warnings

        warnings.warn(
            "zero_tolerance is deprecated, use patience instead",
            FutureWarning,
            stacklevel=2,
        )
        self.patience = value

    def fit(self, X: np.ndarray, y=None) -> AnnealedRBIG:
        """Fit the RBIG model by iteratively Gaussianizing X.

        At each layer k the algorithm:

        1. Builds a new :class:`RBIGLayer` with the configured marginal and
           rotation transforms.
        2. Fits the layer on the current working copy ``Xt``.
        3. Accumulates the per-sample log-det-Jacobian:
           ``log_det_train_ += log|det J_k(Xt)|``.
        4. Advances ``Xt`` through the layer: ``Xt = f_k(Xt)``.
        5. Measures residual total correlation: ``TC(Xt) = ∑ᵢ H(Xᵢ) − H(X)``.
        6. Stops early when TC has not changed by more than ``tol`` for
           ``patience`` consecutive layers.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.
        y : ignored
            Not used, present for sklearn pipeline compatibility.

        Returns
        -------
        self : AnnealedRBIG
            The fitted model.
        """
        X = validate_data(self, X)
        n_samples, n_features = X.shape
        if n_samples < 2:
            raise ValueError(
                f"RBIG requires at least 2 samples to estimate marginal CDFs, "
                f"got n_samples = {n_samples}."
            )
        self.n_features_in_ = n_features  # remember input dimensionality
        self.layers_: list[RBIGLayer] = []
        self.tc_per_layer_: list[float] = []

        # Validate and resolve tolerance
        if self.tol == "auto":
            tol = self._get_information_tolerance(n_samples)
        elif isinstance(self.tol, int | float):
            tol = float(self.tol)
        else:
            raise ValueError(f"tol must be a float or 'auto', got {self.tol!r}")
        self.tol_: float = tol  # store resolved tolerance for inspection

        Xt = X.copy()  # working copy; shape (n_samples, n_features)
        self.log_det_train_ = np.zeros(
            n_samples
        )  # accumulated log|det J|; shape (n_samples,)
        zero_count = 0  # consecutive non-improving layer counter

        # Record TC of the *input* data (before any layers).  This is
        # needed by total_correlation_reduction() which uses
        # tc_per_layer_[0] - tc_per_layer_[-1].
        self.tc_per_layer_.append(self._total_correlation(Xt))

        pbar = maybe_tqdm(
            range(self.n_layers),
            verbose=self.verbose,
            level=1,
            desc="Fitting RBIG",
            total=self.n_layers,
        )
        for i in pbar:
            # Build layer i with the appropriate marginal and rotation components
            layer = RBIGLayer(
                marginal=self._make_marginal(layer_index=i),
                rotation=self._make_rotation(layer_index=i),
            )
            layer.fit(Xt)
            # Accumulate log|det J_i(Xt)| before advancing Xt
            self.log_det_train_ += layer.log_det_jacobian(Xt)
            Xt = layer.transform(Xt)  # advance to next representation
            self.layers_.append(layer)

            # Measure residual total correlation: TC = sum_i H(Xi) - H(X)
            tc = self._total_correlation(Xt)
            self.tc_per_layer_.append(tc)

            if hasattr(pbar, "set_postfix"):
                postfix = {"TC": f"{tc:.4g}"}
                if i > 0:
                    delta = abs(self.tc_per_layer_[-2] - tc)
                    postfix["δTC"] = f"{delta:.2e}"
                pbar.set_postfix(postfix)

            if i > 0:
                # Check convergence: how much did TC improve this layer?
                delta = abs(self.tc_per_layer_[-2] - tc)
                if delta < tol:
                    zero_count += 1
                else:
                    zero_count = 0  # reset on any significant improvement

            # Stop early if TC has been flat for patience consecutive layers
            if zero_count >= self.patience:
                if hasattr(pbar, "total"):
                    pbar.total = i + 1
                    pbar.refresh()
                break

        # Store the fully transformed training data for efficient entropy estimation
        self.X_transformed_ = Xt  # shape (n_samples, n_features)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Map X to the Gaussian latent space through all fitted layers.

        Applies each fitted :class:`RBIGLayer` in order:
        ``Z = fₖ(… f₂(f₁(x)) …)``.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Data in the approximately Gaussian latent space.
        """
        check_is_fitted(self)
        Xt = validate_data(self, X, reset=False).copy()
        layers_iter = maybe_tqdm(
            self.layers_,
            verbose=self.verbose,
            level=2,
            desc="Transforming",
            total=len(self.layers_),
        )
        for layer in layers_iter:
            Xt = layer.transform(Xt)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Map latent-space data back to the original input space.

        Applies layers in reverse order:
        ``x = f₁⁻¹(… fₖ₋₁⁻¹(fₖ⁻¹(z)) …)``.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the latent (approximately Gaussian) space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original input space.
        """
        check_is_fitted(self)
        Xt = validate_data(self, X, reset=False).copy()
        layers_iter = maybe_tqdm(
            reversed(self.layers_),
            verbose=self.verbose,
            level=2,
            desc="Inverse transforming",
            total=len(self.layers_),
        )
        for layer in layers_iter:
            Xt = layer.inverse_transform(Xt)
        return Xt

    def fit_transform(self, X: np.ndarray, y=None) -> np.ndarray:
        """Fit the model to X and return the latent-space representation.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Transformed data in the latent space.
        """
        return self.fit(X).transform(X)

    def score_samples(self, X: np.ndarray) -> np.ndarray:
        """Per-sample log-likelihood under the fitted density model.

        Uses the change-of-variables formula for normalizing flows:

            log p(x) = log p_Z(f(x)) + log|det J_f(x)|

        where ``p_Z = 𝒩(0, I)`` is the standard Gaussian base density,
        ``f`` is the composition of all fitted layers, and ``J_f(x)`` is
        the Jacobian of ``f`` at ``x``.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data points at which to evaluate the log-likelihood.

        Returns
        -------
        log_prob : np.ndarray of shape (n_samples,)
            Per-sample log-likelihood in nats.

        Notes
        -----
        The log-det-Jacobian is accumulated layer by layer to avoid
        recomputing intermediate representations:

            log|det J_f(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|
        """
        check_is_fitted(self)
        X = validate_data(self, X, reset=False)
        Xt = X.copy()  # shape (n_samples, n_features)
        log_det_jac = np.zeros(X.shape[0])  # accumulator; shape (n_samples,)
        layers_iter = maybe_tqdm(
            self.layers_,
            verbose=self.verbose,
            level=2,
            desc="Scoring",
            total=len(self.layers_),
        )
        for layer in layers_iter:
            # Accumulate log|det Jₖ| before advancing through layer k
            log_det_jac += layer.log_det_jacobian(Xt)
            Xt = layer.transform(Xt)  # xₖ = fₖ(xₖ₋₁)
        # log p_Z(z) = sum_i log N(z_i; 0, 1); shape (n_samples,)
        log_pz = np.sum(stats.norm.logpdf(Xt), axis=1)
        # change-of-variables: log p(x) = log p_Z(f(x)) + log|det J_f(x)|
        return log_pz + log_det_jac

    def score(self, X: np.ndarray, y=None) -> float:
        """Mean log-likelihood of samples X under the fitted density.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data points to evaluate.
        y : ignored
            Not used, present for sklearn pipeline compatibility.

        Returns
        -------
        mean_log_prob : float
            Average per-sample log-likelihood in nats.
        """
        return float(np.mean(self.score_samples(X)))

    def entropy(self) -> float:
        """Differential entropy of the fitted distribution in nats.

        Estimated from the training data using:

            H(X) = −𝔼_X[log p(x)]

        The expectation is approximated by the sample mean over the training
        set.  The log-likelihoods are obtained via the efficient cached path
        :meth:`score_samples_raw_` which reuses pre-computed quantities from
        ``fit``.

        Returns
        -------
        h : float
            Estimated entropy in nats.  Always ≥ 0 for continuous
            distributions.

        Notes
        -----
        This is equivalent to ``-self.score(X_train)`` but avoids the cost
        of re-passing training data through all layers.
        """
        check_is_fitted(self)
        return float(-np.mean(self.score_samples_raw_()))

    def total_correlation_reduction(self) -> float:
        """Total correlation removed by RBIG (RBIG-way TC estimation).

        Uses the per-layer TC reduction approach from Laparra et al. (2011):

            TC(X) = TC₀ − TCₖ = Σₖ ΔTCₖ

        where TC₀ is the total correlation of the input and TCₖ is the
        residual TC after K layers of Gaussianization.  When the model has
        converged, TCₖ ≈ 0 and the result equals TC₀.

        Returns
        -------
        tc : float
            Estimated total correlation in nats.
        """
        check_is_fitted(self)
        return float(self.tc_per_layer_[0] - self.tc_per_layer_[-1])

    def entropy_reduction(self, X: np.ndarray) -> float:
        """Differential entropy via RBIG-way TC reduction.

        Uses the identity H(X) = Σ_d H(X_d) − TC(X) where marginal
        entropies are estimated via KDE and TC is obtained from the
        cumulative per-layer TC reduction (Laparra et al. 2011).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data whose entropy is estimated (typically the training data).

        Returns
        -------
        h : float
            Estimated differential entropy in nats.
        """
        check_is_fitted(self)
        from rbig._src.densities import marginal_entropy

        h_marginals = marginal_entropy(X)  # shape (n_features,)
        tc = self.total_correlation_reduction()
        return float(np.sum(h_marginals) - tc)

    def score_samples_raw_(self) -> np.ndarray:
        """Log-likelihood for the stored training data without recomputing layers.

        Reuses ``X_transformed_`` and ``log_det_train_`` cached during
        :meth:`fit`, so the cost is a single Gaussian log-pdf evaluation
        rather than a full forward pass through all layers.

        Returns
        -------
        log_prob : np.ndarray of shape (n_samples,)
            Per-sample log-likelihood of the training data in nats.
        """
        # log p_Z evaluated at the pre-computed transformed training data
        log_pz = np.sum(
            stats.norm.logpdf(self.X_transformed_), axis=1
        )  # shape (n_samples,)
        # add the accumulated log-det-Jacobian stored during fit
        return log_pz + self.log_det_train_

    def sample(self, n_samples: int, random_state: int | None = None) -> np.ndarray:
        """Generate samples from the learned distribution.

        Draws i.i.d. standard Gaussian samples in the latent space and maps
        them back to the data space via the inverse normalizing flow.

        Parameters
        ----------
        n_samples : int
            Number of samples to generate.
        random_state : int or None, optional
            Seed for the random number generator.  If ``None``, a random
            seed is used.

        Returns
        -------
        X_new : np.ndarray of shape (n_samples, n_features_in_)
            Samples in the original data space.
        """
        check_is_fitted(self)
        rng = np.random.default_rng(random_state)
        Z = rng.standard_normal((n_samples, self.n_features_in_))  # latent samples
        return self.inverse_transform(Z)

    def predict_proba(
        self,
        X: np.ndarray,
        domain: str = "input",
    ) -> np.ndarray | tuple[np.ndarray, np.ndarray]:
        """Return probability density estimates for X.

        Uses the change-of-variables formula via the full Jacobian matrix
        to compute the density in the requested domain.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data points to evaluate.
        domain : str, default="input"
            Which domain to return densities in:

            - ``"input"`` — density in the original data space:
              ``p(x) = p_Z(f(x)) · |det J_f(x)|``
            - ``"transform"`` — density in the Gaussian latent space:
              ``p_Z(f(x)) = ∏ᵢ φ(fᵢ(x))``
            - ``"both"`` — returns a tuple ``(p_input, p_transform)``

        Returns
        -------
        proba : np.ndarray of shape (n_samples,) or tuple
            Probability density estimates.  When ``domain="both"``, returns
            ``(p_input, p_transform)``.
        """
        check_is_fitted(self)
        X = validate_data(self, X, reset=False)
        jac, Xt = self.jacobian(X, return_X_transform=True)

        # Work in log-space for numerical stability
        log_p_transform = np.sum(stats.norm.logpdf(Xt), axis=1)

        if domain == "transform":
            p_transform = np.exp(log_p_transform)
            p_transform = np.where(np.isfinite(p_transform), p_transform, 0.0)
            return p_transform

        # Input-domain density via change of variables (log-space)
        _sign, log_abs_det = np.linalg.slogdet(jac)
        log_p_input = log_p_transform + log_abs_det
        p_input = np.exp(log_p_input)
        p_input = np.where(np.isfinite(p_input), p_input, 0.0)

        if domain == "input":
            return p_input
        if domain == "both":
            p_transform = np.exp(log_p_transform)
            p_transform = np.where(np.isfinite(p_transform), p_transform, 0.0)
            return p_input, p_transform
        raise ValueError(
            f"Unknown domain: {domain!r}. Use 'input', 'transform', or 'both'."
        )

    def jacobian(
        self,
        X: np.ndarray,
        return_X_transform: bool = False,
    ) -> np.ndarray | tuple[np.ndarray, np.ndarray]:
        """Compute the full Jacobian matrix of the RBIG transform.

        For each sample, returns the ``(n_features, n_features)`` Jacobian
        matrix ``df/dx`` of the composition of all fitted layers.  Uses the
        seed-dimension approach from the legacy implementation: for each input
        dimension ``idim``, a unit vector is propagated through the chain of
        per-feature marginal derivatives and rotation matrices.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data at which to evaluate the Jacobian.
        return_X_transform : bool, default False
            If True, also return the fully transformed data ``f(X)`` (computed
            as a side-effect of the Jacobian calculation).

        Returns
        -------
        jac : np.ndarray of shape (n_samples, n_features, n_features)
            Full Jacobian matrix per sample.  ``jac[n, i, j]`` is the partial
            derivative ``df_i/dx_j`` for the n-th sample.
        X_transformed : np.ndarray of shape (n_samples, n_features)
            Only returned when ``return_X_transform=True``.  The data after
            passing through all layers.
        """
        check_is_fitted(self)
        n_samples, n_features = X.shape

        # ── Forward pass: collect per-layer derivatives and rotation matrices ──
        derivs_per_layer = []  # each: (n_samples, n_features)
        rotmats_per_layer = []  # each: (n_features, n_features)

        Xt = X.copy()
        layers_iter = maybe_tqdm(
            self.layers_,
            verbose=self.verbose,
            level=2,
            desc="Jacobian (forward)",
            total=len(self.layers_),
        )
        for layer in layers_iter:
            if not hasattr(layer.marginal, "_per_feature_log_deriv"):
                raise NotImplementedError(
                    f"Jacobian computation requires a marginal with "
                    f"_per_feature_log_deriv(); "
                    f"{type(layer.marginal).__name__} does not support this."
                )
            # Per-feature marginal derivatives and transformed data in one pass
            log_d, Xt_marginal = layer.marginal._per_feature_log_deriv(
                Xt, return_transform=True
            )
            derivs_per_layer.append(np.exp(log_d))

            # Rotation matrix in row-vector convention: y = z @ R
            rot = self._extract_rotation_matrix(layer.rotation)
            rotmats_per_layer.append(rot)

            # Advance through rotation only
            Xt = layer.rotation.transform(Xt_marginal)

        # ── Seed-dimension loop: propagate unit vectors through the chain ──
        jac = np.zeros((n_samples, n_features, n_features))

        dims_iter = maybe_tqdm(
            range(n_features),
            verbose=self.verbose,
            level=2,
            desc="Jacobian (dims)",
            total=n_features,
        )
        for idim in dims_iter:
            # Initialize seed: unit vector in dimension idim
            XX = np.zeros((n_samples, n_features))
            XX[:, idim] = 1.0

            for derivs, R in zip(derivs_per_layer, rotmats_per_layer, strict=True):
                # Chain rule: XX_new = diag(derivs) @ XX @ R
                XX = (derivs * XX) @ R

            jac[:, :, idim] = XX

        if return_X_transform:
            return jac, Xt
        return jac

    @staticmethod
    def _extract_rotation_matrix(rotation) -> np.ndarray:
        """Extract the effective rotation matrix in row-vector convention.

        For PCA with whitening the effective matrix is
        ``components_.T / sqrt(explained_variance_)`` so that
        ``y = (x - mu) @ R``.

        Parameters
        ----------
        rotation : BaseTransform
            A fitted rotation object (PCARotation, ICARotation, etc.).

        Returns
        -------
        R : np.ndarray of shape (n_features, n_features)
            Rotation matrix such that ``y = x @ R`` (ignoring mean shift).
        """
        from rbig._src.rotation import ICARotation, PCARotation

        if isinstance(rotation, PCARotation):
            R = rotation.pca_.components_.T.copy()
            if rotation.whiten:
                R /= np.sqrt(rotation.pca_.explained_variance_)[np.newaxis, :]
            return R

        if isinstance(rotation, ICARotation):
            # ICA unmixing: W = components_, transform is x @ W.T
            if hasattr(rotation, "K_") and rotation.K_ is not None:
                # Picard path: y = (x @ K.T) @ W.T
                return rotation.K_.T @ rotation.W_.T
            return rotation.ica_.components_.T.copy()

        # Generic fallback: try to get rotation_matrix_ attribute.
        # These rotations apply X @ rotation_matrix_.T, so transpose
        # to match the y = x @ R convention used by PCA/ICA above.
        if hasattr(rotation, "rotation_matrix_"):
            return rotation.rotation_matrix_.T.copy()

        raise TypeError(
            f"Cannot extract rotation matrix from {type(rotation).__name__}. "
            f"Jacobian computation requires PCARotation, ICARotation, or an "
            f"object with a rotation_matrix_ attribute."
        )

    def _make_rotation(self, layer_index: int = 0):
        """Instantiate the rotation component for a given layer.

        Parameters
        ----------
        layer_index : int, default=0
            Index of the layer being constructed.  Used when cycling through
            a ``strategy`` list.

        Returns
        -------
        rotation : RotationBijector
            An unfitted rotation bijector instance.
        """
        if self.strategy is not None:
            # cycle through the strategy list to select rotation for this layer
            idx = layer_index % len(self.strategy)
            entry = self.strategy[idx]
            rotation_name = entry[0] if isinstance(entry, list | tuple) else entry
            return self._get_component(rotation_name, "rotation", layer_index)
        if self.rotation == "pca":
            return PCARotation(whiten=False)
        elif self.rotation == "ica":
            from rbig._src.rotation import ICARotation

            return ICARotation(random_state=self.random_state)
        elif self.rotation == "random":
            from rbig._src.rotation import RandomRotation

            seed = (self.random_state or 0) + layer_index
            return RandomRotation(random_state=seed)
        else:
            raise ValueError(
                f"Unknown rotation: {self.rotation}. Use 'pca', 'ica', or 'random'."
            )

    def _make_marginal(self, layer_index: int = 0):
        """Instantiate the marginal Gaussianization component for a given layer.

        Parameters
        ----------
        layer_index : int, default=0
            Index of the layer being constructed.  Used when cycling through
            a ``strategy`` list.

        Returns
        -------
        marginal : MarginalBijector
            An unfitted marginal Gaussianizer instance.
        """
        if self.strategy is not None:
            # cycle through the strategy list to select marginal for this layer
            idx = layer_index % len(self.strategy)
            entry = self.strategy[idx]
            marginal_name = (
                entry[1] if isinstance(entry, list | tuple) else "gaussianize"
            )
            return self._get_component(marginal_name, "marginal", layer_index)
        return MarginalGaussianize()

    def _get_component(self, name: str, kind: str, seed: int = 0):
        """Instantiate a rotation or marginal component by name.

        Parameters
        ----------
        name : str
            Component name, e.g. ``"pca"``, ``"ica"``, ``"gaussianize"``.
        kind : str
            Either ``"rotation"`` or ``"marginal"``.
        seed : int, default=0
            Layer index added to ``random_state`` to vary seeds per layer.

        Returns
        -------
        component : Bijector
            An unfitted bijector of the requested kind.
        """
        rng_seed = (self.random_state or 0) + seed
        if kind == "rotation":
            return self._make_rotation_by_name(name, rng_seed)
        return self._make_marginal_by_name(name, rng_seed)

    def _make_rotation_by_name(self, name: str, seed: int):
        """Instantiate a rotation bijector from its string name.

        Parameters
        ----------
        name : str
            One of ``"pca"``, ``"ica"``, or ``"random"``.
        seed : int
            Random seed for stochastic rotations.

        Returns
        -------
        rotation : RotationBijector
            The corresponding unfitted rotation instance.

        Raises
        ------
        ValueError
            If ``name`` is not a recognised rotation type.
        """
        if name == "pca":
            return PCARotation(whiten=False)
        if name == "ica":
            from rbig._src.rotation import ICARotation

            return ICARotation(random_state=seed)
        if name == "random":
            from rbig._src.rotation import RandomRotation

            return RandomRotation(random_state=seed)
        raise ValueError(f"Unknown rotation: {name!r}. Use 'pca', 'ica', or 'random'.")

    def _make_marginal_by_name(self, name: str, seed: int):
        """Instantiate a marginal Gaussianizer from its string name.

        Parameters
        ----------
        name : str
            One of ``"gaussianize"`` / ``"empirical"``, ``"quantile"``,
            ``"kde"``, ``"gmm"``, or ``"spline"``.
        seed : int
            Random seed for stochastic marginal estimators.

        Returns
        -------
        marginal : MarginalBijector
            The corresponding unfitted marginal Gaussianizer instance.

        Raises
        ------
        ValueError
            If ``name`` is not a recognised marginal type.
        """
        if name in ("gaussianize", "empirical", None):
            return MarginalGaussianize()
        if name == "quantile":
            from rbig._src.marginal import QuantileGaussianizer

            return QuantileGaussianizer(random_state=seed)
        if name == "kde":
            from rbig._src.marginal import KDEGaussianizer

            return KDEGaussianizer()
        if name == "gmm":
            from rbig._src.marginal import GMMGaussianizer

            return GMMGaussianizer(random_state=seed)
        if name == "spline":
            from rbig._src.marginal import SplineGaussianizer

            return SplineGaussianizer()
        raise ValueError(
            f"Unknown marginal: {name!r}. Use 'gaussianize', 'quantile', 'kde', 'gmm', or 'spline'."
        )

    @staticmethod
    def _get_information_tolerance(n_samples: int) -> float:
        """Compute a sample-size-adaptive convergence tolerance.

        Interpolates from an empirically calibrated lookup table mapping
        dataset size to an appropriate TC-change threshold.  Larger datasets
        can resolve finer changes in total correlation, so the tolerance
        decreases with sample count.

        Parameters
        ----------
        n_samples : int
            Number of training samples.

        Returns
        -------
        tol : float
            Adaptive tolerance value.
        """
        from scipy.interpolate import interp1d

        xxx = np.logspace(2, 8, 7)
        yyy = [0.1571, 0.0468, 0.0145, 0.0046, 0.0014, 0.0001, 0.00001]
        return float(interp1d(xxx, yyy, fill_value="extrapolate")(n_samples))

    @staticmethod
    def _calculate_negentropy(X: np.ndarray) -> np.ndarray:
        """Negentropy of each marginal: J(xᵢ) = H(Gauss) − H(xᵢ) ≥ 0.

        Negentropy measures how far a distribution is from Gaussian.  It is
        zero if and only if the distribution is Gaussian.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data whose per-feature negentropy is computed.

        Returns
        -------
        neg_entropy : np.ndarray of shape (n_features,)
            Non-negative negentropy for each feature dimension.

        Notes
        -----
        The negentropy is computed as:

            J(xᵢ) = H(𝒩(μᵢ, σᵢ²)) − H(xᵢ)

        where H(𝒩(μ, σ²)) = ½(1 + log(2πσ²)) is the Gaussian entropy with
        the same variance.
        """
        from rbig._src.densities import marginal_entropy

        # Gaussian entropy for a Gaussian with the same variance: 0.5*(1 + log(2*pi*var))
        gauss_h = 0.5 * (1 + np.log(2 * np.pi)) + 0.5 * np.log(np.var(X, axis=0))
        marg_h = marginal_entropy(X)  # empirical marginal entropy per feature
        return gauss_h - marg_h  # shape (n_features,); always >= 0

    @staticmethod
    def _total_correlation(X: np.ndarray) -> float:
        """Total correlation of X: TC(X) = ∑ᵢ H(Xᵢ) − H(X).

        Total correlation (also called multi-information) quantifies the
        statistical dependence among all features jointly.  It equals zero
        when all features are mutually independent.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data whose total correlation is measured.

        Returns
        -------
        tc : float
            Total correlation in nats.  Non-negative by the subadditivity of
            entropy.

        Notes
        -----
        The joint entropy ``H(X)`` is estimated under a Gaussian assumption
        (using the log-determinant of the covariance matrix), while the
        marginal entropies ``H(Xᵢ)`` are estimated empirically.
        """
        from rbig._src.densities import joint_entropy_gaussian, marginal_entropy

        marg_h = marginal_entropy(X)  # per-feature entropy; shape (n_features,)
        joint_h = joint_entropy_gaussian(X)  # Gaussian approximation to joint entropy
        return float(np.sum(marg_h) - joint_h)

zero_tolerance property writable

Deprecated alias for patience.

fit(X, y=None)

Fit the RBIG model by iteratively Gaussianizing X.

At each layer k the algorithm:

  1. Builds a new :class:RBIGLayer with the configured marginal and rotation transforms.
  2. Fits the layer on the current working copy Xt.
  3. Accumulates the per-sample log-det-Jacobian: log_det_train_ += log|det J_k(Xt)|.
  4. Advances Xt through the layer: Xt = f_k(Xt).
  5. Measures residual total correlation: TC(Xt) = ∑ᵢ H(Xᵢ) − H(X).
  6. Stops early when TC has not changed by more than tol for patience consecutive layers.
Parameters

X : np.ndarray of shape (n_samples, n_features) Training data. y : ignored Not used, present for sklearn pipeline compatibility.

Returns

self : AnnealedRBIG The fitted model.

Source code in rbig/_src/model.py
def fit(self, X: np.ndarray, y=None) -> AnnealedRBIG:
    """Fit the RBIG model by iteratively Gaussianizing X.

    At each layer k the algorithm:

    1. Builds a new :class:`RBIGLayer` with the configured marginal and
       rotation transforms.
    2. Fits the layer on the current working copy ``Xt``.
    3. Accumulates the per-sample log-det-Jacobian:
       ``log_det_train_ += log|det J_k(Xt)|``.
    4. Advances ``Xt`` through the layer: ``Xt = f_k(Xt)``.
    5. Measures residual total correlation: ``TC(Xt) = ∑ᵢ H(Xᵢ) − H(X)``.
    6. Stops early when TC has not changed by more than ``tol`` for
       ``patience`` consecutive layers.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.
    y : ignored
        Not used, present for sklearn pipeline compatibility.

    Returns
    -------
    self : AnnealedRBIG
        The fitted model.
    """
    X = validate_data(self, X)
    n_samples, n_features = X.shape
    if n_samples < 2:
        raise ValueError(
            f"RBIG requires at least 2 samples to estimate marginal CDFs, "
            f"got n_samples = {n_samples}."
        )
    self.n_features_in_ = n_features  # remember input dimensionality
    self.layers_: list[RBIGLayer] = []
    self.tc_per_layer_: list[float] = []

    # Validate and resolve tolerance
    if self.tol == "auto":
        tol = self._get_information_tolerance(n_samples)
    elif isinstance(self.tol, int | float):
        tol = float(self.tol)
    else:
        raise ValueError(f"tol must be a float or 'auto', got {self.tol!r}")
    self.tol_: float = tol  # store resolved tolerance for inspection

    Xt = X.copy()  # working copy; shape (n_samples, n_features)
    self.log_det_train_ = np.zeros(
        n_samples
    )  # accumulated log|det J|; shape (n_samples,)
    zero_count = 0  # consecutive non-improving layer counter

    # Record TC of the *input* data (before any layers).  This is
    # needed by total_correlation_reduction() which uses
    # tc_per_layer_[0] - tc_per_layer_[-1].
    self.tc_per_layer_.append(self._total_correlation(Xt))

    pbar = maybe_tqdm(
        range(self.n_layers),
        verbose=self.verbose,
        level=1,
        desc="Fitting RBIG",
        total=self.n_layers,
    )
    for i in pbar:
        # Build layer i with the appropriate marginal and rotation components
        layer = RBIGLayer(
            marginal=self._make_marginal(layer_index=i),
            rotation=self._make_rotation(layer_index=i),
        )
        layer.fit(Xt)
        # Accumulate log|det J_i(Xt)| before advancing Xt
        self.log_det_train_ += layer.log_det_jacobian(Xt)
        Xt = layer.transform(Xt)  # advance to next representation
        self.layers_.append(layer)

        # Measure residual total correlation: TC = sum_i H(Xi) - H(X)
        tc = self._total_correlation(Xt)
        self.tc_per_layer_.append(tc)

        if hasattr(pbar, "set_postfix"):
            postfix = {"TC": f"{tc:.4g}"}
            if i > 0:
                delta = abs(self.tc_per_layer_[-2] - tc)
                postfix["δTC"] = f"{delta:.2e}"
            pbar.set_postfix(postfix)

        if i > 0:
            # Check convergence: how much did TC improve this layer?
            delta = abs(self.tc_per_layer_[-2] - tc)
            if delta < tol:
                zero_count += 1
            else:
                zero_count = 0  # reset on any significant improvement

        # Stop early if TC has been flat for patience consecutive layers
        if zero_count >= self.patience:
            if hasattr(pbar, "total"):
                pbar.total = i + 1
                pbar.refresh()
            break

    # Store the fully transformed training data for efficient entropy estimation
    self.X_transformed_ = Xt  # shape (n_samples, n_features)
    return self

transform(X)

Map X to the Gaussian latent space through all fitted layers.

Applies each fitted :class:RBIGLayer in order: Z = fₖ(… f₂(f₁(x)) …).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Z : np.ndarray of shape (n_samples, n_features) Data in the approximately Gaussian latent space.

Source code in rbig/_src/model.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Map X to the Gaussian latent space through all fitted layers.

    Applies each fitted :class:`RBIGLayer` in order:
    ``Z = fₖ(… f₂(f₁(x)) …)``.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Data in the approximately Gaussian latent space.
    """
    check_is_fitted(self)
    Xt = validate_data(self, X, reset=False).copy()
    layers_iter = maybe_tqdm(
        self.layers_,
        verbose=self.verbose,
        level=2,
        desc="Transforming",
        total=len(self.layers_),
    )
    for layer in layers_iter:
        Xt = layer.transform(Xt)
    return Xt

inverse_transform(X)

Map latent-space data back to the original input space.

Applies layers in reverse order: x = f₁⁻¹(… fₖ₋₁⁻¹(fₖ⁻¹(z)) …).

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the latent (approximately Gaussian) space.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original input space.

Source code in rbig/_src/model.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Map latent-space data back to the original input space.

    Applies layers in reverse order:
    ``x = f₁⁻¹(… fₖ₋₁⁻¹(fₖ⁻¹(z)) …)``.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the latent (approximately Gaussian) space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original input space.
    """
    check_is_fitted(self)
    Xt = validate_data(self, X, reset=False).copy()
    layers_iter = maybe_tqdm(
        reversed(self.layers_),
        verbose=self.verbose,
        level=2,
        desc="Inverse transforming",
        total=len(self.layers_),
    )
    for layer in layers_iter:
        Xt = layer.inverse_transform(Xt)
    return Xt

fit_transform(X, y=None)

Fit the model to X and return the latent-space representation.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

Z : np.ndarray of shape (n_samples, n_features) Transformed data in the latent space.

Source code in rbig/_src/model.py
def fit_transform(self, X: np.ndarray, y=None) -> np.ndarray:
    """Fit the model to X and return the latent-space representation.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Transformed data in the latent space.
    """
    return self.fit(X).transform(X)

score_samples(X)

Per-sample log-likelihood under the fitted density model.

Uses the change-of-variables formula for normalizing flows:

log p(x) = log p_Z(f(x)) + log|det J_f(x)|

where p_Z = 𝒩(0, I) is the standard Gaussian base density, f is the composition of all fitted layers, and J_f(x) is the Jacobian of f at x.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data points at which to evaluate the log-likelihood.

Returns

log_prob : np.ndarray of shape (n_samples,) Per-sample log-likelihood in nats.

Notes

The log-det-Jacobian is accumulated layer by layer to avoid recomputing intermediate representations:

log|det J_f(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|
Source code in rbig/_src/model.py
def score_samples(self, X: np.ndarray) -> np.ndarray:
    """Per-sample log-likelihood under the fitted density model.

    Uses the change-of-variables formula for normalizing flows:

        log p(x) = log p_Z(f(x)) + log|det J_f(x)|

    where ``p_Z = 𝒩(0, I)`` is the standard Gaussian base density,
    ``f`` is the composition of all fitted layers, and ``J_f(x)`` is
    the Jacobian of ``f`` at ``x``.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data points at which to evaluate the log-likelihood.

    Returns
    -------
    log_prob : np.ndarray of shape (n_samples,)
        Per-sample log-likelihood in nats.

    Notes
    -----
    The log-det-Jacobian is accumulated layer by layer to avoid
    recomputing intermediate representations:

        log|det J_f(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|
    """
    check_is_fitted(self)
    X = validate_data(self, X, reset=False)
    Xt = X.copy()  # shape (n_samples, n_features)
    log_det_jac = np.zeros(X.shape[0])  # accumulator; shape (n_samples,)
    layers_iter = maybe_tqdm(
        self.layers_,
        verbose=self.verbose,
        level=2,
        desc="Scoring",
        total=len(self.layers_),
    )
    for layer in layers_iter:
        # Accumulate log|det Jₖ| before advancing through layer k
        log_det_jac += layer.log_det_jacobian(Xt)
        Xt = layer.transform(Xt)  # xₖ = fₖ(xₖ₋₁)
    # log p_Z(z) = sum_i log N(z_i; 0, 1); shape (n_samples,)
    log_pz = np.sum(stats.norm.logpdf(Xt), axis=1)
    # change-of-variables: log p(x) = log p_Z(f(x)) + log|det J_f(x)|
    return log_pz + log_det_jac

score(X, y=None)

Mean log-likelihood of samples X under the fitted density.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data points to evaluate. y : ignored Not used, present for sklearn pipeline compatibility.

Returns

mean_log_prob : float Average per-sample log-likelihood in nats.

Source code in rbig/_src/model.py
def score(self, X: np.ndarray, y=None) -> float:
    """Mean log-likelihood of samples X under the fitted density.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data points to evaluate.
    y : ignored
        Not used, present for sklearn pipeline compatibility.

    Returns
    -------
    mean_log_prob : float
        Average per-sample log-likelihood in nats.
    """
    return float(np.mean(self.score_samples(X)))

entropy()

Differential entropy of the fitted distribution in nats.

Estimated from the training data using:

H(X) = −𝔼_X[log p(x)]

The expectation is approximated by the sample mean over the training set. The log-likelihoods are obtained via the efficient cached path :meth:score_samples_raw_ which reuses pre-computed quantities from fit.

Returns

h : float Estimated entropy in nats. Always ≥ 0 for continuous distributions.

Notes

This is equivalent to -self.score(X_train) but avoids the cost of re-passing training data through all layers.

Source code in rbig/_src/model.py
def entropy(self) -> float:
    """Differential entropy of the fitted distribution in nats.

    Estimated from the training data using:

        H(X) = −𝔼_X[log p(x)]

    The expectation is approximated by the sample mean over the training
    set.  The log-likelihoods are obtained via the efficient cached path
    :meth:`score_samples_raw_` which reuses pre-computed quantities from
    ``fit``.

    Returns
    -------
    h : float
        Estimated entropy in nats.  Always ≥ 0 for continuous
        distributions.

    Notes
    -----
    This is equivalent to ``-self.score(X_train)`` but avoids the cost
    of re-passing training data through all layers.
    """
    check_is_fitted(self)
    return float(-np.mean(self.score_samples_raw_()))

total_correlation_reduction()

Total correlation removed by RBIG (RBIG-way TC estimation).

Uses the per-layer TC reduction approach from Laparra et al. (2011):

TC(X) = TC₀ − TCₖ = Σₖ ΔTCₖ

where TC₀ is the total correlation of the input and TCₖ is the residual TC after K layers of Gaussianization. When the model has converged, TCₖ ≈ 0 and the result equals TC₀.

Returns

tc : float Estimated total correlation in nats.

Source code in rbig/_src/model.py
def total_correlation_reduction(self) -> float:
    """Total correlation removed by RBIG (RBIG-way TC estimation).

    Uses the per-layer TC reduction approach from Laparra et al. (2011):

        TC(X) = TC₀ − TCₖ = Σₖ ΔTCₖ

    where TC₀ is the total correlation of the input and TCₖ is the
    residual TC after K layers of Gaussianization.  When the model has
    converged, TCₖ ≈ 0 and the result equals TC₀.

    Returns
    -------
    tc : float
        Estimated total correlation in nats.
    """
    check_is_fitted(self)
    return float(self.tc_per_layer_[0] - self.tc_per_layer_[-1])

entropy_reduction(X)

Differential entropy via RBIG-way TC reduction.

Uses the identity H(X) = Σ_d H(X_d) − TC(X) where marginal entropies are estimated via KDE and TC is obtained from the cumulative per-layer TC reduction (Laparra et al. 2011).

Parameters

X : np.ndarray of shape (n_samples, n_features) Data whose entropy is estimated (typically the training data).

Returns

h : float Estimated differential entropy in nats.

Source code in rbig/_src/model.py
def entropy_reduction(self, X: np.ndarray) -> float:
    """Differential entropy via RBIG-way TC reduction.

    Uses the identity H(X) = Σ_d H(X_d) − TC(X) where marginal
    entropies are estimated via KDE and TC is obtained from the
    cumulative per-layer TC reduction (Laparra et al. 2011).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data whose entropy is estimated (typically the training data).

    Returns
    -------
    h : float
        Estimated differential entropy in nats.
    """
    check_is_fitted(self)
    from rbig._src.densities import marginal_entropy

    h_marginals = marginal_entropy(X)  # shape (n_features,)
    tc = self.total_correlation_reduction()
    return float(np.sum(h_marginals) - tc)

score_samples_raw_()

Log-likelihood for the stored training data without recomputing layers.

Reuses X_transformed_ and log_det_train_ cached during :meth:fit, so the cost is a single Gaussian log-pdf evaluation rather than a full forward pass through all layers.

Returns

log_prob : np.ndarray of shape (n_samples,) Per-sample log-likelihood of the training data in nats.

Source code in rbig/_src/model.py
def score_samples_raw_(self) -> np.ndarray:
    """Log-likelihood for the stored training data without recomputing layers.

    Reuses ``X_transformed_`` and ``log_det_train_`` cached during
    :meth:`fit`, so the cost is a single Gaussian log-pdf evaluation
    rather than a full forward pass through all layers.

    Returns
    -------
    log_prob : np.ndarray of shape (n_samples,)
        Per-sample log-likelihood of the training data in nats.
    """
    # log p_Z evaluated at the pre-computed transformed training data
    log_pz = np.sum(
        stats.norm.logpdf(self.X_transformed_), axis=1
    )  # shape (n_samples,)
    # add the accumulated log-det-Jacobian stored during fit
    return log_pz + self.log_det_train_

sample(n_samples, random_state=None)

Generate samples from the learned distribution.

Draws i.i.d. standard Gaussian samples in the latent space and maps them back to the data space via the inverse normalizing flow.

Parameters

n_samples : int Number of samples to generate. random_state : int or None, optional Seed for the random number generator. If None, a random seed is used.

Returns

X_new : np.ndarray of shape (n_samples, n_features_in_) Samples in the original data space.

Source code in rbig/_src/model.py
def sample(self, n_samples: int, random_state: int | None = None) -> np.ndarray:
    """Generate samples from the learned distribution.

    Draws i.i.d. standard Gaussian samples in the latent space and maps
    them back to the data space via the inverse normalizing flow.

    Parameters
    ----------
    n_samples : int
        Number of samples to generate.
    random_state : int or None, optional
        Seed for the random number generator.  If ``None``, a random
        seed is used.

    Returns
    -------
    X_new : np.ndarray of shape (n_samples, n_features_in_)
        Samples in the original data space.
    """
    check_is_fitted(self)
    rng = np.random.default_rng(random_state)
    Z = rng.standard_normal((n_samples, self.n_features_in_))  # latent samples
    return self.inverse_transform(Z)

predict_proba(X, domain='input')

Return probability density estimates for X.

Uses the change-of-variables formula via the full Jacobian matrix to compute the density in the requested domain.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data points to evaluate. domain : str, default="input" Which domain to return densities in:

- ``"input"`` — density in the original data space:
  ``p(x) = p_Z(f(x)) · |det J_f(x)|``
- ``"transform"`` — density in the Gaussian latent space:
  ``p_Z(f(x)) = ∏ᵢ φ(fᵢ(x))``
- ``"both"`` — returns a tuple ``(p_input, p_transform)``
Returns

proba : np.ndarray of shape (n_samples,) or tuple Probability density estimates. When domain="both", returns (p_input, p_transform).

Source code in rbig/_src/model.py
def predict_proba(
    self,
    X: np.ndarray,
    domain: str = "input",
) -> np.ndarray | tuple[np.ndarray, np.ndarray]:
    """Return probability density estimates for X.

    Uses the change-of-variables formula via the full Jacobian matrix
    to compute the density in the requested domain.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data points to evaluate.
    domain : str, default="input"
        Which domain to return densities in:

        - ``"input"`` — density in the original data space:
          ``p(x) = p_Z(f(x)) · |det J_f(x)|``
        - ``"transform"`` — density in the Gaussian latent space:
          ``p_Z(f(x)) = ∏ᵢ φ(fᵢ(x))``
        - ``"both"`` — returns a tuple ``(p_input, p_transform)``

    Returns
    -------
    proba : np.ndarray of shape (n_samples,) or tuple
        Probability density estimates.  When ``domain="both"``, returns
        ``(p_input, p_transform)``.
    """
    check_is_fitted(self)
    X = validate_data(self, X, reset=False)
    jac, Xt = self.jacobian(X, return_X_transform=True)

    # Work in log-space for numerical stability
    log_p_transform = np.sum(stats.norm.logpdf(Xt), axis=1)

    if domain == "transform":
        p_transform = np.exp(log_p_transform)
        p_transform = np.where(np.isfinite(p_transform), p_transform, 0.0)
        return p_transform

    # Input-domain density via change of variables (log-space)
    _sign, log_abs_det = np.linalg.slogdet(jac)
    log_p_input = log_p_transform + log_abs_det
    p_input = np.exp(log_p_input)
    p_input = np.where(np.isfinite(p_input), p_input, 0.0)

    if domain == "input":
        return p_input
    if domain == "both":
        p_transform = np.exp(log_p_transform)
        p_transform = np.where(np.isfinite(p_transform), p_transform, 0.0)
        return p_input, p_transform
    raise ValueError(
        f"Unknown domain: {domain!r}. Use 'input', 'transform', or 'both'."
    )

jacobian(X, return_X_transform=False)

Compute the full Jacobian matrix of the RBIG transform.

For each sample, returns the (n_features, n_features) Jacobian matrix df/dx of the composition of all fitted layers. Uses the seed-dimension approach from the legacy implementation: for each input dimension idim, a unit vector is propagated through the chain of per-feature marginal derivatives and rotation matrices.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data at which to evaluate the Jacobian. return_X_transform : bool, default False If True, also return the fully transformed data f(X) (computed as a side-effect of the Jacobian calculation).

Returns

jac : np.ndarray of shape (n_samples, n_features, n_features) Full Jacobian matrix per sample. jac[n, i, j] is the partial derivative df_i/dx_j for the n-th sample. X_transformed : np.ndarray of shape (n_samples, n_features) Only returned when return_X_transform=True. The data after passing through all layers.

Source code in rbig/_src/model.py
def jacobian(
    self,
    X: np.ndarray,
    return_X_transform: bool = False,
) -> np.ndarray | tuple[np.ndarray, np.ndarray]:
    """Compute the full Jacobian matrix of the RBIG transform.

    For each sample, returns the ``(n_features, n_features)`` Jacobian
    matrix ``df/dx`` of the composition of all fitted layers.  Uses the
    seed-dimension approach from the legacy implementation: for each input
    dimension ``idim``, a unit vector is propagated through the chain of
    per-feature marginal derivatives and rotation matrices.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data at which to evaluate the Jacobian.
    return_X_transform : bool, default False
        If True, also return the fully transformed data ``f(X)`` (computed
        as a side-effect of the Jacobian calculation).

    Returns
    -------
    jac : np.ndarray of shape (n_samples, n_features, n_features)
        Full Jacobian matrix per sample.  ``jac[n, i, j]`` is the partial
        derivative ``df_i/dx_j`` for the n-th sample.
    X_transformed : np.ndarray of shape (n_samples, n_features)
        Only returned when ``return_X_transform=True``.  The data after
        passing through all layers.
    """
    check_is_fitted(self)
    n_samples, n_features = X.shape

    # ── Forward pass: collect per-layer derivatives and rotation matrices ──
    derivs_per_layer = []  # each: (n_samples, n_features)
    rotmats_per_layer = []  # each: (n_features, n_features)

    Xt = X.copy()
    layers_iter = maybe_tqdm(
        self.layers_,
        verbose=self.verbose,
        level=2,
        desc="Jacobian (forward)",
        total=len(self.layers_),
    )
    for layer in layers_iter:
        if not hasattr(layer.marginal, "_per_feature_log_deriv"):
            raise NotImplementedError(
                f"Jacobian computation requires a marginal with "
                f"_per_feature_log_deriv(); "
                f"{type(layer.marginal).__name__} does not support this."
            )
        # Per-feature marginal derivatives and transformed data in one pass
        log_d, Xt_marginal = layer.marginal._per_feature_log_deriv(
            Xt, return_transform=True
        )
        derivs_per_layer.append(np.exp(log_d))

        # Rotation matrix in row-vector convention: y = z @ R
        rot = self._extract_rotation_matrix(layer.rotation)
        rotmats_per_layer.append(rot)

        # Advance through rotation only
        Xt = layer.rotation.transform(Xt_marginal)

    # ── Seed-dimension loop: propagate unit vectors through the chain ──
    jac = np.zeros((n_samples, n_features, n_features))

    dims_iter = maybe_tqdm(
        range(n_features),
        verbose=self.verbose,
        level=2,
        desc="Jacobian (dims)",
        total=n_features,
    )
    for idim in dims_iter:
        # Initialize seed: unit vector in dimension idim
        XX = np.zeros((n_samples, n_features))
        XX[:, idim] = 1.0

        for derivs, R in zip(derivs_per_layer, rotmats_per_layer, strict=True):
            # Chain rule: XX_new = diag(derivs) @ XX @ R
            XX = (derivs * XX) @ R

        jac[:, :, idim] = XX

    if return_X_transform:
        return jac, Xt
    return jac

rbig.RBIGLayer dataclass

Single RBIG layer: marginal Gaussianization followed by rotation.

One iteration of the RBIG algorithm applies two successive bijections:

  1. Marginal Gaussianization – maps each feature independently to a standard Gaussian via its empirical CDF and the probit function:

    z = Φ⁻¹(F̂ₙ(x))

where F̂ₙ is the estimated marginal CDF and Φ⁻¹ is the standard normal quantile function.

  1. Rotation/whitening – applies a linear transform R (default: PCA whitening) to de-correlate the Gaussianized features:

    y = R · z

The full single-layer transform is therefore:

y = R · Φ⁻¹(F̂ₙ(x))

Parameters

marginal : MarginalGaussianize, optional Marginal Gaussianization transform (fitted per feature). Defaults to a new MarginalGaussianize instance. rotation : PCARotation, optional Rotation transform applied after marginal Gaussianization. Defaults to a new PCARotation instance.

Attributes

marginal : MarginalGaussianize Fitted marginal transform. rotation : PCARotation Fitted rotation transform.

Notes

The layer log-det-Jacobian is the sum of the marginal and rotation contributions:

log|det J_layer(x)| = log|det J_marginal(x)| + log|det J_rotation|
                     = ∑ᵢ log|Φ⁻¹′(F̂ₙ(xᵢ)) · f̂ₙ(xᵢ)| + log|det J_rotation|

The rotation term log|det J_rotation| is zero when the rotation is strictly orthogonal (|det R| = 1). The default PCARotation(whiten=False) is orthogonal, so its log-det is always zero. PCARotation(whiten=True) includes per-component scaling by 1/√λ and is not orthogonal (non-zero log-det). Note that both converge to identical results in practice because marginal Gaussianization already produces near-unit-variance features.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: From ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537–549. https://doi.org/10.1109/TNN.2011.2106511

Examples

import numpy as np from rbig._src.model import RBIGLayer rng = np.random.default_rng(0) X = rng.standard_normal((500, 3)) layer = RBIGLayer() layer.fit(X) RBIGLayer(...) Z = layer.transform(X) Z.shape (500, 3)

Source code in rbig/_src/model.py
@dataclass
class RBIGLayer:
    """Single RBIG layer: marginal Gaussianization followed by rotation.

    One iteration of the RBIG algorithm applies two successive bijections:

    1. **Marginal Gaussianization** – maps each feature independently to a
       standard Gaussian via its empirical CDF and the probit function:

           z = Φ⁻¹(F̂ₙ(x))

       where F̂ₙ is the estimated marginal CDF and Φ⁻¹ is the standard
       normal quantile function.

    2. **Rotation/whitening** – applies a linear transform R (default: PCA
       whitening) to de-correlate the Gaussianized features:

           y = R · z

    The full single-layer transform is therefore:

        y = R · Φ⁻¹(F̂ₙ(x))

    Parameters
    ----------
    marginal : MarginalGaussianize, optional
        Marginal Gaussianization transform (fitted per feature).
        Defaults to a new ``MarginalGaussianize`` instance.
    rotation : PCARotation, optional
        Rotation transform applied after marginal Gaussianization.
        Defaults to a new ``PCARotation`` instance.

    Attributes
    ----------
    marginal : MarginalGaussianize
        Fitted marginal transform.
    rotation : PCARotation
        Fitted rotation transform.

    Notes
    -----
    The layer log-det-Jacobian is the sum of the marginal and rotation
    contributions:

        log|det J_layer(x)| = log|det J_marginal(x)| + log|det J_rotation|
                             = ∑ᵢ log|Φ⁻¹′(F̂ₙ(xᵢ)) · f̂ₙ(xᵢ)| + log|det J_rotation|

    The rotation term ``log|det J_rotation|`` is zero when the rotation is
    strictly orthogonal (``|det R| = 1``).  The default
    ``PCARotation(whiten=False)`` is orthogonal, so its log-det is always
    zero.  ``PCARotation(whiten=True)`` includes per-component scaling by
    ``1/√λ`` and is *not* orthogonal (non-zero log-det).  Note that both
    converge to identical results in practice because marginal
    Gaussianization already produces near-unit-variance features.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    From ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537–549. https://doi.org/10.1109/TNN.2011.2106511

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.model import RBIGLayer
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((500, 3))
    >>> layer = RBIGLayer()
    >>> layer.fit(X)
    RBIGLayer(...)
    >>> Z = layer.transform(X)
    >>> Z.shape
    (500, 3)
    """

    marginal: MarginalGaussianize = field(default_factory=MarginalGaussianize)
    rotation: PCARotation = field(default_factory=lambda: PCARotation(whiten=False))

    def fit(self, X: np.ndarray, y=None) -> RBIGLayer:
        """Fit the marginal and rotation transforms to data X.

        First fits the marginal Gaussianizer on X, applies it to obtain the
        intermediate Gaussianized representation, then fits the rotation on
        that intermediate representation.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.
        y : ignored
            Not used, present for sklearn pipeline compatibility.

        Returns
        -------
        self : RBIGLayer
            The fitted layer.
        """
        Xm = self.marginal.fit_transform(
            X
        )  # shape (n_samples, n_features) - Gaussianized
        self.rotation.fit(Xm)  # fit rotation on the Gaussianized data
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply marginal Gaussianization then rotation: y = R · Φ⁻¹(F̂ₙ(x)).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Y : np.ndarray of shape (n_samples, n_features)
            Transformed data after Gaussianization and rotation.
        """
        Xm = self.marginal.transform(
            X
        )  # marginal Gaussianization, shape (n_samples, n_features)
        return self.rotation.transform(Xm)  # rotation, shape (n_samples, n_features)

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Log |det J| for this layer at input X.

        The total log-det-Jacobian is the sum of contributions from the
        marginal step and the rotation step:

            log|det J_layer(x)| = log|det J_marginal(x)| + log|det J_rotation(z)|

        For orthogonal rotations (e.g. ``RandomRotation``,
        ``PCARotation(whiten=False)``), the rotation term is zero.  For
        ``PCARotation(whiten=True)`` the rotation includes a per-component
        rescaling, so its term is generally non-zero.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points.

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant of the layer Jacobian.
        """
        Xm = self.marginal.transform(
            X
        )  # intermediate Gaussianized data, shape (n_samples, n_features)
        # marginal log-det + rotation log-det (non-zero for PCARotation with whiten=True)
        return self.marginal.log_det_jacobian(X) + self.rotation.log_det_jacobian(Xm)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the layer: apply inverse rotation then inverse marginal.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the layer's output (latent) space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original input space.
        """
        Xr = self.rotation.inverse_transform(
            X
        )  # undo rotation, shape (n_samples, n_features)
        return self.marginal.inverse_transform(Xr)  # undo marginal Gaussianization

fit(X, y=None)

Fit the marginal and rotation transforms to data X.

First fits the marginal Gaussianizer on X, applies it to obtain the intermediate Gaussianized representation, then fits the rotation on that intermediate representation.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data. y : ignored Not used, present for sklearn pipeline compatibility.

Returns

self : RBIGLayer The fitted layer.

Source code in rbig/_src/model.py
def fit(self, X: np.ndarray, y=None) -> RBIGLayer:
    """Fit the marginal and rotation transforms to data X.

    First fits the marginal Gaussianizer on X, applies it to obtain the
    intermediate Gaussianized representation, then fits the rotation on
    that intermediate representation.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.
    y : ignored
        Not used, present for sklearn pipeline compatibility.

    Returns
    -------
    self : RBIGLayer
        The fitted layer.
    """
    Xm = self.marginal.fit_transform(
        X
    )  # shape (n_samples, n_features) - Gaussianized
    self.rotation.fit(Xm)  # fit rotation on the Gaussianized data
    return self

transform(X)

Apply marginal Gaussianization then rotation: y = R · Φ⁻¹(F̂ₙ(x)).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Y : np.ndarray of shape (n_samples, n_features) Transformed data after Gaussianization and rotation.

Source code in rbig/_src/model.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply marginal Gaussianization then rotation: y = R · Φ⁻¹(F̂ₙ(x)).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Y : np.ndarray of shape (n_samples, n_features)
        Transformed data after Gaussianization and rotation.
    """
    Xm = self.marginal.transform(
        X
    )  # marginal Gaussianization, shape (n_samples, n_features)
    return self.rotation.transform(Xm)  # rotation, shape (n_samples, n_features)

log_det_jacobian(X)

Log |det J| for this layer at input X.

The total log-det-Jacobian is the sum of contributions from the marginal step and the rotation step:

log|det J_layer(x)| = log|det J_marginal(x)| + log|det J_rotation(z)|

For orthogonal rotations (e.g. RandomRotation, PCARotation(whiten=False)), the rotation term is zero. For PCARotation(whiten=True) the rotation includes a per-component rescaling, so its term is generally non-zero.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points.

Returns

ldj : np.ndarray of shape (n_samples,) Per-sample log absolute determinant of the layer Jacobian.

Source code in rbig/_src/model.py
def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Log |det J| for this layer at input X.

    The total log-det-Jacobian is the sum of contributions from the
    marginal step and the rotation step:

        log|det J_layer(x)| = log|det J_marginal(x)| + log|det J_rotation(z)|

    For orthogonal rotations (e.g. ``RandomRotation``,
    ``PCARotation(whiten=False)``), the rotation term is zero.  For
    ``PCARotation(whiten=True)`` the rotation includes a per-component
    rescaling, so its term is generally non-zero.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points.

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant of the layer Jacobian.
    """
    Xm = self.marginal.transform(
        X
    )  # intermediate Gaussianized data, shape (n_samples, n_features)
    # marginal log-det + rotation log-det (non-zero for PCARotation with whiten=True)
    return self.marginal.log_det_jacobian(X) + self.rotation.log_det_jacobian(Xm)

inverse_transform(X)

Invert the layer: apply inverse rotation then inverse marginal.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the layer's output (latent) space.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original input space.

Source code in rbig/_src/model.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the layer: apply inverse rotation then inverse marginal.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the layer's output (latent) space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original input space.
    """
    Xr = self.rotation.inverse_transform(
        X
    )  # undo rotation, shape (n_samples, n_features)
    return self.marginal.inverse_transform(Xr)  # undo marginal Gaussianization

Marginal Transforms

rbig.MarginalUniformize

Bases: BaseTransform

Transform each marginal to uniform [0, 1] using the empirical CDF.

For each feature dimension i, the empirical CDF is estimated from the training data with a mid-point (Hazen) continuity correction::

u_hat = F_hat_n(x) = (rank(x, X_train) + 0.5) / N

where rank is the number of training samples <= x (left-sided searchsorted) and N is the number of training samples. The +0.5 shift avoids the degenerate values 0 and 1 for in-sample boundary points.

Parameters

bound_correct : bool, default True If True, clip the output to [eps, 1 - eps] to prevent exact 0 or 1, which is useful when feeding the result into a probit or logit function. eps : float, default 1e-6 Half-width of the clipping margin when bound_correct=True.

Attributes

support_ : np.ndarray of shape (n_samples, n_features) Column-wise sorted training data. Serves as empirical quantile nodes for both the forward transform and piecewise-linear inversion. n_features_ : int Number of feature dimensions seen during fit.

Notes

The mid-point empirical CDF (Hazen plotting position) is::

F_hat_n(x) = (rank + 0.5) / N

The inverse is approximated by piecewise-linear interpolation between the sorted support values and their corresponding uniform probabilities np.linspace(0, 1, N).

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.marginal import MarginalUniformize rng = np.random.default_rng(0) X = rng.standard_normal((100, 2)) uni = MarginalUniformize().fit(X) U = uni.transform(X) U.shape (100, 2) bool(U.min() > 0.0) and bool(U.max() < 1.0) True Xr = uni.inverse_transform(U) Xr.shape (100, 2)

Source code in rbig/_src/marginal.py
class MarginalUniformize(BaseTransform):
    """Transform each marginal to uniform [0, 1] using the empirical CDF.

    For each feature dimension *i*, the empirical CDF is estimated from the
    training data with a mid-point (Hazen) continuity correction::

        u_hat = F_hat_n(x) = (rank(x, X_train) + 0.5) / N

    where *rank* is the number of training samples <= x (left-sided
    ``searchsorted``) and *N* is the number of training samples.  The
    ``+0.5`` shift avoids the degenerate values 0 and 1 for in-sample
    boundary points.

    Parameters
    ----------
    bound_correct : bool, default True
        If True, clip the output to ``[eps, 1 - eps]`` to prevent exact 0
        or 1, which is useful when feeding the result into a probit or
        logit function.
    eps : float, default 1e-6
        Half-width of the clipping margin when ``bound_correct=True``.

    Attributes
    ----------
    support_ : np.ndarray of shape (n_samples, n_features)
        Column-wise sorted training data.  Serves as empirical quantile
        nodes for both the forward transform and piecewise-linear inversion.
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The mid-point empirical CDF (Hazen plotting position) is::

        F_hat_n(x) = (rank + 0.5) / N

    The inverse is approximated by piecewise-linear interpolation between
    the sorted support values and their corresponding uniform probabilities
    ``np.linspace(0, 1, N)``.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import MarginalUniformize
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((100, 2))
    >>> uni = MarginalUniformize().fit(X)
    >>> U = uni.transform(X)
    >>> U.shape
    (100, 2)
    >>> bool(U.min() > 0.0) and bool(U.max() < 1.0)
    True
    >>> Xr = uni.inverse_transform(U)
    >>> Xr.shape
    (100, 2)
    """

    def __init__(
        self,
        bound_correct: bool = True,
        eps: float = 1e-6,
        pdf_extension: float = 0.0,
        pdf_resolution: int = 1000,
    ):
        self.bound_correct = bound_correct
        self.eps = eps
        self.pdf_extension = pdf_extension
        self.pdf_resolution = pdf_resolution

    def fit(self, X: np.ndarray, y=None) -> MarginalUniformize:
        """Fit the transform by storing sorted training values per feature.

        When ``pdf_extension > 0``, a histogram-based CDF pipeline is used
        instead of the default empirical CDF.  This extends the support by
        ``pdf_extension`` percent of the data range and builds an interpolated,
        monotonic CDF on a grid of ``pdf_resolution`` points.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.  Each column is sorted and stored as the empirical
            support (quantile nodes) for that feature.

        Returns
        -------
        self : MarginalUniformize
            Fitted transform instance.
        """
        self.n_features_ = X.shape[1]

        if self.pdf_extension > 0:
            self._fit_histogram_cdf(X)
        else:
            # Sort each column independently to obtain empirical quantile nodes
            self.support_ = np.sort(X, axis=0)
        return self

    def _fit_histogram_cdf(self, X: np.ndarray) -> None:
        """Build per-feature histogram CDF with extended support."""
        n_samples = X.shape[0]
        self.cdf_support_ = []
        self.cdf_values_ = []
        self.pdf_support_ = []
        self.pdf_values_ = []

        for i in range(self.n_features_):
            xi = X[:, i]
            x_min, x_max = xi.min(), xi.max()

            # Handle constant-valued feature: trivial linear CDF
            if x_min == x_max:
                support = np.array([x_min - 1.0, x_min, x_min + 1.0])
                cdf_vals = np.array([0.0, 0.5, 1.0])
                pdf_sup = np.array([x_min - 1.0, x_min, x_min + 1.0])
                pdf_vals = np.array([0.0, 1.0, 0.0])
                self.cdf_support_.append(support)
                self.cdf_values_.append(cdf_vals)
                self.pdf_support_.append(pdf_sup)
                self.pdf_values_.append(pdf_vals)
                continue

            support_ext = (self.pdf_extension / 100) * abs(x_max - x_min)

            # Build histogram bins: sqrt(n) + 1 edges
            n_bin_edges = int(np.sqrt(float(n_samples)) + 1)
            bin_edges = np.linspace(x_min, x_max, n_bin_edges)
            bin_centers = (bin_edges[:-1] + bin_edges[1:]) / 2
            counts, _ = np.histogram(xi, bin_edges)
            bin_size = bin_edges[1] - bin_edges[0]

            # Empirical PDF with zero-padded edges
            pdf_support = np.concatenate(
                [
                    [bin_centers[0] - bin_size],
                    bin_centers,
                    [bin_centers[-1] + bin_size],
                ]
            )
            empirical_pdf = np.concatenate(
                [
                    [0.0],
                    counts / (np.sum(counts) * bin_size),
                    [0.0],
                ]
            )

            # CDF from cumulative counts with extended support
            c_sum = np.cumsum(counts)
            cdf = (1 - 1 / n_samples) * c_sum / n_samples
            incr_bin = bin_size / 2

            new_bin_edges = np.concatenate(
                [
                    [x_min - support_ext],
                    [x_min],
                    bin_centers + incr_bin,
                    [x_max + support_ext + incr_bin],
                ]
            )
            extended_cdf = np.concatenate(
                [
                    [0.0],
                    [1.0 / n_samples],
                    cdf,
                    [1.0],
                ]
            )

            # Interpolate onto fine grid, enforce monotonicity, normalize
            new_support = np.linspace(
                new_bin_edges[0], new_bin_edges[-1], self.pdf_resolution
            )
            learned_cdf = np.interp(new_support, new_bin_edges, extended_cdf)
            uniform_cdf = make_cdf_monotonic(learned_cdf)
            uniform_cdf /= uniform_cdf.max()

            self.cdf_support_.append(new_support)
            self.cdf_values_.append(uniform_cdf)
            self.pdf_support_.append(pdf_support)
            self.pdf_values_.append(empirical_pdf)

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Map each feature to [0, 1] via the mid-point empirical CDF.

        Applies ``u = (rank + 0.5) / N`` to every column independently.
        When ``pdf_extension > 0``, uses interpolation with the stored
        histogram-based CDF grid instead.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Uniformized data in [0, 1] (or ``[eps, 1 - eps]`` when
            ``bound_correct=True``).
        """
        Xt = np.zeros_like(X, dtype=float)
        if self.pdf_extension > 0:
            for i in range(self.n_features_):
                Xt[:, i] = np.interp(X[:, i], self.cdf_support_[i], self.cdf_values_[i])
        else:
            for i in range(self.n_features_):
                Xt[:, i] = self._uniformize(X[:, i], self.support_[:, i])
        if self.bound_correct:
            # Clip to (eps, 1-eps) to prevent boundary issues downstream
            Xt = np.clip(Xt, self.eps, 1 - self.eps)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Map uniform [0, 1] values back to the original space.

        Uses piecewise-linear interpolation between the stored sorted support
        values and their corresponding uniform probabilities
        ``np.linspace(0, 1, N)``.  When ``pdf_extension > 0``, uses the
        inverted histogram CDF grid instead.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the uniform [0, 1] space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
        """
        Xt = np.zeros_like(X, dtype=float)
        if self.pdf_extension > 0:
            for i in range(self.n_features_):
                # Ensure strictly increasing xp for np.interp by
                # dropping duplicate CDF values
                cdf_vals = self.cdf_values_[i]
                cdf_sup = self.cdf_support_[i]
                unique_mask = np.concatenate([[True], np.diff(cdf_vals) > 0])
                Xt[:, i] = np.interp(
                    X[:, i], cdf_vals[unique_mask], cdf_sup[unique_mask]
                )
        else:
            for i in range(self.n_features_):
                # Interpolate: uniform grid [0, 1] -> sorted training values
                Xt[:, i] = np.interp(
                    X[:, i],
                    np.linspace(0, 1, len(self.support_[:, i])),
                    self.support_[:, i],
                )
        return Xt

    @staticmethod
    def _uniformize(x: np.ndarray, support: np.ndarray) -> np.ndarray:
        """Compute the mid-point empirical CDF for a single feature.

        Parameters
        ----------
        x : np.ndarray of shape (n_samples,)
            New data values to evaluate the empirical CDF at.
        support : np.ndarray of shape (n_train,)
            Sorted training values used as the empirical quantile nodes.

        Returns
        -------
        u : np.ndarray of shape (n_samples,)
            Empirical CDF values: ``(rank + 0.5) / n_train``.
        """
        n = len(support)
        # Left-sided searchsorted gives the number of training points <= x
        ranks = np.searchsorted(support, x, side="left")
        # Mid-point shift (+0.5) avoids exact 0 and 1
        return (ranks + 0.5) / n

fit(X, y=None)

Fit the transform by storing sorted training values per feature.

When pdf_extension > 0, a histogram-based CDF pipeline is used instead of the default empirical CDF. This extends the support by pdf_extension percent of the data range and builds an interpolated, monotonic CDF on a grid of pdf_resolution points.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data. Each column is sorted and stored as the empirical support (quantile nodes) for that feature.

Returns

self : MarginalUniformize Fitted transform instance.

Source code in rbig/_src/marginal.py
def fit(self, X: np.ndarray, y=None) -> MarginalUniformize:
    """Fit the transform by storing sorted training values per feature.

    When ``pdf_extension > 0``, a histogram-based CDF pipeline is used
    instead of the default empirical CDF.  This extends the support by
    ``pdf_extension`` percent of the data range and builds an interpolated,
    monotonic CDF on a grid of ``pdf_resolution`` points.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.  Each column is sorted and stored as the empirical
        support (quantile nodes) for that feature.

    Returns
    -------
    self : MarginalUniformize
        Fitted transform instance.
    """
    self.n_features_ = X.shape[1]

    if self.pdf_extension > 0:
        self._fit_histogram_cdf(X)
    else:
        # Sort each column independently to obtain empirical quantile nodes
        self.support_ = np.sort(X, axis=0)
    return self

transform(X)

Map each feature to [0, 1] via the mid-point empirical CDF.

Applies u = (rank + 0.5) / N to every column independently. When pdf_extension > 0, uses interpolation with the stored histogram-based CDF grid instead.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Uniformized data in [0, 1] (or [eps, 1 - eps] when bound_correct=True).

Source code in rbig/_src/marginal.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Map each feature to [0, 1] via the mid-point empirical CDF.

    Applies ``u = (rank + 0.5) / N`` to every column independently.
    When ``pdf_extension > 0``, uses interpolation with the stored
    histogram-based CDF grid instead.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Uniformized data in [0, 1] (or ``[eps, 1 - eps]`` when
        ``bound_correct=True``).
    """
    Xt = np.zeros_like(X, dtype=float)
    if self.pdf_extension > 0:
        for i in range(self.n_features_):
            Xt[:, i] = np.interp(X[:, i], self.cdf_support_[i], self.cdf_values_[i])
    else:
        for i in range(self.n_features_):
            Xt[:, i] = self._uniformize(X[:, i], self.support_[:, i])
    if self.bound_correct:
        # Clip to (eps, 1-eps) to prevent boundary issues downstream
        Xt = np.clip(Xt, self.eps, 1 - self.eps)
    return Xt

inverse_transform(X)

Map uniform [0, 1] values back to the original space.

Uses piecewise-linear interpolation between the stored sorted support values and their corresponding uniform probabilities np.linspace(0, 1, N). When pdf_extension > 0, uses the inverted histogram CDF grid instead.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the uniform [0, 1] space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space.

Source code in rbig/_src/marginal.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Map uniform [0, 1] values back to the original space.

    Uses piecewise-linear interpolation between the stored sorted support
    values and their corresponding uniform probabilities
    ``np.linspace(0, 1, N)``.  When ``pdf_extension > 0``, uses the
    inverted histogram CDF grid instead.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the uniform [0, 1] space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
    """
    Xt = np.zeros_like(X, dtype=float)
    if self.pdf_extension > 0:
        for i in range(self.n_features_):
            # Ensure strictly increasing xp for np.interp by
            # dropping duplicate CDF values
            cdf_vals = self.cdf_values_[i]
            cdf_sup = self.cdf_support_[i]
            unique_mask = np.concatenate([[True], np.diff(cdf_vals) > 0])
            Xt[:, i] = np.interp(
                X[:, i], cdf_vals[unique_mask], cdf_sup[unique_mask]
            )
    else:
        for i in range(self.n_features_):
            # Interpolate: uniform grid [0, 1] -> sorted training values
            Xt[:, i] = np.interp(
                X[:, i],
                np.linspace(0, 1, len(self.support_[:, i])),
                self.support_[:, i],
            )
    return Xt

rbig.MarginalGaussianize

Bases: BaseTransform

Transform each marginal to standard Gaussian using empirical CDF + probit.

Combines a mid-point empirical CDF estimate with the Gaussian probit (quantile) function Phi^{-1} to map each feature to an approximately standard-normal marginal::

z = Phi ^ {-1}(F_hat_n(x))

where F_hat_n(x) = (rank + 0.5) / N is the mid-point empirical CDF and Phi^{-1} is the inverse standard-normal CDF (probit).

Parameters

bound_correct : bool, default True Clip the intermediate uniform value to [eps, 1 - eps] before applying the probit to prevent +/-inf outputs at the tails. eps : float, default 1e-6 Clipping margin for the uniform intermediate value.

Attributes

support_ : np.ndarray of shape (n_samples, n_features) Column-wise sorted training data (empirical quantile nodes). n_features_ : int Number of feature dimensions seen during fit.

Notes

The log-absolute Jacobian determinant needed for density estimation is::

log|dz/dx| = log f_hat_n(x) - log phi(Phi^{-1}(F_hat_n(x)))

where f_hat_n is the empirical density estimated from the spacing of adjacent sorted training values, and phi is the standard-normal PDF. This is computed in :meth:log_det_jacobian.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.marginal import MarginalGaussianize rng = np.random.default_rng(0) X = rng.standard_normal((200, 3)) mg = MarginalGaussianize().fit(X) Z = mg.transform(X) Z.shape (200, 3) abs(float(Z.mean())) < 0.5 True

Source code in rbig/_src/marginal.py
class MarginalGaussianize(BaseTransform):
    """Transform each marginal to standard Gaussian using empirical CDF + probit.

    Combines a mid-point empirical CDF estimate with the Gaussian probit
    (quantile) function Phi^{-1} to map each feature to an approximately
    standard-normal marginal::

        z = Phi ^ {-1}(F_hat_n(x))

    where ``F_hat_n(x) = (rank + 0.5) / N`` is the mid-point empirical CDF
    and ``Phi^{-1}`` is the inverse standard-normal CDF (probit).

    Parameters
    ----------
    bound_correct : bool, default True
        Clip the intermediate uniform value to ``[eps, 1 - eps]`` before
        applying the probit to prevent +/-inf outputs at the tails.
    eps : float, default 1e-6
        Clipping margin for the uniform intermediate value.

    Attributes
    ----------
    support_ : np.ndarray of shape (n_samples, n_features)
        Column-wise sorted training data (empirical quantile nodes).
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The log-absolute Jacobian determinant needed for density estimation is::

        log|dz/dx| = log f_hat_n(x) - log phi(Phi^{-1}(F_hat_n(x)))

    where ``f_hat_n`` is the empirical density estimated from the spacing of
    adjacent sorted training values, and ``phi`` is the standard-normal PDF.
    This is computed in :meth:`log_det_jacobian`.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import MarginalGaussianize
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 3))
    >>> mg = MarginalGaussianize().fit(X)
    >>> Z = mg.transform(X)
    >>> Z.shape
    (200, 3)
    >>> abs(float(Z.mean())) < 0.5
    True
    """

    def __init__(self, bound_correct: bool = True, eps: float = 1e-6):
        self.bound_correct = bound_correct
        self.eps = eps

    def fit(self, X: np.ndarray, y=None) -> MarginalGaussianize:
        """Fit by storing the column-wise sorted training data.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data used to build the per-feature empirical CDF.

        Returns
        -------
        self : MarginalGaussianize
            Fitted transform instance.
        """
        # Sorted columns serve as empirical quantile nodes
        self.support_ = np.sort(X, axis=0)
        self.n_features_ = X.shape[1]
        self.kdes_ = [
            stats.gaussian_kde(self.support_[:, i].copy())
            for i in range(self.n_features_)
        ]
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Map each feature to N(0, 1) via empirical CDF then probit.

        Applies ``z = Phi^{-1}(F_hat_n(x))`` column by column.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Gaussianized data; each column has approximately N(0, 1) marginal.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Step 1: empirical CDF -> uniform value in (0, 1)
            u = MarginalUniformize._uniformize(X[:, i], self.support_[:, i])
            if self.bound_correct:
                # Clip to avoid Phi^{-1}(0) = -inf or Phi^{-1}(1) = +inf
                u = np.clip(u, self.eps, 1 - self.eps)
            # Step 2: probit transform Phi^{-1}(u) -> standard normal
            Xt[:, i] = ndtri(u)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Map standard-normal values back to the original space.

        Applies the normal CDF Phi to obtain uniform values, then uses
        piecewise-linear interpolation through the sorted support to recover
        approximate original-space values.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the Gaussianized (standard-normal) space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Invert probit: z -> Phi(z) in (0, 1)
            u = stats.norm.cdf(X[:, i])
            # Invert empirical CDF via linear interpolation
            Xt[:, i] = np.interp(
                u, np.linspace(0, 1, len(self.support_[:, i])), self.support_[:, i]
            )
        return Xt

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Log |det J| for marginal Gaussianization.

        For g(x) = Phi^{-1}(F_n(x)):
            log|dg/dx| = log f_n(x_i) - log phi(g(x_i))

        where f_n is estimated from a Gaussian KDE fitted to the training
        data for each feature.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data at which to evaluate the log-det-Jacobian.

        Returns
        -------
        log_jac : np.ndarray of shape (n_samples,)
            Per-sample sum of per-feature log-derivatives.

        Notes
        -----
        The empirical density is approximated via a Gaussian KDE (one per
        feature) fitted during :meth:`fit`.  Bandwidth is selected
        automatically using Scott's rule (the default in
        :func:`scipy.stats.gaussian_kde`).  The KDE objects are cached
        in ``self.kdes_`` so that ``log_det_jacobian`` and repeated calls
        to ``_per_feature_log_deriv`` do not re-fit the KDEs.
        """
        return np.sum(self._per_feature_log_deriv(X), axis=1)

    def _per_feature_log_deriv(
        self, X: np.ndarray, return_transform: bool = False
    ) -> np.ndarray | tuple[np.ndarray, np.ndarray]:
        """Per-feature log |dz_i/dx_i| via cached KDE density estimates.

        Uses the per-feature Gaussian KDEs stored in ``self.kdes_`` (fitted
        during :meth:`fit` with Scott's rule bandwidth) to evaluate the
        marginal density f_n(x_i) at each query point.  The log-derivative
        is then ``log f_n(x_i) - log phi(z_i)`` where ``z_i`` is the
        Gaussianized value.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data at which to evaluate the per-feature log-derivatives.
        return_transform : bool, default False
            If True, also return the Gaussianized output to avoid recomputing.

        Returns
        -------
        log_derivs : np.ndarray of shape (n_samples, n_features)
            Per-feature log |dz_i/dx_i| for each sample.
        Xt : np.ndarray of shape (n_samples, n_features)
            Only returned when ``return_transform=True``.
        """
        Xt = self.transform(X)  # Gaussianized output, shape (N, D)
        log_derivs = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # KDE-based density estimate for feature i
            # After pickle with readonly memmap, the KDE's internal arrays
            # may be read-only.  Re-create the KDE from a writable copy of
            # the dataset if necessary.
            kde = self.kdes_[i]
            if not kde.dataset.flags.writeable:
                kde = stats.gaussian_kde(kde.dataset.copy())
                self.kdes_[i] = kde
            xi = np.ascontiguousarray(X[:, i])
            log_f_i = np.log(np.maximum(kde(xi), 1e-300))
            # Log standard-normal PDF at Gaussianized value: log phi(z_i)
            log_phi_gi = stats.norm.logpdf(Xt[:, i])
            # Chain rule: log|dz/dx| = log f(x) - log phi(z)
            log_derivs[:, i] = log_f_i - log_phi_gi
        if return_transform:
            return log_derivs, Xt
        return log_derivs

fit(X, y=None)

Fit by storing the column-wise sorted training data.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data used to build the per-feature empirical CDF.

Returns

self : MarginalGaussianize Fitted transform instance.

Source code in rbig/_src/marginal.py
def fit(self, X: np.ndarray, y=None) -> MarginalGaussianize:
    """Fit by storing the column-wise sorted training data.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data used to build the per-feature empirical CDF.

    Returns
    -------
    self : MarginalGaussianize
        Fitted transform instance.
    """
    # Sorted columns serve as empirical quantile nodes
    self.support_ = np.sort(X, axis=0)
    self.n_features_ = X.shape[1]
    self.kdes_ = [
        stats.gaussian_kde(self.support_[:, i].copy())
        for i in range(self.n_features_)
    ]
    return self

transform(X)

Map each feature to N(0, 1) via empirical CDF then probit.

Applies z = Phi^{-1}(F_hat_n(x)) column by column.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Gaussianized data; each column has approximately N(0, 1) marginal.

Source code in rbig/_src/marginal.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Map each feature to N(0, 1) via empirical CDF then probit.

    Applies ``z = Phi^{-1}(F_hat_n(x))`` column by column.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Gaussianized data; each column has approximately N(0, 1) marginal.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Step 1: empirical CDF -> uniform value in (0, 1)
        u = MarginalUniformize._uniformize(X[:, i], self.support_[:, i])
        if self.bound_correct:
            # Clip to avoid Phi^{-1}(0) = -inf or Phi^{-1}(1) = +inf
            u = np.clip(u, self.eps, 1 - self.eps)
        # Step 2: probit transform Phi^{-1}(u) -> standard normal
        Xt[:, i] = ndtri(u)
    return Xt

inverse_transform(X)

Map standard-normal values back to the original space.

Applies the normal CDF Phi to obtain uniform values, then uses piecewise-linear interpolation through the sorted support to recover approximate original-space values.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the Gaussianized (standard-normal) space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space.

Source code in rbig/_src/marginal.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Map standard-normal values back to the original space.

    Applies the normal CDF Phi to obtain uniform values, then uses
    piecewise-linear interpolation through the sorted support to recover
    approximate original-space values.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the Gaussianized (standard-normal) space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Invert probit: z -> Phi(z) in (0, 1)
        u = stats.norm.cdf(X[:, i])
        # Invert empirical CDF via linear interpolation
        Xt[:, i] = np.interp(
            u, np.linspace(0, 1, len(self.support_[:, i])), self.support_[:, i]
        )
    return Xt

log_det_jacobian(X)

Log |det J| for marginal Gaussianization.

For g(x) = Phi^{-1}(F_n(x)): log|dg/dx| = log f_n(x_i) - log phi(g(x_i))

where f_n is estimated from a Gaussian KDE fitted to the training data for each feature.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data at which to evaluate the log-det-Jacobian.

Returns

log_jac : np.ndarray of shape (n_samples,) Per-sample sum of per-feature log-derivatives.

Notes

The empirical density is approximated via a Gaussian KDE (one per feature) fitted during :meth:fit. Bandwidth is selected automatically using Scott's rule (the default in :func:scipy.stats.gaussian_kde). The KDE objects are cached in self.kdes_ so that log_det_jacobian and repeated calls to _per_feature_log_deriv do not re-fit the KDEs.

Source code in rbig/_src/marginal.py
def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Log |det J| for marginal Gaussianization.

    For g(x) = Phi^{-1}(F_n(x)):
        log|dg/dx| = log f_n(x_i) - log phi(g(x_i))

    where f_n is estimated from a Gaussian KDE fitted to the training
    data for each feature.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data at which to evaluate the log-det-Jacobian.

    Returns
    -------
    log_jac : np.ndarray of shape (n_samples,)
        Per-sample sum of per-feature log-derivatives.

    Notes
    -----
    The empirical density is approximated via a Gaussian KDE (one per
    feature) fitted during :meth:`fit`.  Bandwidth is selected
    automatically using Scott's rule (the default in
    :func:`scipy.stats.gaussian_kde`).  The KDE objects are cached
    in ``self.kdes_`` so that ``log_det_jacobian`` and repeated calls
    to ``_per_feature_log_deriv`` do not re-fit the KDEs.
    """
    return np.sum(self._per_feature_log_deriv(X), axis=1)

rbig.MarginalKDEGaussianize

Bases: BaseTransform

Transform each marginal to Gaussian using a KDE-estimated CDF.

A kernel density estimate (KDE) with a Gaussian kernel is fitted to each feature dimension. The cumulative integral of the KDE serves as a smooth approximation to the marginal CDF, which is then composed with the probit function Phi^{-1} to Gaussianize each dimension::

z = Phi ^ {-1}(F_KDE(x))

where F_KDE(x) = integral_{-inf}^{x} f_KDE(t) dt and f_KDE is the Gaussian-kernel density estimate.

Parameters

bw_method : str, float, or None, default None Bandwidth selection method passed to :class:scipy.stats.gaussian_kde. None uses Scott's rule; 'silverman' uses Silverman's rule; a scalar sets the bandwidth factor directly. eps : float, default 1e-6 Clipping margin to prevent Phi^{-1}(0) = -inf or Phi^{-1}(1) = +inf.

Attributes

kdes_ : list of scipy.stats.gaussian_kde One fitted KDE object per feature dimension. n_features_ : int Number of feature dimensions seen during fit.

Notes

The inverse transform inverts the KDE CDF numerically via Brent's method (:func:scipy.optimize.brentq) searching in [-100, 100]. Samples outside this range default to 0.0.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.marginal import MarginalKDEGaussianize rng = np.random.default_rng(0) X = rng.standard_normal((50, 2)) kde_g = MarginalKDEGaussianize().fit(X) Z = kde_g.transform(X) Z.shape (50, 2)

Source code in rbig/_src/marginal.py
class MarginalKDEGaussianize(BaseTransform):
    """Transform each marginal to Gaussian using a KDE-estimated CDF.

    A kernel density estimate (KDE) with a Gaussian kernel is fitted to each
    feature dimension.  The cumulative integral of the KDE serves as a smooth
    approximation to the marginal CDF, which is then composed with the probit
    function Phi^{-1} to Gaussianize each dimension::

        z = Phi ^ {-1}(F_KDE(x))

    where ``F_KDE(x) = integral_{-inf}^{x} f_KDE(t) dt`` and ``f_KDE`` is
    the Gaussian-kernel density estimate.

    Parameters
    ----------
    bw_method : str, float, or None, default None
        Bandwidth selection method passed to
        :class:`scipy.stats.gaussian_kde`.  ``None`` uses Scott's rule;
        ``'silverman'`` uses Silverman's rule; a scalar sets the bandwidth
        factor directly.
    eps : float, default 1e-6
        Clipping margin to prevent ``Phi^{-1}(0) = -inf`` or
        ``Phi^{-1}(1) = +inf``.

    Attributes
    ----------
    kdes_ : list of scipy.stats.gaussian_kde
        One fitted KDE object per feature dimension.
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The inverse transform inverts the KDE CDF numerically via Brent's method
    (:func:`scipy.optimize.brentq`) searching in [-100, 100].  Samples
    outside this range default to 0.0.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import MarginalKDEGaussianize
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((50, 2))
    >>> kde_g = MarginalKDEGaussianize().fit(X)
    >>> Z = kde_g.transform(X)
    >>> Z.shape
    (50, 2)
    """

    def __init__(self, bw_method: str | float | None = None, eps: float = 1e-6):
        self.bw_method = bw_method
        self.eps = eps

    def fit(self, X: np.ndarray, y=None) -> MarginalKDEGaussianize:
        """Fit a Gaussian KDE to each feature dimension.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : MarginalKDEGaussianize
            Fitted transform instance.
        """
        self.kdes_ = []
        self.n_features_ = X.shape[1]
        for i in range(self.n_features_):
            # Fit an independent Gaussian KDE per feature
            self.kdes_.append(stats.gaussian_kde(X[:, i], bw_method=self.bw_method))
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Map each feature to N(0, 1) via KDE CDF then probit.

        Computes ``z = Phi^{-1}(F_KDE(x))`` per feature using numerical
        integration of the fitted KDE.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Gaussianized data.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Integrate KDE from -inf to each sample value to get CDF
            u = np.array(
                [self.kdes_[i].integrate_box_1d(-np.inf, xi) for xi in X[:, i]]
            )
            # Clip to avoid +/-inf from the probit function
            u = np.clip(u, self.eps, 1 - self.eps)
            Xt[:, i] = ndtri(u)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Map standard-normal values back to the original space.

        Numerically inverts the KDE CDF via Brent's root-finding method.
        For each sample *j* and feature *i*, solves::

            F_KDE(x) = Phi(z_j)

        searching on the interval [-100, 100].

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the Gaussianized space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
            Samples that fail root-finding are set to 0.0.
        """
        from scipy.optimize import brentq

        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            for j, xj in enumerate(X[:, i]):
                # Map z -> u in (0, 1) via normal CDF
                u = stats.norm.cdf(xj)
                try:
                    # Numerically invert F_KDE(x) = u via root-finding
                    Xt[j, i] = brentq(
                        lambda x, u=u, i=i: (
                            self.kdes_[i].integrate_box_1d(-np.inf, x) - u
                        ),
                        -100,
                        100,
                    )
                except ValueError:
                    # Root not found in [-100, 100]; fall back to zero
                    Xt[j, i] = 0.0
        return Xt

fit(X, y=None)

Fit a Gaussian KDE to each feature dimension.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : MarginalKDEGaussianize Fitted transform instance.

Source code in rbig/_src/marginal.py
def fit(self, X: np.ndarray, y=None) -> MarginalKDEGaussianize:
    """Fit a Gaussian KDE to each feature dimension.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : MarginalKDEGaussianize
        Fitted transform instance.
    """
    self.kdes_ = []
    self.n_features_ = X.shape[1]
    for i in range(self.n_features_):
        # Fit an independent Gaussian KDE per feature
        self.kdes_.append(stats.gaussian_kde(X[:, i], bw_method=self.bw_method))
    return self

transform(X)

Map each feature to N(0, 1) via KDE CDF then probit.

Computes z = Phi^{-1}(F_KDE(x)) per feature using numerical integration of the fitted KDE.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Gaussianized data.

Source code in rbig/_src/marginal.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Map each feature to N(0, 1) via KDE CDF then probit.

    Computes ``z = Phi^{-1}(F_KDE(x))`` per feature using numerical
    integration of the fitted KDE.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Gaussianized data.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Integrate KDE from -inf to each sample value to get CDF
        u = np.array(
            [self.kdes_[i].integrate_box_1d(-np.inf, xi) for xi in X[:, i]]
        )
        # Clip to avoid +/-inf from the probit function
        u = np.clip(u, self.eps, 1 - self.eps)
        Xt[:, i] = ndtri(u)
    return Xt

inverse_transform(X)

Map standard-normal values back to the original space.

Numerically inverts the KDE CDF via Brent's root-finding method. For each sample j and feature i, solves::

F_KDE(x) = Phi(z_j)

searching on the interval [-100, 100].

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the Gaussianized space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space. Samples that fail root-finding are set to 0.0.

Source code in rbig/_src/marginal.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Map standard-normal values back to the original space.

    Numerically inverts the KDE CDF via Brent's root-finding method.
    For each sample *j* and feature *i*, solves::

        F_KDE(x) = Phi(z_j)

    searching on the interval [-100, 100].

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the Gaussianized space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
        Samples that fail root-finding are set to 0.0.
    """
    from scipy.optimize import brentq

    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        for j, xj in enumerate(X[:, i]):
            # Map z -> u in (0, 1) via normal CDF
            u = stats.norm.cdf(xj)
            try:
                # Numerically invert F_KDE(x) = u via root-finding
                Xt[j, i] = brentq(
                    lambda x, u=u, i=i: (
                        self.kdes_[i].integrate_box_1d(-np.inf, x) - u
                    ),
                    -100,
                    100,
                )
            except ValueError:
                # Root not found in [-100, 100]; fall back to zero
                Xt[j, i] = 0.0
    return Xt

rbig.SplineGaussianizer

Bases: Bijector

Gaussianize each marginal using monotone PCHIP spline interpolation.

Estimates the marginal CDF from empirical quantiles and fits a shape-preserving (monotone) cubic Hermite spline (PCHIP) from original-space quantile values to the corresponding Gaussian quantiles. The forward transform is::

z = S(x)

where S is the fitted :class:scipy.interpolate.PchipInterpolator mapping data values to standard-normal quantiles. Because PCHIP preserves monotonicity, the mapping is guaranteed to be invertible.

Parameters

n_quantiles : int, default 200 Number of quantile nodes used to fit the splines. Capped at n_samples when fewer training samples are available. eps : float, default 1e-6 Clipping margin applied to the Gaussian quantile grid to keep the spline endpoints finite (avoids +/-inf at boundary quantiles).

Attributes

splines_ : list of scipy.interpolate.PchipInterpolator Forward splines (x -> z) per feature, mapping original-space values to standard-normal quantiles. inv_splines_ : list of scipy.interpolate.PchipInterpolator Inverse splines (z -> x) per feature, mapping Gaussian quantiles back to original-space values. n_features_ : int Number of feature dimensions seen during fit.

Notes

The log-det-Jacobian uses the analytic first derivative of the spline::

log|dz/dx| = log|S'(x)|

where S' is the first derivative of the PCHIP forward spline, evaluated via spline(x, 1) (the derivative-order argument).

Duplicate x-values (arising from discrete or constant features) are removed before fitting to ensure strict monotonicity.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.marginal import SplineGaussianizer rng = np.random.default_rng(0) X = rng.standard_normal((300, 3)) sg = SplineGaussianizer(n_quantiles=100).fit(X) Z = sg.transform(X) Z.shape (300, 3) ldj = sg.get_log_det_jacobian(X) ldj.shape (300,)

Source code in rbig/_src/marginal.py
class SplineGaussianizer(Bijector):
    """Gaussianize each marginal using monotone PCHIP spline interpolation.

    Estimates the marginal CDF from empirical quantiles and fits a
    shape-preserving (monotone) cubic Hermite spline (PCHIP) from
    original-space quantile values to the corresponding Gaussian quantiles.
    The forward transform is::

        z = S(x)

    where ``S`` is the fitted :class:`scipy.interpolate.PchipInterpolator`
    mapping data values to standard-normal quantiles.  Because PCHIP
    preserves monotonicity, the mapping is guaranteed to be invertible.

    Parameters
    ----------
    n_quantiles : int, default 200
        Number of quantile nodes used to fit the splines.  Capped at
        ``n_samples`` when fewer training samples are available.
    eps : float, default 1e-6
        Clipping margin applied to the Gaussian quantile grid to keep
        the spline endpoints finite (avoids +/-inf at boundary quantiles).

    Attributes
    ----------
    splines_ : list of scipy.interpolate.PchipInterpolator
        Forward splines (x -> z) per feature, mapping original-space
        values to standard-normal quantiles.
    inv_splines_ : list of scipy.interpolate.PchipInterpolator
        Inverse splines (z -> x) per feature, mapping Gaussian quantiles
        back to original-space values.
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The log-det-Jacobian uses the analytic first derivative of the spline::

        log|dz/dx| = log|S'(x)|

    where ``S'`` is the first derivative of the PCHIP forward spline,
    evaluated via ``spline(x, 1)`` (the derivative-order argument).

    Duplicate x-values (arising from discrete or constant features) are
    removed before fitting to ensure strict monotonicity.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import SplineGaussianizer
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((300, 3))
    >>> sg = SplineGaussianizer(n_quantiles=100).fit(X)
    >>> Z = sg.transform(X)
    >>> Z.shape
    (300, 3)
    >>> ldj = sg.get_log_det_jacobian(X)
    >>> ldj.shape
    (300,)
    """

    def __init__(self, n_quantiles: int = 200, eps: float = 1e-6):
        self.n_quantiles = n_quantiles
        self.eps = eps

    def fit(self, X: np.ndarray, y=None) -> SplineGaussianizer:
        """Fit forward and inverse PCHIP splines for each feature.

        For each dimension, ``n_quantiles`` evenly-spaced probability levels
        are mapped to their empirical quantile values in the data, and the
        corresponding Gaussian quantile values ``Phi^{-1}(p)`` are computed.
        PCHIP interpolants are then fitted in both directions.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : SplineGaussianizer
            Fitted bijector instance.
        """
        from scipy.interpolate import PchipInterpolator

        self.splines_ = []
        self.inv_splines_ = []
        self.n_features_ = X.shape[1]
        # Use at most n_samples quantile nodes
        n_q = min(self.n_quantiles, X.shape[0])
        # Probability grid: n_q evenly-spaced points in [0, 1]
        quantiles = np.linspace(0, 1, n_q)
        # Corresponding Gaussian quantiles Phi^{-1}(p), clipped away from +/-inf
        g_q = ndtri(np.clip(quantiles, self.eps, 1 - self.eps))
        for i in range(self.n_features_):
            xi_sorted = np.sort(X[:, i])
            # Empirical quantile values at each probability level
            x_q = np.quantile(xi_sorted, quantiles)
            # Remove duplicate x values so PchipInterpolator gets a strictly
            # increasing sequence (duplicates arise with discrete/tied data).
            x_q_u, idx = np.unique(x_q, return_index=True)
            g_q_u = g_q[idx]
            self.splines_.append(PchipInterpolator(x_q_u, g_q_u))
            self.inv_splines_.append(PchipInterpolator(g_q_u, x_q_u))
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the forward spline map: x -> z = S(x).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Gaussianized data with approximately standard-normal marginals.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Evaluate the forward PCHIP spline at the input values
            Xt[:, i] = self.splines_[i](X[:, i])
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the inverse spline map: z -> x = S^{-1}(z).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the Gaussianized space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Evaluate the inverse PCHIP spline at the Gaussian values
            Xt[:, i] = self.inv_splines_[i](X[:, i])
        return Xt

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute log |det J| using the analytic spline first derivative.

        Because the Jacobian is diagonal::

            log|det J| = sum_i log|S'(x_i)|

        where ``S'`` is the first derivative of the PCHIP forward spline,
        evaluated via ``spline(x, 1)`` (the derivative order argument).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points at which to evaluate the log-det-Jacobian.

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant.
        """
        log_det = np.zeros(X.shape[0])
        for i in range(self.n_features_):
            deriv = self.splines_[i](X[:, i], 1)  # first derivative
            log_det += np.log(np.maximum(np.abs(deriv), 1e-300))
        return log_det

fit(X, y=None)

Fit forward and inverse PCHIP splines for each feature.

For each dimension, n_quantiles evenly-spaced probability levels are mapped to their empirical quantile values in the data, and the corresponding Gaussian quantile values Phi^{-1}(p) are computed. PCHIP interpolants are then fitted in both directions.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : SplineGaussianizer Fitted bijector instance.

Source code in rbig/_src/marginal.py
def fit(self, X: np.ndarray, y=None) -> SplineGaussianizer:
    """Fit forward and inverse PCHIP splines for each feature.

    For each dimension, ``n_quantiles`` evenly-spaced probability levels
    are mapped to their empirical quantile values in the data, and the
    corresponding Gaussian quantile values ``Phi^{-1}(p)`` are computed.
    PCHIP interpolants are then fitted in both directions.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : SplineGaussianizer
        Fitted bijector instance.
    """
    from scipy.interpolate import PchipInterpolator

    self.splines_ = []
    self.inv_splines_ = []
    self.n_features_ = X.shape[1]
    # Use at most n_samples quantile nodes
    n_q = min(self.n_quantiles, X.shape[0])
    # Probability grid: n_q evenly-spaced points in [0, 1]
    quantiles = np.linspace(0, 1, n_q)
    # Corresponding Gaussian quantiles Phi^{-1}(p), clipped away from +/-inf
    g_q = ndtri(np.clip(quantiles, self.eps, 1 - self.eps))
    for i in range(self.n_features_):
        xi_sorted = np.sort(X[:, i])
        # Empirical quantile values at each probability level
        x_q = np.quantile(xi_sorted, quantiles)
        # Remove duplicate x values so PchipInterpolator gets a strictly
        # increasing sequence (duplicates arise with discrete/tied data).
        x_q_u, idx = np.unique(x_q, return_index=True)
        g_q_u = g_q[idx]
        self.splines_.append(PchipInterpolator(x_q_u, g_q_u))
        self.inv_splines_.append(PchipInterpolator(g_q_u, x_q_u))
    return self

transform(X)

Apply the forward spline map: x -> z = S(x).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Gaussianized data with approximately standard-normal marginals.

Source code in rbig/_src/marginal.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the forward spline map: x -> z = S(x).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Gaussianized data with approximately standard-normal marginals.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Evaluate the forward PCHIP spline at the input values
        Xt[:, i] = self.splines_[i](X[:, i])
    return Xt

inverse_transform(X)

Apply the inverse spline map: z -> x = S^{-1}(z).

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the Gaussianized space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space.

Source code in rbig/_src/marginal.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the inverse spline map: z -> x = S^{-1}(z).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the Gaussianized space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Evaluate the inverse PCHIP spline at the Gaussian values
        Xt[:, i] = self.inv_splines_[i](X[:, i])
    return Xt

get_log_det_jacobian(X)

Compute log |det J| using the analytic spline first derivative.

Because the Jacobian is diagonal::

log|det J| = sum_i log|S'(x_i)|

where S' is the first derivative of the PCHIP forward spline, evaluated via spline(x, 1) (the derivative order argument).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points at which to evaluate the log-det-Jacobian.

Returns

log_det : np.ndarray of shape (n_samples,) Per-sample log absolute determinant.

Source code in rbig/_src/marginal.py
def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute log |det J| using the analytic spline first derivative.

    Because the Jacobian is diagonal::

        log|det J| = sum_i log|S'(x_i)|

    where ``S'`` is the first derivative of the PCHIP forward spline,
    evaluated via ``spline(x, 1)`` (the derivative order argument).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points at which to evaluate the log-det-Jacobian.

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant.
    """
    log_det = np.zeros(X.shape[0])
    for i in range(self.n_features_):
        deriv = self.splines_[i](X[:, i], 1)  # first derivative
        log_det += np.log(np.maximum(np.abs(deriv), 1e-300))
    return log_det

rbig.KDEGaussianizer

Bases: Bijector

Gaussianize each marginal using a KDE-estimated CDF and probit.

Fits a Gaussian kernel density estimate (KDE) to each feature dimension, then maps samples to standard-normal values via::

z = Phi ^ {-1}(F_KDE(x))

where F_KDE(x) = integral_{-inf}^{x} f_KDE(t) dt is the smooth KDE-based CDF and Phi^{-1} is the Gaussian probit (inverse CDF).

Parameters

bw_method : str, float, or None, default None Bandwidth selection passed to :class:scipy.stats.gaussian_kde. None uses Scott's rule; 'silverman' uses Silverman's rule; a scalar sets the smoothing factor directly. eps : float, default 1e-6 Clipping margin applied to the CDF value before the probit to prevent Phi^{-1}(0) = -inf or Phi^{-1}(1) = +inf.

Attributes

kdes_ : list of scipy.stats.gaussian_kde One fitted KDE per feature dimension. n_features_ : int Number of feature dimensions seen during fit.

Notes

The log-det-Jacobian uses the analytic KDE density::

log|dz/dx| = log f_KDE(x) - log phi(z)

where phi is the standard-normal PDF evaluated at the Gaussianized value z = Phi^{-1}(F_KDE(x)).

The inverse transform uses Brent's root-finding algorithm to numerically invert F_KDE(x) = Phi(z) on the interval [-100, 100].

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.marginal import KDEGaussianizer rng = np.random.default_rng(0) X = rng.standard_normal((100, 2)) kde = KDEGaussianizer().fit(X) Z = kde.transform(X) Z.shape (100, 2) ldj = kde.get_log_det_jacobian(X) ldj.shape (100,)

Source code in rbig/_src/marginal.py
class KDEGaussianizer(Bijector):
    """Gaussianize each marginal using a KDE-estimated CDF and probit.

    Fits a Gaussian kernel density estimate (KDE) to each feature dimension,
    then maps samples to standard-normal values via::

        z = Phi ^ {-1}(F_KDE(x))

    where ``F_KDE(x) = integral_{-inf}^{x} f_KDE(t) dt`` is the smooth
    KDE-based CDF and ``Phi^{-1}`` is the Gaussian probit (inverse CDF).

    Parameters
    ----------
    bw_method : str, float, or None, default None
        Bandwidth selection passed to :class:`scipy.stats.gaussian_kde`.
        ``None`` uses Scott's rule; ``'silverman'`` uses Silverman's rule;
        a scalar sets the smoothing factor directly.
    eps : float, default 1e-6
        Clipping margin applied to the CDF value before the probit to
        prevent ``Phi^{-1}(0) = -inf`` or ``Phi^{-1}(1) = +inf``.

    Attributes
    ----------
    kdes_ : list of scipy.stats.gaussian_kde
        One fitted KDE per feature dimension.
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The log-det-Jacobian uses the analytic KDE density::

        log|dz/dx| = log f_KDE(x) - log phi(z)

    where ``phi`` is the standard-normal PDF evaluated at the Gaussianized
    value ``z = Phi^{-1}(F_KDE(x))``.

    The inverse transform uses Brent's root-finding algorithm to numerically
    invert ``F_KDE(x) = Phi(z)`` on the interval [-100, 100].

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import KDEGaussianizer
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((100, 2))
    >>> kde = KDEGaussianizer().fit(X)
    >>> Z = kde.transform(X)
    >>> Z.shape
    (100, 2)
    >>> ldj = kde.get_log_det_jacobian(X)
    >>> ldj.shape
    (100,)
    """

    def __init__(self, bw_method: str | float | None = None, eps: float = 1e-6):
        self.bw_method = bw_method
        self.eps = eps

    def fit(self, X: np.ndarray, y=None) -> KDEGaussianizer:
        """Fit a Gaussian KDE to each feature dimension.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : KDEGaussianizer
            Fitted bijector instance.
        """
        self.kdes_ = []
        self.n_features_ = X.shape[1]
        for i in range(self.n_features_):
            # Independent KDE per feature
            self.kdes_.append(stats.gaussian_kde(X[:, i], bw_method=self.bw_method))
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Map each feature to N(0, 1) via KDE CDF then probit.

        Computes ``z = Phi^{-1}(F_KDE(x))`` for each feature independently.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Gaussianized data with approximately standard-normal marginals.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Numerical integration of KDE from -inf to each sample value
            u = np.array(
                [self.kdes_[i].integrate_box_1d(-np.inf, xi) for xi in X[:, i]]
            )
            # Clip CDF values away from boundaries before probit
            u = np.clip(u, self.eps, 1 - self.eps)
            Xt[:, i] = ndtri(u)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Map standard-normal values back to the original space.

        Numerically inverts ``F_KDE(x) = Phi(z)`` using Brent's method
        on the interval [-100, 100] per sample and feature.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the Gaussianized (standard-normal) space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
            Samples for which root-finding fails default to 0.0.
        """
        from scipy.optimize import brentq

        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            for j, xj in enumerate(X[:, i]):
                # Map z -> u = Phi(z) in (0, 1)
                u = stats.norm.cdf(xj)
                try:
                    # Find x such that F_KDE(x) = u
                    Xt[j, i] = brentq(
                        lambda x, u=u, i=i: (
                            self.kdes_[i].integrate_box_1d(-np.inf, x) - u
                        ),
                        -100,
                        100,
                    )
                except ValueError:
                    # Root not bracketed in [-100, 100]; use zero as fallback
                    Xt[j, i] = 0.0
        return Xt

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute log |det J| using the analytic KDE density.

        Because the Jacobian is diagonal (each feature transformed
        independently)::

            log|det J| = sum_i log|dz_i/dx_i|
                       = sum_i [log f_KDE(x_i) - log phi(z_i)]

        where ``phi`` is the standard-normal PDF evaluated at
        ``z_i = Phi^{-1}(F_KDE(x_i))``.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points at which to evaluate the log-det-Jacobian.

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant.
        """
        log_det = np.zeros(X.shape[0])
        for i in range(self.n_features_):
            # Evaluate KDE density (used as the empirical marginal PDF)
            pdf = self.kdes_[i](X[:, i])
            # Compute KDE CDF via numerical integration
            u = np.array(
                [self.kdes_[i].integrate_box_1d(-np.inf, xi) for xi in X[:, i]]
            )
            u = np.clip(u, self.eps, 1 - self.eps)
            g = ndtri(u)  # Gaussianized value z = Phi^{-1}(u)
            log_phi = stats.norm.logpdf(g)  # log phi(z)
            # log|dz/dx| = log f_KDE(x) - log phi(z)
            log_det += np.log(np.maximum(pdf, 1e-300)) - log_phi
        return log_det

fit(X, y=None)

Fit a Gaussian KDE to each feature dimension.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : KDEGaussianizer Fitted bijector instance.

Source code in rbig/_src/marginal.py
def fit(self, X: np.ndarray, y=None) -> KDEGaussianizer:
    """Fit a Gaussian KDE to each feature dimension.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : KDEGaussianizer
        Fitted bijector instance.
    """
    self.kdes_ = []
    self.n_features_ = X.shape[1]
    for i in range(self.n_features_):
        # Independent KDE per feature
        self.kdes_.append(stats.gaussian_kde(X[:, i], bw_method=self.bw_method))
    return self

transform(X)

Map each feature to N(0, 1) via KDE CDF then probit.

Computes z = Phi^{-1}(F_KDE(x)) for each feature independently.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Gaussianized data with approximately standard-normal marginals.

Source code in rbig/_src/marginal.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Map each feature to N(0, 1) via KDE CDF then probit.

    Computes ``z = Phi^{-1}(F_KDE(x))`` for each feature independently.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Gaussianized data with approximately standard-normal marginals.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Numerical integration of KDE from -inf to each sample value
        u = np.array(
            [self.kdes_[i].integrate_box_1d(-np.inf, xi) for xi in X[:, i]]
        )
        # Clip CDF values away from boundaries before probit
        u = np.clip(u, self.eps, 1 - self.eps)
        Xt[:, i] = ndtri(u)
    return Xt

inverse_transform(X)

Map standard-normal values back to the original space.

Numerically inverts F_KDE(x) = Phi(z) using Brent's method on the interval [-100, 100] per sample and feature.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the Gaussianized (standard-normal) space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space. Samples for which root-finding fails default to 0.0.

Source code in rbig/_src/marginal.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Map standard-normal values back to the original space.

    Numerically inverts ``F_KDE(x) = Phi(z)`` using Brent's method
    on the interval [-100, 100] per sample and feature.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the Gaussianized (standard-normal) space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
        Samples for which root-finding fails default to 0.0.
    """
    from scipy.optimize import brentq

    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        for j, xj in enumerate(X[:, i]):
            # Map z -> u = Phi(z) in (0, 1)
            u = stats.norm.cdf(xj)
            try:
                # Find x such that F_KDE(x) = u
                Xt[j, i] = brentq(
                    lambda x, u=u, i=i: (
                        self.kdes_[i].integrate_box_1d(-np.inf, x) - u
                    ),
                    -100,
                    100,
                )
            except ValueError:
                # Root not bracketed in [-100, 100]; use zero as fallback
                Xt[j, i] = 0.0
    return Xt

get_log_det_jacobian(X)

Compute log |det J| using the analytic KDE density.

Because the Jacobian is diagonal (each feature transformed independently)::

log|det J| = sum_i log|dz_i/dx_i|
           = sum_i [log f_KDE(x_i) - log phi(z_i)]

where phi is the standard-normal PDF evaluated at z_i = Phi^{-1}(F_KDE(x_i)).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points at which to evaluate the log-det-Jacobian.

Returns

log_det : np.ndarray of shape (n_samples,) Per-sample log absolute determinant.

Source code in rbig/_src/marginal.py
def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute log |det J| using the analytic KDE density.

    Because the Jacobian is diagonal (each feature transformed
    independently)::

        log|det J| = sum_i log|dz_i/dx_i|
                   = sum_i [log f_KDE(x_i) - log phi(z_i)]

    where ``phi`` is the standard-normal PDF evaluated at
    ``z_i = Phi^{-1}(F_KDE(x_i))``.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points at which to evaluate the log-det-Jacobian.

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant.
    """
    log_det = np.zeros(X.shape[0])
    for i in range(self.n_features_):
        # Evaluate KDE density (used as the empirical marginal PDF)
        pdf = self.kdes_[i](X[:, i])
        # Compute KDE CDF via numerical integration
        u = np.array(
            [self.kdes_[i].integrate_box_1d(-np.inf, xi) for xi in X[:, i]]
        )
        u = np.clip(u, self.eps, 1 - self.eps)
        g = ndtri(u)  # Gaussianized value z = Phi^{-1}(u)
        log_phi = stats.norm.logpdf(g)  # log phi(z)
        # log|dz/dx| = log f_KDE(x) - log phi(z)
        log_det += np.log(np.maximum(pdf, 1e-300)) - log_phi
    return log_det

rbig.GMMGaussianizer

Bases: Bijector

Gaussianize each marginal using a Gaussian Mixture Model (GMM) CDF.

Fits a univariate GMM with n_components Gaussian components to each feature dimension, then maps samples to standard-normal values via the analytic GMM CDF::

F_GMM(x) = sum_k w_k * Phi((x - mu_k) / sigma_k)

followed by the probit function::

z = Phi ^ {-1}(F_GMM(x))

where Phi is the standard-normal CDF, w_k are mixture weights, and mu_k, sigma_k are the component means and standard deviations.

Parameters

n_components : int, default 5 Number of mixture components. Capped at max(1, min(n_components, n_samples // 5, n_samples)) during fit to avoid over-fitting on small data sets. random_state : int or None, default 0 Seed for reproducible GMM initialisation.

Attributes

gmms_ : list of sklearn.mixture.GaussianMixture One fitted 1-D GMM per feature dimension. n_features_ : int Number of feature dimensions seen during fit.

Notes

The log-det-Jacobian uses the analytic GMM density::

log|dz/dx| = log f_GMM(x) - log phi(z)

where f_GMM(x) = sum_k w_k * phi((x - mu_k) / sigma_k) / sigma_k is the GMM PDF and phi is the standard-normal PDF.

The inverse transform numerically inverts the GMM CDF via Brent's method on [-50, 50]; samples outside this range default to 0.0.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.marginal import GMMGaussianizer rng = np.random.default_rng(0) X = rng.standard_normal((200, 2)) gmm = GMMGaussianizer(n_components=3).fit(X) Z = gmm.transform(X) Z.shape (200, 2) ldj = gmm.get_log_det_jacobian(X) ldj.shape (200,)

Source code in rbig/_src/marginal.py
class GMMGaussianizer(Bijector):
    """Gaussianize each marginal using a Gaussian Mixture Model (GMM) CDF.

    Fits a univariate GMM with ``n_components`` Gaussian components to each
    feature dimension, then maps samples to standard-normal values via the
    analytic GMM CDF::

        F_GMM(x) = sum_k w_k * Phi((x - mu_k) / sigma_k)

    followed by the probit function::

        z = Phi ^ {-1}(F_GMM(x))

    where ``Phi`` is the standard-normal CDF, ``w_k`` are mixture weights,
    and ``mu_k``, ``sigma_k`` are the component means and standard deviations.

    Parameters
    ----------
    n_components : int, default 5
        Number of mixture components.  Capped at
        ``max(1, min(n_components, n_samples // 5, n_samples))`` during
        ``fit`` to avoid over-fitting on small data sets.
    random_state : int or None, default 0
        Seed for reproducible GMM initialisation.

    Attributes
    ----------
    gmms_ : list of sklearn.mixture.GaussianMixture
        One fitted 1-D GMM per feature dimension.
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The log-det-Jacobian uses the analytic GMM density::

        log|dz/dx| = log f_GMM(x) - log phi(z)

    where ``f_GMM(x) = sum_k w_k * phi((x - mu_k) / sigma_k) / sigma_k``
    is the GMM PDF and ``phi`` is the standard-normal PDF.

    The inverse transform numerically inverts the GMM CDF via Brent's
    method on [-50, 50]; samples outside this range default to 0.0.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import GMMGaussianizer
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 2))
    >>> gmm = GMMGaussianizer(n_components=3).fit(X)
    >>> Z = gmm.transform(X)
    >>> Z.shape
    (200, 2)
    >>> ldj = gmm.get_log_det_jacobian(X)
    >>> ldj.shape
    (200,)
    """

    def __init__(self, n_components: int = 5, random_state: int | None = 0):
        self.n_components = n_components
        self.random_state = random_state

    def fit(self, X: np.ndarray, y=None) -> GMMGaussianizer:
        """Fit a univariate GMM to each feature dimension.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : GMMGaussianizer
            Fitted bijector instance.

        Notes
        -----
        The number of mixture components is capped at
        ``max(1, min(n_components, n_samples // 5, n_samples))`` to avoid
        over-fitting when ``n_samples`` is small.
        """
        from sklearn.mixture import GaussianMixture

        self.gmms_ = []
        self.n_features_ = X.shape[1]
        n_samples = X.shape[0]
        # Cap n_components to avoid GMMs with more components than data points
        n_components = max(1, min(self.n_components, n_samples // 5, n_samples))
        for i in range(self.n_features_):
            gmm = GaussianMixture(
                n_components=n_components,
                random_state=self.random_state,
            )
            # Reshape to (n_samples, 1) as required by sklearn GMM API
            gmm.fit(X[:, i : i + 1])
            self.gmms_.append(gmm)
        return self

    def _cdf(self, gmm, x: np.ndarray) -> np.ndarray:
        """Compute the GMM CDF at points x (1-D).

        Evaluates the mixture CDF::

            F_GMM(x) = sum_k w_k * Phi((x - mu_k) / sigma_k)

        Parameters
        ----------
        gmm : sklearn.mixture.GaussianMixture
            Fitted 1-D GMM.
        x : np.ndarray of shape (n_samples,)
            Query points.

        Returns
        -------
        cdf : np.ndarray of shape (n_samples,)
            GMM CDF values in [0, 1].
        """
        weights = gmm.weights_  # mixture weights, shape (K,)
        means = gmm.means_.ravel()  # component means, shape (K,)
        stds = np.sqrt(gmm.covariances_.ravel())  # component stds, shape (K,)
        cdf = np.zeros_like(x, dtype=float)
        for w, mu, sigma in zip(weights, means, stds, strict=False):
            # Weighted sum of normal CDFs: w_k * Phi((x - mu_k) / sigma_k)
            cdf += w * stats.norm.cdf(x, loc=mu, scale=sigma)
        return cdf

    def _pdf(self, gmm, x: np.ndarray) -> np.ndarray:
        """Compute the GMM PDF at points x (1-D).

        Evaluates the mixture density::

            f_GMM(x) = sum_k w_k * phi((x - mu_k) / sigma_k) / sigma_k

        Parameters
        ----------
        gmm : sklearn.mixture.GaussianMixture
            Fitted 1-D GMM.
        x : np.ndarray of shape (n_samples,)
            Query points.

        Returns
        -------
        pdf : np.ndarray of shape (n_samples,)
            GMM PDF values (>= 0).
        """
        weights = gmm.weights_
        means = gmm.means_.ravel()
        stds = np.sqrt(gmm.covariances_.ravel())
        pdf = np.zeros_like(x, dtype=float)
        for w, mu, sigma in zip(weights, means, stds, strict=False):
            # Weighted sum of normal PDFs: w_k * phi((x - mu_k) / sigma_k) / sigma_k
            pdf += w * stats.norm.pdf(x, loc=mu, scale=sigma)
        return pdf

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Map each feature to N(0, 1) via GMM CDF then probit.

        Applies ``z = Phi^{-1}(F_GMM(x))`` to each feature independently.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Gaussianized data with approximately standard-normal marginals.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            # Evaluate analytic GMM CDF
            u = self._cdf(self.gmms_[i], X[:, i])
            # Clip away from boundaries before probit
            u = np.clip(u, 1e-6, 1 - 1e-6)
            Xt[:, i] = ndtri(u)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Map standard-normal values back to the original space.

        Numerically inverts ``F_GMM(x) = Phi(z)`` per sample via Brent's
        method on the interval [-50, 50].

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the Gaussianized (standard-normal) space.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
            Samples for which root-finding fails default to 0.0.
        """
        from scipy.optimize import brentq

        Xt = np.zeros_like(X, dtype=float)
        for i in range(self.n_features_):
            for j, xj in enumerate(X[:, i]):
                # Map z -> u = Phi(z) in (0, 1)
                u = stats.norm.cdf(xj)
                try:
                    # Numerically solve F_GMM(x) = u for x
                    Xt[j, i] = brentq(
                        lambda x, u=u, i=i: (
                            self._cdf(self.gmms_[i], np.array([x]))[0] - u
                        ),
                        -50,
                        50,
                    )
                except ValueError:
                    # Root not found in [-50, 50]; fall back to zero
                    Xt[j, i] = 0.0
        return Xt

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute log |det J| using the analytic GMM density.

        Because the Jacobian is diagonal (each feature transformed
        independently)::

            log|det J| = sum_i log|dz_i/dx_i|
                       = sum_i [log f_GMM(x_i) - log phi(z_i)]

        where ``z_i = Phi^{-1}(F_GMM(x_i))`` and ``phi`` is the
        standard-normal PDF.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points at which to evaluate the log-det-Jacobian.

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant.
        """
        log_det = np.zeros(X.shape[0])
        for i in range(self.n_features_):
            # Evaluate GMM CDF and clip to avoid probit boundary issues
            u = self._cdf(self.gmms_[i], X[:, i])
            u = np.clip(u, 1e-6, 1 - 1e-6)
            g = ndtri(u)  # z = Phi^{-1}(F_GMM(x))
            pdf = self._pdf(self.gmms_[i], X[:, i])  # f_GMM(x)
            log_phi = stats.norm.logpdf(g)  # log phi(z)
            # log|dz/dx| = log f_GMM(x) - log phi(z)
            log_det += np.log(np.maximum(pdf, 1e-300)) - log_phi
        return log_det

fit(X, y=None)

Fit a univariate GMM to each feature dimension.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : GMMGaussianizer Fitted bijector instance.

Notes

The number of mixture components is capped at max(1, min(n_components, n_samples // 5, n_samples)) to avoid over-fitting when n_samples is small.

Source code in rbig/_src/marginal.py
def fit(self, X: np.ndarray, y=None) -> GMMGaussianizer:
    """Fit a univariate GMM to each feature dimension.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : GMMGaussianizer
        Fitted bijector instance.

    Notes
    -----
    The number of mixture components is capped at
    ``max(1, min(n_components, n_samples // 5, n_samples))`` to avoid
    over-fitting when ``n_samples`` is small.
    """
    from sklearn.mixture import GaussianMixture

    self.gmms_ = []
    self.n_features_ = X.shape[1]
    n_samples = X.shape[0]
    # Cap n_components to avoid GMMs with more components than data points
    n_components = max(1, min(self.n_components, n_samples // 5, n_samples))
    for i in range(self.n_features_):
        gmm = GaussianMixture(
            n_components=n_components,
            random_state=self.random_state,
        )
        # Reshape to (n_samples, 1) as required by sklearn GMM API
        gmm.fit(X[:, i : i + 1])
        self.gmms_.append(gmm)
    return self

transform(X)

Map each feature to N(0, 1) via GMM CDF then probit.

Applies z = Phi^{-1}(F_GMM(x)) to each feature independently.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Gaussianized data with approximately standard-normal marginals.

Source code in rbig/_src/marginal.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Map each feature to N(0, 1) via GMM CDF then probit.

    Applies ``z = Phi^{-1}(F_GMM(x))`` to each feature independently.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Gaussianized data with approximately standard-normal marginals.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        # Evaluate analytic GMM CDF
        u = self._cdf(self.gmms_[i], X[:, i])
        # Clip away from boundaries before probit
        u = np.clip(u, 1e-6, 1 - 1e-6)
        Xt[:, i] = ndtri(u)
    return Xt

inverse_transform(X)

Map standard-normal values back to the original space.

Numerically inverts F_GMM(x) = Phi(z) per sample via Brent's method on the interval [-50, 50].

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the Gaussianized (standard-normal) space.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space. Samples for which root-finding fails default to 0.0.

Source code in rbig/_src/marginal.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Map standard-normal values back to the original space.

    Numerically inverts ``F_GMM(x) = Phi(z)`` per sample via Brent's
    method on the interval [-50, 50].

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the Gaussianized (standard-normal) space.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
        Samples for which root-finding fails default to 0.0.
    """
    from scipy.optimize import brentq

    Xt = np.zeros_like(X, dtype=float)
    for i in range(self.n_features_):
        for j, xj in enumerate(X[:, i]):
            # Map z -> u = Phi(z) in (0, 1)
            u = stats.norm.cdf(xj)
            try:
                # Numerically solve F_GMM(x) = u for x
                Xt[j, i] = brentq(
                    lambda x, u=u, i=i: (
                        self._cdf(self.gmms_[i], np.array([x]))[0] - u
                    ),
                    -50,
                    50,
                )
            except ValueError:
                # Root not found in [-50, 50]; fall back to zero
                Xt[j, i] = 0.0
    return Xt

get_log_det_jacobian(X)

Compute log |det J| using the analytic GMM density.

Because the Jacobian is diagonal (each feature transformed independently)::

log|det J| = sum_i log|dz_i/dx_i|
           = sum_i [log f_GMM(x_i) - log phi(z_i)]

where z_i = Phi^{-1}(F_GMM(x_i)) and phi is the standard-normal PDF.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points at which to evaluate the log-det-Jacobian.

Returns

log_det : np.ndarray of shape (n_samples,) Per-sample log absolute determinant.

Source code in rbig/_src/marginal.py
def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute log |det J| using the analytic GMM density.

    Because the Jacobian is diagonal (each feature transformed
    independently)::

        log|det J| = sum_i log|dz_i/dx_i|
                   = sum_i [log f_GMM(x_i) - log phi(z_i)]

    where ``z_i = Phi^{-1}(F_GMM(x_i))`` and ``phi`` is the
    standard-normal PDF.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points at which to evaluate the log-det-Jacobian.

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant.
    """
    log_det = np.zeros(X.shape[0])
    for i in range(self.n_features_):
        # Evaluate GMM CDF and clip to avoid probit boundary issues
        u = self._cdf(self.gmms_[i], X[:, i])
        u = np.clip(u, 1e-6, 1 - 1e-6)
        g = ndtri(u)  # z = Phi^{-1}(F_GMM(x))
        pdf = self._pdf(self.gmms_[i], X[:, i])  # f_GMM(x)
        log_phi = stats.norm.logpdf(g)  # log phi(z)
        # log|dz/dx| = log f_GMM(x) - log phi(z)
        log_det += np.log(np.maximum(pdf, 1e-300)) - log_phi
    return log_det

rbig.QuantileGaussianizer

Bases: Bijector

Gaussianize each marginal using sklearn's QuantileTransformer.

Wraps :class:sklearn.preprocessing.QuantileTransformer configured with output_distribution='normal' to map each feature to an approximately standard-normal distribution. The quantile transform is a step-function CDF estimate that is particularly robust to outliers.

Parameters

n_quantiles : int, default 1000 Number of quantile nodes used to define the piecewise-linear mapping. Capped at n_samples during fit to avoid requesting more quantiles than there are training points. random_state : int or None, default 0 Seed for reproducible subsampling inside QuantileTransformer.

Attributes

qt_ : sklearn.preprocessing.QuantileTransformer Fitted quantile transformer with output_distribution='normal'. n_features_ : int Number of feature dimensions seen during fit.

Notes

The log-absolute-Jacobian is estimated via central finite differences::

dz_i/dx_i ~= (z_i(x + eps*e_i) - z_i(x - eps*e_i)) / (2*eps)

with eps = 1e-5. This approximation may be inaccurate near discontinuities of the piecewise quantile function.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.marginal import QuantileGaussianizer rng = np.random.default_rng(0) X = rng.standard_normal((200, 3)) qg = QuantileGaussianizer().fit(X) Z = qg.transform(X) Z.shape (200, 3) Xr = qg.inverse_transform(Z) Xr.shape (200, 3)

Source code in rbig/_src/marginal.py
class QuantileGaussianizer(Bijector):
    """Gaussianize each marginal using sklearn's QuantileTransformer.

    Wraps :class:`sklearn.preprocessing.QuantileTransformer` configured with
    ``output_distribution='normal'`` to map each feature to an approximately
    standard-normal distribution.  The quantile transform is a step-function
    CDF estimate that is particularly robust to outliers.

    Parameters
    ----------
    n_quantiles : int, default 1000
        Number of quantile nodes used to define the piecewise-linear mapping.
        Capped at ``n_samples`` during ``fit`` to avoid requesting more
        quantiles than there are training points.
    random_state : int or None, default 0
        Seed for reproducible subsampling inside ``QuantileTransformer``.

    Attributes
    ----------
    qt_ : sklearn.preprocessing.QuantileTransformer
        Fitted quantile transformer with ``output_distribution='normal'``.
    n_features_ : int
        Number of feature dimensions seen during ``fit``.

    Notes
    -----
    The log-absolute-Jacobian is estimated via central finite differences::

        dz_i/dx_i ~= (z_i(x + eps*e_i) - z_i(x - eps*e_i)) / (2*eps)

    with ``eps = 1e-5``.  This approximation may be inaccurate near
    discontinuities of the piecewise quantile function.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.marginal import QuantileGaussianizer
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 3))
    >>> qg = QuantileGaussianizer().fit(X)
    >>> Z = qg.transform(X)
    >>> Z.shape
    (200, 3)
    >>> Xr = qg.inverse_transform(Z)
    >>> Xr.shape
    (200, 3)
    """

    def __init__(self, n_quantiles: int = 1000, random_state: int | None = 0):
        self.n_quantiles = n_quantiles
        self.random_state = random_state

    def fit(self, X: np.ndarray, y=None) -> QuantileGaussianizer:
        """Fit the quantile transformer to the training data.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : QuantileGaussianizer
            Fitted bijector instance.
        """
        from sklearn.preprocessing import QuantileTransformer

        # Cap quantile count so it cannot exceed the available samples
        n_quantiles = min(self.n_quantiles, X.shape[0])
        self.qt_ = QuantileTransformer(
            n_quantiles=n_quantiles,
            output_distribution="normal",
            random_state=self.random_state,
        )
        self.qt_.fit(X)
        self.n_features_ = X.shape[1]
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the quantile transform: x -> z approximately N(0, 1).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Gaussianized data.
        """
        return self.qt_.transform(X)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the quantile transform: z -> x.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the Gaussianized (standard-normal) space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
        """
        return self.qt_.inverse_transform(X)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Estimate log |det J| by finite differences on the quantile transform.

        Uses a small perturbation ``eps = 1e-5`` in each dimension::

            dz_i/dx_i ~= (z_i(x + eps*e_i) - z_i(x - eps*e_i)) / (2*eps)

        and sums the log-absolute-derivatives::

            log|det J| = sum_i log|dz_i/dx_i|

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points at which to evaluate the log-det-Jacobian.

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant approximation.

        Notes
        -----
        The quantile transform is piecewise-linear; the finite-difference
        derivative equals the local slope and is exact within each segment.
        """
        eps = 1e-5
        log_det = np.zeros(X.shape[0])
        for i in range(X.shape[1]):
            dummy_plus = X.copy()
            dummy_plus[:, i] = X[:, i] + eps
            dummy_minus = X.copy()
            dummy_minus[:, i] = X[:, i] - eps
            y_plus = self.qt_.transform(dummy_plus)[:, i]
            y_minus = self.qt_.transform(dummy_minus)[:, i]
            # Central-difference derivative for dimension i
            deriv = (y_plus - y_minus) / (2 * eps)
            log_det += np.log(np.maximum(np.abs(deriv), 1e-300))
        return log_det

fit(X, y=None)

Fit the quantile transformer to the training data.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : QuantileGaussianizer Fitted bijector instance.

Source code in rbig/_src/marginal.py
def fit(self, X: np.ndarray, y=None) -> QuantileGaussianizer:
    """Fit the quantile transformer to the training data.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : QuantileGaussianizer
        Fitted bijector instance.
    """
    from sklearn.preprocessing import QuantileTransformer

    # Cap quantile count so it cannot exceed the available samples
    n_quantiles = min(self.n_quantiles, X.shape[0])
    self.qt_ = QuantileTransformer(
        n_quantiles=n_quantiles,
        output_distribution="normal",
        random_state=self.random_state,
    )
    self.qt_.fit(X)
    self.n_features_ = X.shape[1]
    return self

transform(X)

Apply the quantile transform: x -> z approximately N(0, 1).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Z : np.ndarray of shape (n_samples, n_features) Gaussianized data.

Source code in rbig/_src/marginal.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the quantile transform: x -> z approximately N(0, 1).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Gaussianized data.
    """
    return self.qt_.transform(X)

inverse_transform(X)

Invert the quantile transform: z -> x.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the Gaussianized (standard-normal) space.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space.

Source code in rbig/_src/marginal.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the quantile transform: z -> x.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the Gaussianized (standard-normal) space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
    """
    return self.qt_.inverse_transform(X)

get_log_det_jacobian(X)

Estimate log |det J| by finite differences on the quantile transform.

Uses a small perturbation eps = 1e-5 in each dimension::

dz_i/dx_i ~= (z_i(x + eps*e_i) - z_i(x - eps*e_i)) / (2*eps)

and sums the log-absolute-derivatives::

log|det J| = sum_i log|dz_i/dx_i|
Parameters

X : np.ndarray of shape (n_samples, n_features) Input points at which to evaluate the log-det-Jacobian.

Returns

log_det : np.ndarray of shape (n_samples,) Per-sample log absolute determinant approximation.

Notes

The quantile transform is piecewise-linear; the finite-difference derivative equals the local slope and is exact within each segment.

Source code in rbig/_src/marginal.py
def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Estimate log |det J| by finite differences on the quantile transform.

    Uses a small perturbation ``eps = 1e-5`` in each dimension::

        dz_i/dx_i ~= (z_i(x + eps*e_i) - z_i(x - eps*e_i)) / (2*eps)

    and sums the log-absolute-derivatives::

        log|det J| = sum_i log|dz_i/dx_i|

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points at which to evaluate the log-det-Jacobian.

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant approximation.

    Notes
    -----
    The quantile transform is piecewise-linear; the finite-difference
    derivative equals the local slope and is exact within each segment.
    """
    eps = 1e-5
    log_det = np.zeros(X.shape[0])
    for i in range(X.shape[1]):
        dummy_plus = X.copy()
        dummy_plus[:, i] = X[:, i] + eps
        dummy_minus = X.copy()
        dummy_minus[:, i] = X[:, i] - eps
        y_plus = self.qt_.transform(dummy_plus)[:, i]
        y_minus = self.qt_.transform(dummy_minus)[:, i]
        # Central-difference derivative for dimension i
        deriv = (y_plus - y_minus) / (2 * eps)
        log_det += np.log(np.maximum(np.abs(deriv), 1e-300))
    return log_det

Rotations

rbig.PCARotation

Bases: BaseTransform

PCA-based rotation with optional whitening (decorrelation + rescaling).

Fits a standard PCA (via scikit-learn's :class:~sklearn.decomposition.PCA) and uses it as a linear rotation transform. When whiten=True (default), each principal component is additionally rescaled by the reciprocal square root of its eigenvalue so the output has unit variance per component::

z = Lambda^{-1/2} V^T (x - mu)

where V in R^{D x K} is the matrix of leading eigenvectors (principal axes), Lambda in R^{K x K} is the diagonal eigenvalue matrix, and mu is the sample mean. When whiten=False, the rescaling is omitted and the transform is a pure rotation::

z = V ^ T(x - mu)

Parameters

n_components : int or None, default None Number of principal components to retain. If None, all D components are kept. whiten : bool, default True If True, divide each component by sqrt(lambda_i) to decorrelate and normalise variance. If False, only rotate (and center).

Attributes

pca_ : sklearn.decomposition.PCA Fitted PCA object containing eigenvectors, eigenvalues, and the sample mean.

Notes

The log-absolute-Jacobian determinant for the whitening transform is::

log|det J| = -1/2 * sum_i log(lambda_i)

because each whitening factor Lambda^{-1/2} contributes -1/2 * log(lambda_i) per component. For a pure rotation (whiten=False), the determinant is 1 and the log is 0.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.rotation import PCARotation rng = np.random.default_rng(0) X = rng.standard_normal((200, 4)) pca_rot = PCARotation(whiten=True).fit(X) Z = pca_rot.transform(X) Z.shape (200, 4) ldj = pca_rot.log_det_jacobian(X) ldj.shape (200,)

Source code in rbig/_src/rotation.py
class PCARotation(BaseTransform):
    """PCA-based rotation with optional whitening (decorrelation + rescaling).

    Fits a standard PCA (via scikit-learn's :class:`~sklearn.decomposition.PCA`)
    and uses it as a linear rotation transform.  When ``whiten=True`` (default),
    each principal component is additionally rescaled by the reciprocal square
    root of its eigenvalue so the output has unit variance per component::

        z = Lambda^{-1/2} V^T (x - mu)

    where ``V`` in R^{D x K} is the matrix of leading eigenvectors (principal
    axes), ``Lambda`` in R^{K x K} is the diagonal eigenvalue matrix, and
    ``mu`` is the sample mean.  When ``whiten=False``, the rescaling is
    omitted and the transform is a pure rotation::

        z = V ^ T(x - mu)

    Parameters
    ----------
    n_components : int or None, default None
        Number of principal components to retain.  If ``None``, all D
        components are kept.
    whiten : bool, default True
        If True, divide each component by sqrt(lambda_i) to decorrelate
        *and* normalise variance.  If False, only rotate (and center).

    Attributes
    ----------
    pca_ : sklearn.decomposition.PCA
        Fitted PCA object containing eigenvectors, eigenvalues, and the
        sample mean.

    Notes
    -----
    The log-absolute-Jacobian determinant for the whitening transform is::

        log|det J| = -1/2 * sum_i log(lambda_i)

    because each whitening factor Lambda^{-1/2} contributes
    ``-1/2 * log(lambda_i)`` per component.  For a pure rotation
    (``whiten=False``), the determinant is 1 and the log is 0.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import PCARotation
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 4))
    >>> pca_rot = PCARotation(whiten=True).fit(X)
    >>> Z = pca_rot.transform(X)
    >>> Z.shape
    (200, 4)
    >>> ldj = pca_rot.log_det_jacobian(X)
    >>> ldj.shape
    (200,)
    """

    def __init__(self, n_components: int | None = None, whiten: bool = True):
        self.n_components = n_components
        self.whiten = whiten

    def fit(self, X: np.ndarray, y=None) -> PCARotation:
        """Fit PCA to the training data.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : PCARotation
            Fitted transform instance.
        """
        # Stores eigenvectors, eigenvalues, and mean in pca_
        self.pca_ = PCA(n_components=self.n_components, whiten=self.whiten)
        self.pca_.fit(X)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply PCA rotation (and optional whitening) to X.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_components)
            Rotated (and optionally whitened) data.
        """
        return self.pca_.transform(X)  # (N, D) -> (N, K)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the PCA rotation (and optional whitening).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_components)
            Data in the PCA / whitened space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original input space.
        """
        return self.pca_.inverse_transform(X)  # (N, K) -> (N, D)

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Log absolute Jacobian determinant (constant for linear transforms).

        For a whitening PCA the Jacobian determinant is constant::

            log|det J| = -1/2 * sum_i log(lambda_i)

        where ``lambda_i`` are the PCA eigenvalues (``explained_variance_``).
        For a plain rotation (``whiten=False``), ``|det J| = 1`` and the
        log is 0.

        .. note::
            This method is only valid when the transform is square (i.e.
            ``n_components`` is ``None`` or equals the number of input
            features).  A dimensionality-reducing PCA (``n_components`` <
            ``n_features``) is not bijective and its Jacobian determinant is
            undefined.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine the number of samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Constant per-sample log absolute Jacobian determinant.
        """
        if self.whiten:
            # Whitening scales each eigendirection by lambda_i^{-1/2}
            # -> log|det| = -1/2 * sum log(lambda_i)
            log_det = -0.5 * np.sum(np.log(self.pca_.explained_variance_))
        else:
            # Pure rotation: |det Q| = 1 -> log|det| = 0
            log_det = 0.0
        return np.full(X.shape[0], log_det)

fit(X, y=None)

Fit PCA to the training data.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : PCARotation Fitted transform instance.

Source code in rbig/_src/rotation.py
def fit(self, X: np.ndarray, y=None) -> PCARotation:
    """Fit PCA to the training data.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : PCARotation
        Fitted transform instance.
    """
    # Stores eigenvectors, eigenvalues, and mean in pca_
    self.pca_ = PCA(n_components=self.n_components, whiten=self.whiten)
    self.pca_.fit(X)
    return self

transform(X)

Apply PCA rotation (and optional whitening) to X.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Z : np.ndarray of shape (n_samples, n_components) Rotated (and optionally whitened) data.

Source code in rbig/_src/rotation.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply PCA rotation (and optional whitening) to X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_components)
        Rotated (and optionally whitened) data.
    """
    return self.pca_.transform(X)  # (N, D) -> (N, K)

inverse_transform(X)

Invert the PCA rotation (and optional whitening).

Parameters

X : np.ndarray of shape (n_samples, n_components) Data in the PCA / whitened space.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original input space.

Source code in rbig/_src/rotation.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the PCA rotation (and optional whitening).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_components)
        Data in the PCA / whitened space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original input space.
    """
    return self.pca_.inverse_transform(X)  # (N, K) -> (N, D)

log_det_jacobian(X)

Log absolute Jacobian determinant (constant for linear transforms).

For a whitening PCA the Jacobian determinant is constant::

log|det J| = -1/2 * sum_i log(lambda_i)

where lambda_i are the PCA eigenvalues (explained_variance_). For a plain rotation (whiten=False), |det J| = 1 and the log is 0.

.. note:: This method is only valid when the transform is square (i.e. n_components is None or equals the number of input features). A dimensionality-reducing PCA (n_components < n_features) is not bijective and its Jacobian determinant is undefined.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine the number of samples).

Returns

ldj : np.ndarray of shape (n_samples,) Constant per-sample log absolute Jacobian determinant.

Source code in rbig/_src/rotation.py
def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Log absolute Jacobian determinant (constant for linear transforms).

    For a whitening PCA the Jacobian determinant is constant::

        log|det J| = -1/2 * sum_i log(lambda_i)

    where ``lambda_i`` are the PCA eigenvalues (``explained_variance_``).
    For a plain rotation (``whiten=False``), ``|det J| = 1`` and the
    log is 0.

    .. note::
        This method is only valid when the transform is square (i.e.
        ``n_components`` is ``None`` or equals the number of input
        features).  A dimensionality-reducing PCA (``n_components`` <
        ``n_features``) is not bijective and its Jacobian determinant is
        undefined.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine the number of samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Constant per-sample log absolute Jacobian determinant.
    """
    if self.whiten:
        # Whitening scales each eigendirection by lambda_i^{-1/2}
        # -> log|det| = -1/2 * sum log(lambda_i)
        log_det = -0.5 * np.sum(np.log(self.pca_.explained_variance_))
    else:
        # Pure rotation: |det Q| = 1 -> log|det| = 0
        log_det = 0.0
    return np.full(X.shape[0], log_det)

rbig.ICARotation

Bases: BaseTransform

ICA-based rotation using the Picard algorithm or FastICA fallback.

Fits an Independent Component Analysis (ICA) model that learns a linear unmixing matrix. When the optional picard package is available, it is used for faster and more accurate convergence::

s = W K x

where K in R^{K x D} is a pre-whitening matrix and W in R^{K x K} is the ICA unmixing matrix. The combined transform is W K.

If picard is not installed, :class:sklearn.decomposition.FastICA is used as a drop-in replacement.

Parameters

n_components : int or None, default None Number of independent components. If None, all D components are estimated (square unmixing matrix). random_state : int or None, default None Seed for reproducible ICA initialisation.

Attributes

K_ : np.ndarray of shape (n_components, n_features) or None Pre-whitening matrix from the Picard solver. None when using the FastICA fallback. W_ : np.ndarray of shape (n_components, n_components) or None ICA unmixing matrix from the Picard solver. None when using FastICA. ica_ : sklearn.decomposition.FastICA or None Fitted FastICA object used when Picard is unavailable. n_features_in_ : int Number of input features (set only when using Picard).

Notes

The log-absolute-Jacobian determinant is::

log|det J| = log|det(W K)|

for the Picard path, or log|det(components_)| for FastICA. The Jacobian is constant (independent of x) for any linear transform.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.rotation import ICARotation rng = np.random.default_rng(0) X = rng.standard_normal((200, 3)) ica = ICARotation(random_state=0).fit(X) S = ica.transform(X) S.shape (200, 3)

Source code in rbig/_src/rotation.py
class ICARotation(BaseTransform):
    """ICA-based rotation using the Picard algorithm or FastICA fallback.

    Fits an Independent Component Analysis (ICA) model that learns a linear
    unmixing matrix.  When the optional ``picard`` package is available, it
    is used for faster and more accurate convergence::

        s = W K x

    where ``K`` in R^{K x D} is a pre-whitening matrix and ``W`` in
    R^{K x K} is the ICA unmixing matrix.  The combined transform is
    ``W K``.

    If ``picard`` is not installed, :class:`sklearn.decomposition.FastICA`
    is used as a drop-in replacement.

    Parameters
    ----------
    n_components : int or None, default None
        Number of independent components.  If ``None``, all D components
        are estimated (square unmixing matrix).
    random_state : int or None, default None
        Seed for reproducible ICA initialisation.

    Attributes
    ----------
    K_ : np.ndarray of shape (n_components, n_features) or None
        Pre-whitening matrix from the Picard solver.  ``None`` when using
        the FastICA fallback.
    W_ : np.ndarray of shape (n_components, n_components) or None
        ICA unmixing matrix from the Picard solver.  ``None`` when using
        FastICA.
    ica_ : sklearn.decomposition.FastICA or None
        Fitted FastICA object used when Picard is unavailable.
    n_features_in_ : int
        Number of input features (set only when using Picard).

    Notes
    -----
    The log-absolute-Jacobian determinant is::

        log|det J| = log|det(W K)|

    for the Picard path, or ``log|det(components_)|`` for FastICA.  The
    Jacobian is constant (independent of x) for any linear transform.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import ICARotation
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 3))
    >>> ica = ICARotation(random_state=0).fit(X)
    >>> S = ica.transform(X)
    >>> S.shape
    (200, 3)
    """

    def __init__(
        self,
        n_components: int | None = None,
        random_state: int | None = None,
        orthogonal: bool = True,
    ):
        self.n_components = n_components
        self.random_state = random_state
        self.orthogonal = orthogonal

    def fit(self, X: np.ndarray, y=None) -> ICARotation:
        """Fit the ICA model (Picard if available, otherwise FastICA).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : ICARotation
            Fitted transform instance.
        """
        try:
            from picard import picard

            n = X.shape[1] if self.n_components is None else self.n_components
            # Picard expects data as (n_features, n_samples), so transpose X
            K, W, _ = picard(
                X.T,
                n_components=n,
                random_state=self.random_state,
                max_iter=500,
                tol=1e-5,
            )
            self.K_ = K  # whitening matrix, shape (K, D)
            self.W_ = W  # unmixing matrix, shape (K, K)
            self.n_features_in_ = X.shape[1]
            if (
                self.orthogonal
                and self.n_components is not None
                and self.n_components != X.shape[1]
            ):
                raise ValueError(
                    "orthogonal=True requires n_components=None or "
                    "n_components=n_features"
                )
        except ImportError:
            from sklearn.decomposition import FastICA

            # Fall back to FastICA when picard is not installed
            self.ica_ = FastICA(
                n_components=self.n_components,
                random_state=self.random_state,
                max_iter=500,
            )
            self.ica_.fit(X)
            self.K_ = None  # signals that FastICA path is active
            if (
                self.orthogonal
                and self.n_components is not None
                and self.n_components != X.shape[1]
            ):
                raise ValueError(
                    "orthogonal=True requires n_components=None or "
                    "n_components=n_features"
                ) from None
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the ICA unmixing to X.

        When ``orthogonal=True`` (default), applies only the orthogonal
        rotation W_ (skips whitening K_), giving ``s = W x``.  When
        ``orthogonal=False``, applies the full unmixing ``s = W K x``.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original (mixed) space.

        Returns
        -------
        S : np.ndarray of shape (n_samples, n_components)
            Estimated independent components.
        """
        if self.K_ is None:
            # FastICA path: uses sklearn's built-in transform
            return self.ica_.transform(X)
        if self.orthogonal:
            # Orthogonal mode: apply only W_ (rotation without whitening)
            return X @ self.W_.T  # (N, D) @ (D, D) -> (N, D)
        # Full unmixing: whiten then rotate
        Xw = X @ self.K_.T  # (N, D) @ (D, K) -> (N, K)  whitening step
        return Xw @ self.W_.T  # (N, K) @ (K, K) -> (N, K)  unmixing step

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the ICA unmixing.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_components)
            Independent-component representation.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original mixed space.
        """
        if self.K_ is None:
            return self.ica_.inverse_transform(X)
        if self.orthogonal:
            # W_ is orthogonal, so W_^{-1} = W_^T
            return X @ self.W_  # (N, D) @ (D, D) -> (N, D)
        # Invert unmixing W then whitening K using pseudo-inverses
        Xw = X @ np.linalg.pinv(self.W_).T  # (N, K) -> (N, K)
        return Xw @ np.linalg.pinv(self.K_).T  # (N, K) -> (N, D)

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Log absolute Jacobian determinant (constant for linear transforms).

        Computes ``log|det(W K)|`` (Picard path) or
        ``log|det(components_)|`` (FastICA path).  The result is replicated
        for every sample since the Jacobian of a linear transform is constant.

        .. note::
            This method is only valid when the unmixing matrix is square (i.e.
            ``n_components`` is ``None`` or equals the number of input
            features).  A non-square unmixing matrix is not bijective and its
            Jacobian determinant is undefined.  A :exc:`ValueError` is raised
            in that case.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine the number of samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Constant per-sample log absolute Jacobian determinant.

        Raises
        ------
        ValueError
            If the unmixing matrix is not square (``n_components != n_features``).
        """
        if self.K_ is None:
            W = self.ica_.components_  # shape (K, D) or (D, D)
            if W.shape[0] != W.shape[1]:
                raise ValueError(
                    "ICARotation.log_det_jacobian is only defined for square "
                    "unmixing matrices. Got components_ with shape "
                    f"{W.shape}. Ensure that `n_components` is None or "
                    "equals the number of features."
                )
            log_det = np.log(np.abs(np.linalg.det(W)))
        elif self.orthogonal:
            # W_ is orthogonal → |det W_| = 1 → log|det| = 0
            log_det = 0.0
        else:
            # Combined unmixing matrix: W @ K, shape (K, D)
            WK = self.W_ @ self.K_
            if WK.shape[0] != WK.shape[1]:
                raise ValueError(
                    "ICARotation.log_det_jacobian is only defined for square "
                    "unmixing matrices. Got W @ K with shape "
                    f"{WK.shape}. Ensure that `n_components` is None or "
                    "equals the number of features."
                )
            log_det = np.log(np.abs(np.linalg.det(WK)))
        return np.full(X.shape[0], log_det)

fit(X, y=None)

Fit the ICA model (Picard if available, otherwise FastICA).

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : ICARotation Fitted transform instance.

Source code in rbig/_src/rotation.py
def fit(self, X: np.ndarray, y=None) -> ICARotation:
    """Fit the ICA model (Picard if available, otherwise FastICA).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : ICARotation
        Fitted transform instance.
    """
    try:
        from picard import picard

        n = X.shape[1] if self.n_components is None else self.n_components
        # Picard expects data as (n_features, n_samples), so transpose X
        K, W, _ = picard(
            X.T,
            n_components=n,
            random_state=self.random_state,
            max_iter=500,
            tol=1e-5,
        )
        self.K_ = K  # whitening matrix, shape (K, D)
        self.W_ = W  # unmixing matrix, shape (K, K)
        self.n_features_in_ = X.shape[1]
        if (
            self.orthogonal
            and self.n_components is not None
            and self.n_components != X.shape[1]
        ):
            raise ValueError(
                "orthogonal=True requires n_components=None or "
                "n_components=n_features"
            )
    except ImportError:
        from sklearn.decomposition import FastICA

        # Fall back to FastICA when picard is not installed
        self.ica_ = FastICA(
            n_components=self.n_components,
            random_state=self.random_state,
            max_iter=500,
        )
        self.ica_.fit(X)
        self.K_ = None  # signals that FastICA path is active
        if (
            self.orthogonal
            and self.n_components is not None
            and self.n_components != X.shape[1]
        ):
            raise ValueError(
                "orthogonal=True requires n_components=None or "
                "n_components=n_features"
            ) from None
    return self

transform(X)

Apply the ICA unmixing to X.

When orthogonal=True (default), applies only the orthogonal rotation W_ (skips whitening K_), giving s = W x. When orthogonal=False, applies the full unmixing s = W K x.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in the original (mixed) space.

Returns

S : np.ndarray of shape (n_samples, n_components) Estimated independent components.

Source code in rbig/_src/rotation.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the ICA unmixing to X.

    When ``orthogonal=True`` (default), applies only the orthogonal
    rotation W_ (skips whitening K_), giving ``s = W x``.  When
    ``orthogonal=False``, applies the full unmixing ``s = W K x``.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original (mixed) space.

    Returns
    -------
    S : np.ndarray of shape (n_samples, n_components)
        Estimated independent components.
    """
    if self.K_ is None:
        # FastICA path: uses sklearn's built-in transform
        return self.ica_.transform(X)
    if self.orthogonal:
        # Orthogonal mode: apply only W_ (rotation without whitening)
        return X @ self.W_.T  # (N, D) @ (D, D) -> (N, D)
    # Full unmixing: whiten then rotate
    Xw = X @ self.K_.T  # (N, D) @ (D, K) -> (N, K)  whitening step
    return Xw @ self.W_.T  # (N, K) @ (K, K) -> (N, K)  unmixing step

inverse_transform(X)

Invert the ICA unmixing.

Parameters

X : np.ndarray of shape (n_samples, n_components) Independent-component representation.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original mixed space.

Source code in rbig/_src/rotation.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the ICA unmixing.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_components)
        Independent-component representation.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original mixed space.
    """
    if self.K_ is None:
        return self.ica_.inverse_transform(X)
    if self.orthogonal:
        # W_ is orthogonal, so W_^{-1} = W_^T
        return X @ self.W_  # (N, D) @ (D, D) -> (N, D)
    # Invert unmixing W then whitening K using pseudo-inverses
    Xw = X @ np.linalg.pinv(self.W_).T  # (N, K) -> (N, K)
    return Xw @ np.linalg.pinv(self.K_).T  # (N, K) -> (N, D)

log_det_jacobian(X)

Log absolute Jacobian determinant (constant for linear transforms).

Computes log|det(W K)| (Picard path) or log|det(components_)| (FastICA path). The result is replicated for every sample since the Jacobian of a linear transform is constant.

.. note:: This method is only valid when the unmixing matrix is square (i.e. n_components is None or equals the number of input features). A non-square unmixing matrix is not bijective and its Jacobian determinant is undefined. A :exc:ValueError is raised in that case.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine the number of samples).

Returns

ldj : np.ndarray of shape (n_samples,) Constant per-sample log absolute Jacobian determinant.

Raises

ValueError If the unmixing matrix is not square (n_components != n_features).

Source code in rbig/_src/rotation.py
def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Log absolute Jacobian determinant (constant for linear transforms).

    Computes ``log|det(W K)|`` (Picard path) or
    ``log|det(components_)|`` (FastICA path).  The result is replicated
    for every sample since the Jacobian of a linear transform is constant.

    .. note::
        This method is only valid when the unmixing matrix is square (i.e.
        ``n_components`` is ``None`` or equals the number of input
        features).  A non-square unmixing matrix is not bijective and its
        Jacobian determinant is undefined.  A :exc:`ValueError` is raised
        in that case.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine the number of samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Constant per-sample log absolute Jacobian determinant.

    Raises
    ------
    ValueError
        If the unmixing matrix is not square (``n_components != n_features``).
    """
    if self.K_ is None:
        W = self.ica_.components_  # shape (K, D) or (D, D)
        if W.shape[0] != W.shape[1]:
            raise ValueError(
                "ICARotation.log_det_jacobian is only defined for square "
                "unmixing matrices. Got components_ with shape "
                f"{W.shape}. Ensure that `n_components` is None or "
                "equals the number of features."
            )
        log_det = np.log(np.abs(np.linalg.det(W)))
    elif self.orthogonal:
        # W_ is orthogonal → |det W_| = 1 → log|det| = 0
        log_det = 0.0
    else:
        # Combined unmixing matrix: W @ K, shape (K, D)
        WK = self.W_ @ self.K_
        if WK.shape[0] != WK.shape[1]:
            raise ValueError(
                "ICARotation.log_det_jacobian is only defined for square "
                "unmixing matrices. Got W @ K with shape "
                f"{WK.shape}. Ensure that `n_components` is None or "
                "equals the number of features."
            )
        log_det = np.log(np.abs(np.linalg.det(WK)))
    return np.full(X.shape[0], log_det)

rbig.RandomRotation

Bases: RotationBijector

Random orthogonal rotation drawn from the Haar measure via QR.

Generates a uniformly random orthogonal matrix Q in R^{D x D} by QR decomposing a matrix of i.i.d. standard-normal entries and applying a sign correction to ensure the result is Haar-uniform::

A ~ N(0, 1)^{D x D},  A = Q R,  Q <- Q * diag(sign(diag(R)))

The sign correction guarantees that Q is sampled uniformly from the orthogonal group O(D) (the Haar measure).

Parameters

random_state : int or None, default None Seed for reproducible rotation matrix generation.

Attributes

rotation_matrix_ : np.ndarray of shape (n_features, n_features) The sampled orthogonal rotation matrix Q.

Notes

Because Q is orthogonal, |det Q| = 1 and::

log|det J| = log|det Q| = 0

This is the default implementation inherited from :class:~rbig._src.base.RotationBijector.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.rotation import RandomRotation rng_data = np.random.default_rng(0) X = rng_data.standard_normal((100, 4)) rot = RandomRotation(random_state=42).fit(X) Z = rot.transform(X) Z.shape (100, 4) Xr = rot.inverse_transform(Z) np.allclose(X, Xr) True

Source code in rbig/_src/rotation.py
class RandomRotation(RotationBijector):
    """Random orthogonal rotation drawn from the Haar measure via QR.

    Generates a uniformly random orthogonal matrix Q in R^{D x D} by QR
    decomposing a matrix of i.i.d. standard-normal entries and applying a
    sign correction to ensure the result is Haar-uniform::

        A ~ N(0, 1)^{D x D},  A = Q R,  Q <- Q * diag(sign(diag(R)))

    The sign correction guarantees that Q is sampled uniformly from the
    orthogonal group O(D) (the Haar measure).

    Parameters
    ----------
    random_state : int or None, default None
        Seed for reproducible rotation matrix generation.

    Attributes
    ----------
    rotation_matrix_ : np.ndarray of shape (n_features, n_features)
        The sampled orthogonal rotation matrix Q.

    Notes
    -----
    Because Q is orthogonal, ``|det Q| = 1`` and::

        log|det J| = log|det Q| = 0

    This is the default implementation inherited from
    :class:`~rbig._src.base.RotationBijector`.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import RandomRotation
    >>> rng_data = np.random.default_rng(0)
    >>> X = rng_data.standard_normal((100, 4))
    >>> rot = RandomRotation(random_state=42).fit(X)
    >>> Z = rot.transform(X)
    >>> Z.shape
    (100, 4)
    >>> Xr = rot.inverse_transform(Z)
    >>> np.allclose(X, Xr)
    True
    """

    def __init__(self, random_state: int | None = None):
        self.random_state = random_state

    def fit(self, X: np.ndarray, y=None) -> RandomRotation:
        """Sample a Haar-uniform orthogonal rotation matrix of size D x D.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data (used only to infer the dimensionality D).

        Returns
        -------
        self : RandomRotation
            Fitted transform instance with ``rotation_matrix_`` set.
        """
        rng = np.random.default_rng(self.random_state)
        n_features = X.shape[1]
        # Draw a random D x D Gaussian matrix
        A = rng.standard_normal((n_features, n_features))
        Q, R = np.linalg.qr(A)
        # Sign correction: multiply columns of Q by sign(diag(R)) for Haar measure
        Q *= np.sign(np.diag(R))  # ensures uniform distribution on O(D)
        self.rotation_matrix_ = Q  # shape (D, D)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Rotate X by the sampled orthogonal matrix: z = Q x.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Rotated data.
        """
        return X @ self.rotation_matrix_.T  # (N, D) @ (D, D) -> (N, D)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the rotation: x = Q^T z = Q^{-1} z.

        Because Q is orthogonal, its inverse equals its transpose.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Rotated data.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original space.
        """
        return X @ self.rotation_matrix_  # Q^{-1} = Q^T -> (N, D) @ (D, D)

fit(X, y=None)

Sample a Haar-uniform orthogonal rotation matrix of size D x D.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data (used only to infer the dimensionality D).

Returns

self : RandomRotation Fitted transform instance with rotation_matrix_ set.

Source code in rbig/_src/rotation.py
def fit(self, X: np.ndarray, y=None) -> RandomRotation:
    """Sample a Haar-uniform orthogonal rotation matrix of size D x D.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data (used only to infer the dimensionality D).

    Returns
    -------
    self : RandomRotation
        Fitted transform instance with ``rotation_matrix_`` set.
    """
    rng = np.random.default_rng(self.random_state)
    n_features = X.shape[1]
    # Draw a random D x D Gaussian matrix
    A = rng.standard_normal((n_features, n_features))
    Q, R = np.linalg.qr(A)
    # Sign correction: multiply columns of Q by sign(diag(R)) for Haar measure
    Q *= np.sign(np.diag(R))  # ensures uniform distribution on O(D)
    self.rotation_matrix_ = Q  # shape (D, D)
    return self

transform(X)

Rotate X by the sampled orthogonal matrix: z = Q x.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Z : np.ndarray of shape (n_samples, n_features) Rotated data.

Source code in rbig/_src/rotation.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Rotate X by the sampled orthogonal matrix: z = Q x.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Rotated data.
    """
    return X @ self.rotation_matrix_.T  # (N, D) @ (D, D) -> (N, D)

inverse_transform(X)

Invert the rotation: x = Q^T z = Q^{-1} z.

Because Q is orthogonal, its inverse equals its transpose.

Parameters

X : np.ndarray of shape (n_samples, n_features) Rotated data.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original space.

Source code in rbig/_src/rotation.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the rotation: x = Q^T z = Q^{-1} z.

    Because Q is orthogonal, its inverse equals its transpose.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Rotated data.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original space.
    """
    return X @ self.rotation_matrix_  # Q^{-1} = Q^T -> (N, D) @ (D, D)

rbig.PicardRotation

Bases: RotationBijector

ICA rotation via the Picard algorithm with a FastICA fallback.

Fits an ICA model that learns maximally statistically-independent sources. When the optional picard package is available, it solves::

K, W = picard(X^T)
s = W K x

where K in R^{K x D} is the pre-whitening matrix and W in R^{K x K} is the Picard unmixing matrix. The log-det-Jacobian is::

log|det J| = log|det(W K)|

If picard is not installed (or incompatible), :class:sklearn .decomposition.FastICA is used as a fallback.

Parameters

n_components : int or None, default None Number of independent components K. If None, K = D. extended : bool, default False If True, use the extended Picard algorithm that can handle both super- and sub-Gaussian sources (passed directly to picard). random_state : int or None, default None Seed for reproducible initialisation. max_iter : int, default 500 Maximum number of ICA iterations. tol : float, default 1e-5 Convergence tolerance for the ICA algorithm.

Attributes

K_ : np.ndarray of shape (n_components, n_features) or None Pre-whitening matrix (Picard path). None when using FastICA. W_ : np.ndarray of shape (n_components, n_components) or None Unmixing matrix (Picard path). None when using FastICA. use_picard_ : bool True if the Picard solver was used; False if FastICA was used. ica_ : sklearn.decomposition.FastICA or None Fitted FastICA model (FastICA path only).

Notes

The log-det-Jacobian is::

log|det J| = log|det(W K)|

for the Picard path, or log|det(components_)| for the FastICA path. The Jacobian is constant because the transform is linear.

:meth:get_log_det_jacobian raises :class:ValueError if the unmixing matrix is not square (i.e. n_components != n_features).

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Ablin, P., Cardoso, J.-F., & Gramfort, A. (2018). Faster Independent Component Analysis by Preconditioning with Hessian Approximations. IEEE Transactions on Signal Processing, 66(15), 4040-4049.

Examples

import numpy as np from rbig._src.rotation import PicardRotation rng = np.random.default_rng(0) X = rng.standard_normal((200, 3)) pic = PicardRotation(random_state=0).fit(X) S = pic.transform(X) S.shape (200, 3)

Source code in rbig/_src/rotation.py
class PicardRotation(RotationBijector):
    """ICA rotation via the Picard algorithm with a FastICA fallback.

    Fits an ICA model that learns maximally statistically-independent
    sources.  When the optional ``picard`` package is available, it solves::

        K, W = picard(X^T)
        s = W K x

    where ``K`` in R^{K x D} is the pre-whitening matrix and ``W`` in
    R^{K x K} is the Picard unmixing matrix.  The log-det-Jacobian is::

        log|det J| = log|det(W K)|

    If ``picard`` is not installed (or incompatible), :class:`sklearn
    .decomposition.FastICA` is used as a fallback.

    Parameters
    ----------
    n_components : int or None, default None
        Number of independent components K.  If ``None``, K = D.
    extended : bool, default False
        If True, use the extended Picard algorithm that can handle both
        super- and sub-Gaussian sources (passed directly to ``picard``).
    random_state : int or None, default None
        Seed for reproducible initialisation.
    max_iter : int, default 500
        Maximum number of ICA iterations.
    tol : float, default 1e-5
        Convergence tolerance for the ICA algorithm.

    Attributes
    ----------
    K_ : np.ndarray of shape (n_components, n_features) or None
        Pre-whitening matrix (Picard path).  ``None`` when using FastICA.
    W_ : np.ndarray of shape (n_components, n_components) or None
        Unmixing matrix (Picard path).  ``None`` when using FastICA.
    use_picard_ : bool
        True if the Picard solver was used; False if FastICA was used.
    ica_ : sklearn.decomposition.FastICA or None
        Fitted FastICA model (FastICA path only).

    Notes
    -----
    The log-det-Jacobian is::

        log|det J| = log|det(W K)|

    for the Picard path, or ``log|det(components_)|`` for the FastICA path.
    The Jacobian is constant because the transform is linear.

    :meth:`get_log_det_jacobian` raises :class:`ValueError` if the unmixing
    matrix is not square (i.e. ``n_components != n_features``).

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Ablin, P., Cardoso, J.-F., & Gramfort, A. (2018). Faster Independent
    Component Analysis by Preconditioning with Hessian Approximations.
    *IEEE Transactions on Signal Processing*, 66(15), 4040-4049.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import PicardRotation
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 3))
    >>> pic = PicardRotation(random_state=0).fit(X)
    >>> S = pic.transform(X)
    >>> S.shape
    (200, 3)
    """

    def __init__(
        self,
        n_components: int | None = None,
        extended: bool = False,
        random_state: int | None = None,
        max_iter: int = 500,
        tol: float = 1e-5,
        orthogonal: bool = True,
    ):
        self.n_components = n_components
        self.extended = extended
        self.random_state = random_state
        self.max_iter = max_iter
        self.tol = tol
        self.orthogonal = orthogonal

    def fit(self, X: np.ndarray, y=None) -> PicardRotation:
        """Fit ICA (Picard if available, otherwise FastICA).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : PicardRotation
            Fitted transform instance.
        """
        try:
            from picard import picard

            n = X.shape[1] if self.n_components is None else self.n_components
            # Picard expects (n_features, n_samples); returns K (whitening) and W (unmixing)
            K, W, _ = picard(
                X.T,  # (D, N)
                n_components=n,
                random_state=self.random_state,
                max_iter=self.max_iter,
                tol=self.tol,
                extended=self.extended,
            )
            self.K_ = K  # pre-whitening matrix, shape (K, D)
            self.W_ = W  # ICA unmixing matrix, shape (K, K)
            self.use_picard_ = True
            if (
                self.orthogonal
                and self.n_components is not None
                and self.n_components != X.shape[1]
            ):
                raise ValueError(
                    "orthogonal=True requires n_components=None or "
                    "n_components=n_features"
                )
        except (ImportError, TypeError):
            from sklearn.decomposition import FastICA

            # FastICA fallback when picard is unavailable or incompatible
            self.ica_ = FastICA(
                n_components=self.n_components,
                random_state=self.random_state,
                max_iter=self.max_iter,
            )
            self.ica_.fit(X)
            self.K_ = None
            self.use_picard_ = False
            if (
                self.orthogonal
                and self.n_components is not None
                and self.n_components != X.shape[1]
            ):
                raise ValueError(
                    "orthogonal=True requires n_components=None or "
                    "n_components=n_features"
                ) from None
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the ICA unmixing.

        When ``orthogonal=True`` (default), applies only the orthogonal
        rotation W_ (skips whitening K_).  When ``orthogonal=False``,
        applies the full unmixing ``s = W K x``.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original (mixed) space.

        Returns
        -------
        S : np.ndarray of shape (n_samples, n_components)
            Estimated independent components.
        """
        if not self.use_picard_:
            return self.ica_.transform(X)
        if self.orthogonal:
            return X @ self.W_.T  # (N, D) @ (D, D) -> (N, D)
        return (X @ self.K_.T) @ self.W_.T

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the ICA unmixing.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_components)
            Independent-component representation.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data approximately recovered in the original mixed space.
        """
        if not self.use_picard_:
            return self.ica_.inverse_transform(X)
        if self.orthogonal:
            return X @ self.W_  # W_ orthogonal → W_^{-1} = W_^T
        return (X @ np.linalg.pinv(self.W_).T) @ np.linalg.pinv(self.K_).T

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Log absolute Jacobian determinant (constant for linear transforms).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine n_samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Constant per-sample log absolute Jacobian determinant.

        Raises
        ------
        ValueError
            If the unmixing matrix is not square
            (``n_components != n_features``).
        """
        if not self.use_picard_:
            W = self.ica_.components_
            if W.shape[0] != W.shape[1]:
                raise ValueError(
                    "PicardRotation.get_log_det_jacobian is only defined for square "
                    "unmixing matrices when using the FastICA fallback. Got "
                    f"components_ with shape {W.shape}. Ensure that `n_components` "
                    "is None or equals the number of features."
                )
            log_det = np.log(np.abs(np.linalg.det(W)))
        elif self.orthogonal:
            # W_ is orthogonal → |det W_| = 1 → log|det| = 0
            log_det = 0.0
        else:
            WK = self.W_ @ self.K_
            if WK.shape[0] != WK.shape[1]:
                raise ValueError(
                    "PicardRotation.get_log_det_jacobian is only defined for square "
                    "unmixing matrices when using the Picard solver. Got "
                    f"W @ K with shape {WK.shape}. Ensure that `n_components` "
                    "is None or equals the number of features."
                )
            log_det = np.log(np.abs(np.linalg.det(WK)))
        return np.full(X.shape[0], log_det)

fit(X, y=None)

Fit ICA (Picard if available, otherwise FastICA).

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : PicardRotation Fitted transform instance.

Source code in rbig/_src/rotation.py
def fit(self, X: np.ndarray, y=None) -> PicardRotation:
    """Fit ICA (Picard if available, otherwise FastICA).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : PicardRotation
        Fitted transform instance.
    """
    try:
        from picard import picard

        n = X.shape[1] if self.n_components is None else self.n_components
        # Picard expects (n_features, n_samples); returns K (whitening) and W (unmixing)
        K, W, _ = picard(
            X.T,  # (D, N)
            n_components=n,
            random_state=self.random_state,
            max_iter=self.max_iter,
            tol=self.tol,
            extended=self.extended,
        )
        self.K_ = K  # pre-whitening matrix, shape (K, D)
        self.W_ = W  # ICA unmixing matrix, shape (K, K)
        self.use_picard_ = True
        if (
            self.orthogonal
            and self.n_components is not None
            and self.n_components != X.shape[1]
        ):
            raise ValueError(
                "orthogonal=True requires n_components=None or "
                "n_components=n_features"
            )
    except (ImportError, TypeError):
        from sklearn.decomposition import FastICA

        # FastICA fallback when picard is unavailable or incompatible
        self.ica_ = FastICA(
            n_components=self.n_components,
            random_state=self.random_state,
            max_iter=self.max_iter,
        )
        self.ica_.fit(X)
        self.K_ = None
        self.use_picard_ = False
        if (
            self.orthogonal
            and self.n_components is not None
            and self.n_components != X.shape[1]
        ):
            raise ValueError(
                "orthogonal=True requires n_components=None or "
                "n_components=n_features"
            ) from None
    return self

transform(X)

Apply the ICA unmixing.

When orthogonal=True (default), applies only the orthogonal rotation W_ (skips whitening K_). When orthogonal=False, applies the full unmixing s = W K x.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in the original (mixed) space.

Returns

S : np.ndarray of shape (n_samples, n_components) Estimated independent components.

Source code in rbig/_src/rotation.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the ICA unmixing.

    When ``orthogonal=True`` (default), applies only the orthogonal
    rotation W_ (skips whitening K_).  When ``orthogonal=False``,
    applies the full unmixing ``s = W K x``.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original (mixed) space.

    Returns
    -------
    S : np.ndarray of shape (n_samples, n_components)
        Estimated independent components.
    """
    if not self.use_picard_:
        return self.ica_.transform(X)
    if self.orthogonal:
        return X @ self.W_.T  # (N, D) @ (D, D) -> (N, D)
    return (X @ self.K_.T) @ self.W_.T

inverse_transform(X)

Invert the ICA unmixing.

Parameters

X : np.ndarray of shape (n_samples, n_components) Independent-component representation.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data approximately recovered in the original mixed space.

Source code in rbig/_src/rotation.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the ICA unmixing.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_components)
        Independent-component representation.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data approximately recovered in the original mixed space.
    """
    if not self.use_picard_:
        return self.ica_.inverse_transform(X)
    if self.orthogonal:
        return X @ self.W_  # W_ orthogonal → W_^{-1} = W_^T
    return (X @ np.linalg.pinv(self.W_).T) @ np.linalg.pinv(self.K_).T

get_log_det_jacobian(X)

Log absolute Jacobian determinant (constant for linear transforms).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine n_samples).

Returns

ldj : np.ndarray of shape (n_samples,) Constant per-sample log absolute Jacobian determinant.

Raises

ValueError If the unmixing matrix is not square (n_components != n_features).

Source code in rbig/_src/rotation.py
def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Log absolute Jacobian determinant (constant for linear transforms).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine n_samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Constant per-sample log absolute Jacobian determinant.

    Raises
    ------
    ValueError
        If the unmixing matrix is not square
        (``n_components != n_features``).
    """
    if not self.use_picard_:
        W = self.ica_.components_
        if W.shape[0] != W.shape[1]:
            raise ValueError(
                "PicardRotation.get_log_det_jacobian is only defined for square "
                "unmixing matrices when using the FastICA fallback. Got "
                f"components_ with shape {W.shape}. Ensure that `n_components` "
                "is None or equals the number of features."
            )
        log_det = np.log(np.abs(np.linalg.det(W)))
    elif self.orthogonal:
        # W_ is orthogonal → |det W_| = 1 → log|det| = 0
        log_det = 0.0
    else:
        WK = self.W_ @ self.K_
        if WK.shape[0] != WK.shape[1]:
            raise ValueError(
                "PicardRotation.get_log_det_jacobian is only defined for square "
                "unmixing matrices when using the Picard solver. Got "
                f"W @ K with shape {WK.shape}. Ensure that `n_components` "
                "is None or equals the number of features."
            )
        log_det = np.log(np.abs(np.linalg.det(WK)))
    return np.full(X.shape[0], log_det)

rbig.RandomOrthogonalProjection

Bases: RotationBijector

Semi-orthogonal random projection from D to K dimensions via QR.

Generates a semi-orthogonal matrix P in R^{D x K} (K <= D) whose columns are orthonormal, obtained by taking the first K columns of a QR decomposition of a random Gaussian matrix::

A ~ N(0, 1)^{D x K},  A = Q R,  P = Q[:, :K]

The forward transform projects D-dimensional input to K dimensions::

z = X P   where P in R^{D x K}

Parameters

n_components : int or None, default None Output dimensionality K. If None, K = D (square case). random_state : int or None, default None Seed for reproducible matrix generation.

Attributes

projection_matrix_ : np.ndarray of shape (n_features, n_components) Semi-orthogonal projection matrix P with orthonormal columns. input_dim_ : int Input dimensionality D. output_dim_ : int Output dimensionality K.

Notes

When K = D the matrix is fully orthogonal and log|det J| = 0. When K < D the transform is not invertible and both :meth:inverse_transform and :meth:get_log_det_jacobian raise :class:NotImplementedError.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.rotation import RandomOrthogonalProjection rng = np.random.default_rng(0) X = rng.standard_normal((100, 4)) proj = RandomOrthogonalProjection(random_state=0).fit(X) Z = proj.transform(X) Z.shape (100, 4)

Source code in rbig/_src/rotation.py
class RandomOrthogonalProjection(RotationBijector):
    """Semi-orthogonal random projection from D to K dimensions via QR.

    Generates a semi-orthogonal matrix P in R^{D x K} (K <= D) whose
    columns are orthonormal, obtained by taking the first K columns of a
    QR decomposition of a random Gaussian matrix::

        A ~ N(0, 1)^{D x K},  A = Q R,  P = Q[:, :K]

    The forward transform projects D-dimensional input to K dimensions::

        z = X P   where P in R^{D x K}

    Parameters
    ----------
    n_components : int or None, default None
        Output dimensionality K.  If ``None``, K = D (square case).
    random_state : int or None, default None
        Seed for reproducible matrix generation.

    Attributes
    ----------
    projection_matrix_ : np.ndarray of shape (n_features, n_components)
        Semi-orthogonal projection matrix P with orthonormal columns.
    input_dim_ : int
        Input dimensionality D.
    output_dim_ : int
        Output dimensionality K.

    Notes
    -----
    When K = D the matrix is fully orthogonal and ``log|det J| = 0``.
    When K < D the transform is not invertible and both
    :meth:`inverse_transform` and :meth:`get_log_det_jacobian` raise
    :class:`NotImplementedError`.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import RandomOrthogonalProjection
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((100, 4))
    >>> proj = RandomOrthogonalProjection(random_state=0).fit(X)
    >>> Z = proj.transform(X)
    >>> Z.shape
    (100, 4)
    """

    def __init__(
        self,
        n_components: int | None = None,
        random_state: int | None = None,
    ):
        self.n_components = n_components
        self.random_state = random_state

    def fit(self, X: np.ndarray, y=None) -> RandomOrthogonalProjection:
        """Build the semi-orthogonal projection matrix P.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data (used only to infer D = n_features).

        Returns
        -------
        self : RandomOrthogonalProjection
            Fitted transform instance.
        """
        rng = np.random.default_rng(self.random_state)
        D = X.shape[1]
        K = self.n_components if self.n_components is not None else D
        # Random Gaussian seed matrix; QR gives orthonormal columns
        A = rng.standard_normal((D, K))
        Q, _ = np.linalg.qr(A)
        self.projection_matrix_ = Q[:, :K]  # (D, K)  semi-orthogonal basis
        self.input_dim_ = D
        self.output_dim_ = K
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Project X from D to K dimensions: z = X P.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_components)
            Projected data.
        """
        return X @ self.projection_matrix_  # (N, D) @ (D, K) -> (N, K)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the projection (only valid for square case K = D).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_components)
            Projected data.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original space (exact only when K = D).

        Raises
        ------
        NotImplementedError
            If ``n_components < n_features`` (projection is not invertible).
        """
        if self.output_dim_ < self.input_dim_:
            raise NotImplementedError(
                "RandomOrthogonalProjection with n_components < input dimension "
                "is not bijective; inverse_transform is undefined."
            )
        return X @ self.projection_matrix_.T  # exact inverse only when square (N, D)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros for the square (bijective) case.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine n_samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Zeros, because ``|det P| = 1`` for a square orthogonal matrix.

        Raises
        ------
        NotImplementedError
            If ``n_components < n_features`` (Jacobian determinant undefined).
        """
        if self.output_dim_ < self.input_dim_:
            raise NotImplementedError(
                "RandomOrthogonalProjection with n_components < input dimension "
                "does not have a well-defined Jacobian determinant."
            )
        # For a square orthogonal matrix, |det(J)| = 1, so log|det(J)| = 0.
        return np.zeros(X.shape[0])

fit(X, y=None)

Build the semi-orthogonal projection matrix P.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data (used only to infer D = n_features).

Returns

self : RandomOrthogonalProjection Fitted transform instance.

Source code in rbig/_src/rotation.py
def fit(self, X: np.ndarray, y=None) -> RandomOrthogonalProjection:
    """Build the semi-orthogonal projection matrix P.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data (used only to infer D = n_features).

    Returns
    -------
    self : RandomOrthogonalProjection
        Fitted transform instance.
    """
    rng = np.random.default_rng(self.random_state)
    D = X.shape[1]
    K = self.n_components if self.n_components is not None else D
    # Random Gaussian seed matrix; QR gives orthonormal columns
    A = rng.standard_normal((D, K))
    Q, _ = np.linalg.qr(A)
    self.projection_matrix_ = Q[:, :K]  # (D, K)  semi-orthogonal basis
    self.input_dim_ = D
    self.output_dim_ = K
    return self

transform(X)

Project X from D to K dimensions: z = X P.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Z : np.ndarray of shape (n_samples, n_components) Projected data.

Source code in rbig/_src/rotation.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Project X from D to K dimensions: z = X P.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_components)
        Projected data.
    """
    return X @ self.projection_matrix_  # (N, D) @ (D, K) -> (N, K)

inverse_transform(X)

Invert the projection (only valid for square case K = D).

Parameters

X : np.ndarray of shape (n_samples, n_components) Projected data.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original space (exact only when K = D).

Raises

NotImplementedError If n_components < n_features (projection is not invertible).

Source code in rbig/_src/rotation.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the projection (only valid for square case K = D).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_components)
        Projected data.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original space (exact only when K = D).

    Raises
    ------
    NotImplementedError
        If ``n_components < n_features`` (projection is not invertible).
    """
    if self.output_dim_ < self.input_dim_:
        raise NotImplementedError(
            "RandomOrthogonalProjection with n_components < input dimension "
            "is not bijective; inverse_transform is undefined."
        )
    return X @ self.projection_matrix_.T  # exact inverse only when square (N, D)

get_log_det_jacobian(X)

Return zeros for the square (bijective) case.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine n_samples).

Returns

ldj : np.ndarray of shape (n_samples,) Zeros, because |det P| = 1 for a square orthogonal matrix.

Raises

NotImplementedError If n_components < n_features (Jacobian determinant undefined).

Source code in rbig/_src/rotation.py
def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros for the square (bijective) case.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine n_samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Zeros, because ``|det P| = 1`` for a square orthogonal matrix.

    Raises
    ------
    NotImplementedError
        If ``n_components < n_features`` (Jacobian determinant undefined).
    """
    if self.output_dim_ < self.input_dim_:
        raise NotImplementedError(
            "RandomOrthogonalProjection with n_components < input dimension "
            "does not have a well-defined Jacobian determinant."
        )
    # For a square orthogonal matrix, |det(J)| = 1, so log|det(J)| = 0.
    return np.zeros(X.shape[0])

rbig.GaussianRandomProjection

Bases: RotationBijector

Johnson-Lindenstrauss style random projection with Gaussian entries.

Constructs a random projection matrix M in R^{D x K} whose entries are drawn i.i.d. from N(0, 1/K)::

M_ij ~ N(0, 1/K)

The 1/K normalisation approximately preserves pairwise Euclidean distances (Johnson-Lindenstrauss lemma)::

(1 - eps)||x - y||^2 <= ||Mx - My||^2 <= (1 + eps)||x - y||^2

with high probability when K = O(eps^{-2} log n).

Parameters

n_components : int or None, default None Output dimensionality K. If None, K = D (square case). random_state : int or None, default None Seed for reproducible matrix generation.

Attributes

matrix_ : np.ndarray of shape (n_features, n_components) The random projection matrix with entries ~ N(0, 1/K).

Notes

Unlike :class:RandomOrthogonalProjection, the columns of this matrix are not orthogonal, so |det M| != 1 in general. :meth:get_log_det_jacobian returns zeros as an approximation. For density estimation where accuracy matters, prefer :class:RandomOrthogonalProjection or :class:RandomRotation.

The inverse uses the Moore-Penrose pseudoinverse computed by :func:numpy.linalg.pinv.

References

Johnson, W. B., & Lindenstrauss, J. (1984). Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics, 26, 189-206.

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.rotation import GaussianRandomProjection rng = np.random.default_rng(0) X = rng.standard_normal((100, 4)) grp = GaussianRandomProjection(random_state=0).fit(X) Z = grp.transform(X) Z.shape (100, 4)

Source code in rbig/_src/rotation.py
class GaussianRandomProjection(RotationBijector):
    """Johnson-Lindenstrauss style random projection with Gaussian entries.

    Constructs a random projection matrix M in R^{D x K} whose entries are
    drawn i.i.d. from N(0, 1/K)::

        M_ij ~ N(0, 1/K)

    The 1/K normalisation approximately preserves pairwise Euclidean
    distances (Johnson-Lindenstrauss lemma)::

        (1 - eps)||x - y||^2 <= ||Mx - My||^2 <= (1 + eps)||x - y||^2

    with high probability when K = O(eps^{-2} log n).

    Parameters
    ----------
    n_components : int or None, default None
        Output dimensionality K.  If ``None``, K = D (square case).
    random_state : int or None, default None
        Seed for reproducible matrix generation.

    Attributes
    ----------
    matrix_ : np.ndarray of shape (n_features, n_components)
        The random projection matrix with entries ~ N(0, 1/K).

    Notes
    -----
    Unlike :class:`RandomOrthogonalProjection`, the columns of this matrix
    are *not* orthogonal, so ``|det M| != 1`` in general.
    :meth:`get_log_det_jacobian` returns zeros as an approximation.
    For density estimation where accuracy matters, prefer
    :class:`RandomOrthogonalProjection` or :class:`RandomRotation`.

    The inverse uses the Moore-Penrose pseudoinverse computed by
    :func:`numpy.linalg.pinv`.

    References
    ----------
    Johnson, W. B., & Lindenstrauss, J. (1984). Extensions of Lipschitz
    mappings into a Hilbert space. *Contemporary Mathematics*, 26, 189-206.

    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import GaussianRandomProjection
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((100, 4))
    >>> grp = GaussianRandomProjection(random_state=0).fit(X)
    >>> Z = grp.transform(X)
    >>> Z.shape
    (100, 4)
    """

    def __init__(
        self,
        n_components: int | None = None,
        random_state: int | None = None,
    ):
        self.n_components = n_components
        self.random_state = random_state

    def fit(self, X: np.ndarray, y=None) -> GaussianRandomProjection:
        """Build the Gaussian random projection matrix.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data (used only to infer D = n_features).

        Returns
        -------
        self : GaussianRandomProjection
            Fitted transform instance.
        """
        rng = np.random.default_rng(self.random_state)
        D = X.shape[1]
        K = self.n_components if self.n_components is not None else D
        # Entries drawn from N(0, 1), then scaled by 1/sqrt(K) for distance preservation
        self.matrix_ = rng.standard_normal((D, K)) / np.sqrt(K)  # (D, K)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Project X: z = X M.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_components)
            Projected data.
        """
        return X @ self.matrix_  # (N, D) @ (D, K) -> (N, K)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Approximate inverse via the Moore-Penrose pseudoinverse.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_components)
            Projected data.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Approximately recovered original data.
        """
        # Pseudoinverse: M^+ has shape (K, D), so X @ M^+ gives (N, D)
        return X @ np.linalg.pinv(self.matrix_)  # (N, K) @ (K, D) -> (N, D)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros (approximation; Gaussian projections are not isometric).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine n_samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Zeros (approximate; the true log-det is generally non-zero).
        """
        return np.zeros(X.shape[0])

fit(X, y=None)

Build the Gaussian random projection matrix.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data (used only to infer D = n_features).

Returns

self : GaussianRandomProjection Fitted transform instance.

Source code in rbig/_src/rotation.py
def fit(self, X: np.ndarray, y=None) -> GaussianRandomProjection:
    """Build the Gaussian random projection matrix.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data (used only to infer D = n_features).

    Returns
    -------
    self : GaussianRandomProjection
        Fitted transform instance.
    """
    rng = np.random.default_rng(self.random_state)
    D = X.shape[1]
    K = self.n_components if self.n_components is not None else D
    # Entries drawn from N(0, 1), then scaled by 1/sqrt(K) for distance preservation
    self.matrix_ = rng.standard_normal((D, K)) / np.sqrt(K)  # (D, K)
    return self

transform(X)

Project X: z = X M.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Z : np.ndarray of shape (n_samples, n_components) Projected data.

Source code in rbig/_src/rotation.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Project X: z = X M.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_components)
        Projected data.
    """
    return X @ self.matrix_  # (N, D) @ (D, K) -> (N, K)

inverse_transform(X)

Approximate inverse via the Moore-Penrose pseudoinverse.

Parameters

X : np.ndarray of shape (n_samples, n_components) Projected data.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Approximately recovered original data.

Source code in rbig/_src/rotation.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Approximate inverse via the Moore-Penrose pseudoinverse.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_components)
        Projected data.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Approximately recovered original data.
    """
    # Pseudoinverse: M^+ has shape (K, D), so X @ M^+ gives (N, D)
    return X @ np.linalg.pinv(self.matrix_)  # (N, K) @ (K, D) -> (N, D)

get_log_det_jacobian(X)

Return zeros (approximation; Gaussian projections are not isometric).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine n_samples).

Returns

ldj : np.ndarray of shape (n_samples,) Zeros (approximate; the true log-det is generally non-zero).

Source code in rbig/_src/rotation.py
def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros (approximation; Gaussian projections are not isometric).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine n_samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Zeros (approximate; the true log-det is generally non-zero).
    """
    return np.zeros(X.shape[0])

rbig.OrthogonalDimensionalityReduction

Bases: RotationBijector

Full orthogonal rotation followed by optional dimension truncation.

Applies a D x D orthogonal rotation Q (drawn from the Haar measure via QR) and then retains only the first K <= D components::

z = (X Q^T)[:, :K]

The rotation is sampled fresh at fit time from a square standard-normal matrix processed through QR with sign correction.

Parameters

n_components : int or None, default None Number of output dimensions K. If None, K = D (no truncation). random_state : int or None, default None Seed for reproducible rotation matrix generation.

Attributes

rotation_matrix_ : np.ndarray of shape (n_features, n_features) Full D x D orthogonal rotation matrix Q. n_components_ : int Number of retained output dimensions K. input_dim_ : int Input dimensionality D.

Notes

When K = D the transform is a bijection and::

log|det J| = log|det Q| = 0

When K < D the transform is not invertible; both :meth:inverse_transform and :meth:get_log_det_jacobian raise :class:NotImplementedError.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: from ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537-549.

Examples

import numpy as np from rbig._src.rotation import OrthogonalDimensionalityReduction rng = np.random.default_rng(0) X = rng.standard_normal((100, 4)) odr = OrthogonalDimensionalityReduction(random_state=0).fit(X) Z = odr.transform(X) Z.shape (100, 4)

Source code in rbig/_src/rotation.py
class OrthogonalDimensionalityReduction(RotationBijector):
    """Full orthogonal rotation followed by optional dimension truncation.

    Applies a D x D orthogonal rotation Q (drawn from the Haar measure via
    QR) and then retains only the first K <= D components::

        z = (X Q^T)[:, :K]

    The rotation is sampled fresh at ``fit`` time from a square
    standard-normal matrix processed through QR with sign correction.

    Parameters
    ----------
    n_components : int or None, default None
        Number of output dimensions K.  If ``None``, K = D (no truncation).
    random_state : int or None, default None
        Seed for reproducible rotation matrix generation.

    Attributes
    ----------
    rotation_matrix_ : np.ndarray of shape (n_features, n_features)
        Full D x D orthogonal rotation matrix Q.
    n_components_ : int
        Number of retained output dimensions K.
    input_dim_ : int
        Input dimensionality D.

    Notes
    -----
    When K = D the transform is a bijection and::

        log|det J| = log|det Q| = 0

    When K < D the transform is not invertible; both
    :meth:`inverse_transform` and :meth:`get_log_det_jacobian` raise
    :class:`NotImplementedError`.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    from ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537-549.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.rotation import OrthogonalDimensionalityReduction
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((100, 4))
    >>> odr = OrthogonalDimensionalityReduction(random_state=0).fit(X)
    >>> Z = odr.transform(X)
    >>> Z.shape
    (100, 4)
    """

    def __init__(
        self,
        n_components: int | None = None,
        random_state: int | None = None,
    ):
        self.n_components = n_components
        self.random_state = random_state

    def fit(self, X: np.ndarray, y=None) -> OrthogonalDimensionalityReduction:
        """Sample a Haar-uniform D x D rotation matrix.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data (used only to infer D = n_features).

        Returns
        -------
        self : OrthogonalDimensionalityReduction
            Fitted transform instance.
        """
        rng = np.random.default_rng(self.random_state)
        D = X.shape[1]
        K = self.n_components if self.n_components is not None else D
        # QR of a random Gaussian matrix gives a Haar-uniform orthogonal matrix
        A = rng.standard_normal((D, D))
        Q, _ = np.linalg.qr(A)
        self.rotation_matrix_ = Q  # (D, D)
        self.n_components_ = K
        self.input_dim_ = D
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Rotate then truncate: z = (X Q^T)[:, :K].

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_components)
            Rotated and (optionally) truncated data.
        """
        Xr = X @ self.rotation_matrix_.T  # (N, D) @ (D, D) -> (N, D)  full rotation
        return Xr[:, : self.n_components_]  # (N, K)  keep first K components

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the rotation (only valid for square case K = D).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_components)
            Rotated data.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original space.

        Raises
        ------
        NotImplementedError
            If ``n_components < n_features`` (not invertible).
        """
        if self.n_components_ < self.input_dim_:
            raise NotImplementedError(
                "OrthogonalDimensionalityReduction with n_components < input dimension "
                "is not bijective; inverse_transform is undefined."
            )
        return X @ self.rotation_matrix_  # (N, D) @ (D, D) -> (N, D)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros for the square (bijective) case.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine n_samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Zeros, because ``|det Q| = 1`` for any orthogonal Q.

        Raises
        ------
        NotImplementedError
            If ``n_components < n_features`` (Jacobian determinant undefined).
        """
        if self.n_components_ < self.input_dim_:
            raise NotImplementedError(
                "OrthogonalDimensionalityReduction with n_components < input dimension "
                "does not have a well-defined Jacobian determinant."
            )
        return np.zeros(X.shape[0])

fit(X, y=None)

Sample a Haar-uniform D x D rotation matrix.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data (used only to infer D = n_features).

Returns

self : OrthogonalDimensionalityReduction Fitted transform instance.

Source code in rbig/_src/rotation.py
def fit(self, X: np.ndarray, y=None) -> OrthogonalDimensionalityReduction:
    """Sample a Haar-uniform D x D rotation matrix.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data (used only to infer D = n_features).

    Returns
    -------
    self : OrthogonalDimensionalityReduction
        Fitted transform instance.
    """
    rng = np.random.default_rng(self.random_state)
    D = X.shape[1]
    K = self.n_components if self.n_components is not None else D
    # QR of a random Gaussian matrix gives a Haar-uniform orthogonal matrix
    A = rng.standard_normal((D, D))
    Q, _ = np.linalg.qr(A)
    self.rotation_matrix_ = Q  # (D, D)
    self.n_components_ = K
    self.input_dim_ = D
    return self

transform(X)

Rotate then truncate: z = (X Q^T)[:, :K].

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Z : np.ndarray of shape (n_samples, n_components) Rotated and (optionally) truncated data.

Source code in rbig/_src/rotation.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Rotate then truncate: z = (X Q^T)[:, :K].

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_components)
        Rotated and (optionally) truncated data.
    """
    Xr = X @ self.rotation_matrix_.T  # (N, D) @ (D, D) -> (N, D)  full rotation
    return Xr[:, : self.n_components_]  # (N, K)  keep first K components

inverse_transform(X)

Invert the rotation (only valid for square case K = D).

Parameters

X : np.ndarray of shape (n_samples, n_components) Rotated data.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original space.

Raises

NotImplementedError If n_components < n_features (not invertible).

Source code in rbig/_src/rotation.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the rotation (only valid for square case K = D).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_components)
        Rotated data.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original space.

    Raises
    ------
    NotImplementedError
        If ``n_components < n_features`` (not invertible).
    """
    if self.n_components_ < self.input_dim_:
        raise NotImplementedError(
            "OrthogonalDimensionalityReduction with n_components < input dimension "
            "is not bijective; inverse_transform is undefined."
        )
    return X @ self.rotation_matrix_  # (N, D) @ (D, D) -> (N, D)

get_log_det_jacobian(X)

Return zeros for the square (bijective) case.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine n_samples).

Returns

ldj : np.ndarray of shape (n_samples,) Zeros, because |det Q| = 1 for any orthogonal Q.

Raises

NotImplementedError If n_components < n_features (Jacobian determinant undefined).

Source code in rbig/_src/rotation.py
def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros for the square (bijective) case.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine n_samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Zeros, because ``|det Q| = 1`` for any orthogonal Q.

    Raises
    ------
    NotImplementedError
        If ``n_components < n_features`` (Jacobian determinant undefined).
    """
    if self.n_components_ < self.input_dim_:
        raise NotImplementedError(
            "OrthogonalDimensionalityReduction with n_components < input dimension "
            "does not have a well-defined Jacobian determinant."
        )
    return np.zeros(X.shape[0])

Parametric Transforms

rbig.BoxCoxTransform

Bases: BaseTransform

Box-Cox power transform fitted independently to each feature.

The Box-Cox family of transforms is parameterised by λ (one per feature):

λ ≠ 0 :  y = (xᵏ − 1) / λ
λ → 0 :  y = log(x)             (continuity limit)

λ values are estimated via maximum likelihood (scipy's boxcox). Features with non-positive values are left unchanged (λ = 0 applied as identity rather than log, since log requires positive inputs).

The inverse transform is:

λ ≠ 0 :  x = (λy + 1)^{1/λ}
λ = 0 :  x = exp(y)

The log-det of the Jacobian is:

λ ≠ 0 :  ∑ᵢ (λ − 1) log xᵢ
λ = 0 :  ∑ᵢ (−log xᵢ)          (from d(log x)/dx = 1/x ⟹ log|dy/dx| = −log xᵢ)

.. note:: The current implementation uses −xᵢ (not −log xᵢ) for the λ = 0 branch, matching the original code behaviour. This is an approximation that differs from the exact analytical log-det.

Parameters

method : str, optional (default='mle') Fitting method passed conceptually; currently scipy MLE is always used regardless of this value.

Attributes

lambdas_ : np.ndarray of shape (n_features,) Fitted λ values after calling fit.

Examples

import numpy as np from rbig._src.parametric import BoxCoxTransform rng = np.random.default_rng(1) X = rng.exponential(scale=2.0, size=(200, 3)) # strictly positive tr = BoxCoxTransform().fit(X) Y = tr.transform(X) X_rec = tr.inverse_transform(Y) np.allclose(X, X_rec, atol=1e-6) True

Source code in rbig/_src/parametric.py
class BoxCoxTransform(BaseTransform):
    """Box-Cox power transform fitted independently to each feature.

    The Box-Cox family of transforms is parameterised by λ (one per feature):

        λ ≠ 0 :  y = (xᵏ − 1) / λ
        λ → 0 :  y = log(x)             (continuity limit)

    λ values are estimated via maximum likelihood (scipy's ``boxcox``).
    Features with non-positive values are left unchanged (λ = 0 applied as
    identity rather than log, since log requires positive inputs).

    The inverse transform is:

        λ ≠ 0 :  x = (λy + 1)^{1/λ}
        λ = 0 :  x = exp(y)

    The log-det of the Jacobian is:

        λ ≠ 0 :  ∑ᵢ (λ − 1) log xᵢ
        λ = 0 :  ∑ᵢ (−log xᵢ)          (from d(log x)/dx = 1/x ⟹ log|dy/dx| = −log xᵢ)

    .. note::
        The current implementation uses ``−xᵢ`` (not ``−log xᵢ``) for the
        λ = 0 branch, matching the original code behaviour.  This is an
        approximation that differs from the exact analytical log-det.

    Parameters
    ----------
    method : str, optional (default='mle')
        Fitting method passed conceptually; currently scipy MLE is always
        used regardless of this value.

    Attributes
    ----------
    lambdas_ : np.ndarray of shape (n_features,)
        Fitted λ values after calling ``fit``.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import BoxCoxTransform
    >>> rng = np.random.default_rng(1)
    >>> X = rng.exponential(scale=2.0, size=(200, 3))  # strictly positive
    >>> tr = BoxCoxTransform().fit(X)
    >>> Y = tr.transform(X)
    >>> X_rec = tr.inverse_transform(Y)
    >>> np.allclose(X, X_rec, atol=1e-6)
    True
    """

    def __init__(self, method: str = "mle"):
        self.method = method

    def fit(self, X: np.ndarray, y=None) -> BoxCoxTransform:
        """Estimate one Box-Cox λ per feature via MLE.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.  Features that contain non-positive values are
            assigned λ = 0 (no transform applied during ``transform``).

        Returns
        -------
        self : BoxCoxTransform
            Fitted instance with ``lambdas_`` attribute set.
        """
        self.lambdas_ = np.zeros(X.shape[1])  # λ per feature, default 0
        for i in range(X.shape[1]):
            xi = X[:, i]
            if np.all(xi > 0):
                _, lam = stats.boxcox(xi)  # MLE for λ
            else:
                lam = 0.0  # non-positive data: no power transform
            self.lambdas_[i] = lam
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the fitted Box-Cox transform to X.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.  Features with λ = 0 and non-positive values are
            passed through unchanged.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Box-Cox transformed data.
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(X.shape[1]):
            xi = X[:, i]
            lam = self.lambdas_[i]
            if np.all(xi > 0):
                Xt[:, i] = stats.boxcox(xi, lmbda=lam)  # y = (x^lam - 1)/lam or log(x)
            else:
                Xt[:, i] = xi  # pass-through for non-positive
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the Box-Cox transform.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Box-Cox transformed data.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Recovered original-scale data.  Uses:

            * λ = 0 : x = exp(y)
            * λ ≠ 0 : x = (λy + 1)^{1/λ}   (clamped to 0 for stability)
        """
        Xt = np.zeros_like(X, dtype=float)
        for i in range(X.shape[1]):
            lam = self.lambdas_[i]
            if np.abs(lam) < 1e-10:
                Xt[:, i] = np.exp(X[:, i])  # x = exp(y)
            else:
                # x = (λy + 1)^{1/λ}, clamp argument to ≥ 0
                Xt[:, i] = np.power(np.maximum(lam * X[:, i] + 1, 0), 1 / lam)
        return Xt

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute per-sample log |det J| of the forward Box-Cox transform.

        The Jacobian is diagonal; for each feature:

            λ ≠ 0 :  d/dx[(xᵏ−1)/λ] = x^{λ−1}  ⟹  log = (λ−1) log x
            λ = 0 :  d/dx[log x] = 1/x           ⟹  exact log = −log x

        .. note::
            The λ = 0 branch accumulates ``−xᵢ`` rather than the exact
            ``−log xᵢ``.  This preserves the original implementation
            behaviour.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data (pre-transform, original scale).

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Sum of per-feature log Jacobian contributions.
        """
        log_jac = np.zeros(X.shape[0])
        for i in range(X.shape[1]):
            xi = X[:, i]
            lam = self.lambdas_[i]
            if np.abs(lam) < 1e-10:
                log_jac += -xi  # log(1/x) ~= -x (lam->0 limit)
            else:
                # (lam-1) log xi from x^{lam-1} Jacobian
                log_jac += (lam - 1) * np.log(np.maximum(xi, 1e-300))
        return log_jac

fit(X, y=None)

Estimate one Box-Cox λ per feature via MLE.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data. Features that contain non-positive values are assigned λ = 0 (no transform applied during transform).

Returns

self : BoxCoxTransform Fitted instance with lambdas_ attribute set.

Source code in rbig/_src/parametric.py
def fit(self, X: np.ndarray, y=None) -> BoxCoxTransform:
    """Estimate one Box-Cox λ per feature via MLE.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.  Features that contain non-positive values are
        assigned λ = 0 (no transform applied during ``transform``).

    Returns
    -------
    self : BoxCoxTransform
        Fitted instance with ``lambdas_`` attribute set.
    """
    self.lambdas_ = np.zeros(X.shape[1])  # λ per feature, default 0
    for i in range(X.shape[1]):
        xi = X[:, i]
        if np.all(xi > 0):
            _, lam = stats.boxcox(xi)  # MLE for λ
        else:
            lam = 0.0  # non-positive data: no power transform
        self.lambdas_[i] = lam
    return self

transform(X)

Apply the fitted Box-Cox transform to X.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data. Features with λ = 0 and non-positive values are passed through unchanged.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Box-Cox transformed data.

Source code in rbig/_src/parametric.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the fitted Box-Cox transform to X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.  Features with λ = 0 and non-positive values are
        passed through unchanged.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Box-Cox transformed data.
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(X.shape[1]):
        xi = X[:, i]
        lam = self.lambdas_[i]
        if np.all(xi > 0):
            Xt[:, i] = stats.boxcox(xi, lmbda=lam)  # y = (x^lam - 1)/lam or log(x)
        else:
            Xt[:, i] = xi  # pass-through for non-positive
    return Xt

inverse_transform(X)

Invert the Box-Cox transform.

Parameters

X : np.ndarray of shape (n_samples, n_features) Box-Cox transformed data.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Recovered original-scale data. Uses:

* λ = 0 : x = exp(y)
* λ ≠ 0 : x = (λy + 1)^{1/λ}   (clamped to 0 for stability)
Source code in rbig/_src/parametric.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the Box-Cox transform.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Box-Cox transformed data.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Recovered original-scale data.  Uses:

        * λ = 0 : x = exp(y)
        * λ ≠ 0 : x = (λy + 1)^{1/λ}   (clamped to 0 for stability)
    """
    Xt = np.zeros_like(X, dtype=float)
    for i in range(X.shape[1]):
        lam = self.lambdas_[i]
        if np.abs(lam) < 1e-10:
            Xt[:, i] = np.exp(X[:, i])  # x = exp(y)
        else:
            # x = (λy + 1)^{1/λ}, clamp argument to ≥ 0
            Xt[:, i] = np.power(np.maximum(lam * X[:, i] + 1, 0), 1 / lam)
    return Xt

log_det_jacobian(X)

Compute per-sample log |det J| of the forward Box-Cox transform.

The Jacobian is diagonal; for each feature:

λ ≠ 0 :  d/dx[(xᵏ−1)/λ] = x^{λ−1}  ⟹  log = (λ−1) log x
λ = 0 :  d/dx[log x] = 1/x           ⟹  exact log = −log x

.. note:: The λ = 0 branch accumulates −xᵢ rather than the exact −log xᵢ. This preserves the original implementation behaviour.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data (pre-transform, original scale).

Returns

log_det : np.ndarray of shape (n_samples,) Sum of per-feature log Jacobian contributions.

Source code in rbig/_src/parametric.py
def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute per-sample log |det J| of the forward Box-Cox transform.

    The Jacobian is diagonal; for each feature:

        λ ≠ 0 :  d/dx[(xᵏ−1)/λ] = x^{λ−1}  ⟹  log = (λ−1) log x
        λ = 0 :  d/dx[log x] = 1/x           ⟹  exact log = −log x

    .. note::
        The λ = 0 branch accumulates ``−xᵢ`` rather than the exact
        ``−log xᵢ``.  This preserves the original implementation
        behaviour.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data (pre-transform, original scale).

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Sum of per-feature log Jacobian contributions.
    """
    log_jac = np.zeros(X.shape[0])
    for i in range(X.shape[1]):
        xi = X[:, i]
        lam = self.lambdas_[i]
        if np.abs(lam) < 1e-10:
            log_jac += -xi  # log(1/x) ~= -x (lam->0 limit)
        else:
            # (lam-1) log xi from x^{lam-1} Jacobian
            log_jac += (lam - 1) * np.log(np.maximum(xi, 1e-300))
    return log_jac

rbig.LogitTransform

Bases: BaseTransform

Logit transform: bijectively maps the unit hypercube (0,1)ᵈ to ℝᵈ.

Each feature is transformed independently by the logit (log-odds) function:

Forward  : y = log(x / (1 − x))
Inverse  : x = σ(y) = 1 / (1 + e^{−y})        (sigmoid)
Log-det  : ∑ᵢ [−log xᵢ − log(1 − xᵢ)]

The transform is useful as a pre-processing step when data lives in (0, 1), e.g. probabilities or proportions.

Examples

import numpy as np from rbig._src.parametric import LogitTransform rng = np.random.default_rng(0) X = rng.uniform(0.05, 0.95, size=(100, 3)) # data in (0, 1) tr = LogitTransform().fit(X) Y = tr.transform(X) # data now in ℝ X_rec = tr.inverse_transform(Y) np.allclose(X, X_rec, atol=1e-10) True

Source code in rbig/_src/parametric.py
class LogitTransform(BaseTransform):
    """Logit transform: bijectively maps the unit hypercube (0,1)ᵈ to ℝᵈ.

    Each feature is transformed independently by the logit (log-odds) function:

        Forward  : y = log(x / (1 − x))
        Inverse  : x = σ(y) = 1 / (1 + e^{−y})        (sigmoid)
        Log-det  : ∑ᵢ [−log xᵢ − log(1 − xᵢ)]

    The transform is useful as a pre-processing step when data lives in (0, 1),
    e.g. probabilities or proportions.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import LogitTransform
    >>> rng = np.random.default_rng(0)
    >>> X = rng.uniform(0.05, 0.95, size=(100, 3))  # data in (0, 1)
    >>> tr = LogitTransform().fit(X)
    >>> Y = tr.transform(X)  # data now in ℝ
    >>> X_rec = tr.inverse_transform(Y)
    >>> np.allclose(X, X_rec, atol=1e-10)
    True
    """

    def fit(self, X: np.ndarray, y=None) -> LogitTransform:
        """No-op fit (stateless transform).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Ignored.

        Returns
        -------
        self : LogitTransform
        """
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the logit map y = log(x / (1 − x)).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in (0, 1).

        Returns
        -------
        Y : np.ndarray of shape (n_samples, n_features)
            Log-odds transformed data in ℝ.
        """
        return np.log(X / (1 - X))  # y = logit(x) = log(x/(1-x))

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the inverse sigmoid (logistic) map x = 1 / (1 + e^{−y}).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in ℝ.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Recovered data in (0, 1).
        """
        return 1 / (1 + np.exp(-X))  # x = sigmoid(y) = 1/(1+e^{-y})

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute per-sample log |det J| of the forward logit transform.

        The Jacobian of logit is diagonal with entries
        d(logit xᵢ)/dxᵢ = 1/xᵢ + 1/(1−xᵢ), so:

            log |det J| = ∑ᵢ [−log xᵢ − log(1 − xᵢ)]

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in (0, 1) (pre-transform).

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Log absolute determinant of the Jacobian for each sample.
        """
        # Diagonal Jacobian: sum_i (-log xi - log(1-xi))
        return np.sum(-np.log(X) - np.log(1 - X), axis=1)

fit(X, y=None)

No-op fit (stateless transform).

Parameters

X : np.ndarray of shape (n_samples, n_features) Ignored.

Returns

self : LogitTransform

Source code in rbig/_src/parametric.py
def fit(self, X: np.ndarray, y=None) -> LogitTransform:
    """No-op fit (stateless transform).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Ignored.

    Returns
    -------
    self : LogitTransform
    """
    return self

transform(X)

Apply the logit map y = log(x / (1 − x)).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in (0, 1).

Returns

Y : np.ndarray of shape (n_samples, n_features) Log-odds transformed data in ℝ.

Source code in rbig/_src/parametric.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the logit map y = log(x / (1 − x)).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in (0, 1).

    Returns
    -------
    Y : np.ndarray of shape (n_samples, n_features)
        Log-odds transformed data in ℝ.
    """
    return np.log(X / (1 - X))  # y = logit(x) = log(x/(1-x))

inverse_transform(X)

Apply the inverse sigmoid (logistic) map x = 1 / (1 + e^{−y}).

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in ℝ.

Returns

Z : np.ndarray of shape (n_samples, n_features) Recovered data in (0, 1).

Source code in rbig/_src/parametric.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the inverse sigmoid (logistic) map x = 1 / (1 + e^{−y}).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in ℝ.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Recovered data in (0, 1).
    """
    return 1 / (1 + np.exp(-X))  # x = sigmoid(y) = 1/(1+e^{-y})

log_det_jacobian(X)

Compute per-sample log |det J| of the forward logit transform.

The Jacobian of logit is diagonal with entries d(logit xᵢ)/dxᵢ = 1/xᵢ + 1/(1−xᵢ), so:

log |det J| = ∑ᵢ [−log xᵢ − log(1 − xᵢ)]
Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in (0, 1) (pre-transform).

Returns

log_det : np.ndarray of shape (n_samples,) Log absolute determinant of the Jacobian for each sample.

Source code in rbig/_src/parametric.py
def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute per-sample log |det J| of the forward logit transform.

    The Jacobian of logit is diagonal with entries
    d(logit xᵢ)/dxᵢ = 1/xᵢ + 1/(1−xᵢ), so:

        log |det J| = ∑ᵢ [−log xᵢ − log(1 − xᵢ)]

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in (0, 1) (pre-transform).

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Log absolute determinant of the Jacobian for each sample.
    """
    # Diagonal Jacobian: sum_i (-log xi - log(1-xi))
    return np.sum(-np.log(X) - np.log(1 - X), axis=1)

rbig.QuantileTransform

Bases: BaseTransform

Quantile transform that maps each feature to a target distribution.

Wraps sklearn.preprocessing.QuantileTransformer to provide a uniform interface compatible with RBIG pipelines. By default, features are mapped to a standard Gaussian distribution, which is a common pre-processing step for Gaussianisation.

Parameters

n_quantiles : int, optional (default=1000) Number of quantiles used to build the empirical CDF. Capped at n_samples during fit. output_distribution : str, optional (default='normal') Target distribution for the transform. Accepted values are 'normal' (standard Gaussian) and 'uniform'.

Attributes

qt_ : sklearn.preprocessing.QuantileTransformer Fitted sklearn transformer, available after calling fit.

Examples

import numpy as np from rbig._src.parametric import QuantileTransform rng = np.random.default_rng(42) X = rng.exponential(scale=1.0, size=(500, 2)) tr = QuantileTransform(n_quantiles=200).fit(X) Y = tr.transform(X) # approximately standard Gaussian Y.shape (500, 2)

Marginal means should be near zero, stds near 1

np.allclose(Y.mean(axis=0), 0, atol=0.1) True

Source code in rbig/_src/parametric.py
class QuantileTransform(BaseTransform):
    """Quantile transform that maps each feature to a target distribution.

    Wraps ``sklearn.preprocessing.QuantileTransformer`` to provide a uniform
    interface compatible with RBIG pipelines.  By default, features are
    mapped to a standard Gaussian distribution, which is a common
    pre-processing step for Gaussianisation.

    Parameters
    ----------
    n_quantiles : int, optional (default=1000)
        Number of quantiles used to build the empirical CDF.  Capped at
        ``n_samples`` during ``fit``.
    output_distribution : str, optional (default='normal')
        Target distribution for the transform.  Accepted values are
        ``'normal'`` (standard Gaussian) and ``'uniform'``.

    Attributes
    ----------
    qt_ : sklearn.preprocessing.QuantileTransformer
        Fitted sklearn transformer, available after calling ``fit``.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import QuantileTransform
    >>> rng = np.random.default_rng(42)
    >>> X = rng.exponential(scale=1.0, size=(500, 2))
    >>> tr = QuantileTransform(n_quantiles=200).fit(X)
    >>> Y = tr.transform(X)  # approximately standard Gaussian
    >>> Y.shape
    (500, 2)
    >>> # Marginal means should be near zero, stds near 1
    >>> np.allclose(Y.mean(axis=0), 0, atol=0.1)
    True
    """

    def __init__(self, n_quantiles: int = 1000, output_distribution: str = "normal"):
        self.n_quantiles = n_quantiles
        self.output_distribution = output_distribution

    def fit(self, X: np.ndarray, y=None) -> QuantileTransform:
        """Fit the quantile transform to the training data.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.

        Returns
        -------
        self : QuantileTransform
            Fitted instance with ``qt_`` attribute set.
        """
        from sklearn.preprocessing import QuantileTransformer

        # Cap n_quantiles at the number of available training samples
        n_quantiles = min(self.n_quantiles, X.shape[0])
        self.qt_ = QuantileTransformer(
            n_quantiles=n_quantiles,
            output_distribution=self.output_distribution,
            random_state=0,
        )
        self.qt_.fit(X)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the fitted quantile transform.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Y : np.ndarray of shape (n_samples, n_features)
            Data mapped to the target distribution.
        """
        return self.qt_.transform(X)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the quantile transform.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the target distribution space.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Recovered data in the original distribution space.
        """
        return self.qt_.inverse_transform(X)

fit(X, y=None)

Fit the quantile transform to the training data.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data.

Returns

self : QuantileTransform Fitted instance with qt_ attribute set.

Source code in rbig/_src/parametric.py
def fit(self, X: np.ndarray, y=None) -> QuantileTransform:
    """Fit the quantile transform to the training data.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.

    Returns
    -------
    self : QuantileTransform
        Fitted instance with ``qt_`` attribute set.
    """
    from sklearn.preprocessing import QuantileTransformer

    # Cap n_quantiles at the number of available training samples
    n_quantiles = min(self.n_quantiles, X.shape[0])
    self.qt_ = QuantileTransformer(
        n_quantiles=n_quantiles,
        output_distribution=self.output_distribution,
        random_state=0,
    )
    self.qt_.fit(X)
    return self

transform(X)

Apply the fitted quantile transform.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Y : np.ndarray of shape (n_samples, n_features) Data mapped to the target distribution.

Source code in rbig/_src/parametric.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the fitted quantile transform.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Y : np.ndarray of shape (n_samples, n_features)
        Data mapped to the target distribution.
    """
    return self.qt_.transform(X)

inverse_transform(X)

Invert the quantile transform.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the target distribution space.

Returns

Z : np.ndarray of shape (n_samples, n_features) Recovered data in the original distribution space.

Source code in rbig/_src/parametric.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the quantile transform.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the target distribution space.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Recovered data in the original distribution space.
    """
    return self.qt_.inverse_transform(X)

Base Classes

rbig.BaseTransform

Bases: TransformerMixin, BaseEstimator, ABC

Abstract base class for all RBIG transforms.

Defines the common interface shared by every learnable data transformation in this library: fitting to data, forward mapping, and its inverse. Subclasses that support density estimation should also implement log_det_jacobian.

Notes

The change-of-variables formula for a normalizing flow relates the density of the input x to a base density p_Z via a bijection f:

log p(x) = log p_Z(f(x)) + log|det J_f(x)|

where J_f(x) is the Jacobian of f evaluated at x.

Source code in rbig/_src/base.py
class BaseTransform(TransformerMixin, BaseEstimator, ABC):
    """Abstract base class for all RBIG transforms.

    Defines the common interface shared by every learnable data transformation
    in this library: fitting to data, forward mapping, and its inverse.
    Subclasses that support density estimation should also implement
    ``log_det_jacobian``.

    Notes
    -----
    The change-of-variables formula for a normalizing flow relates the density
    of the input ``x`` to a base density ``p_Z`` via a bijection ``f``:

        log p(x) = log p_Z(f(x)) + log|det J_f(x)|

    where ``J_f(x)`` is the Jacobian of ``f`` evaluated at ``x``.
    """

    @abstractmethod
    def fit(self, X: np.ndarray, y=None) -> "BaseTransform":
        """Fit the transform to data X.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data used to estimate any internal parameters.
        y : ignored
            Not used, present for sklearn pipeline compatibility.

        Returns
        -------
        self : BaseTransform
            The fitted transform instance.
        """
        ...

    @abstractmethod
    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the fitted forward transform to data X.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data to transform.

        Returns
        -------
        Xt : np.ndarray of shape (n_samples, n_features)
            Transformed data.
        """
        ...

    @abstractmethod
    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the fitted inverse transform to data X.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the transformed (latent) space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original input space.
        """
        ...

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Log absolute determinant of the Jacobian evaluated at X.

        For a transform f, this returns ``log|det J_f(x)|`` per sample,
        which is the volume-correction term required in the change-of-variables
        formula for density estimation.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points at which to evaluate the log-det-Jacobian.

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant of the Jacobian.

        Raises
        ------
        NotImplementedError
            If the subclass does not implement this method.
        """
        raise NotImplementedError

fit(X, y=None) abstractmethod

Fit the transform to data X.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data used to estimate any internal parameters. y : ignored Not used, present for sklearn pipeline compatibility.

Returns

self : BaseTransform The fitted transform instance.

Source code in rbig/_src/base.py
@abstractmethod
def fit(self, X: np.ndarray, y=None) -> "BaseTransform":
    """Fit the transform to data X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data used to estimate any internal parameters.
    y : ignored
        Not used, present for sklearn pipeline compatibility.

    Returns
    -------
    self : BaseTransform
        The fitted transform instance.
    """
    ...

transform(X) abstractmethod

Apply the fitted forward transform to data X.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data to transform.

Returns

Xt : np.ndarray of shape (n_samples, n_features) Transformed data.

Source code in rbig/_src/base.py
@abstractmethod
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the fitted forward transform to data X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data to transform.

    Returns
    -------
    Xt : np.ndarray of shape (n_samples, n_features)
        Transformed data.
    """
    ...

inverse_transform(X) abstractmethod

Apply the fitted inverse transform to data X.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the transformed (latent) space.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original input space.

Source code in rbig/_src/base.py
@abstractmethod
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the fitted inverse transform to data X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the transformed (latent) space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original input space.
    """
    ...

log_det_jacobian(X)

Log absolute determinant of the Jacobian evaluated at X.

For a transform f, this returns log|det J_f(x)| per sample, which is the volume-correction term required in the change-of-variables formula for density estimation.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points at which to evaluate the log-det-Jacobian.

Returns

ldj : np.ndarray of shape (n_samples,) Per-sample log absolute determinant of the Jacobian.

Raises

NotImplementedError If the subclass does not implement this method.

Source code in rbig/_src/base.py
def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Log absolute determinant of the Jacobian evaluated at X.

    For a transform f, this returns ``log|det J_f(x)|`` per sample,
    which is the volume-correction term required in the change-of-variables
    formula for density estimation.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points at which to evaluate the log-det-Jacobian.

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant of the Jacobian.

    Raises
    ------
    NotImplementedError
        If the subclass does not implement this method.
    """
    raise NotImplementedError

rbig.Bijector

Bases: TransformerMixin, BaseEstimator, ABC

Abstract base class for invertible transformations (bijectors).

A bijector implements a differentiable, invertible map f : ℝᵈ → ℝᵈ and provides the log absolute determinant of its Jacobian. These are the building blocks of normalizing flows.

The density of a random variable X = f⁻¹(Z) where Z ~ p_Z is:

log p(x) = log p_Z(f(x)) + log|det J_f(x)|

Notes

Concrete subclasses must implement :meth:fit, :meth:transform, :meth:inverse_transform, and :meth:get_log_det_jacobian. log_det_jacobian is provided as a convenience alias for the last method, for compatibility with RBIGLayer.

References

Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: From ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537–549. https://doi.org/10.1109/TNN.2011.2106511

Source code in rbig/_src/base.py
class Bijector(TransformerMixin, BaseEstimator, ABC):
    """Abstract base class for invertible transformations (bijectors).

    A bijector implements a differentiable, invertible map ``f : ℝᵈ → ℝᵈ``
    and provides the log absolute determinant of its Jacobian.  These are
    the building blocks of normalizing flows.

    The density of a random variable ``X = f⁻¹(Z)`` where ``Z ~ p_Z`` is:

        log p(x) = log p_Z(f(x)) + log|det J_f(x)|

    Notes
    -----
    Concrete subclasses must implement :meth:`fit`, :meth:`transform`,
    :meth:`inverse_transform`, and :meth:`get_log_det_jacobian`.
    ``log_det_jacobian`` is provided as a convenience alias for the last
    method, for compatibility with ``RBIGLayer``.

    References
    ----------
    Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization:
    From ICA to Random Rotations. *IEEE Transactions on Neural Networks*, 22(4),
    537–549. https://doi.org/10.1109/TNN.2011.2106511
    """

    @abstractmethod
    def fit(self, X: np.ndarray, y=None) -> "Bijector":
        """Fit the bijector to data X.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.
        y : ignored
            Not used, present for sklearn pipeline compatibility.

        Returns
        -------
        self : Bijector
            The fitted bijector.
        """
        ...

    @abstractmethod
    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the forward bijection f(x).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data in the original space.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Data mapped to the latent space.
        """
        ...

    @abstractmethod
    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the inverse bijection f⁻¹(z).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the latent space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original space.
        """
        ...

    @abstractmethod
    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Compute log|det J_f(x)| per sample.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points at which to evaluate the log-det-Jacobian.

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Per-sample log absolute determinant of the forward Jacobian J_f.
        """
        ...

    def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Alias for get_log_det_jacobian for compatibility with RBIGLayer.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points.

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Per-sample log|det J_f(x)|.
        """
        return self.get_log_det_jacobian(X)

fit(X, y=None) abstractmethod

Fit the bijector to data X.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data. y : ignored Not used, present for sklearn pipeline compatibility.

Returns

self : Bijector The fitted bijector.

Source code in rbig/_src/base.py
@abstractmethod
def fit(self, X: np.ndarray, y=None) -> "Bijector":
    """Fit the bijector to data X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.
    y : ignored
        Not used, present for sklearn pipeline compatibility.

    Returns
    -------
    self : Bijector
        The fitted bijector.
    """
    ...

transform(X) abstractmethod

Apply the forward bijection f(x).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data in the original space.

Returns

Z : np.ndarray of shape (n_samples, n_features) Data mapped to the latent space.

Source code in rbig/_src/base.py
@abstractmethod
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the forward bijection f(x).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data in the original space.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Data mapped to the latent space.
    """
    ...

inverse_transform(X) abstractmethod

Apply the inverse bijection f⁻¹(z).

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the latent space.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original space.

Source code in rbig/_src/base.py
@abstractmethod
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the inverse bijection f⁻¹(z).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the latent space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original space.
    """
    ...

get_log_det_jacobian(X) abstractmethod

Compute log|det J_f(x)| per sample.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points at which to evaluate the log-det-Jacobian.

Returns

ldj : np.ndarray of shape (n_samples,) Per-sample log absolute determinant of the forward Jacobian J_f.

Source code in rbig/_src/base.py
@abstractmethod
def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Compute log|det J_f(x)| per sample.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points at which to evaluate the log-det-Jacobian.

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Per-sample log absolute determinant of the forward Jacobian J_f.
    """
    ...

log_det_jacobian(X)

Alias for get_log_det_jacobian for compatibility with RBIGLayer.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points.

Returns

ldj : np.ndarray of shape (n_samples,) Per-sample log|det J_f(x)|.

Source code in rbig/_src/base.py
def log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Alias for get_log_det_jacobian for compatibility with RBIGLayer.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points.

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Per-sample log|det J_f(x)|.
    """
    return self.get_log_det_jacobian(X)

rbig.MarginalBijector

Bases: Bijector

Abstract bijector for independent, per-dimension (marginal) transforms.

Each feature dimension is transformed by a separate invertible function. Because the transform is applied independently to each coordinate, the Jacobian is diagonal and its log-determinant is the sum of per-dimension log-derivatives:

log|det J_f(x)| = ∑ᵢ log|f′(xᵢ)|

Subclasses implement concrete marginal mappings such as empirical CDF Gaussianization, quantile transform, or kernel density estimation.

Notes

In RBIG, the marginal step maps each dimension to a standard Gaussian via

z = Φ⁻¹(F̂ₙ(x))

where F̂ₙ is the estimated marginal CDF and Φ⁻¹ is the standard normal quantile function (probit).

Source code in rbig/_src/base.py
class MarginalBijector(Bijector):
    """Abstract bijector for independent, per-dimension (marginal) transforms.

    Each feature dimension is transformed by a separate invertible function.
    Because the transform is applied independently to each coordinate, the
    Jacobian is diagonal and its log-determinant is the sum of per-dimension
    log-derivatives:

        log|det J_f(x)| = ∑ᵢ log|f′(xᵢ)|

    Subclasses implement concrete marginal mappings such as empirical CDF
    Gaussianization, quantile transform, or kernel density estimation.

    Notes
    -----
    In RBIG, the marginal step maps each dimension to a standard Gaussian via

        z = Φ⁻¹(F̂ₙ(x))

    where F̂ₙ is the estimated marginal CDF and Φ⁻¹ is the standard normal
    quantile function (probit).
    """

rbig.RotationBijector

Bases: Bijector

Abstract bijector for orthogonal rotation transforms.

Rotation matrices Q satisfy QᵀQ = I and |det Q| = 1, so the log-absolute-determinant of the Jacobian is exactly zero:

log|det J_Q(x)| = log|det Q| = log 1 = 0

This default implementation of get_log_det_jacobian returns a zero vector of length n_samples, which concrete subclasses (e.g. PCA, ICA, random orthogonal) can inherit without override.

Notes

In RBIG, the rotation step de-correlates the marginally Gaussianized data, driving the joint distribution closer to a standard multivariate Gaussian with each iteration.

Source code in rbig/_src/base.py
class RotationBijector(Bijector):
    """Abstract bijector for orthogonal rotation transforms.

    Rotation matrices Q satisfy QᵀQ = I and |det Q| = 1, so the
    log-absolute-determinant of the Jacobian is exactly zero:

        log|det J_Q(x)| = log|det Q| = log 1 = 0

    This default implementation of ``get_log_det_jacobian`` returns a
    zero vector of length ``n_samples``, which concrete subclasses (e.g.
    PCA, ICA, random orthogonal) can inherit without override.

    Notes
    -----
    In RBIG, the rotation step de-correlates the marginally Gaussianized
    data, driving the joint distribution closer to a standard multivariate
    Gaussian with each iteration.
    """

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros because |det Q| = 1 for any orthogonal matrix Q.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points (used only to determine n_samples).

        Returns
        -------
        ldj : np.ndarray of shape (n_samples,)
            Array of zeros; rotations do not change volume.
        """
        # Orthogonal matrices preserve volume: log|det Q| = 0 for all x
        return np.zeros(X.shape[0])

get_log_det_jacobian(X)

Return zeros because |det Q| = 1 for any orthogonal matrix Q.

Parameters

X : np.ndarray of shape (n_samples, n_features) Input points (used only to determine n_samples).

Returns

ldj : np.ndarray of shape (n_samples,) Array of zeros; rotations do not change volume.

Source code in rbig/_src/base.py
def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros because |det Q| = 1 for any orthogonal matrix Q.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points (used only to determine n_samples).

    Returns
    -------
    ldj : np.ndarray of shape (n_samples,)
        Array of zeros; rotations do not change volume.
    """
    # Orthogonal matrices preserve volume: log|det Q| = 0 for all x
    return np.zeros(X.shape[0])

rbig.CompositeBijector

Bases: Bijector

A bijector that chains a sequence of bijectors in order.

Applies bijectors f₁, f₂, …, fₖ in sequence so that the composite map is g = fₖ ∘ … ∘ f₂ ∘ f₁. The log-det-Jacobian of the composition follows the chain rule:

log|det J_g(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|

where xₖ₋₁ = fₖ₋₁ ∘ … ∘ f₁(x) is the input to the k-th bijector.

Parameters

bijectors : list of Bijector Ordered list of bijectors to chain. They are applied left-to-right during transform and right-to-left during inverse_transform.

Attributes

bijectors : list of Bijector The constituent bijectors in application order.

Notes

Fitting is done sequentially: each bijector is fitted to the output of the previous one, so that the full model is trained in a single fit call.

Examples

import numpy as np from rbig._src.base import CompositeBijector from rbig._src.marginal import MarginalGaussianize from rbig._src.rotation import PCARotation rng = np.random.default_rng(0) X = rng.standard_normal((200, 4)) cb = CompositeBijector([MarginalGaussianize(), PCARotation()]) cb.fit(X) # doctest: +ELLIPSIS Z = cb.transform(X) Z.shape (200, 4)

Source code in rbig/_src/base.py
class CompositeBijector(Bijector):
    """A bijector that chains a sequence of bijectors in order.

    Applies bijectors ``f₁, f₂, …, fₖ`` in sequence so that the composite
    map is ``g = fₖ ∘ … ∘ f₂ ∘ f₁``.  The log-det-Jacobian of the
    composition follows the chain rule:

        log|det J_g(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|

    where ``xₖ₋₁ = fₖ₋₁ ∘ … ∘ f₁(x)`` is the input to the k-th bijector.

    Parameters
    ----------
    bijectors : list of Bijector
        Ordered list of bijectors to chain.  They are applied left-to-right
        during ``transform`` and right-to-left during ``inverse_transform``.

    Attributes
    ----------
    bijectors : list of Bijector
        The constituent bijectors in application order.

    Notes
    -----
    Fitting is done sequentially: each bijector is fitted to the output of
    the previous one, so that the full model is trained in a single
    ``fit`` call.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.base import CompositeBijector
    >>> from rbig._src.marginal import MarginalGaussianize
    >>> from rbig._src.rotation import PCARotation
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((200, 4))
    >>> cb = CompositeBijector([MarginalGaussianize(), PCARotation()])
    >>> cb.fit(X)  # doctest: +ELLIPSIS
    <rbig._src.base.CompositeBijector ...>
    >>> Z = cb.transform(X)
    >>> Z.shape
    (200, 4)
    """

    def __init__(self, bijectors: list):
        self.bijectors = bijectors

    def fit(self, X: np.ndarray, y=None) -> "CompositeBijector":
        """Fit each bijector sequentially on the output of the previous one.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Training data.
        y : ignored
            Not used, present for sklearn pipeline compatibility.

        Returns
        -------
        self : CompositeBijector
            The fitted composite bijector.
        """
        Xt = X.copy()  # working copy; shape (n_samples, n_features)
        for b in self.bijectors:
            # fit bijector b on current Xt, then advance Xt to b's output
            Xt = b.fit_transform(Xt)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply all bijectors left-to-right: g(x) = fₖ(… f₁(x) …).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input data.

        Returns
        -------
        Z : np.ndarray of shape (n_samples, n_features)
            Data after passing through every bijector in sequence.
        """
        Xt = X.copy()  # shape (n_samples, n_features)
        for b in self.bijectors:
            Xt = b.transform(Xt)
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Invert the composite map: g⁻¹(z) = f₁⁻¹(… fₖ⁻¹(z) …).

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Data in the latent space.

        Returns
        -------
        Xr : np.ndarray of shape (n_samples, n_features)
            Data recovered in the original input space.
        """
        Xt = X.copy()  # shape (n_samples, n_features)
        # reverse order to undo the forward composition
        for b in reversed(self.bijectors):
            Xt = b.inverse_transform(Xt)
        return Xt

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Sum log|det Jₖ| over all bijectors (chain rule).

        Uses the chain rule for Jacobian determinants:

            log|det J_g(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            Input points.

        Returns
        -------
        log_det : np.ndarray of shape (n_samples,)
            Per-sample sum of log-det-Jacobians across all bijectors.
        """
        Xt = X.copy()  # shape (n_samples, n_features)
        log_det = np.zeros(X.shape[0])  # accumulator, shape (n_samples,)
        for b in self.bijectors:
            # add log|det Jₖ| at the *current* intermediate input Xt
            log_det += b.get_log_det_jacobian(Xt)
            # advance Xt to the output of bijector b for the next iteration
            Xt = b.transform(Xt)
        return log_det

fit(X, y=None)

Fit each bijector sequentially on the output of the previous one.

Parameters

X : np.ndarray of shape (n_samples, n_features) Training data. y : ignored Not used, present for sklearn pipeline compatibility.

Returns

self : CompositeBijector The fitted composite bijector.

Source code in rbig/_src/base.py
def fit(self, X: np.ndarray, y=None) -> "CompositeBijector":
    """Fit each bijector sequentially on the output of the previous one.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Training data.
    y : ignored
        Not used, present for sklearn pipeline compatibility.

    Returns
    -------
    self : CompositeBijector
        The fitted composite bijector.
    """
    Xt = X.copy()  # working copy; shape (n_samples, n_features)
    for b in self.bijectors:
        # fit bijector b on current Xt, then advance Xt to b's output
        Xt = b.fit_transform(Xt)
    return self

transform(X)

Apply all bijectors left-to-right: g(x) = fₖ(… f₁(x) …).

Parameters

X : np.ndarray of shape (n_samples, n_features) Input data.

Returns

Z : np.ndarray of shape (n_samples, n_features) Data after passing through every bijector in sequence.

Source code in rbig/_src/base.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply all bijectors left-to-right: g(x) = fₖ(… f₁(x) …).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input data.

    Returns
    -------
    Z : np.ndarray of shape (n_samples, n_features)
        Data after passing through every bijector in sequence.
    """
    Xt = X.copy()  # shape (n_samples, n_features)
    for b in self.bijectors:
        Xt = b.transform(Xt)
    return Xt

inverse_transform(X)

Invert the composite map: g⁻¹(z) = f₁⁻¹(… fₖ⁻¹(z) …).

Parameters

X : np.ndarray of shape (n_samples, n_features) Data in the latent space.

Returns

Xr : np.ndarray of shape (n_samples, n_features) Data recovered in the original input space.

Source code in rbig/_src/base.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Invert the composite map: g⁻¹(z) = f₁⁻¹(… fₖ⁻¹(z) …).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data in the latent space.

    Returns
    -------
    Xr : np.ndarray of shape (n_samples, n_features)
        Data recovered in the original input space.
    """
    Xt = X.copy()  # shape (n_samples, n_features)
    # reverse order to undo the forward composition
    for b in reversed(self.bijectors):
        Xt = b.inverse_transform(Xt)
    return Xt

get_log_det_jacobian(X)

Sum log|det Jₖ| over all bijectors (chain rule).

Uses the chain rule for Jacobian determinants:

log|det J_g(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|
Parameters

X : np.ndarray of shape (n_samples, n_features) Input points.

Returns

log_det : np.ndarray of shape (n_samples,) Per-sample sum of log-det-Jacobians across all bijectors.

Source code in rbig/_src/base.py
def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Sum log|det Jₖ| over all bijectors (chain rule).

    Uses the chain rule for Jacobian determinants:

        log|det J_g(x)| = ∑ₖ log|det J_fₖ(xₖ₋₁)|

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Input points.

    Returns
    -------
    log_det : np.ndarray of shape (n_samples,)
        Per-sample sum of log-det-Jacobians across all bijectors.
    """
    Xt = X.copy()  # shape (n_samples, n_features)
    log_det = np.zeros(X.shape[0])  # accumulator, shape (n_samples,)
    for b in self.bijectors:
        # add log|det Jₖ| at the *current* intermediate input Xt
        log_det += b.get_log_det_jacobian(Xt)
        # advance Xt to the output of bijector b for the next iteration
        Xt = b.transform(Xt)
    return log_det

Information Theory

rbig.total_correlation_rbig(X)

Estimate Total Correlation (multivariate mutual information) of X.

Total Correlation is defined as:

TC(X) = ∑ᵢ H(Xᵢ) − H(X)

where the marginal entropies H(Xᵢ) are estimated via KDE (using marginal_entropy) and the joint entropy H(X) is estimated by fitting a multivariate Gaussian to the data (joint_entropy_gaussian).

Parameters

X : np.ndarray of shape (n_samples, n_features) Data matrix.

Returns

tc : float Estimated total correlation in nats. Values close to zero indicate approximate statistical independence among the dimensions.

Notes

See :func:rbig._src.densities.total_correlation for identical logic. This function is kept in metrics for API convenience.

Examples

import numpy as np from rbig._src.metrics import total_correlation_rbig rng = np.random.default_rng(0) X = rng.standard_normal((300, 4)) # independent Gaussians tc = total_correlation_rbig(X) tc >= -0.5 # should be near 0 True

Source code in rbig/_src/metrics.py
def total_correlation_rbig(X: np.ndarray) -> float:
    """Estimate Total Correlation (multivariate mutual information) of X.

    Total Correlation is defined as:

        TC(X) = ∑ᵢ H(Xᵢ) − H(X)

    where the marginal entropies H(Xᵢ) are estimated via KDE (using
    ``marginal_entropy``) and the joint entropy H(X) is estimated by fitting
    a multivariate Gaussian to the data (``joint_entropy_gaussian``).

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data matrix.

    Returns
    -------
    tc : float
        Estimated total correlation in nats.  Values close to zero indicate
        approximate statistical independence among the dimensions.

    Notes
    -----
    See :func:`rbig._src.densities.total_correlation` for identical logic.
    This function is kept in ``metrics`` for API convenience.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.metrics import total_correlation_rbig
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((300, 4))  # independent Gaussians
    >>> tc = total_correlation_rbig(X)
    >>> tc >= -0.5  # should be near 0
    True
    """
    from rbig._src.densities import joint_entropy_gaussian, marginal_entropy

    marg_h = marginal_entropy(X)  # ∑ᵢ H(Xᵢ), shape (n_features,)
    joint_h = joint_entropy_gaussian(X)  # H(X) via Gaussian approximation
    return float(np.sum(marg_h) - joint_h)

rbig.mutual_information_rbig(model_X, model_Y, model_XY)

Estimate mutual information between X and Y via RBIG models.

Uses the identity:

MI(X; Y) = H(X) + H(Y) − H(X, Y)

where each entropy is estimated from a separately fitted RBIG model.

Parameters

model_X : AnnealedRBIG RBIG model fitted on samples from the marginal distribution of X. model_Y : AnnealedRBIG RBIG model fitted on samples from the marginal distribution of Y. model_XY : AnnealedRBIG RBIG model fitted on joint samples [X, Y] (i.e. columns concatenated).

Returns

mi : float Estimated mutual information MI(X; Y) in nats. Non-negative for well-calibrated models; small negative values may appear due to numerical imprecision.

Notes

Each model.entropy() call returns the differential entropy estimated from the RBIG-transformed representation.

Examples

Assumes pre-fitted models; see AnnealedRBIG for fitting details.

mi = mutual_information_rbig(model_X, model_Y, model_XY) mi >= 0 # MI is non-negative True

Source code in rbig/_src/metrics.py
def mutual_information_rbig(
    model_X: AnnealedRBIG,
    model_Y: AnnealedRBIG,
    model_XY: AnnealedRBIG,
) -> float:
    """Estimate mutual information between X and Y via RBIG models.

    Uses the identity:

        MI(X; Y) = H(X) + H(Y) − H(X, Y)

    where each entropy is estimated from a separately fitted RBIG model.

    Parameters
    ----------
    model_X : AnnealedRBIG
        RBIG model fitted on samples from the marginal distribution of X.
    model_Y : AnnealedRBIG
        RBIG model fitted on samples from the marginal distribution of Y.
    model_XY : AnnealedRBIG
        RBIG model fitted on joint samples [X, Y] (i.e. columns concatenated).

    Returns
    -------
    mi : float
        Estimated mutual information MI(X; Y) in nats.  Non-negative for
        well-calibrated models; small negative values may appear due to
        numerical imprecision.

    Notes
    -----
    Each ``model.entropy()`` call returns the differential entropy estimated
    from the RBIG-transformed representation.

    Examples
    --------
    >>> # Assumes pre-fitted models; see AnnealedRBIG for fitting details.
    >>> mi = mutual_information_rbig(model_X, model_Y, model_XY)
    >>> mi >= 0  # MI is non-negative
    True
    """
    hx = model_X.entropy()  # H(X)
    hy = model_Y.entropy()  # H(Y)
    hxy = model_XY.entropy()  # H(X, Y)
    return float(hx + hy - hxy)  # MI(X;Y) = H(X) + H(Y) - H(X,Y)

rbig.kl_divergence_rbig(model_P, X_Q)

Estimate a divergence between distributions P and Q via a fitted RBIG model.

As implemented, this function returns:

−𝔼_Q[log p(x)] − H(P)

where 𝔼_Q is the expectation over samples X_Q from Q, log p is the log-density of model_P, and H(P) is the entropy of P estimated from the training data. Expanding H(P) = −𝔼_P[log p(x)]:

result = 𝔼_P[log p(x)] − 𝔼_Q[log p(x)]

.. note:: This quantity is not the standard KL divergence KL(P ‖ Q) = 𝔼_P[log p(x)/q(x)], because Q's log-density log q is never evaluated. The result is a measure of how differently P's log-density scores the P-samples versus the Q-samples. It equals zero when P and Q assign identical average log-probability under P's model.

Parameters

model_P : AnnealedRBIG RBIG model fitted on samples from distribution P. Must expose score_samples(X) and entropy(). X_Q : np.ndarray of shape (n_samples, n_features) Samples drawn from distribution Q against which P is compared.

Returns

divergence : float Estimated 𝔼_P[log p(x)] − 𝔼_Q[log p(x)] in nats.

Examples

When P == Q the divergence should be near zero.

kl = kl_divergence_rbig(model_P, X_from_P) kl >= -0.1 # small negative values possible due to approximation True

Source code in rbig/_src/metrics.py
def kl_divergence_rbig(
    model_P: AnnealedRBIG,
    X_Q: np.ndarray,
) -> float:
    """Estimate a divergence between distributions P and Q via a fitted RBIG model.

    As implemented, this function returns:

        −𝔼_Q[log p(x)] − H(P)

    where ``𝔼_Q`` is the expectation over samples ``X_Q`` from Q, ``log p``
    is the log-density of ``model_P``, and ``H(P)`` is the entropy of P
    estimated from the training data.  Expanding ``H(P) = −𝔼_P[log p(x)]``:

        result = 𝔼_P[log p(x)] − 𝔼_Q[log p(x)]

    .. note::
        This quantity is **not** the standard KL divergence
        ``KL(P ‖ Q) = 𝔼_P[log p(x)/q(x)]``, because Q's log-density
        ``log q`` is never evaluated.  The result is a measure of how
        differently P's log-density scores the P-samples versus the
        Q-samples.  It equals zero when P and Q assign identical average
        log-probability under P's model.

    Parameters
    ----------
    model_P : AnnealedRBIG
        RBIG model fitted on samples from distribution P.  Must expose
        ``score_samples(X)`` and ``entropy()``.
    X_Q : np.ndarray of shape (n_samples, n_features)
        Samples drawn from distribution Q against which P is compared.

    Returns
    -------
    divergence : float
        Estimated ``𝔼_P[log p(x)] − 𝔼_Q[log p(x)]`` in nats.

    Examples
    --------
    >>> # When P == Q the divergence should be near zero.
    >>> kl = kl_divergence_rbig(model_P, X_from_P)
    >>> kl >= -0.1  # small negative values possible due to approximation
    True
    """
    # Evaluate log p(x) for samples drawn from Q
    log_pq = model_P.score_samples(X_Q)  # shape (n_samples,)
    hp = model_P.entropy()  # H(P) estimated by RBIG
    return float(-np.mean(log_pq) - hp)  # -E_Q[log p] - H(P)

rbig.entropy_rbig(model, X)

Estimate differential entropy of X using a fitted RBIG model.

Approximates the entropy via the plug-in estimator:

H(X) = −𝔼[log p(x)] ≈ −(1/N) ∑ᵢ log p(xᵢ)

where log p(xᵢ) is provided by model.score_samples.

Parameters

model : AnnealedRBIG RBIG model fitted on data from the same distribution as X. Must expose a score_samples(X) method returning per-sample log probabilities. X : np.ndarray of shape (n_samples, n_features) Evaluation data used to compute the empirical expectation.

Returns

entropy : float Estimated differential entropy in nats.

Examples

Assumes a pre-fitted AnnealedRBIG model.

h = entropy_rbig(fitted_model, X_test) h > 0 # entropy is typically positive for continuous distributions True

Source code in rbig/_src/metrics.py
def entropy_rbig(model: AnnealedRBIG, X: np.ndarray) -> float:
    """Estimate differential entropy of X using a fitted RBIG model.

    Approximates the entropy via the plug-in estimator:

        H(X) = −𝔼[log p(x)] ≈ −(1/N) ∑ᵢ log p(xᵢ)

    where log p(xᵢ) is provided by ``model.score_samples``.

    Parameters
    ----------
    model : AnnealedRBIG
        RBIG model fitted on data from the same distribution as X.  Must
        expose a ``score_samples(X)`` method returning per-sample log
        probabilities.
    X : np.ndarray of shape (n_samples, n_features)
        Evaluation data used to compute the empirical expectation.

    Returns
    -------
    entropy : float
        Estimated differential entropy in nats.

    Examples
    --------
    >>> # Assumes a pre-fitted AnnealedRBIG model.
    >>> h = entropy_rbig(fitted_model, X_test)
    >>> h > 0  # entropy is typically positive for continuous distributions
    True
    """
    log_probs = model.score_samples(X)  # log p(xᵢ) for each sample, shape (N,)
    return float(-np.mean(log_probs))  # H ~= -(1/N) sum log p(xi)

rbig.entropy_marginal(X)

Per-dimension marginal entropy using the Vasicek spacing estimator.

Applies :func:entropy_univariate independently to each column of X.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data matrix.

Returns

entropies : np.ndarray of shape (n_features,) Vasicek entropy estimate (nats) for each feature dimension.

Examples

import numpy as np from rbig._src.metrics import entropy_marginal rng = np.random.default_rng(9) X = rng.standard_normal((800, 3)) h = entropy_marginal(X) h.shape (3,)

Source code in rbig/_src/metrics.py
def entropy_marginal(X: np.ndarray) -> np.ndarray:
    """Per-dimension marginal entropy using the Vasicek spacing estimator.

    Applies :func:`entropy_univariate` independently to each column of X.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data matrix.

    Returns
    -------
    entropies : np.ndarray of shape (n_features,)
        Vasicek entropy estimate (nats) for each feature dimension.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.metrics import entropy_marginal
    >>> rng = np.random.default_rng(9)
    >>> X = rng.standard_normal((800, 3))
    >>> h = entropy_marginal(X)
    >>> h.shape
    (3,)
    """
    n_features = X.shape[1]
    # Apply 1-D Vasicek estimator to each column independently
    return np.array([entropy_univariate(X[:, i]) for i in range(n_features)])

rbig.negentropy(X)

Compute per-dimension negentropy J(Xᵢ) = H_Gauss(Xᵢ) − H(Xᵢ).

Negentropy measures non-Gaussianity for each marginal:

J(Xᵢ) = H_Gauss(Xᵢ) − H(Xᵢ) ≥ 0

where H_Gauss(Xᵢ) = ½(1 + log 2π) + ½ log Var(Xᵢ) is the Gaussian entropy matched to the observed variance and H(Xᵢ) is estimated via KDE.

Parameters

X : np.ndarray of shape (n_samples, n_features) Data matrix.

Returns

neg_entropy : np.ndarray of shape (n_features,) Non-negative negentropy for each dimension. A value of 0 indicates that the marginal is Gaussian; larger values indicate more non-Gaussianity.

Notes

Negentropy is guaranteed non-negative by the maximum-entropy principle: among all distributions with a given variance, the Gaussian has the highest entropy.

Examples

import numpy as np from rbig._src.metrics import negentropy rng = np.random.default_rng(3) X_gauss = rng.standard_normal((500, 2)) J_gauss = negentropy(X_gauss) np.all(J_gauss >= -0.05) # nearly zero for Gaussian data True

Source code in rbig/_src/metrics.py
def negentropy(X: np.ndarray) -> np.ndarray:
    """Compute per-dimension negentropy J(Xᵢ) = H_Gauss(Xᵢ) − H(Xᵢ).

    Negentropy measures non-Gaussianity for each marginal:

        J(Xᵢ) = H_Gauss(Xᵢ) − H(Xᵢ) ≥ 0

    where H_Gauss(Xᵢ) = ½(1 + log 2π) + ½ log Var(Xᵢ) is the Gaussian
    entropy matched to the observed variance and H(Xᵢ) is estimated via KDE.

    Parameters
    ----------
    X : np.ndarray of shape (n_samples, n_features)
        Data matrix.

    Returns
    -------
    neg_entropy : np.ndarray of shape (n_features,)
        Non-negative negentropy for each dimension.  A value of 0 indicates
        that the marginal is Gaussian; larger values indicate more
        non-Gaussianity.

    Notes
    -----
    Negentropy is guaranteed non-negative by the maximum-entropy principle:
    among all distributions with a given variance, the Gaussian has the
    highest entropy.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.metrics import negentropy
    >>> rng = np.random.default_rng(3)
    >>> X_gauss = rng.standard_normal((500, 2))
    >>> J_gauss = negentropy(X_gauss)
    >>> np.all(J_gauss >= -0.05)  # nearly zero for Gaussian data
    True
    """
    _n, _d = X.shape
    # Gaussian entropy matched to empirical variance: H_Gauss = ½(1+log 2π) + ½ log σ²
    gauss_h = 0.5 * (1 + np.log(2 * np.pi)) + 0.5 * np.log(np.var(X, axis=0))
    from rbig._src.densities import marginal_entropy

    marg_h = marginal_entropy(X)  # KDE-based entropy, shape (n_features,)
    return gauss_h - marg_h  # J(Xi) = H_Gauss - H >= 0

rbig.negative_log_likelihood(model, X)

Average negative log-likelihood of X under the RBIG model.

Computes:

NLL = −(1/N) ∑ᵢ log p(xᵢ)

This is equivalent to :func:entropy_rbig but is exposed separately to make its role as a loss / evaluation metric explicit.

Parameters

model : AnnealedRBIG Fitted RBIG model. Must expose score_samples(X). X : np.ndarray of shape (n_samples, n_features) Evaluation data.

Returns

nll : float Average negative log-likelihood in nats.

Examples

nll = negative_log_likelihood(fitted_model, X_test) nll > 0 # NLL is positive for well-calibrated models True

Source code in rbig/_src/metrics.py
def negative_log_likelihood(model: AnnealedRBIG, X: np.ndarray) -> float:
    """Average negative log-likelihood of X under the RBIG model.

    Computes:

        NLL = −(1/N) ∑ᵢ log p(xᵢ)

    This is equivalent to :func:`entropy_rbig` but is exposed separately to
    make its role as a loss / evaluation metric explicit.

    Parameters
    ----------
    model : AnnealedRBIG
        Fitted RBIG model.  Must expose ``score_samples(X)``.
    X : np.ndarray of shape (n_samples, n_features)
        Evaluation data.

    Returns
    -------
    nll : float
        Average negative log-likelihood in nats.

    Examples
    --------
    >>> nll = negative_log_likelihood(fitted_model, X_test)
    >>> nll > 0  # NLL is positive for well-calibrated models
    True
    """
    log_probs = model.score_samples(X)  # log p(xᵢ), shape (N,)
    return float(-np.mean(log_probs))  # NLL = -(1/N) sum log p(xi)

Parametric Distributions

rbig.gaussian(n_samples=1000, loc=0.0, scale=1.0, random_state=None)

Sample from a univariate Gaussian (normal) distribution.

Parameters

n_samples : int, optional (default=1000) Number of samples to draw. loc : float, optional (default=0.0) Mean μ of the distribution. scale : float, optional (default=1.0) Standard deviation σ > 0 of the distribution. random_state : int or None, optional (default=None) Seed for the random number generator. Pass an integer for reproducible results.

Returns

samples : np.ndarray of shape (n_samples,) Independent draws from 𝒩(loc, scale²).

Examples

from rbig._src.parametric import gaussian x = gaussian(n_samples=500, loc=2.0, scale=0.5, random_state=0) x.shape (500,) import numpy as np np.isclose(x.mean(), 2.0, atol=0.1) True

Source code in rbig/_src/parametric.py
def gaussian(
    n_samples: int = 1000,
    loc: float = 0.0,
    scale: float = 1.0,
    random_state: int | None = None,
) -> np.ndarray:
    """Sample from a univariate Gaussian (normal) distribution.

    Parameters
    ----------
    n_samples : int, optional (default=1000)
        Number of samples to draw.
    loc : float, optional (default=0.0)
        Mean μ of the distribution.
    scale : float, optional (default=1.0)
        Standard deviation σ > 0 of the distribution.
    random_state : int or None, optional (default=None)
        Seed for the random number generator.  Pass an integer for
        reproducible results.

    Returns
    -------
    samples : np.ndarray of shape (n_samples,)
        Independent draws from 𝒩(loc, scale²).

    Examples
    --------
    >>> from rbig._src.parametric import gaussian
    >>> x = gaussian(n_samples=500, loc=2.0, scale=0.5, random_state=0)
    >>> x.shape
    (500,)
    >>> import numpy as np
    >>> np.isclose(x.mean(), 2.0, atol=0.1)
    True
    """
    rng = np.random.default_rng(random_state)
    return rng.normal(loc=loc, scale=scale, size=n_samples)

rbig.multivariate_gaussian(n_samples=1000, mean=None, cov=None, d=2, random_state=None)

Sample from a multivariate Gaussian distribution.

Parameters

n_samples : int, optional (default=1000) Number of samples to draw. mean : np.ndarray of shape (d,) or None, optional Mean vector μ. Defaults to the zero vector of length d. cov : np.ndarray of shape (d, d) or None, optional Covariance matrix Σ. Must be symmetric positive semi-definite. Defaults to the identity matrix Iₐ. d : int, optional (default=2) Dimensionality used when mean is None. Ignored when mean is provided (its length determines the dimension). random_state : int or None, optional (default=None) Seed for the random number generator.

Returns

samples : np.ndarray of shape (n_samples, d) Independent draws from 𝒩(mean, cov).

Examples

import numpy as np from rbig._src.parametric import multivariate_gaussian cov = np.array([[1.0, 0.8], [0.8, 1.0]]) X = multivariate_gaussian(n_samples=200, cov=cov, d=2, random_state=7) X.shape (200, 2) np.isclose(np.corrcoef(X.T)[0, 1], 0.8, atol=0.1) True

Source code in rbig/_src/parametric.py
def multivariate_gaussian(
    n_samples: int = 1000,
    mean: np.ndarray | None = None,
    cov: np.ndarray | None = None,
    d: int = 2,
    random_state: int | None = None,
) -> np.ndarray:
    """Sample from a multivariate Gaussian distribution.

    Parameters
    ----------
    n_samples : int, optional (default=1000)
        Number of samples to draw.
    mean : np.ndarray of shape (d,) or None, optional
        Mean vector μ.  Defaults to the zero vector of length ``d``.
    cov : np.ndarray of shape (d, d) or None, optional
        Covariance matrix Σ.  Must be symmetric positive semi-definite.
        Defaults to the identity matrix Iₐ.
    d : int, optional (default=2)
        Dimensionality used when ``mean`` is ``None``.  Ignored when
        ``mean`` is provided (its length determines the dimension).
    random_state : int or None, optional (default=None)
        Seed for the random number generator.

    Returns
    -------
    samples : np.ndarray of shape (n_samples, d)
        Independent draws from 𝒩(mean, cov).

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import multivariate_gaussian
    >>> cov = np.array([[1.0, 0.8], [0.8, 1.0]])
    >>> X = multivariate_gaussian(n_samples=200, cov=cov, d=2, random_state=7)
    >>> X.shape
    (200, 2)
    >>> np.isclose(np.corrcoef(X.T)[0, 1], 0.8, atol=0.1)
    True
    """
    rng = np.random.default_rng(random_state)
    if mean is None:
        mean = np.zeros(d)  # default: zero mean
    if cov is None:
        cov = np.eye(len(mean))  # default: identity covariance
    return rng.multivariate_normal(mean, cov, size=n_samples)

rbig.total_correlation_gaussian(cov)

Analytic Total Correlation of a multivariate Gaussian.

For a Gaussian with covariance Σ, the TC reduces to a function of the correlation matrix R = D^{-½} Σ D^{-½} (where D = diag(Σ)):

TC = ∑ᵢ H(Xᵢ) − H(X) = −½ log|R|

Equivalently, it measures how far the distribution is from being a product of its marginals.

Parameters

cov : np.ndarray of shape (d, d) Covariance matrix Σ. Coerced to at least 2-D.

Returns

tc : float Total correlation in nats. Returns +inf if Σ is singular.

Notes

The computation uses:

TC = (∑ᵢ ½ log(2πe σᵢ²)) − ½(d(1 + log 2π) + log|Σ|)
   = ½ ∑ᵢ log σᵢ² − ½ log|Σ|
   = −½ log|corr(Σ)|

Examples

import numpy as np from rbig._src.parametric import total_correlation_gaussian

Identity covariance → all marginals independent → TC = 0

tc = total_correlation_gaussian(np.eye(3)) np.isclose(tc, 0.0) True

Correlated covariance → TC > 0

cov = np.array([[1.0, 0.9], [0.9, 1.0]]) total_correlation_gaussian(cov) > 0 True

Source code in rbig/_src/parametric.py
def total_correlation_gaussian(cov: np.ndarray) -> float:
    """Analytic Total Correlation of a multivariate Gaussian.

    For a Gaussian with covariance Σ, the TC reduces to a function of the
    correlation matrix R = D^{-½} Σ D^{-½} (where D = diag(Σ)):

        TC = ∑ᵢ H(Xᵢ) − H(X) = −½ log|R|

    Equivalently, it measures how far the distribution is from being a
    product of its marginals.

    Parameters
    ----------
    cov : np.ndarray of shape (d, d)
        Covariance matrix Σ.  Coerced to at least 2-D.

    Returns
    -------
    tc : float
        Total correlation in nats.  Returns ``+inf`` if Σ is singular.

    Notes
    -----
    The computation uses:

        TC = (∑ᵢ ½ log(2πe σᵢ²)) − ½(d(1 + log 2π) + log|Σ|)
           = ½ ∑ᵢ log σᵢ² − ½ log|Σ|
           = −½ log|corr(Σ)|

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import total_correlation_gaussian
    >>> # Identity covariance → all marginals independent → TC = 0
    >>> tc = total_correlation_gaussian(np.eye(3))
    >>> np.isclose(tc, 0.0)
    True
    >>> # Correlated covariance → TC > 0
    >>> cov = np.array([[1.0, 0.9], [0.9, 1.0]])
    >>> total_correlation_gaussian(cov) > 0
    True
    """
    cov = np.atleast_2d(cov)
    d = cov.shape[0]
    marginal_vars = np.diag(cov)  # σᵢ²
    # ∑ᵢ H(Xᵢ) = ∑ᵢ ½ log(2πe σᵢ²)
    sum_marg_h = 0.5 * np.sum(np.log(2 * np.pi * np.e * marginal_vars))
    sign, log_det = np.linalg.slogdet(cov)  # log|Σ|
    if sign <= 0:
        return np.inf  # singular Σ
    # H(X) = ½ (d(1 + log 2π) + log|Σ|)
    joint_h = 0.5 * (d * (1 + np.log(2 * np.pi)) + log_det)
    return float(sum_marg_h - joint_h)  # TC = sum H(Xi) - H(X)

rbig.mutual_information_gaussian(cov_X, cov_Y, cov_XY)

Analytic mutual information between two jointly Gaussian variables.

Uses the entropy identity:

MI(X; Y) = H(X) + H(Y) − H(X, Y)

where each entropy is computed analytically from the corresponding covariance matrix via :func:entropy_gaussian.

Parameters

cov_X : np.ndarray of shape (d_X, d_X) Marginal covariance of X. cov_Y : np.ndarray of shape (d_Y, d_Y) Marginal covariance of Y. cov_XY : np.ndarray of shape (d_X + d_Y, d_X + d_Y) Joint covariance matrix of the concatenated variable [X, Y].

Returns

mi : float Mutual information in nats. Non-negative for valid covariance matrices; small negative values indicate numerical imprecision.

Notes

For Gaussians the MI can also be expressed as:

MI(X; Y) = −½ log(|Σ_{XX}| · |Σ_{YY}| / |Σ_{XY}|)

Examples

import numpy as np from rbig._src.parametric import mutual_information_gaussian

Block-diagonal joint covariance → MI = 0

cov_X = np.eye(2) cov_Y = np.eye(2) cov_XY = np.block([[cov_X, np.zeros((2, 2))], [np.zeros((2, 2)), cov_Y]]) mi = mutual_information_gaussian(cov_X, cov_Y, cov_XY) np.isclose(mi, 0.0, atol=1e-10) True

Source code in rbig/_src/parametric.py
def mutual_information_gaussian(
    cov_X: np.ndarray,
    cov_Y: np.ndarray,
    cov_XY: np.ndarray,
) -> float:
    """Analytic mutual information between two jointly Gaussian variables.

    Uses the entropy identity:

        MI(X; Y) = H(X) + H(Y) − H(X, Y)

    where each entropy is computed analytically from the corresponding
    covariance matrix via :func:`entropy_gaussian`.

    Parameters
    ----------
    cov_X : np.ndarray of shape (d_X, d_X)
        Marginal covariance of X.
    cov_Y : np.ndarray of shape (d_Y, d_Y)
        Marginal covariance of Y.
    cov_XY : np.ndarray of shape (d_X + d_Y, d_X + d_Y)
        Joint covariance matrix of the concatenated variable [X, Y].

    Returns
    -------
    mi : float
        Mutual information in nats.  Non-negative for valid covariance
        matrices; small negative values indicate numerical imprecision.

    Notes
    -----
    For Gaussians the MI can also be expressed as:

        MI(X; Y) = −½ log(|Σ_{XX}| · |Σ_{YY}| / |Σ_{XY}|)

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import mutual_information_gaussian
    >>> # Block-diagonal joint covariance → MI = 0
    >>> cov_X = np.eye(2)
    >>> cov_Y = np.eye(2)
    >>> cov_XY = np.block([[cov_X, np.zeros((2, 2))], [np.zeros((2, 2)), cov_Y]])
    >>> mi = mutual_information_gaussian(cov_X, cov_Y, cov_XY)
    >>> np.isclose(mi, 0.0, atol=1e-10)
    True
    """
    hx = entropy_gaussian(cov_X)  # H(X)
    hy = entropy_gaussian(cov_Y)  # H(Y)
    hxy = entropy_gaussian(cov_XY)  # H(X, Y)
    return float(hx + hy - hxy)  # MI(X;Y) = H(X) + H(Y) - H(X,Y)

rbig.kl_divergence_gaussian(mu0, cov0, mu1, cov1)

Analytic KL divergence KL(P₀ ‖ P₁) between two multivariate Gaussians.

Both distributions are assumed to be multivariate Gaussian:

P₀ = 𝒩(μ₀, Σ₀)   and   P₁ = 𝒩(μ₁, Σ₁)

The closed-form KL divergence is:

KL(P₀ ‖ P₁) = ½ [ tr(Σ₁⁻¹Σ₀) + (μ₁ − μ₀)ᵀ Σ₁⁻¹ (μ₁ − μ₀)
                   − d + log(|Σ₁| / |Σ₀|) ]

Parameters

mu0 : np.ndarray of shape (d,) Mean of the source distribution P₀. cov0 : np.ndarray of shape (d, d) Covariance Σ₀ of the source distribution P₀. mu1 : np.ndarray of shape (d,) Mean of the target distribution P₁. cov1 : np.ndarray of shape (d, d) Covariance Σ₁ of the target distribution P₁.

Returns

kl : float KL divergence KL(P₀ ‖ P₁) in nats. Always non-negative for valid covariance matrices.

Notes

The matrix inverse Σ₁⁻¹ is computed via np.linalg.inv; for large d a Cholesky-based approach would be more numerically stable.

Examples

import numpy as np from rbig._src.parametric import kl_divergence_gaussian

KL(P ‖ P) = 0 for identical distributions

mu = np.array([1.0, 2.0]) cov = np.array([[2.0, 0.5], [0.5, 1.5]]) kl = kl_divergence_gaussian(mu, cov, mu, cov) np.isclose(kl, 0.0, atol=1e-10) True

Source code in rbig/_src/parametric.py
def kl_divergence_gaussian(
    mu0: np.ndarray,
    cov0: np.ndarray,
    mu1: np.ndarray,
    cov1: np.ndarray,
) -> float:
    """Analytic KL divergence KL(P₀ ‖ P₁) between two multivariate Gaussians.

    Both distributions are assumed to be multivariate Gaussian:

        P₀ = 𝒩(μ₀, Σ₀)   and   P₁ = 𝒩(μ₁, Σ₁)

    The closed-form KL divergence is:

        KL(P₀ ‖ P₁) = ½ [ tr(Σ₁⁻¹Σ₀) + (μ₁ − μ₀)ᵀ Σ₁⁻¹ (μ₁ − μ₀)
                           − d + log(|Σ₁| / |Σ₀|) ]

    Parameters
    ----------
    mu0 : np.ndarray of shape (d,)
        Mean of the *source* distribution P₀.
    cov0 : np.ndarray of shape (d, d)
        Covariance Σ₀ of the source distribution P₀.
    mu1 : np.ndarray of shape (d,)
        Mean of the *target* distribution P₁.
    cov1 : np.ndarray of shape (d, d)
        Covariance Σ₁ of the target distribution P₁.

    Returns
    -------
    kl : float
        KL divergence KL(P₀ ‖ P₁) in nats.  Always non-negative for valid
        covariance matrices.

    Notes
    -----
    The matrix inverse Σ₁⁻¹ is computed via ``np.linalg.inv``; for large d
    a Cholesky-based approach would be more numerically stable.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import kl_divergence_gaussian
    >>> # KL(P ‖ P) = 0 for identical distributions
    >>> mu = np.array([1.0, 2.0])
    >>> cov = np.array([[2.0, 0.5], [0.5, 1.5]])
    >>> kl = kl_divergence_gaussian(mu, cov, mu, cov)
    >>> np.isclose(kl, 0.0, atol=1e-10)
    True
    """
    mu0 = np.atleast_1d(mu0)
    mu1 = np.atleast_1d(mu1)
    cov0 = np.atleast_2d(cov0)
    cov1 = np.atleast_2d(cov1)
    d = len(mu0)
    cov1_inv = np.linalg.inv(cov1)  # Σ₁⁻¹, shape (d, d)
    diff = mu1 - mu0  # mu1 - mu0, shape (d,)
    _sign0, log_det0 = np.linalg.slogdet(cov0)  # log|Σ₀|
    _sign1, log_det1 = np.linalg.slogdet(cov1)  # log|Σ₁|
    trace_term = np.trace(cov1_inv @ cov0)  # tr(Σ₁⁻¹Σ₀)
    quad_term = diff @ cov1_inv @ diff  # (mu1-mu0)^T Sigma1^-1 (mu1-mu0)
    log_det_term = log_det1 - log_det0  # log(|Sigma1|/|Sigma0|)
    # KL = 0.5 [tr(Sigma1^-1 Sigma0) + quad - d + log(|Sigma1|/|Sigma0|)]
    return float(0.5 * (trace_term + quad_term - d + log_det_term))

rbig.entropy_gaussian(cov)

Analytic differential entropy of a multivariate Gaussian.

Computes the closed-form entropy of 𝒩(μ, Σ):

H(X) = ½ log|2πeΣ| = ½ (d(1 + log 2π) + log|Σ|)

where d is the dimensionality and |·| denotes the matrix determinant. The mean μ does not affect the entropy.

Parameters

cov : np.ndarray of shape (d, d) or (1,) for scalar Covariance matrix Σ (or variance for d=1). Coerced to at least 2-D via np.atleast_2d.

Returns

entropy : float Differential entropy in nats. Returns -inf if Σ is singular or not positive definite.

Notes

np.linalg.slogdet is used for numerically stable log-determinant computation.

Examples

import numpy as np from rbig._src.parametric import entropy_gaussian

2-D standard Gaussian: H = 0.5 * 2 * (1 + log 2π) ≈ 2.838 nats

h = entropy_gaussian(np.eye(2)) np.isclose(h, 0.5 * 2 * (1 + np.log(2 * np.pi))) True

Source code in rbig/_src/parametric.py
def entropy_gaussian(cov: np.ndarray) -> float:
    """Analytic differential entropy of a multivariate Gaussian.

    Computes the closed-form entropy of 𝒩(μ, Σ):

        H(X) = ½ log|2πeΣ| = ½ (d(1 + log 2π) + log|Σ|)

    where d is the dimensionality and |·| denotes the matrix determinant.
    The mean μ does not affect the entropy.

    Parameters
    ----------
    cov : np.ndarray of shape (d, d) or (1,) for scalar
        Covariance matrix Σ (or variance for d=1).  Coerced to at least 2-D
        via ``np.atleast_2d``.

    Returns
    -------
    entropy : float
        Differential entropy in nats.  Returns ``-inf`` if Σ is singular or
        not positive definite.

    Notes
    -----
    ``np.linalg.slogdet`` is used for numerically stable log-determinant
    computation.

    Examples
    --------
    >>> import numpy as np
    >>> from rbig._src.parametric import entropy_gaussian
    >>> # 2-D standard Gaussian: H = 0.5 * 2 * (1 + log 2π) ≈ 2.838 nats
    >>> h = entropy_gaussian(np.eye(2))
    >>> np.isclose(h, 0.5 * 2 * (1 + np.log(2 * np.pi)))
    True
    """
    cov = np.atleast_2d(cov)
    d = cov.shape[0]
    sign, log_det = np.linalg.slogdet(cov)  # stable log|Σ|
    if sign <= 0:
        return -np.inf  # singular covariance
    # H = ½ (d(1 + log 2π) + log|Σ|)
    return 0.5 * (d * (1 + np.log(2 * np.pi)) + log_det)

Image Processing

rbig.ImageRBIG

RBIG orchestrator for image data.

Alternates between marginal Gaussianisation and an orthonormal spatial rotation for n_layers steps, progressively pushing the joint distribution of image pixels towards a multivariate Gaussian.

Each layer applies:

  1. :class:~rbig._src.marginal.MarginalGaussianize — maps every feature dimension to a standard normal marginal distribution.
  2. An orthonormal rotation selected by strategy — decorrelates features without changing the differential entropy.

Parameters

n_layers : int, default 10 Number of (marginal + rotation) layer pairs to apply. C : int, default 1 Number of image channels passed to the rotation layers. H : int, default 8 Image height in pixels passed to the rotation layers. W : int, default 8 Image width in pixels passed to the rotation layers. strategy : str, default "dct" Rotation strategy. One of:

* ``"dct"`` — Type-II orthonormal DCT (:class:`DCTRotation`).
* ``"hartley"`` — Discrete Hartley Transform
  (:class:`HartleyRotation`).
* ``"random_channel"`` — Random orthogonal channel mixing
  (:class:`RandomChannelRotation`).

Any unknown string falls back to ``"dct"``.

random_state : int or None, default None Base seed for rotation layers that use randomness (random_channel). Layer i uses seed random_state + i. verbose : bool or int, default=False Controls progress bar display. False (or 0) disables all progress bars. True (or 1) shows a progress bar for the fit loop. 2 additionally shows progress bars for transform and inverse_transform.

Attributes

layers_ : list of tuple (MarginalGaussianize, ImageBijector) Fitted (marginal, rotation) pairs in application order. X_transformed_ : np.ndarray, shape (N, C*H*W) Final transformed representation after the last layer.

Notes

The composed forward transform for a single sample :math:\mathbf{x} is

.. math::

\mathbf{z} = (R_L \circ G_L \circ \cdots \circ R_1 \circ G_1)(\mathbf{x})

where :math:G_\ell is marginal Gaussianisation and :math:R_\ell is an orthonormal rotation at layer :math:\ell. Because each rotation is orthonormal, the total log-determinant is determined entirely by the marginal transforms.

Examples

import numpy as np rng = np.random.default_rng(0) X = rng.standard_normal((50, 64)) # 50 images, C=1, H=8, W=8 model = ImageRBIG(n_layers=3, C=1, H=8, W=8, strategy="dct", random_state=0) model.fit(X) # doctest: +ELLIPSIS ImageRBIG(...) Xt = model.transform(X) Xt.shape (50, 64) Xr = model.inverse_transform(Xt) Xr.shape (50, 64)

Source code in rbig/_src/image.py
class ImageRBIG:
    """RBIG orchestrator for image data.

    Alternates between marginal Gaussianisation and an orthonormal spatial
    rotation for ``n_layers`` steps, progressively pushing the joint
    distribution of image pixels towards a multivariate Gaussian.

    Each layer applies:

    1. :class:`~rbig._src.marginal.MarginalGaussianize` — maps every feature
       dimension to a standard normal marginal distribution.
    2. An orthonormal rotation selected by ``strategy`` — decorrelates
       features without changing the differential entropy.

    Parameters
    ----------
    n_layers : int, default 10
        Number of (marginal + rotation) layer pairs to apply.
    C : int, default 1
        Number of image channels passed to the rotation layers.
    H : int, default 8
        Image height in pixels passed to the rotation layers.
    W : int, default 8
        Image width in pixels passed to the rotation layers.
    strategy : str, default ``"dct"``
        Rotation strategy.  One of:

        * ``"dct"`` — Type-II orthonormal DCT (:class:`DCTRotation`).
        * ``"hartley"`` — Discrete Hartley Transform
          (:class:`HartleyRotation`).
        * ``"random_channel"`` — Random orthogonal channel mixing
          (:class:`RandomChannelRotation`).

        Any unknown string falls back to ``"dct"``.
    random_state : int or None, default None
        Base seed for rotation layers that use randomness (``random_channel``).
        Layer ``i`` uses seed ``random_state + i``.
    verbose : bool or int, default=False
        Controls progress bar display.  ``False`` (or ``0``) disables all
        progress bars.  ``True`` (or ``1``) shows a progress bar for the
        ``fit`` loop.  ``2`` additionally shows progress bars for
        ``transform`` and ``inverse_transform``.

    Attributes
    ----------
    layers_ : list of tuple (MarginalGaussianize, ImageBijector)
        Fitted (marginal, rotation) pairs in application order.
    X_transformed_ : np.ndarray, shape ``(N, C*H*W)``
        Final transformed representation after the last layer.

    Notes
    -----
    The composed forward transform for a single sample :math:`\\mathbf{x}` is

    .. math::

        \\mathbf{z} = (R_L \\circ G_L \\circ \\cdots \\circ R_1 \\circ G_1)(\\mathbf{x})

    where :math:`G_\\ell` is marginal Gaussianisation and :math:`R_\\ell` is an
    orthonormal rotation at layer :math:`\\ell`.  Because each rotation is
    orthonormal, the total log-determinant is determined entirely by the
    marginal transforms.

    Examples
    --------
    >>> import numpy as np
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((50, 64))  # 50 images, C=1, H=8, W=8
    >>> model = ImageRBIG(n_layers=3, C=1, H=8, W=8, strategy="dct", random_state=0)
    >>> model.fit(X)  # doctest: +ELLIPSIS
    ImageRBIG(...)
    >>> Xt = model.transform(X)
    >>> Xt.shape
    (50, 64)
    >>> Xr = model.inverse_transform(Xt)
    >>> Xr.shape
    (50, 64)
    """

    def __init__(
        self,
        n_layers: int = 10,
        C: int = 1,
        H: int = 8,
        W: int = 8,
        strategy: str = "dct",
        random_state: int | None = None,
        verbose: bool | int = False,
    ):
        self.n_layers = n_layers
        self.C = C
        self.H = H
        self.W = W
        self.strategy = strategy
        self.random_state = random_state
        self.verbose = verbose

    def fit(self, X: np.ndarray, y=None) -> ImageRBIG:
        """Fit all (marginal, rotation) layer pairs sequentially.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Training image batch in flattened format.

        Returns
        -------
        self : ImageRBIG
        """
        from rbig._src._progress import maybe_tqdm
        from rbig._src.marginal import MarginalGaussianize

        self.layers_ = []
        Xt = X.copy()  # working copy updated layer by layer
        pbar = maybe_tqdm(
            range(self.n_layers),
            verbose=self.verbose,
            level=1,
            desc="Fitting ImageRBIG",
            total=self.n_layers,
        )
        for i in pbar:
            # Step 1: marginal Gaussianisation
            marginal = MarginalGaussianize()
            Xt_m = marginal.fit_transform(Xt)
            # Step 2: orthonormal spatial rotation
            rotation = self._make_rotation(seed=i)
            rotation.fit(Xt_m)
            self.layers_.append((marginal, rotation))
            Xt = rotation.transform(Xt_m)  # update for the next iteration
        self.X_transformed_ = Xt  # final representation after all layers
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply all fitted layers in forward order.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Image batch to transform.

        Returns
        -------
        Xt : np.ndarray, shape ``(N, C*H*W)``
            Gaussianised representation.
        """
        from rbig._src._progress import maybe_tqdm

        Xt = X.copy()
        layers_iter = maybe_tqdm(
            self.layers_,
            verbose=self.verbose,
            level=2,
            desc="Transforming",
            total=len(self.layers_),
        )
        for marginal, rotation in layers_iter:
            Xt = rotation.transform(marginal.transform(Xt))
        return Xt

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply all fitted layers in reverse order.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Gaussianised representation to invert.

        Returns
        -------
        Xr : np.ndarray, shape ``(N, C*H*W)``
            Reconstructed image batch in the original domain.
        """
        from rbig._src._progress import maybe_tqdm

        Xt = X.copy()
        # Reverse the layer list; apply inverse of each (rotation first, then marginal)
        layers_iter = maybe_tqdm(
            reversed(self.layers_),
            verbose=self.verbose,
            level=2,
            desc="Inverse transforming",
            total=len(self.layers_),
        )
        for marginal, rotation in layers_iter:
            Xt = marginal.inverse_transform(rotation.inverse_transform(Xt))
        return Xt

    def _make_rotation(self, seed: int = 0):
        """Instantiate the rotation layer for the given layer index.

        Parameters
        ----------
        seed : int, default 0
            Layer index; combined with ``random_state`` for reproducibility.

        Returns
        -------
        rotation : ImageBijector
            An unfitted rotation bijector of the type specified by
            ``self.strategy``.
        """
        # Combine base seed with layer index so each layer gets a unique seed
        rng_seed = (self.random_state or 0) + seed
        if self.strategy == "dct":
            return DCTRotation(C=self.C, H=self.H, W=self.W)
        elif self.strategy == "hartley":
            return HartleyRotation(C=self.C, H=self.H, W=self.W)
        elif self.strategy == "random_channel":
            return RandomChannelRotation(
                C=self.C, H=self.H, W=self.W, random_state=rng_seed
            )
        else:
            # Unknown strategy: fall back to DCT
            return DCTRotation(C=self.C, H=self.H, W=self.W)

fit(X, y=None)

Fit all (marginal, rotation) layer pairs sequentially.

Parameters

X : np.ndarray, shape (N, C*H*W) Training image batch in flattened format.

Returns

self : ImageRBIG

Source code in rbig/_src/image.py
def fit(self, X: np.ndarray, y=None) -> ImageRBIG:
    """Fit all (marginal, rotation) layer pairs sequentially.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Training image batch in flattened format.

    Returns
    -------
    self : ImageRBIG
    """
    from rbig._src._progress import maybe_tqdm
    from rbig._src.marginal import MarginalGaussianize

    self.layers_ = []
    Xt = X.copy()  # working copy updated layer by layer
    pbar = maybe_tqdm(
        range(self.n_layers),
        verbose=self.verbose,
        level=1,
        desc="Fitting ImageRBIG",
        total=self.n_layers,
    )
    for i in pbar:
        # Step 1: marginal Gaussianisation
        marginal = MarginalGaussianize()
        Xt_m = marginal.fit_transform(Xt)
        # Step 2: orthonormal spatial rotation
        rotation = self._make_rotation(seed=i)
        rotation.fit(Xt_m)
        self.layers_.append((marginal, rotation))
        Xt = rotation.transform(Xt_m)  # update for the next iteration
    self.X_transformed_ = Xt  # final representation after all layers
    return self

transform(X)

Apply all fitted layers in forward order.

Parameters

X : np.ndarray, shape (N, C*H*W) Image batch to transform.

Returns

Xt : np.ndarray, shape (N, C*H*W) Gaussianised representation.

Source code in rbig/_src/image.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply all fitted layers in forward order.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Image batch to transform.

    Returns
    -------
    Xt : np.ndarray, shape ``(N, C*H*W)``
        Gaussianised representation.
    """
    from rbig._src._progress import maybe_tqdm

    Xt = X.copy()
    layers_iter = maybe_tqdm(
        self.layers_,
        verbose=self.verbose,
        level=2,
        desc="Transforming",
        total=len(self.layers_),
    )
    for marginal, rotation in layers_iter:
        Xt = rotation.transform(marginal.transform(Xt))
    return Xt

inverse_transform(X)

Apply all fitted layers in reverse order.

Parameters

X : np.ndarray, shape (N, C*H*W) Gaussianised representation to invert.

Returns

Xr : np.ndarray, shape (N, C*H*W) Reconstructed image batch in the original domain.

Source code in rbig/_src/image.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply all fitted layers in reverse order.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Gaussianised representation to invert.

    Returns
    -------
    Xr : np.ndarray, shape ``(N, C*H*W)``
        Reconstructed image batch in the original domain.
    """
    from rbig._src._progress import maybe_tqdm

    Xt = X.copy()
    # Reverse the layer list; apply inverse of each (rotation first, then marginal)
    layers_iter = maybe_tqdm(
        reversed(self.layers_),
        verbose=self.verbose,
        level=2,
        desc="Inverse transforming",
        total=len(self.layers_),
    )
    for marginal, rotation in layers_iter:
        Xt = marginal.inverse_transform(rotation.inverse_transform(Xt))
    return Xt

rbig.ImageBijector

Bases: Bijector

Abstract base class for bijective image transforms.

Manages the conversion between the flattened representation (N, C·H·W) expected by RBIG and the 4-D tensor (N, C, H, W) used internally by spatial transforms.

Subclasses must implement :meth:fit, :meth:transform, and :meth:inverse_transform. The default :meth:get_log_det_jacobian returns zeros, which is correct for all orthonormal transforms defined in this module (|det J| = 1).

Attributes

C_ : int Number of channels (set during :meth:fit). H_ : int Image height in pixels (set during :meth:fit). W_ : int Image width in pixels (set during :meth:fit).

Notes

The two helper methods implement:

.. math::

\text{_to_tensor}: (N, C \cdot H \cdot W)
    \longrightarrow (N, C, H, W)

\text{_to_flat}: (N, C, H, W)
    \longrightarrow (N, C \cdot H \cdot W)
Source code in rbig/_src/image.py
class ImageBijector(Bijector):
    """Abstract base class for bijective image transforms.

    Manages the conversion between the flattened representation
    ``(N, C·H·W)`` expected by RBIG and the 4-D tensor ``(N, C, H, W)``
    used internally by spatial transforms.

    Subclasses must implement :meth:`fit`, :meth:`transform`, and
    :meth:`inverse_transform`.  The default :meth:`get_log_det_jacobian`
    returns zeros, which is correct for all orthonormal transforms defined
    in this module (``|det J| = 1``).

    Attributes
    ----------
    C_ : int
        Number of channels (set during :meth:`fit`).
    H_ : int
        Image height in pixels (set during :meth:`fit`).
    W_ : int
        Image width in pixels (set during :meth:`fit`).

    Notes
    -----
    The two helper methods implement:

    .. math::

        \\text{_to_tensor}: (N, C \\cdot H \\cdot W)
            \\longrightarrow (N, C, H, W)

        \\text{_to_flat}: (N, C, H, W)
            \\longrightarrow (N, C \\cdot H \\cdot W)
    """

    def _to_tensor(self, X: np.ndarray) -> np.ndarray:
        """Reshape flat ``(N, C*H*W)`` array to tensor ``(N, C, H, W)``.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Flattened image batch.

        Returns
        -------
        tensor : np.ndarray, shape ``(N, C, H, W)``
        """
        N = X.shape[0]
        C, H, W = self.C_, self.H_, self.W_
        return X.reshape(N, C, H, W)

    def _to_flat(self, X: np.ndarray) -> np.ndarray:
        """Reshape tensor ``(N, C, H, W)`` to flat ``(N, C*H*W)``.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C, H, W)``
            Image tensor batch.

        Returns
        -------
        flat : np.ndarray, shape ``(N, C*H*W)``
        """
        N = X.shape[0]
        return X.reshape(N, -1)

    def fit(self, X: np.ndarray, y=None) -> ImageBijector:
        raise NotImplementedError

    def transform(self, X: np.ndarray) -> np.ndarray:
        raise NotImplementedError

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        raise NotImplementedError

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return per-sample log |det J| = 0 (orthonormal transform).

        Parameters
        ----------
        X : np.ndarray, shape ``(N, D)``

        Returns
        -------
        log_det : np.ndarray, shape ``(N,)``
            All-zero array because the Jacobian determinant is ±1 for every
            orthonormal linear map.
        """
        return np.zeros(X.shape[0])

get_log_det_jacobian(X)

Return per-sample log |det J| = 0 (orthonormal transform).

Parameters

X : np.ndarray, shape (N, D)

Returns

log_det : np.ndarray, shape (N,) All-zero array because the Jacobian determinant is ±1 for every orthonormal linear map.

Source code in rbig/_src/image.py
def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return per-sample log |det J| = 0 (orthonormal transform).

    Parameters
    ----------
    X : np.ndarray, shape ``(N, D)``

    Returns
    -------
    log_det : np.ndarray, shape ``(N,)``
        All-zero array because the Jacobian determinant is ±1 for every
        orthonormal linear map.
    """
    return np.zeros(X.shape[0])

rbig.DCTRotation

Bases: ImageBijector

Type-II orthonormal 2-D Discrete Cosine Transform rotation.

Applies the 2-D DCT-II with orthonormal normalisation (norm="ortho") to each spatial channel. Because the ortho-normalised DCT is an orthogonal matrix, log|det J| = 0 for all inputs.

Parameters

C : int, default 1 Number of image channels. H : int, default 8 Image height in pixels. W : int, default 8 Image width in pixels.

Attributes

C_ : int Fitted number of channels. H_ : int Fitted image height. W_ : int Fitted image width.

Notes

For an orthonormal DCT matrix :math:\mathbf{D} acting on the vectorised image :math:\mathbf{x}:

.. math::

\mathbf{y} = \mathbf{D}\,\mathbf{x},
\quad
\log |\det J| = \log |\det \mathbf{D}| = 0

because :math:\mathbf{D} is orthogonal (det = ±1).

Examples

import numpy as np rng = np.random.default_rng(0) X = rng.standard_normal((5, 64)) # N=5, C=1, H=8, W=8 layer = DCTRotation(C=1, H=8, W=8) layer.fit(X) # doctest: +ELLIPSIS DCTRotation(...) Xt = layer.transform(X) Xr = layer.inverse_transform(Xt) np.allclose(X, Xr, atol=1e-10) True

Source code in rbig/_src/image.py
class DCTRotation(ImageBijector):
    """Type-II orthonormal 2-D Discrete Cosine Transform rotation.

    Applies the 2-D DCT-II with orthonormal normalisation (``norm="ortho"``)
    to each spatial channel.  Because the ortho-normalised DCT is an
    orthogonal matrix, ``log|det J| = 0`` for all inputs.

    Parameters
    ----------
    C : int, default 1
        Number of image channels.
    H : int, default 8
        Image height in pixels.
    W : int, default 8
        Image width in pixels.

    Attributes
    ----------
    C_ : int
        Fitted number of channels.
    H_ : int
        Fitted image height.
    W_ : int
        Fitted image width.

    Notes
    -----
    For an orthonormal DCT matrix :math:`\\mathbf{D}` acting on the
    vectorised image :math:`\\mathbf{x}`:

    .. math::

        \\mathbf{y} = \\mathbf{D}\\,\\mathbf{x},
        \\quad
        \\log |\\det J| = \\log |\\det \\mathbf{D}| = 0

    because :math:`\\mathbf{D}` is orthogonal (``det = ±1``).

    Examples
    --------
    >>> import numpy as np
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((5, 64))  # N=5, C=1, H=8, W=8
    >>> layer = DCTRotation(C=1, H=8, W=8)
    >>> layer.fit(X)  # doctest: +ELLIPSIS
    DCTRotation(...)
    >>> Xt = layer.transform(X)
    >>> Xr = layer.inverse_transform(Xt)
    >>> np.allclose(X, Xr, atol=1e-10)
    True
    """

    def __init__(self, C: int = 1, H: int = 8, W: int = 8):
        self.C = C
        self.H = H
        self.W = W

    def fit(self, X: np.ndarray, y=None) -> DCTRotation:
        """Store spatial dimensions; no data-dependent fitting required.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``

        Returns
        -------
        self : DCTRotation
        """
        self.C_ = self.C
        self.H_ = self.H
        self.W_ = self.W
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply orthonormal 2-D DCT-II to every image channel.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Flattened image batch.

        Returns
        -------
        Xt : np.ndarray, shape ``(N, C*H*W)``
            Orthonormal DCT-II coefficients.
        """
        from scipy.fft import dctn

        N = X.shape[0]
        imgs = self._to_tensor(X)  # (N, C, H, W)
        result = np.zeros_like(imgs)
        for n in range(N):
            for c in range(self.C_):
                # norm="ortho" yields the orthonormal variant of the DCT-II
                result[n, c] = dctn(imgs[n, c], norm="ortho")
        return self._to_flat(result)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply orthonormal 2-D inverse DCT (DCT-III scaled).

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            DCT coefficient batch from :meth:`transform`.

        Returns
        -------
        Xr : np.ndarray, shape ``(N, C*H*W)``
            Reconstructed image batch.
        """
        from scipy.fft import idctn

        N = X.shape[0]
        imgs = X.reshape(N, self.C_, self.H_, self.W_)  # (N, C, H, W)
        result = np.zeros_like(imgs)
        for n in range(N):
            for c in range(self.C_):
                # idctn with norm="ortho" is the exact inverse of dctn with norm="ortho"
                result[n, c] = idctn(imgs[n, c], norm="ortho")
        return self._to_flat(result)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros: orthonormal DCT has ``log|det J| = 0``.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, D)``

        Returns
        -------
        log_det : np.ndarray, shape ``(N,)``
        """
        return np.zeros(X.shape[0])

fit(X, y=None)

Store spatial dimensions; no data-dependent fitting required.

Parameters

X : np.ndarray, shape (N, C*H*W)

Returns

self : DCTRotation

Source code in rbig/_src/image.py
def fit(self, X: np.ndarray, y=None) -> DCTRotation:
    """Store spatial dimensions; no data-dependent fitting required.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``

    Returns
    -------
    self : DCTRotation
    """
    self.C_ = self.C
    self.H_ = self.H
    self.W_ = self.W
    return self

transform(X)

Apply orthonormal 2-D DCT-II to every image channel.

Parameters

X : np.ndarray, shape (N, C*H*W) Flattened image batch.

Returns

Xt : np.ndarray, shape (N, C*H*W) Orthonormal DCT-II coefficients.

Source code in rbig/_src/image.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply orthonormal 2-D DCT-II to every image channel.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Flattened image batch.

    Returns
    -------
    Xt : np.ndarray, shape ``(N, C*H*W)``
        Orthonormal DCT-II coefficients.
    """
    from scipy.fft import dctn

    N = X.shape[0]
    imgs = self._to_tensor(X)  # (N, C, H, W)
    result = np.zeros_like(imgs)
    for n in range(N):
        for c in range(self.C_):
            # norm="ortho" yields the orthonormal variant of the DCT-II
            result[n, c] = dctn(imgs[n, c], norm="ortho")
    return self._to_flat(result)

inverse_transform(X)

Apply orthonormal 2-D inverse DCT (DCT-III scaled).

Parameters

X : np.ndarray, shape (N, C*H*W) DCT coefficient batch from :meth:transform.

Returns

Xr : np.ndarray, shape (N, C*H*W) Reconstructed image batch.

Source code in rbig/_src/image.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply orthonormal 2-D inverse DCT (DCT-III scaled).

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        DCT coefficient batch from :meth:`transform`.

    Returns
    -------
    Xr : np.ndarray, shape ``(N, C*H*W)``
        Reconstructed image batch.
    """
    from scipy.fft import idctn

    N = X.shape[0]
    imgs = X.reshape(N, self.C_, self.H_, self.W_)  # (N, C, H, W)
    result = np.zeros_like(imgs)
    for n in range(N):
        for c in range(self.C_):
            # idctn with norm="ortho" is the exact inverse of dctn with norm="ortho"
            result[n, c] = idctn(imgs[n, c], norm="ortho")
    return self._to_flat(result)

get_log_det_jacobian(X)

Return zeros: orthonormal DCT has log|det J| = 0.

Parameters

X : np.ndarray, shape (N, D)

Returns

log_det : np.ndarray, shape (N,)

Source code in rbig/_src/image.py
def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros: orthonormal DCT has ``log|det J| = 0``.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, D)``

    Returns
    -------
    log_det : np.ndarray, shape ``(N,)``
    """
    return np.zeros(X.shape[0])

rbig.HartleyRotation

Bases: ImageBijector

Discrete Hartley Transform — real-to-real orthonormal rotation.

The 2-D Discrete Hartley Transform (DHT) is defined as

.. math::

H(\mathbf{x}) = \operatorname{Re}\bigl(\text{FFT}(\mathbf{x})\bigr)
                - \operatorname{Im}\bigl(\text{FFT}(\mathbf{x})\bigr)

and is normalised by 1/√(H·W) to make it orthonormal (unitary). Because the DHT is its own inverse (self-inverse), the same operation is applied in both :meth:transform and :meth:inverse_transform.

Since the transform is orthonormal log|det J| = 0 for all inputs.

Parameters

C : int, default 1 Number of image channels. H : int, default 8 Image height in pixels. W : int, default 8 Image width in pixels.

Attributes

C_ : int Fitted number of channels. H_ : int Fitted image height. W_ : int Fitted image width.

Notes

The normalised DHT satisfies

.. math::

H(H(\mathbf{x})) = \mathbf{x}

making it a self-inverse bijection. The scaling factor is :math:1 / \sqrt{H \cdot W}.

Examples

import numpy as np rng = np.random.default_rng(0) X = rng.standard_normal((5, 64)) # N=5, C=1, H=8, W=8 layer = HartleyRotation(C=1, H=8, W=8) layer.fit(X) # doctest: +ELLIPSIS HartleyRotation(...) Xt = layer.transform(X) Xr = layer.inverse_transform(Xt) np.allclose(X, Xr, atol=1e-10) True

Source code in rbig/_src/image.py
class HartleyRotation(ImageBijector):
    """Discrete Hartley Transform — real-to-real orthonormal rotation.

    The 2-D Discrete Hartley Transform (DHT) is defined as

    .. math::

        H(\\mathbf{x}) = \\operatorname{Re}\\bigl(\\text{FFT}(\\mathbf{x})\\bigr)
                        - \\operatorname{Im}\\bigl(\\text{FFT}(\\mathbf{x})\\bigr)

    and is normalised by ``1/√(H·W)`` to make it orthonormal (unitary).
    Because the DHT is its own inverse (self-inverse), the same operation is
    applied in both :meth:`transform` and :meth:`inverse_transform`.

    Since the transform is orthonormal ``log|det J| = 0`` for all inputs.

    Parameters
    ----------
    C : int, default 1
        Number of image channels.
    H : int, default 8
        Image height in pixels.
    W : int, default 8
        Image width in pixels.

    Attributes
    ----------
    C_ : int
        Fitted number of channels.
    H_ : int
        Fitted image height.
    W_ : int
        Fitted image width.

    Notes
    -----
    The normalised DHT satisfies

    .. math::

        H(H(\\mathbf{x})) = \\mathbf{x}

    making it a self-inverse bijection.  The scaling factor is
    :math:`1 / \\sqrt{H \\cdot W}`.

    Examples
    --------
    >>> import numpy as np
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((5, 64))  # N=5, C=1, H=8, W=8
    >>> layer = HartleyRotation(C=1, H=8, W=8)
    >>> layer.fit(X)  # doctest: +ELLIPSIS
    HartleyRotation(...)
    >>> Xt = layer.transform(X)
    >>> Xr = layer.inverse_transform(Xt)
    >>> np.allclose(X, Xr, atol=1e-10)
    True
    """

    def __init__(self, C: int = 1, H: int = 8, W: int = 8):
        self.C = C
        self.H = H
        self.W = W

    def fit(self, X: np.ndarray, y=None) -> HartleyRotation:
        """Store spatial dimensions; no data-dependent fitting required.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``

        Returns
        -------
        self : HartleyRotation
        """
        self.C_ = self.C
        self.H_ = self.H
        self.W_ = self.W
        return self

    def _dht2(self, x: np.ndarray) -> np.ndarray:
        """Compute un-normalised 2-D Discrete Hartley Transform.

        .. math::

            H_{m,n} = \\operatorname{Re}(F_{m,n}) - \\operatorname{Im}(F_{m,n})

        where :math:`F = \\text{FFT2}(x)`.

        Parameters
        ----------
        x : np.ndarray, shape ``(H, W)``
            Single real-valued image channel.

        Returns
        -------
        h : np.ndarray, shape ``(H, W)``
            Un-normalised DHT coefficients.
        """
        from scipy.fft import fft2

        X_fft = fft2(x)  # complex FFT2, shape (H, W)
        # DHT = Re(FFT) - Im(FFT)
        return X_fft.real - X_fft.imag

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Apply normalised 2-D DHT to every image channel.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``
            Flattened image batch.

        Returns
        -------
        Xt : np.ndarray, shape ``(N, C*H*W)``
            DHT coefficients scaled by ``1/√(H*W)``.
        """
        N = X.shape[0]
        imgs = self._to_tensor(X)  # (N, C, H, W)
        result = np.zeros_like(imgs)
        for n in range(N):
            for c in range(self.C_):
                # Normalise by 1/√(H·W) to make the transform orthonormal
                result[n, c] = self._dht2(imgs[n, c]) / np.sqrt(self.H_ * self.W_)
        return self._to_flat(result)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Apply the inverse DHT (identical to the forward transform).

        The normalised DHT satisfies ``H(H(x)) = x``, so the forward and
        inverse transforms are the same function.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, C*H*W)``

        Returns
        -------
        Xr : np.ndarray, shape ``(N, C*H*W)``
        """
        # DHT is self-inverse (up to scale factor already applied in transform)
        return self.transform(X)

    def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
        """Return zeros: normalised DHT has ``|det J| = 1``.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, D)``

        Returns
        -------
        log_det : np.ndarray, shape ``(N,)``
        """
        return np.zeros(X.shape[0])

fit(X, y=None)

Store spatial dimensions; no data-dependent fitting required.

Parameters

X : np.ndarray, shape (N, C*H*W)

Returns

self : HartleyRotation

Source code in rbig/_src/image.py
def fit(self, X: np.ndarray, y=None) -> HartleyRotation:
    """Store spatial dimensions; no data-dependent fitting required.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``

    Returns
    -------
    self : HartleyRotation
    """
    self.C_ = self.C
    self.H_ = self.H
    self.W_ = self.W
    return self

transform(X)

Apply normalised 2-D DHT to every image channel.

Parameters

X : np.ndarray, shape (N, C*H*W) Flattened image batch.

Returns

Xt : np.ndarray, shape (N, C*H*W) DHT coefficients scaled by 1/√(H*W).

Source code in rbig/_src/image.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Apply normalised 2-D DHT to every image channel.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``
        Flattened image batch.

    Returns
    -------
    Xt : np.ndarray, shape ``(N, C*H*W)``
        DHT coefficients scaled by ``1/√(H*W)``.
    """
    N = X.shape[0]
    imgs = self._to_tensor(X)  # (N, C, H, W)
    result = np.zeros_like(imgs)
    for n in range(N):
        for c in range(self.C_):
            # Normalise by 1/√(H·W) to make the transform orthonormal
            result[n, c] = self._dht2(imgs[n, c]) / np.sqrt(self.H_ * self.W_)
    return self._to_flat(result)

inverse_transform(X)

Apply the inverse DHT (identical to the forward transform).

The normalised DHT satisfies H(H(x)) = x, so the forward and inverse transforms are the same function.

Parameters

X : np.ndarray, shape (N, C*H*W)

Returns

Xr : np.ndarray, shape (N, C*H*W)

Source code in rbig/_src/image.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Apply the inverse DHT (identical to the forward transform).

    The normalised DHT satisfies ``H(H(x)) = x``, so the forward and
    inverse transforms are the same function.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, C*H*W)``

    Returns
    -------
    Xr : np.ndarray, shape ``(N, C*H*W)``
    """
    # DHT is self-inverse (up to scale factor already applied in transform)
    return self.transform(X)

get_log_det_jacobian(X)

Return zeros: normalised DHT has |det J| = 1.

Parameters

X : np.ndarray, shape (N, D)

Returns

log_det : np.ndarray, shape (N,)

Source code in rbig/_src/image.py
def get_log_det_jacobian(self, X: np.ndarray) -> np.ndarray:
    """Return zeros: normalised DHT has ``|det J| = 1``.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, D)``

    Returns
    -------
    log_det : np.ndarray, shape ``(N,)``
    """
    return np.zeros(X.shape[0])

rbig.WaveletTransform

Bases: BaseTransform

Multi-level 2-D wavelet decomposition for image data.

Wraps PyWavelets wavedec2 / waverec2 to provide the standard fit / transform / inverse_transform interface expected by RBIG pipeline components.

The forward transform maps each (H, W) image to a flat coefficient vector of length H * W (for periodization boundary mode the coefficient array has the same number of elements as the input image).

Requires PyWavelets (pip install PyWavelets).

Parameters

wavelet : str, default "haar" Wavelet name accepted by :func:pywt.Wavelet, e.g. "haar", "db2", "sym4". level : int, default 1 Decomposition depth. Higher levels produce coarser approximation sub-bands. mode : str, default "periodization" Signal extension mode passed to PyWavelets. "periodization" ensures the output coefficient array has the same total size as the input.

Attributes

original_shape_ : tuple of int Shape (N, H, W) of the training data passed to :meth:fit. coeff_slices_ : list PyWavelets slicing metadata needed to pack/unpack the coefficient array. Set during :meth:fit. coeff_shape_ : tuple of int Shape of the 2-D coefficient array produced by :func:pywt.coeffs_to_array.

Notes

The mapping from image to coefficients is

.. math::

(N,\, H \cdot W) \xrightarrow{\text{wavedec2}}
(N,\, H \cdot W)

For level=1 and wavelet="haar" the four sub-bands are: approximation (LL), horizontal detail (LH), vertical detail (HL), and diagonal detail (HH).

Examples

import numpy as np rng = np.random.default_rng(0) X = rng.standard_normal((50, 8, 8)) # 50 grayscale 8×8 images wt = WaveletTransform(wavelet="haar", level=1) wt.fit(X) # doctest: +ELLIPSIS WaveletTransform(...) Xt = wt.transform(X) Xt.shape (50, 64) Xr = wt.inverse_transform(Xt) Xr.shape (50, 8, 8)

Source code in rbig/_src/image.py
class WaveletTransform(BaseTransform):
    """Multi-level 2-D wavelet decomposition for image data.

    Wraps PyWavelets ``wavedec2`` / ``waverec2`` to provide the standard
    ``fit`` / ``transform`` / ``inverse_transform`` interface expected by
    RBIG pipeline components.

    The forward transform maps each ``(H, W)`` image to a flat coefficient
    vector of length ``H * W`` (for periodization boundary mode the coefficient
    array has the same number of elements as the input image).

    Requires PyWavelets (``pip install PyWavelets``).

    Parameters
    ----------
    wavelet : str, default ``"haar"``
        Wavelet name accepted by :func:`pywt.Wavelet`, e.g. ``"haar"``,
        ``"db2"``, ``"sym4"``.
    level : int, default 1
        Decomposition depth.  Higher levels produce coarser approximation
        sub-bands.
    mode : str, default ``"periodization"``
        Signal extension mode passed to PyWavelets.  ``"periodization"``
        ensures the output coefficient array has the same total size as the
        input.

    Attributes
    ----------
    original_shape_ : tuple of int
        Shape ``(N, H, W)`` of the training data passed to :meth:`fit`.
    coeff_slices_ : list
        PyWavelets slicing metadata needed to pack/unpack the coefficient
        array.  Set during :meth:`fit`.
    coeff_shape_ : tuple of int
        Shape of the 2-D coefficient array produced by
        :func:`pywt.coeffs_to_array`.

    Notes
    -----
    The mapping from image to coefficients is

    .. math::

        (N,\\, H \\cdot W) \\xrightarrow{\\text{wavedec2}}
        (N,\\, H \\cdot W)

    For ``level=1`` and ``wavelet="haar"`` the four sub-bands are:
    approximation (LL), horizontal detail (LH), vertical detail (HL), and
    diagonal detail (HH).

    Examples
    --------
    >>> import numpy as np
    >>> rng = np.random.default_rng(0)
    >>> X = rng.standard_normal((50, 8, 8))  # 50 grayscale 8×8 images
    >>> wt = WaveletTransform(wavelet="haar", level=1)
    >>> wt.fit(X)  # doctest: +ELLIPSIS
    WaveletTransform(...)
    >>> Xt = wt.transform(X)
    >>> Xt.shape
    (50, 64)
    >>> Xr = wt.inverse_transform(Xt)
    >>> Xr.shape
    (50, 8, 8)
    """

    def __init__(
        self, wavelet: str = "haar", level: int = 1, mode: str = "periodization"
    ):
        self.wavelet = wavelet
        self.level = level
        self.mode = mode

    def fit(self, X: np.ndarray, y=None) -> WaveletTransform:
        """Compute and store coefficient layout from the first sample.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, H, W)``
            Training images.  Only the first sample is used to determine the
            coefficient array shape; the data values are not retained.

        Returns
        -------
        self : WaveletTransform
        """
        import pywt

        self.pywt_ = pywt
        self.original_shape_ = X.shape  # store (N, H, W) for reference
        test = X[0]  # single image of shape (H, W)
        coeffs = pywt.wavedec2(test, self.wavelet, level=self.level, mode=self.mode)
        self.coeff_slices_ = None
        # coeffs_to_array packs all sub-bands into one 2-D array
        arr, self.coeff_slices_ = pywt.coeffs_to_array(coeffs)
        self.coeff_shape_ = arr.shape  # e.g. (H, W) for periodization
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        """Decompose images into flattened wavelet coefficients.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, H, W)``
            Images to transform.

        Returns
        -------
        Xt : np.ndarray, shape ``(N, H*W)``
            Flattened coefficient vectors, one row per image.
        """
        import pywt

        result = []
        for xi in X:  # xi shape: (H, W)
            coeffs = pywt.wavedec2(xi, self.wavelet, level=self.level, mode=self.mode)
            arr, _ = pywt.coeffs_to_array(coeffs)  # arr shape: coeff_shape_
            result.append(arr.ravel())  # flatten to 1-D coefficient vector
        return np.array(result)  # (N, H*W)

    def inverse_transform(self, X: np.ndarray) -> np.ndarray:
        """Reconstruct images from flattened wavelet coefficients.

        Parameters
        ----------
        X : np.ndarray, shape ``(N, H*W)``
            Flattened coefficient vectors produced by :meth:`transform`.

        Returns
        -------
        Xr : np.ndarray, shape ``(N, H, W)``
            Reconstructed images.
        """
        import pywt

        result = []
        for xi in X:  # xi shape: (H*W,)
            arr = xi.reshape(self.coeff_shape_)  # restore 2-D coefficient array
            coeffs = pywt.array_to_coeffs(
                arr, self.coeff_slices_, output_format="wavedec2"
            )
            img = pywt.waverec2(coeffs, self.wavelet, mode=self.mode)  # (H, W)
            result.append(img)
        return np.array(result)  # (N, H, W)

fit(X, y=None)

Compute and store coefficient layout from the first sample.

Parameters

X : np.ndarray, shape (N, H, W) Training images. Only the first sample is used to determine the coefficient array shape; the data values are not retained.

Returns

self : WaveletTransform

Source code in rbig/_src/image.py
def fit(self, X: np.ndarray, y=None) -> WaveletTransform:
    """Compute and store coefficient layout from the first sample.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, H, W)``
        Training images.  Only the first sample is used to determine the
        coefficient array shape; the data values are not retained.

    Returns
    -------
    self : WaveletTransform
    """
    import pywt

    self.pywt_ = pywt
    self.original_shape_ = X.shape  # store (N, H, W) for reference
    test = X[0]  # single image of shape (H, W)
    coeffs = pywt.wavedec2(test, self.wavelet, level=self.level, mode=self.mode)
    self.coeff_slices_ = None
    # coeffs_to_array packs all sub-bands into one 2-D array
    arr, self.coeff_slices_ = pywt.coeffs_to_array(coeffs)
    self.coeff_shape_ = arr.shape  # e.g. (H, W) for periodization
    return self

transform(X)

Decompose images into flattened wavelet coefficients.

Parameters

X : np.ndarray, shape (N, H, W) Images to transform.

Returns

Xt : np.ndarray, shape (N, H*W) Flattened coefficient vectors, one row per image.

Source code in rbig/_src/image.py
def transform(self, X: np.ndarray) -> np.ndarray:
    """Decompose images into flattened wavelet coefficients.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, H, W)``
        Images to transform.

    Returns
    -------
    Xt : np.ndarray, shape ``(N, H*W)``
        Flattened coefficient vectors, one row per image.
    """
    import pywt

    result = []
    for xi in X:  # xi shape: (H, W)
        coeffs = pywt.wavedec2(xi, self.wavelet, level=self.level, mode=self.mode)
        arr, _ = pywt.coeffs_to_array(coeffs)  # arr shape: coeff_shape_
        result.append(arr.ravel())  # flatten to 1-D coefficient vector
    return np.array(result)  # (N, H*W)

inverse_transform(X)

Reconstruct images from flattened wavelet coefficients.

Parameters

X : np.ndarray, shape (N, H*W) Flattened coefficient vectors produced by :meth:transform.

Returns

Xr : np.ndarray, shape (N, H, W) Reconstructed images.

Source code in rbig/_src/image.py
def inverse_transform(self, X: np.ndarray) -> np.ndarray:
    """Reconstruct images from flattened wavelet coefficients.

    Parameters
    ----------
    X : np.ndarray, shape ``(N, H*W)``
        Flattened coefficient vectors produced by :meth:`transform`.

    Returns
    -------
    Xr : np.ndarray, shape ``(N, H, W)``
        Reconstructed images.
    """
    import pywt

    result = []
    for xi in X:  # xi shape: (H*W,)
        arr = xi.reshape(self.coeff_shape_)  # restore 2-D coefficient array
        coeffs = pywt.array_to_coeffs(
            arr, self.coeff_slices_, output_format="wavedec2"
        )
        img = pywt.waverec2(coeffs, self.wavelet, mode=self.mode)  # (H, W)
        result.append(img)
    return np.array(result)  # (N, H, W)

Xarray Integration

rbig.XarrayRBIG

RBIG model with an xarray-aware interface.

Wraps an :class:~rbig._src.model.AnnealedRBIG (or compatible class) so that it can be fitted and applied directly to :class:xarray.DataArray / :class:xarray.Dataset objects with spatiotemporal dimensions. The underlying model operates on a 2-D (samples, features) matrix obtained via :func:xr_st_to_matrix.

Parameters

n_layers : int, default 100 Maximum number of RBIG layers. strategy : list or None, default None Rotation strategy list passed to the underlying RBIG model. If None, the default rotation of the model class is used. tol : float, default 1e-5 Convergence tolerance for early stopping. random_state : int or None, default None Random seed for reproducibility. rbig_class : class or None, default None RBIG model class to instantiate. Defaults to :class:~rbig._src.model.AnnealedRBIG when None. rbig_kwargs : dict or None, default None Additional keyword arguments forwarded to rbig_class. verbose : bool or int, default=False Controls progress bar display. Passed through to the underlying RBIG model.

Attributes

model_ : AnnealedRBIG The fitted underlying RBIG model. meta_ : dict xarray metadata captured during :meth:fit, used to reconstruct output arrays.

Examples

import numpy as np import xarray as xr rng = np.random.default_rng(0) da = xr.DataArray( ... rng.standard_normal((30, 4, 5)), ... dims=["time", "lat", "lon"], ... ) xrbig = XarrayRBIG(n_layers=10, random_state=0)

info = xrbig.fit(da)

da_t = xrbig.transform(da)

Source code in rbig/_src/xarray_st.py
class XarrayRBIG:
    """RBIG model with an xarray-aware interface.

    Wraps an :class:`~rbig._src.model.AnnealedRBIG` (or compatible class) so
    that it can be fitted and applied directly to :class:`xarray.DataArray` /
    :class:`xarray.Dataset` objects with spatiotemporal dimensions.  The
    underlying model operates on a 2-D ``(samples, features)`` matrix obtained
    via :func:`xr_st_to_matrix`.

    Parameters
    ----------
    n_layers : int, default 100
        Maximum number of RBIG layers.
    strategy : list or None, default None
        Rotation strategy list passed to the underlying RBIG model.  If
        ``None``, the default rotation of the model class is used.
    tol : float, default 1e-5
        Convergence tolerance for early stopping.
    random_state : int or None, default None
        Random seed for reproducibility.
    rbig_class : class or None, default None
        RBIG model class to instantiate.  Defaults to
        :class:`~rbig._src.model.AnnealedRBIG` when ``None``.
    rbig_kwargs : dict or None, default None
        Additional keyword arguments forwarded to ``rbig_class``.
    verbose : bool or int, default=False
        Controls progress bar display.  Passed through to the underlying
        RBIG model.

    Attributes
    ----------
    model_ : AnnealedRBIG
        The fitted underlying RBIG model.
    meta_ : dict
        xarray metadata captured during :meth:`fit`, used to reconstruct
        output arrays.

    Examples
    --------
    >>> import numpy as np
    >>> import xarray as xr
    >>> rng = np.random.default_rng(0)
    >>> da = xr.DataArray(
    ...     rng.standard_normal((30, 4, 5)),
    ...     dims=["time", "lat", "lon"],
    ... )
    >>> xrbig = XarrayRBIG(n_layers=10, random_state=0)
    >>> # info = xrbig.fit(da)
    >>> # da_t = xrbig.transform(da)
    """

    def __init__(
        self,
        n_layers: int = 100,
        strategy: list | None = None,
        tol: float = 1e-5,
        random_state: int | None = None,
        rbig_class=None,
        rbig_kwargs: dict | None = None,
        verbose: bool | int = False,
    ):
        self.n_layers = n_layers
        self.strategy = strategy
        self.tol = tol
        self.random_state = random_state
        self.rbig_class = rbig_class
        self.rbig_kwargs = rbig_kwargs or {}
        self.verbose = verbose

    def fit(self, X) -> dict:
        """Fit the RBIG model to xarray data and return an information summary.

        Parameters
        ----------
        X : xr.DataArray or xr.Dataset
            Input spatiotemporal data.  Internally converted to a 2-D matrix
            via :func:`xr_st_to_matrix`.

        Returns
        -------
        info : dict
            Dictionary of RBIG information metrics (e.g. total correlation,
            entropy estimates) as returned by
            :func:`~rbig._src.metrics.information_summary`.
        """
        from rbig._src.metrics import information_summary
        from rbig._src.model import AnnealedRBIG

        rbig_cls = self.rbig_class if self.rbig_class is not None else AnnealedRBIG
        kwargs = {
            "n_layers": self.n_layers,
            "tol": self.tol,
            "random_state": self.random_state,
        }
        if self.strategy is not None:
            kwargs["strategy"] = self.strategy
        kwargs.update(self.rbig_kwargs)
        kwargs["verbose"] = self.verbose

        # Convert xarray → (n_samples, n_features) matrix and store metadata
        matrix, self.meta_ = xr_st_to_matrix(X)
        self.model_ = rbig_cls(**kwargs)
        self.model_.fit(matrix)
        return information_summary(self.model_, matrix)

    def transform(self, X):
        """Gaussianise samples and return an xarray object.

        Applies the fitted RBIG transform to ``X``, then reconstructs the
        original xarray structure.  Original coordinates and DataArray name
        are re-attached when possible.

        Parameters
        ----------
        X : xr.DataArray or xr.Dataset
            Data to transform.  Must have the same structure as the data
            passed to :meth:`fit`.

        Returns
        -------
        out : xr.DataArray or xr.Dataset
            Gaussianised data with the same shape and dimension names as ``X``.
        """
        matrix, _ = xr_st_to_matrix(X)
        Xt = self.model_.transform(matrix)
        out = matrix_to_xr_st(Xt, self.meta_)
        # Re-attach original xarray coordinates and name when available
        if hasattr(X, "assign_coords") and hasattr(X, "coords"):
            try:
                out = out.assign_coords(X.coords)
            except Exception:
                pass
        if hasattr(X, "name") and hasattr(out, "name") and X.name is not None:
            try:
                out.name = X.name
            except Exception:
                pass
        return out

    def score_samples(self, X):
        """Compute per-sample log-probability log p(x).

        Parameters
        ----------
        X : xr.DataArray or xr.Dataset
            Input data.

        Returns
        -------
        log_prob : np.ndarray, shape ``(n_samples,)``
            Log-probability of each sample under the fitted RBIG model.
        """
        matrix, _ = xr_st_to_matrix(X)
        return self.model_.score_samples(matrix)

    def mutual_information(self, X, Y) -> float:
        """Estimate mutual information between two xarray variables via RBIG.

        Fits independent RBIG models to ``X``, ``Y``, and their concatenation
        ``[X, Y]``, then computes:

        .. math::

            \\mathrm{MI}(X;\\,Y)
            = H(X) + H(Y) - H(X,\\,Y)

        where each differential entropy :math:`H` is estimated from the RBIG
        log-determinant accumulation.

        Parameters
        ----------
        X : xr.DataArray or xr.Dataset
            First variable.
        Y : xr.DataArray or xr.Dataset
            Second variable.  Must have the same number of samples as ``X``
            after flattening.

        Returns
        -------
        mi : float
            Estimated mutual information in nats.

        Notes
        -----
        All three RBIG models share the same ``n_layers``, ``tol``, and
        ``random_state`` settings as the parent :class:`XarrayRBIG` instance.

        Examples
        --------
        >>> import numpy as np
        >>> import xarray as xr
        >>> rng = np.random.default_rng(0)
        >>> x = xr.DataArray(rng.standard_normal((50, 1)), dims=["time", "f"])
        >>> y = xr.DataArray(rng.standard_normal((50, 1)), dims=["time", "f"])
        >>> xrbig = XarrayRBIG(n_layers=5, random_state=0)
        >>> # mi = xrbig.mutual_information(x, y)
        """
        from rbig._src.metrics import entropy_rbig
        from rbig._src.model import AnnealedRBIG

        rbig_cls = self.rbig_class if self.rbig_class is not None else AnnealedRBIG
        kwargs = {
            "n_layers": self.n_layers,
            "tol": self.tol,
            "random_state": self.random_state,
        }
        kwargs.update(self.rbig_kwargs)

        # Flatten both variables to 2-D matrices
        X_mat, _ = xr_st_to_matrix(X)
        Y_mat, _ = xr_st_to_matrix(Y)
        XY_mat = np.hstack([X_mat, Y_mat])  # joint representation (n, dx + dy)

        # Fit three separate RBIG models for H(X), H(Y), H(X,Y)
        mx = rbig_cls(**kwargs).fit(X_mat)
        my = rbig_cls(**kwargs).fit(Y_mat)
        mxy = rbig_cls(**kwargs).fit(XY_mat)

        hx = entropy_rbig(mx, X_mat)  # H(X)
        hy = entropy_rbig(my, Y_mat)  # H(Y)
        hxy = entropy_rbig(mxy, XY_mat)  # H(X, Y)
        # MI(X;Y) = H(X) + H(Y) - H(X,Y)
        return float(hx + hy - hxy)

fit(X)

Fit the RBIG model to xarray data and return an information summary.

Parameters

X : xr.DataArray or xr.Dataset Input spatiotemporal data. Internally converted to a 2-D matrix via :func:xr_st_to_matrix.

Returns

info : dict Dictionary of RBIG information metrics (e.g. total correlation, entropy estimates) as returned by :func:~rbig._src.metrics.information_summary.

Source code in rbig/_src/xarray_st.py
def fit(self, X) -> dict:
    """Fit the RBIG model to xarray data and return an information summary.

    Parameters
    ----------
    X : xr.DataArray or xr.Dataset
        Input spatiotemporal data.  Internally converted to a 2-D matrix
        via :func:`xr_st_to_matrix`.

    Returns
    -------
    info : dict
        Dictionary of RBIG information metrics (e.g. total correlation,
        entropy estimates) as returned by
        :func:`~rbig._src.metrics.information_summary`.
    """
    from rbig._src.metrics import information_summary
    from rbig._src.model import AnnealedRBIG

    rbig_cls = self.rbig_class if self.rbig_class is not None else AnnealedRBIG
    kwargs = {
        "n_layers": self.n_layers,
        "tol": self.tol,
        "random_state": self.random_state,
    }
    if self.strategy is not None:
        kwargs["strategy"] = self.strategy
    kwargs.update(self.rbig_kwargs)
    kwargs["verbose"] = self.verbose

    # Convert xarray → (n_samples, n_features) matrix and store metadata
    matrix, self.meta_ = xr_st_to_matrix(X)
    self.model_ = rbig_cls(**kwargs)
    self.model_.fit(matrix)
    return information_summary(self.model_, matrix)

transform(X)

Gaussianise samples and return an xarray object.

Applies the fitted RBIG transform to X, then reconstructs the original xarray structure. Original coordinates and DataArray name are re-attached when possible.

Parameters

X : xr.DataArray or xr.Dataset Data to transform. Must have the same structure as the data passed to :meth:fit.

Returns

out : xr.DataArray or xr.Dataset Gaussianised data with the same shape and dimension names as X.

Source code in rbig/_src/xarray_st.py
def transform(self, X):
    """Gaussianise samples and return an xarray object.

    Applies the fitted RBIG transform to ``X``, then reconstructs the
    original xarray structure.  Original coordinates and DataArray name
    are re-attached when possible.

    Parameters
    ----------
    X : xr.DataArray or xr.Dataset
        Data to transform.  Must have the same structure as the data
        passed to :meth:`fit`.

    Returns
    -------
    out : xr.DataArray or xr.Dataset
        Gaussianised data with the same shape and dimension names as ``X``.
    """
    matrix, _ = xr_st_to_matrix(X)
    Xt = self.model_.transform(matrix)
    out = matrix_to_xr_st(Xt, self.meta_)
    # Re-attach original xarray coordinates and name when available
    if hasattr(X, "assign_coords") and hasattr(X, "coords"):
        try:
            out = out.assign_coords(X.coords)
        except Exception:
            pass
    if hasattr(X, "name") and hasattr(out, "name") and X.name is not None:
        try:
            out.name = X.name
        except Exception:
            pass
    return out

score_samples(X)

Compute per-sample log-probability log p(x).

Parameters

X : xr.DataArray or xr.Dataset Input data.

Returns

log_prob : np.ndarray, shape (n_samples,) Log-probability of each sample under the fitted RBIG model.

Source code in rbig/_src/xarray_st.py
def score_samples(self, X):
    """Compute per-sample log-probability log p(x).

    Parameters
    ----------
    X : xr.DataArray or xr.Dataset
        Input data.

    Returns
    -------
    log_prob : np.ndarray, shape ``(n_samples,)``
        Log-probability of each sample under the fitted RBIG model.
    """
    matrix, _ = xr_st_to_matrix(X)
    return self.model_.score_samples(matrix)

mutual_information(X, Y)

Estimate mutual information between two xarray variables via RBIG.

Fits independent RBIG models to X, Y, and their concatenation [X, Y], then computes:

.. math::

\mathrm{MI}(X;\,Y)
= H(X) + H(Y) - H(X,\,Y)

where each differential entropy :math:H is estimated from the RBIG log-determinant accumulation.

Parameters

X : xr.DataArray or xr.Dataset First variable. Y : xr.DataArray or xr.Dataset Second variable. Must have the same number of samples as X after flattening.

Returns

mi : float Estimated mutual information in nats.

Notes

All three RBIG models share the same n_layers, tol, and random_state settings as the parent :class:XarrayRBIG instance.

Examples

import numpy as np import xarray as xr rng = np.random.default_rng(0) x = xr.DataArray(rng.standard_normal((50, 1)), dims=["time", "f"]) y = xr.DataArray(rng.standard_normal((50, 1)), dims=["time", "f"]) xrbig = XarrayRBIG(n_layers=5, random_state=0)

mi = xrbig.mutual_information(x, y)

Source code in rbig/_src/xarray_st.py
def mutual_information(self, X, Y) -> float:
    """Estimate mutual information between two xarray variables via RBIG.

    Fits independent RBIG models to ``X``, ``Y``, and their concatenation
    ``[X, Y]``, then computes:

    .. math::

        \\mathrm{MI}(X;\\,Y)
        = H(X) + H(Y) - H(X,\\,Y)

    where each differential entropy :math:`H` is estimated from the RBIG
    log-determinant accumulation.

    Parameters
    ----------
    X : xr.DataArray or xr.Dataset
        First variable.
    Y : xr.DataArray or xr.Dataset
        Second variable.  Must have the same number of samples as ``X``
        after flattening.

    Returns
    -------
    mi : float
        Estimated mutual information in nats.

    Notes
    -----
    All three RBIG models share the same ``n_layers``, ``tol``, and
    ``random_state`` settings as the parent :class:`XarrayRBIG` instance.

    Examples
    --------
    >>> import numpy as np
    >>> import xarray as xr
    >>> rng = np.random.default_rng(0)
    >>> x = xr.DataArray(rng.standard_normal((50, 1)), dims=["time", "f"])
    >>> y = xr.DataArray(rng.standard_normal((50, 1)), dims=["time", "f"])
    >>> xrbig = XarrayRBIG(n_layers=5, random_state=0)
    >>> # mi = xrbig.mutual_information(x, y)
    """
    from rbig._src.metrics import entropy_rbig
    from rbig._src.model import AnnealedRBIG

    rbig_cls = self.rbig_class if self.rbig_class is not None else AnnealedRBIG
    kwargs = {
        "n_layers": self.n_layers,
        "tol": self.tol,
        "random_state": self.random_state,
    }
    kwargs.update(self.rbig_kwargs)

    # Flatten both variables to 2-D matrices
    X_mat, _ = xr_st_to_matrix(X)
    Y_mat, _ = xr_st_to_matrix(Y)
    XY_mat = np.hstack([X_mat, Y_mat])  # joint representation (n, dx + dy)

    # Fit three separate RBIG models for H(X), H(Y), H(X,Y)
    mx = rbig_cls(**kwargs).fit(X_mat)
    my = rbig_cls(**kwargs).fit(Y_mat)
    mxy = rbig_cls(**kwargs).fit(XY_mat)

    hx = entropy_rbig(mx, X_mat)  # H(X)
    hy = entropy_rbig(my, Y_mat)  # H(Y)
    hxy = entropy_rbig(mxy, XY_mat)  # H(X, Y)
    # MI(X;Y) = H(X) + H(Y) - H(X,Y)
    return float(hx + hy - hxy)