Fair Gaussianization — input-side follow-up experiments
Seven alternatives that put the flow on inputs instead of outputs
Fair Gaussianization — follow-up experiments¶
0. Why a follow-up¶
The original experiment puts the Gaussianization flow on the predictor’s output side — and — and measures their dependence as a training-time fairness penalty. That works, but it has two structural problems we surfaced empirically in notebooks 06–07:
- Moving target on . The predictor’s output distribution shifts during training, so (pretrained on a baseline) goes off-support. Gradients keep flowing, but they encode “distance from the baseline distribution” rather than “distance from independence.” G-TC’s constant-predictor collapse is the symptom.
- Inputs are untouched. never moves during predictor training, so a flow on is always in-support. We’re not using that.
This doc proposes seven follow-up experiments that move the flow’s role around the pipeline. Three of them are pure preprocessing (the flow runs once, offline); two are training-time but on stable quantities; one is a counterfactual data-augmentation. Each comes with math, pseudocode, an explicit “ask” of what new infrastructure is needed, honest tradeoffs, and a falsifiable hypothesis.
1. TL;DR — the seven approaches at a glance¶
Table (1):The seven follow-up approaches. Each row is a separate downstream-training recipe; the flow’s job changes from row to row.
| Approach | Flow’s job | When it runs | Predictor sees | Fairness mechanism |
|---|---|---|---|---|
| A. Input whitening | Gaussianises features | Offline | indirect (better conditioning) | |
| B. Fair feature selection | Per-feature dependence score | Offline | (top- independent) | hard subset selection |
| C. Subspace projection | Joint + q-orthogonal projection | Offline | hard linear projection | |
| D. Conditional flow | Gaussianises given | Offline | structural ( by construction) | |
| E. Counterfactual augmentation | Generates with flipped | Offline (data prep) | data-level + consistency loss | |
| F. Density-ratio reweighting | Estimates | Offline (weight prep) | with weights | classical importance weighting |
| G. Representation bottleneck (stretch) | on encoder output | Training-time (with refresh) | encoded then head | soft penalty on intermediate representation |
All seven leave the predictor’s training-loop architecture unchanged (or nearly so) — the work happens before training, and the predictor sees a tweaked feature space or a weighted task loss. Compare to the original experiment, where the fairness logic was inside the optimisation loop. The follow-ups are easier to compose, easier to debug, and don’t carry the moving-target risk.
Notation throughout: inputs, targets, sensitive attribute, the trainable predictor.
2. Approach A — Input whitening¶
The simplest and most boring of the six. Frozen Gaussianization flow as preprocessor — a direct descendant of RBIG-style whitening Laparra et al., 2011; no fairness machinery added on top.
2.1 Math¶
Pretrain a joint Gaussianization flow
Freeze. The predictor sees Gaussianised inputs:
This is information-preserving ( is a diffeomorphism, so carries the same information as ), but the predictor operates on features with controlled marginals and a closer-to-isotropic joint.
2.2 Pipeline¶
2.3 Pseudocode¶
from gaussianization.fair import fit_and_freeze
# Stage 1: pretrain once
T_X, _ = fit_and_freeze(
X_train, num_blocks=8, num_components=12, epochs=200, seed=0,
)
# Stage 2: standard Keras training on Gaussianised inputs
X_train_whitened = T_X(X_train)
X_test_whitened = T_X(X_test)
mlp = keras.Sequential([
keras.Input(shape=(d,)),
keras.layers.Dense(32, "relu"),
keras.layers.Dense(1),
])
mlp.compile(optimizer="adam", loss="mse")
mlp.fit(X_train_whitened, y_train, ...)Or, if you want inside the predictor graph (so saliency explanations live in original-X space), wrap it as a frozen layer:
inputs = keras.Input(shape=(d,))
whitened = GaussianizationLayer(T_X)(inputs) # NEW: thin wrapper
out = keras.layers.Dense(32, "relu")(whitened)
out = keras.layers.Dense(1)(out)
mlp = keras.Model(inputs, out)2.4 Asks (new infrastructure)¶
| Item | Effort | Notes |
|---|---|---|
gauss_keras.GaussianizationLayer(flow, trainable=False) | S (one wrapper) | A keras.layers.Layer that forwards through the flow and refuses gradient updates on its params. |
Notebook 08_input_whitening_baseline.ipynb | M | Adult + synthetic; ablation of whitening on/off. |
No new losses. The existing fit_and_freeze handles the offline step.
2.5 Tradeoffs¶
Plus
- Pure preprocessing — composes with any predictor, any task loss, any other fairness method.
- Better-conditioned input tends to speed up training and removes the need for per-feature scale tuning.
- Helps when features have heavy tails or multi-modal marginals.
Minus
- Doesn’t reduce fairness gap on its own. If the bias was already in the joint of , preserves it.
- One extra forward pass per minibatch. Tiny on CPU; negligible on GPU.
- Interpretability cost: a “feature value” is now in Gaussianised space,
so domain-meaningful values (
age = 45) need an inverse pass to recover.
2.6 Hypothesis¶
3. Approach B — Fair feature selection¶
Use per-feature Gaussianised dependence to rank features by their “q-leakage” and select the most-independent subset — a non-linear generalisation of the linear-CKA filter Cortes et al., 2012. The flow’s job is to compute a score per feature, not to enter the predictor’s forward pass.
3.1 Math¶
For each feature dimension :
where is a per-feature 1-D Gaussianization flow (or, more efficiently, the -th marginal of a joint ). Rank features by ascending; select the top- smallest:
Train the predictor on .
Soft variant. Replace the hard top- with a learnable sigmoid mask :
with pre-computed (frozen). This is end-to-end differentiable in and θ, with as a fixed weight vector.
Higher-order variant. Replace with G-MI
or G-TC per feature — picks up non-monotone dependence that the Pearson-corr
analog (|cor(X_i, q)|) misses.
3.2 Pipeline¶
3.3 Pseudocode¶
Hard selection:
from gaussianization.fair import (
fit_and_freeze,
score_features_g, # NEW
)
# Stage 1: per-feature flows + dependence scores
flows = [fit_and_freeze(X_train[:, i:i+1], ...)[0] for i in range(d)]
T_q, _ = fit_and_freeze(q_train.reshape(-1, 1), ...)
rho_g = score_features_g(X_train, q_train, flows, T_q, metric="g_xcov")
# rho_g : np.ndarray of shape (d,), values in [0, 1]
# Stage 2: select top-K least dependent features
S_K = np.argsort(np.abs(rho_g))[:K]
# Stage 3: standard training on the selected subset
mlp = build_mlp(input_dim=K)
mlp.fit(X_train[:, S_K], y_train, ...)Soft selection:
# rho_g pre-computed as above; freeze it as a non-trainable constant
class FeatureMaskedMLP(keras.Model):
def __init__(self, d, rho_g, lam=1.0, gamma=0.01):
super().__init__()
# Trainable logits for the mask; sigmoid pushes to [0, 1]
self.mask_logits = self.add_weight(shape=(d,), initializer="zeros")
self.rho_g = ops.convert_to_tensor(rho_g, dtype="float32")
self.mlp = build_mlp(d)
self.lam, self.gamma = lam, gamma
def call(self, x, training=False):
m = ops.sigmoid(self.mask_logits)
if training:
self.add_loss(self.lam * ops.sum(m * ops.abs(self.rho_g)))
self.add_loss(self.gamma * ops.sum(m)) # sparsity
return self.mlp(x * m)3.4 Asks (new infrastructure)¶
| Item | Effort | Notes |
|---|---|---|
gaussianization.fair.score_features_g(X, q, flows, T_q, metric) | S | Vectorised per-feature scoring; returns shape (d,). |
gaussianization.fair.fit_marginals(X, ...) | S | Convenience: fit one 1-D flow per feature in parallel. |
Notebook 08_fair_feature_selection.ipynb | L | Bar chart of ` |
No new losses; the soft-variant uses standard add_loss plumbing.
3.5 Tradeoffs¶
Plus
- Once features are selected, the predictor’s training loop has no fairness logic — composes with any architecture, any optimiser.
- Catches non-monotone proxies that the linear
|Pearson|baseline misses. (E.g. a feature with a U-shaped relationship to gender: linear corr ≈ 0, Gaussianised corr huge.) - Interpretable: one score per feature, easy to communicate to stakeholders.
- Naturally extends to soft selection with a sparsity-regularised mask, which is end-to-end differentiable.
Minus
- Hard selection loses information — some unfair features are also predictive, and dropping them costs accuracy.
- Per-feature flows ignore correlations across features. A bivariate
proxy (
(X_5, X_8)together leak but neither alone does) is invisible. Use a joint flow or HSIC-over-feature-blocks to catch this. - Static — the dependence ranking is computed once, doesn’t adapt to which features the predictor actually uses.
3.6 Hypothesis¶
4. Approach C — Gaussianised subspace projection¶
Generalise B from “drop features” to “project out the q-direction in Gaussianised space” — a Gaussianisation analogue of fair PCA Olfat & Aswani, 2019. Same flow, more powerful selection.
4.1 Math¶
Pretrain a joint flow with . The flow’s rotation layers have implicitly chosen a basis for the Gaussianised latent space. In that basis, the “q-direction” is the cross-covariance:
Hard projection (single sensitive attribute, ). The unit-length q-direction in -space is . The orthogonal projection is
By construction — the linear component of dependence is zero. Because are marginally Gaussian, this is most of the dependence.
Hard projection (multi-class, ). SVD of , project onto the orthogonal complement of its largest singular vectors. Strips the top- q-correlated directions.
Soft projection (learnable basis). Parameterise a linear map with orthogonality constraint, train end-to-end with task loss and a G-XCOV penalty on :
The orthogonality constraint can be enforced via Stiefel-manifold optimisation or a soft penalty .
4.2 Pipeline¶
4.3 Pseudocode¶
Hard projection:
from gaussianization.fair import fit_and_freeze, q_orthogonal_projection
T_X, _ = fit_and_freeze(X_train, num_blocks=8, ...) # joint flow
T_q, _ = fit_and_freeze(q_train.reshape(-1, 1), ...)
Z_train = np.asarray(T_X(X_train))
Q_train = np.asarray(T_q(q_train.reshape(-1, 1)))
P = q_orthogonal_projection(Z_train, Q_train) # NEW: (d, d) matrix
# Predictor sees the projected representation
mlp = build_mlp(input_dim=d)
def features(X):
return ops.matmul(T_X(X), P)
mlp.fit(features(X_train), y_train, ...)Soft variant (Stiefel-soft):
GaussianizedXCovLoss would re-apply T_X to its z_pred input and
also shape-mismatch when k != d, so we compute the cross-covariance
penalty directly on the already-projected against :
class FairProjMLP(keras.Model):
def __init__(self, d, k, T_X, T_q, mu=1.0, ortho_lam=10.0):
super().__init__()
self.T_X, self.T_q = T_X, T_q
self.P = self.add_weight(
shape=(d, k), initializer="orthogonal"
)
self.mlp = build_mlp(input_dim=k)
self.mu, self.ortho_lam = mu, ortho_lam
def _xcov_penalty(self, Zp, q):
# ||Cov(Zp, T_q(q))||_F^2 / (||Cov(Zp)||_F · ||Cov(T_q(q))||_F)
qg = self.T_q(q)
Zp_c = Zp - ops.mean(Zp, axis=0, keepdims=True)
qg_c = qg - ops.mean(qg, axis=0, keepdims=True)
n = ops.cast(ops.shape(Zp_c)[0], Zp_c.dtype)
denom = ops.maximum(n - 1.0, 1.0)
C = ops.matmul(ops.transpose(Zp_c), qg_c) / denom
S_z = ops.matmul(ops.transpose(Zp_c), Zp_c) / denom
S_q = ops.matmul(ops.transpose(qg_c), qg_c) / denom
fz = ops.sqrt(ops.sum(S_z * S_z)); fq = ops.sqrt(ops.sum(S_q * S_q))
return ops.sum(C * C) / (fz * fq + 1e-12)
def call(self, inputs, training=False):
x, q = inputs["x"], inputs["q"]
Z = self.T_X(x)
Zp = ops.matmul(Z, self.P)
if training:
# Fairness penalty on the projected latent (k-dim, not d-dim)
self.add_loss(self.mu * self._xcov_penalty(Zp, q))
# Orthogonality constraint as soft penalty
PtP = ops.matmul(ops.transpose(self.P), self.P)
self.add_loss(self.ortho_lam *
ops.sum((PtP - ops.eye(self.P.shape[1])) ** 2))
return self.mlp(Zp)(For the eventual implementation, factor out _xcov_penalty as a free
function — it’s the same linear-CKA computation used in
GaussianizedXCovLoss, just without the leading pass.)
4.4 Asks¶
| Item | Effort | Notes |
|---|---|---|
gaussianization.fair.q_orthogonal_projection(Z, Q, rank=1) | S | SVD-based; handles multi-dim . |
gaussianization.fair.FairProjModel (soft variant) | M | Composes GaussianizationLayer (Approach A) + Stiefel-soft trainable . |
Notebook 09_subspace_projection.ipynb | L | Hard vs soft on Adult; orthogonality monitoring. |
4.5 Tradeoffs¶
Plus
- Hard variant has closed form — no extra training, just one SVD.
- Captures the direction of dependence in -space, not merely dimensional selection. Stronger than B in two ways: (i) handles bivariate proxies (info leakage that lives across two original features), (ii) removes only the -correlated component, keeping the rest of each feature.
- Composes with any of our existing losses (project first, then add G-XCOV on top — defence in depth).
Minus
- The projection is linear in -space, which is non-linear in -space (the flow is non-linear). So the “fair subspace” doesn’t have a clean interpretation in original feature units. Interpretation requires composing with .
- One direction at a time; multi-q (race in COMPAS) needs SVD or iterative.
- Information loss — removes dimensions of variance entirely. Predictive signal that happened to align with is gone.
4.6 Hypothesis¶
5. Approach D — Conditional flow ¶
The most ambitious — and close in spirit to the conditional normalising flows of Winkler et al. (2019) and the invariant-representation objective of Moyer et al. (2018). A flow whose parameters depend on , Gaussianising given . The residual is structurally independent of .
5.1 Math¶
Train a conditional Gaussianization flow
such that for every value of ,
Freeze. Train a predictor on the residual :
Mechanism. Coupling-layer Gaussianization with FiLM conditioning: each coupling layer’s conditioner MLP takes both the active half of and the conditioning , and produces shift/scale parameters that depend on both. The marginal Gaussianization layers’ mixture-CDF parameters also become -dependent (a small MLP from to mixture parameters).
Why it’s “by construction”. has the same distribution for every value of — that’s the training objective. So , i.e. .
5.2 Pipeline¶
5.3 Pseudocode¶
from gaussianization.gauss_keras.conditional import (
ConditionalGaussianizationFlow, # NEW class
)
from gaussianization.fair import freeze_flow
# Stage 1: pretrain conditional flow
T_xq = ConditionalGaussianizationFlow(
input_dim=d,
cond_dim=d_q,
num_blocks=8,
num_components=12,
)
T_xq.compile(optimizer=keras.optimizers.Adam(1e-3), loss=base_nll_loss)
T_xq.fit(
[X_train, q_train], # input is a (data, condition) pair
X_train, # NLL target
epochs=200,
batch_size=256,
)
freeze_flow(T_xq)
# Stage 2: standard predictor on the residual
def residual(X, q):
return T_xq([X, q])
mlp = build_mlp(input_dim=d)
mlp.compile(optimizer="adam", loss="mse")
mlp.fit(residual(X_train, q_train), y_train, ...)5.4 Asks¶
| Item | Effort | Notes |
|---|---|---|
gauss_keras.bijectors.MixtureCDFGaussianization accepts a condition input | M | FiLM-style conditioning on mixture params via a small head MLP. |
gauss_keras.bijectors.MixtureCDFCoupling already takes a conditioner — extend its conditioner to accept q alongside the active half. | S | One arg change. |
gauss_keras.flows.ConditionalGaussianizationFlow | M | Threads q through every layer; subclasses or wraps existing GaussianizationFlow. |
fair.fit_and_freeze_conditional(X, q, ...) | S | Convenience helper. |
Notebook 10_conditional_flow.ipynb | L | Comparison against A–C on Adult; sanity check after freezing. |
This is the most invasive change: it touches the core gauss_keras
library, not just fair/. Worth doing because conditional flows are
broadly useful (density estimation conditional on covariates).
5.5 Tradeoffs¶
Plus
- Structurally enforces — strongest fairness guarantee.
- The predictor cannot reverse-engineer from no matter how hard it tries (the information is gone).
- Naturally handles continuous (the conditional flow’s parameters interpolate). Binary, multi-class, real-valued sensitive attributes all the same.
Minus
- Information loss is unbounded. Removes everything about that varies with — including useful predictive signal. The “fair” representation is structurally pure but may be predictively poor.
- Most expensive to pretrain: per- Gaussianisation. The flow has to model everywhere, not just .
- Architectural lift to
gauss_kerasis non-trivial. - “Frozen” is now more fragile: at inference time we need to apply with the test , and if test has a value never seen during pretraining (continuous , distribution shift), the flow is off-support.
5.6 Hypothesis¶
6. Approach E — Counterfactual sample augmentation¶
Use the (conditional) flow’s inverse pass to generate counterfactual with flipped, then train a predictor to make the same decision on both. Targets individual counterfactual fairness in the sense of Kusner et al. (2017), not just population-level statistics.
6.1 Math¶
For each training example , define the counterfactual
i.e. Gaussianise given , then invert the Gaussianisation under the opposite -value. The result has the same “position in the Gaussianised latent” but the marginal of the opposite group.
(For continuous or multi-class , swap for a chosen reference value or sample of values.)
Augmented dataset: . Train with a consistency loss:
The consistency term explicitly says: an individual’s prediction must not change if you flip their sensitive attribute (Kusner et al. 2017, “Counterfactual Fairness”).
6.2 Pipeline¶
6.3 Pseudocode¶
from gaussianization.fair import (
generate_counterfactuals, # NEW
CounterfactualConsistencyLoss, # NEW
)
# Stage 1: pretrain a conditional flow (Approach D's machinery)
T_xq, _ = fit_and_freeze_conditional(X_train, q_train, ...)
# Stage 2: generate counterfactuals for the training set
X_tilde = generate_counterfactuals(T_xq, X_train, q_train)
# X_tilde[i] is the counterfactual of X_train[i] with q flipped
# Stage 3: train predictor with consistency loss
class FairCFMLP(keras.Model):
def __init__(self, d, lam=1.0):
super().__init__()
self.mlp = build_mlp(d)
self.lam = lam
def call(self, inputs, training=False):
x, x_tilde = inputs["x"], inputs["x_tilde"]
y_hat = self.mlp(x)
if training:
y_hat_tilde = self.mlp(x_tilde)
self.add_loss(self.lam * ops.mean((y_hat - y_hat_tilde) ** 2))
return y_hat
model = FairCFMLP(d, lam=1.0)
model.compile(optimizer="adam", loss="mse")
model.fit({"x": X_train, "x_tilde": X_tilde}, y_train, ...)6.4 Asks¶
| Item | Effort | Notes |
|---|---|---|
Approach D’s ConditionalGaussianizationFlow and its inverse | L | Big — see Approach D. |
generate_counterfactuals(flow, X, q) | S | One forward + one inverse pass per batch; precomputable. |
CounterfactualConsistencyLoss (or use plain add_loss as above) | S | Just an MSE between two predictor calls. |
Notebook 11_counterfactual_fairness.ipynb | L | Visualise counterfactual quality (a few pairs); individual fairness metrics. |
Building blocks: needs Approach D first.
6.5 Tradeoffs¶
Plus
- Targets individual counterfactual fairness — “Would the same applicant get the same prediction if their gender were flipped?” — which population-level DP/EO doesn’t capture.
- Composes with any predictor architecture.
- Naturally handles continuous : average consistency loss over multiple counterfactual draws from .
Minus
- Needs an invertible flow with a faithful conditional. Counterfactuals are only as good as 's coverage.
- Doubles dataset size (or doubles forward passes per step if precomputed). Manageable.
- “Counterfactual” is a fantasy when the actual joint has no support at . Adult example: if a feature like “occupation = mining engineer” essentially never co-occurs with female in the data, the counterfactual is extrapolating. Honest warning needed.
6.6 Hypothesis¶
7. Approach F — Density-ratio reweighting¶
Use the (conditional) flow’s log-density to estimate per-sample weights that rebalance the training set. Classical importance weighting in the style of Calders & Verwer (2010), with flow-based densities replacing the usual kernel-density or logistic-regression-style propensity estimates.
7.1 Math¶
Two equivalent formulations:
Group-balanced reweighting. Estimate the group-conditional density ratio:
That is, the weighted task loss is the same across groups — closing the population-level disparity without a fairness penalty.
Inverse-propensity reweighting. Use the flow to estimate via Bayes, then .
Both forms come from the same set of densities; the choice is just which factorisation is more numerically stable.
The flow estimates directly via .
7.2 Pipeline¶
7.3 Pseudocode¶
from gaussianization.fair import (
fit_and_freeze_conditional,
density_ratio_weights, # NEW
)
# Stage 1: conditional flow gives log p(X | q)
T_xq, _ = fit_and_freeze_conditional(X_train, q_train, ...)
# Stage 2: per-sample weights w_i = p(X_i | q=0) / p(X_i | q=q_i)
w_train = density_ratio_weights(
T_xq, X_train, q_train, target_q=0, clip=10.0,
)
# Returns shape (n,), positive, normalised to mean 1.
# Stage 3: standard weighted training
mlp = build_mlp(d)
mlp.compile(optimizer="adam", loss="mse")
mlp.fit(X_train, y_train, sample_weight=w_train, ...)7.4 Asks¶
| Item | Effort | Notes |
|---|---|---|
| Conditional flow log-density (Approach D’s machinery) | L | Needed first. |
density_ratio_weights(flow, X, q, target_q, clip) | S | One-liner once log-density is available; clipping handles tail. |
Notebook 12_density_ratio_reweighting.ipynb | M | Pareto: G-XCOV penalty vs IPW weighting. |
7.5 Tradeoffs¶
Plus
- Classical importance-weighting machinery — well-understood theoretically. Pearl & co. would approve.
- Composes with any task loss; just multiply by weights.
- One-time per-sample weight computation — training is otherwise vanilla.
- Direct attack on population-level fairness; can be cleanly combined with E (CF augmentation) for individual + population fairness.
Minus
- Per-sample weights can be high-variance for tails of .
Need clipping (
w_i = min(w_i, 10)) which biases the estimator. - Effective sample size shrinks. If one group is very different from the target, you’re effectively training on the overlap.
- Needs robust flow log-density estimates. Mixture-CDF Gaussianisation gives them cleanly (analytic Jacobian), but a miscalibrated flow translates 1:1 into miscalibrated weights.
7.6 Hypothesis¶
8. Approach G — Information-bottleneck on representations (stretch)¶
A seventh idea worth recording, even if it’s the most speculative: apply the fairness penalty to an intermediate layer of the predictor, not to its output.
8.1 Sketch¶
Most fair-representation literature does exactly this — see e.g. the VFAE of Louizos et al. (2016). An encoder , a head , and a penalty on the bottleneck representation. The encoder learns a representation that is task-useful but -uninformative.
Drop in any of our existing losses on and :
For G-MI we need a flow on — but is high-dim and moves during training, same problem as the original output-side experiment. Two workarounds:
- Periodic refresh: refit on every epochs. Costs a few extra minutes; gives a fresh dependence probe.
- VAE-style structural fix: make Gaussian by construction (e.g. with a KL-to- regulariser on , like a VAE encoder). Now and G-XCOV reduces to plain linear cross-covariance on the bottleneck — cheap and exact.
8.2 Tradeoffs (briefly)¶
Plus: composes with any downstream architecture; lets a small classifier head sit on top of a strongly-fair representation; the representation itself can be reused for multiple downstream tasks.
Minus: moving target on (same risk as the original experiment); requires architectural surgery on the predictor.
8.3 Hypothesis¶
9. Cross-cutting comparison matrix¶
| A (whiten) | B (select) | C (project) | D (cond. flow) | E (CF aug.) | F (IPW) | G (bottleneck) | |
|---|---|---|---|---|---|---|---|
| Flow on inputs? | ✅ joint | ✅ marginals | ✅ joint | ✅ conditional | ✅ conditional | ✅ conditional | ❌ representations |
| Flow frozen? | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ⚠️ refresh |
| Predictor sees fairness penalty during training? | ❌ | ❌ (soft: ✅) | ❌ (soft: ✅) | ❌ | ✅ (consistency) | ❌ | ✅ |
| Pretraining cost | low | medium (d marginals) | medium | high | high | high | medium |
| Information loss | none | hard | rank- | total q-conditioned | none | importance-weighted | task-driven |
| Granularity of fairness | n/a | population | population | structural | individual | population | representation |
Needs new core (gauss_keras) infra? | thin layer | n/a | n/a | yes | yes | yes | maybe |
| Composes with G-XCOV / G-MI? | ✅ | ✅ | ✅ | redundant | ✅ | ✅ | ✅ |
| Effort estimate (S/M/L) | S | M | M | L | L | M | L |
10. Recommended sequencing¶
Round 1 alone is plausibly the strongest paper of the set — A+B+C give three preprocessing-only fairness baselines that the original output-side losses can be benchmarked against. If Round 1 is sufficient in practice, Rounds 2–4 become “did you really need to add a fairness loss at all?” — a sharp result either way.
11. New library additions, summarised¶
If we eventually ship Rounds 1–3, the public API of
gaussianization.fair grows by:
from gaussianization.fair import (
# existing
GaussianizedXCovLoss,
GaussianizedMutualInfoLoss,
GaussianizedTotalCorrelationLoss,
fit_and_freeze,
fit_and_freeze_joint,
freeze_flow,
is_fully_frozen,
demographic_parity_difference,
equalized_odds_difference,
pearson_corr,
# Round 1
score_features_g, # B
q_orthogonal_projection, # C
fit_marginals, # B convenience
# Round 2 (infrastructure)
fit_and_freeze_conditional, # D
# Round 3
generate_counterfactuals, # E
density_ratio_weights, # F
)And gauss_keras grows:
from gaussianization.gauss_keras import (
# existing
GaussianizationFlow,
make_gaussianization_flow,
make_coupling_flow,
...
# Round 1
GaussianizationLayer, # A: frozen-flow wrapper as keras.Layer
# Round 2
ConditionalGaussianizationFlow, # D
)Six new public symbols in gaussianization.fair
(score_features_g, q_orthogonal_projection, fit_marginals,
fit_and_freeze_conditional, generate_counterfactuals,
density_ratio_weights), two new classes in gauss_keras
(GaussianizationLayer — a thin frozen-flow wrapper for Approach A;
ConditionalGaussianizationFlow — the conditional flow for D/E/F),
and one extended bijector
(MixtureCDFGaussianization gains an optional condition input).
None of the existing API breaks.
12. Open questions¶
Joint flow vs marginal flows for Approach B. Per-feature flows are independent and parallelisable, but they ignore cross-feature structure. A joint flow scores features via its marginal projections but is harder to fit. Worth a small ablation in the B notebook.
Pre-vs-post Gaussianisation for . All approaches assume has its own Gaussianisation flow . For binary this is overkill — is essentially a sign-flip + scale. Is there an ablation showing matters? Or can we use raw for the sensitive side when ?
Continuous semantics. For age-as-sensitive-attribute (a continuous variable), what does “DP-diff” even mean? Approaches D and E need to specify a counterfactual policy: do we flip a 25-year-old to a 45-year-old, or to the average, or to the marginal distribution?
Flow capacity vs predictor capacity. All approaches assume the flow has “enough” capacity to faithfully model , , etc. An under-capacity flow gives bad scores / bad projections / bad counterfactuals — and the failure mode is silent (the predictor just inherits the flow’s blind spots). Possible mitigation: diagnostic that monitors per-feature log-likelihood on held-out data.
Combining approaches. Nothing prevents stacking — e.g. whiten inputs (A), select features (B), project out residual q-direction (C), and then add a small G-XCOV penalty for defence in depth. Does that compound, or does each subsequent step add nothing?
- Laparra, V., Camps-Valls, G., & Malo, J. (2011). Iterative Gaussianization: From ICA to Random Rotations. IEEE Transactions on Neural Networks, 22(4), 537–549. 10.1109/TNN.2011.2106511
- Cortes, C., Mohri, M., & Rostamizadeh, A. (2012). Algorithms for Learning Kernels Based on Centered Alignment. Journal of Machine Learning Research, 13, 795–828.
- Olfat, M., & Aswani, A. (2019). Convex Formulations for Fair Principal Component Analysis. AAAI Conference on Artificial Intelligence. 10.1609/aaai.v33i01.3301663
- Winkler, C., Worrall, D. E., Hoogeboom, E., & Welling, M. (2019). Learning Likelihoods with Conditional Normalizing Flows. arXiv Preprint arXiv:1912.00042.
- Moyer, D., Gao, S., Brekelmans, R., Galstyan, A., & Ver Steeg, G. (2018). Invariant Representations without Adversarial Training. Advances in Neural Information Processing Systems (NeurIPS).
- Kusner, M. J., Loftus, J. R., Russell, C., & Silva, R. (2017). Counterfactual Fairness. Advances in Neural Information Processing Systems (NeurIPS).
- Calders, T., & Verwer, S. (2010). Three Naive Bayes Approaches for Discrimination-Free Classification. Data Mining and Knowledge Discovery, 21(2), 277–292. 10.1007/s10618-010-0190-x
- Louizos, C., Swersky, K., Li, Y., Welling, M., & Zemel, R. (2016). The Variational Fair Autoencoder. International Conference on Learning Representations (ICLR).