Fair learning with frozen Gaussianization flows
Overview, reading order, and status
This sub-project replaces the CKA fairness penalty in
keras-fairkl with a
family of penalties built from a frozen Gaussianization flow. The
flow is trained once, frozen, and reused as a differentiable
Gaussian-space probe inside any downstream predictor’s optimisation
loop.
The one-paragraph pitch¶
A Gaussianization flow turns arbitrary marginals into approximately standard normals while preserving all statistical dependence (). Once trained and frozen, acts as a fixed, scale-normalising, differentiable preprocessor — it absorbs the kernel/bandwidth choices of CKA and HSIC into its mixture-CDF parameters, and turns “measure non-linear dependence between and ” into “measure linear dependence between near-Gaussian variables.” Three concrete penalties exploit this: G-XCOV, G-MI, and G-TC.
Reading order¶
¶ | Page | What it is for |
|---|---|---|
| 1 | Fair learning with frozen Gaussianization flows — design doc | Design doc. Mental model, math, hypotheses, experiment plan, risks, milestones. Read first. |
| 2 | Pretrain & freeze a Gaussianization flow | Notebook 05. Pretrain + freeze a flow on a 2-D dataset; four diagnostics that prove the flow Gaussianises, freezes, and inverts. |
| 3 | Fair MLP regression with a frozen Gaussianization flow | Notebook 06. Fair MLP regression on synthetic data; Pareto curve of across G-XCOV, G-MI, G-TC, and CKA. |
| 4 | UCI Adult Census — real-data fair classification | Notebook 07. Same setup on UCI Adult Census; Pareto curves on AUC vs. DP-diff and EO-diff. |
| 5 | Fair Gaussianization — input-side follow-up experiments | Follow-up doc. Seven input-side alternatives that move the flow from the predictor’s output to its input / representation / data pipeline. |
The three penalties at a glance¶
Table (2):Output-side fairness penalties built on a frozen Gaussianization flow.
| Loss | Captures | Closed form? | Joint flow needed? |
|---|---|---|---|
| G-XCOV | 2nd-moment dependence in Gaussianised space (linear CKA) | yes | no — two marginal flows |
| G-MI | MI assuming joint-Gaussian after Gaussianisation | yes | no — two marginal flows |
| G-TC | Full MI / total correlation, no joint-Gaussian assumption | no — via flow NLL | yes — one joint flow over |
All three are differentiable in the predictor’s parameters and plug
into FairModelWrapper via its fairness_loss=... argument. See
§4 of the design doc for the math, and
§4.4 — the comparison table for the
property comparison.
Status¶
| Milestone | Acceptance | |
|---|---|---|
| ✅ | Skeleton: fair/{losses,freeze,pretrain,metrics}.py + tests | pytest tests/test_fair.py green |
| ✅ | Notebook 05: pretrain + freeze + 4 diagnostics | Executed and committed |
| ✅ | Notebook 06: synthetic Pareto with G-XCOV vs CKA | Pareto curve from RMSE 0.11 → 1.35 |
| ✅ | Notebook 07: Adult Pareto with G-XCOV vs CKA | Pareto traced |
| 🟡 | G-MI + G-TC losses + tests | In flight |
| 🟡 | Notebooks 06/07 re-executed with G-MI and G-TC curves | Pending |
| ⏳ | H3 quadratic-dependence experiment (08_quadratic_dependence.ipynb) | Pending |
| ⏳ | Input-side follow-ups (see Fair Gaussianization — input-side follow-up experiments) | Pending |