Physics-aware SSH reconstruction — where physics enters the pathwise GP
Physics-aware SSH reconstruction¶
The pipeline in SSH reconstruction via GP pathwise sampling and Efficient machinery for SSH pathwise sampling — a tour of gaussx + pyrox is a physics-agnostic Gaussian process: a single Matérn × OU kernel encodes “the field is smoothly varying in space and time.” That is a strong baseline — essentially what classical DUACS optimal interpolation does[1] [2] — but it leaves on the table a half-century of ocean-dynamics knowledge.
This note organises how that knowledge gets injected. To frame it, recall the pathwise posterior formula from 00:
Every term in this expression is a knob:
| Symbol | What it controls | Where physics enters | Treated in |
|---|---|---|---|
| The observations and their noise | DATA — pre-process them, augment with new modalities (§2) | §2 below | |
| The prior covariance — the kernel | KERNEL — choose a physically motivated kernel; structure it as a sum of physically meaningful components (§3) | §3 below | |
| The linear inversion | COMPUTE — Woodbury, PCG, low-rank approximations | 01 | |
| The prior sample (RFF) | COMPUTE — RFF basis, Bochner | 01 | |
| The prediction inputs (the grid) | GEOMETRY — patch decomposition, localisation | 03 |
This note is about the first two columns: DATA (§2) and KERNEL (§3). A final §4 places geostrophic-current diagnostics, which are post-hoc and touch nothing in the pathwise formula above.
Two concrete axes summarise the menu:
- Modify the data — subtract physics that is already understood (tides, atmospheric loading, mean dynamic topography); add velocity observations from drifters / HF radar / Doppler altimetry as gradient constraints on η.
- Modify the kernel — pick a kernel whose spectral density encodes ocean dynamics (SQG); decompose the kernel as a sum of physically meaningful components on multiscale localised bases (MIOST/MASSH Gabor wavelets); keep low-rank structure so the gaussx Woodbury machinery from 01 still applies.
The grouping is deliberate: the GP machinery from 00/01/03 stays the same throughout — only the inputs and the kernel change. Methods that abandon the static-prior GP framework — variational data assimilation with dynamical priors (3D-Var, 4D-Var, DYMOST, BFN-QG) and learned variational schemes (4DVarNet) — live in Variational SSH reconstruction — 3D-Var, 4D-Var, DYMOST, BFN-QG, 4DVarNet, since they swap out the entire + block, not just the kernel.
2. The DATA knob — modify before the GP sees it¶
The observation vector in the pathwise formula is a passive input — anything you do to before passing it to the solver is honoured exactly. Two physical interventions:
2a. Subtract — pre-processing the obs¶
The numbers in 01 quietly assume that the observations have already been corrected for the standard altimetry noise sources. They have not, by default — CMEMS L3 products are minimally corrected. The standard pipeline:
| Correction | Source | Removes |
|---|---|---|
| Wet/dry tropospheric | Radiometer + ECMWF | Atmospheric refraction (~10 cm) |
| Ionospheric | Dual-frequency altimeter | Plasma path delay (~5 cm at low latitudes) |
| Sea-state bias | Empirical (SWH, U10 dependence) | EM bias from wave shape (~few cm) |
| Solid Earth + pole tide | Cartwright–Tayler model | Earth-body tides (~30 cm) |
| Ocean tides | FES2014 / FES2022[3] | M2, S2, K1, O1, … (~1 m peak) |
| Dynamic Atmospheric Correction (DAC) | MOG2D barotropic + IB[4] | High-frequency wind/pressure response (~10 cm at mid-latitudes) |
| Mean Dynamic Topography (MDT) | CNES-CLS-22[5] | Time-mean SSH so you work with anomaly |
After all corrections, what remains is the sea-level anomaly — the field the GP should be modelling. Three things to know:
- All corrections are pre-computed and provided in the L3 product. Reading the right CMEMS variable (
sla_unfilteredorsla_filtered) gets you the already-corrected anomaly. - DAC is essential when pooling across short time windows. The 5–10 cm barotropic response to a passing storm is highly correlated across all observations within a few days and will break the GP’s stationarity assumption if not removed.
- Tide residuals at coastal points are the dominant error budget. FES tidal models have ~3 cm RMS on the open ocean but degrade to >10 cm in narrow shelf seas (English Channel, Patagonian shelf). For coastal SSH work, expect to filter or down-weight these.
In code: this is a one-shot upstream step. The GP machinery is unchanged.
2b. Augment — derivative observations from velocities¶
Lagrangian drifter trajectories give direct estimates of surface velocity (after removing Ekman, Stokes, and inertial-oscillation components)[6] [7]. HF-radar gives the same on shelf seas. SKIM[8] and the recently selected ESA Harmony / future Doppler-altimetry missions give it from space. All of these constrain gradients of η via geostrophic balance:
with and Coriolis at latitude ϕ. These are linear functionals of η — partial derivatives with a latitude-dependent multiplier. Gaussian processes are closed under linear operators, so derivative observations slot into the GP as additional rows of the Gram matrix:
For Matérn-3/2 these have closed form; in JAX, jax.grad of the kernel function gives you all the entries automatically — no analytical work needed. Recipe:
- Pre-process drifter velocities to remove Ekman + inertial + Stokes (standard CMEMS workflow).
- Multiply by and to obtain “SSH-gradient observations.”
- Stack them onto the SLA observation vector .
- Define a kernel function that branches on observation type —
eta×eta,eta×∂η,∂η×∂η— usingjax.grad(k)for the differentiated entries. - Pass the augmented kernel to
gx.ImplicitKernelOperatorand the augmented togx.solve. Everything else from 00/01 is unchanged.
This is mathematically identical to how the CMEMS Mean Dynamic Topography is constructed[6] [7] [5], where steady-state geostrophic balance jointly constrains altimetry residuals and drifter velocities.
Cyclostrophic / equatorial caveats. Geostrophy degenerates as at the equator and inside tight eddy cores where centrifugal acceleration competes with Coriolis. Standard fixes are β-plane geostrophy near the equator[9] and gradient-wind balance for cyclostrophic eddies — both add a quadratic-in- correction term, and the GP is no longer linear. Skip unless you specifically care about the equatorial band or eddy-core diagnostics.
3. The KERNEL knob — physics-informed priors¶
The kernel is the entire prior — it determines the smoothness, anisotropy, multi-scale content, and spectral slope of every reconstruction. The Matérn × OU baseline says only “smooth, isotropic, one length scale per axis.” Two physical refinements, in increasing order of structural complexity:
3a. Clever spectral density — the SQG kernel¶
In a uniformly stratified ocean with zero interior potential vorticity, the dynamics reduce to a single equation for surface buoyancy[10]:
with surface streamfunction recovered from by a half-Laplacian inversion:
where is the upper-ocean buoyancy frequency. Sea level inherits the same spectral relation: . Combined with the canonical SQG buoyancy spectrum , this gives an SSH spectral slope of in the SQG regime, vs for interior QG turbulence[11] [12].
Use as a GP kernel. The “SQG kernel” is a stationary kernel whose spectral density follows the slope, with a low-wavenumber cutoff at the Rossby deformation radius and a high-wavenumber cutoff at the altimeter noise floor. From the GP point of view this is just a different choice of in Bochner’s theorem — feed it into [pyrox._basis._rff._draw_spectral_frequencies](file:///home/azureuser/localfiles/pyrox/src/pyrox/_basis/_rff.py) and the rest of the pathwise machinery from 01 stays unchanged. Physically, the prior says “fields with energy concentrated at wavelengths near the deformation radius and a known fall-off rate are a priori more plausible” — much sharper than the Matérn fall-off, and it concentrates statistical strength where the ocean actually has variance.
Where SQG is and isn’t valid. SQG works well[12] [13] in the upper ocean of energetic mid-latitude regions, in winter when deep mixed layers maintain near-zero PV, and at mesoscale wavelengths (~50–300 km). It breaks at submesoscale (<50 km) where mixed-layer instabilities dominate[14], in summer when stratification is shallow, and below the surface mixed layer (use eSQG / interior + surface combinations instead). Pragmatically: use SQG as the spatial part of a multi-component kernel (next subsection) and let other components absorb what SQG misses.
3b. Structured multi-scale kernels — MIOST / MASSH Gabor wavelets¶
DUACS uses one global Matérn-like spatial covariance per region with hyperparameters tuned by latitude band[1]. MIOST (Multiscale Inversion of Ocean Surface Topography, CNES/CLS) replaces this with a sum of independent components, each expanded on a redundant tight frame of localised wavelet atoms[15] [16]:
Components in operational MIOST: large-scale geostrophy, mesoscale geostrophy, equatorial waves, barotropic motion, internal tides[15] [17]. Each component carries its own covariance, diagonal in its wavelet basis (one variance per atom). The global covariance is then a sum of structured operators — one block per component.
The wavelet atoms¶
The reference open implementation is the Le Guillou MASSH code, mirrored locally at [jej_vc_snippets/quasigeostrophic_model/massh/basis.py](file:///home/azureuser/localfiles/jej_vc_snippets/quasigeostrophic_model/massh/basis.py). Each atom is a Gaspari–Cohn windowed plane wave (Gabor / Morlet wavelet):
with a Gaspari–Cohn compact-support window (so atoms genuinely vanish outside , unlike Gaussians). The atom is parameterised by
| Symbol | Meaning |
|---|---|
| atom centre — placed on a hierarchy of grids, one grid per scale octave | |
| central wavenumbers (multiples of at the atom’s scale) | |
| spatial / temporal support radii — typically , with wavelengths per atom | |
| φ | — gives both cosine and sine atoms (real wavelet pair) |
Each scale octave gets its own grid spacing and its own band of wavenumbers; the user picks the number of octaves to span the resolved range (e.g. to , log-spaced).
Why this fits naturally into the pathwise GP¶
The wavelet expansion is exactly a finite-rank, structured prior on η. Stack the basis evaluations into a feature matrix Ψ and the per-atom variances into a diagonal :
This is a low-rank + diagonal kernel — structurally identical to the RFF Woodbury route from 01, with three concrete differences that make it strictly better as an SSH prior:
- Atoms are localised and bandpass, not global plane waves. Ψ is sparse in space and frequency: each pixel is touched by atoms (one per octave that overlaps it), and each atom touches pixels. Matrix-vector products with Ψ are sparse and embarrassingly parallel — much cheaper than the dense of RFF.
- The diagonal encodes the energy spectrum directly. Set from a target spectral slope (e.g. for SQG, for QG, observed slope from in-region spectra). No kernel-fitting needed.
- Components are additive and independent. block-stacks per-component bases; is block-diagonal. Each component lives in its own wavelet sub-frame with its own physical interpretation.
In gaussx vocabulary: is a gaussx.LowRankUpdate with sparse Ψ. The noisy Gram goes through Woodbury exactly as in 01’s Route B, but with Ψ implemented as a sparse-matvec instead of a dense feature matrix. You get the per-day cost numbers from 01’s “fully-RFF” column with a physically motivated basis, plus the multi-component decomposition that DUACS lacks.
Resolution comparison (operational, mid-latitudes): DUACS recovers down to wavelengths, MIOST down to [2]. With SWOT KaRIn ingested, MIOST resolves [17].
MIOST vs inducing points — same family, different basis¶
Both MIOST wavelets and Nyström-style inducing points (gaussx.nystrom_operator from 01) yield a low-rank approximation . The difference is what Ψ contains:
| Method | Ψ columns | Picks up... | |
|---|---|---|---|
| RFF (01) | random plane waves at frequencies | identity | matches the kernel spectral density on average |
| Nyström inducing points | with uniformly subsampled obs | matches the kernel exactly at the inducing locations | |
| MIOST Gabor wavelets | localised bandpass atoms on per-octave grids | diagonal physical variances per scale | matches the desired energy spectrum at every scale |
MIOST is the natural choice when (a) you want a per-scale interpretation and (b) you want to add or remove components without retraining. Nyström is the natural choice when you have a kernel you trust and want the cheapest possible approximation. RFF is the natural choice when neither structure helps.
Code pointers¶
- [
massh/basis.py](file:///home/azureuser/localfiles/jej_vc_snippets/quasigeostrophic_model/massh/basis.py) —WaveletBasis,WaveletParameters,WaveletGridclasses; the canonical reference. - [
massh/inversion.py](file:///home/azureuser/localfiles/jej_vc_snippets/quasigeostrophic_model/massh/inversion.py) — variational inversion for the wavelet coefficients. - Ocean Data Challenges — open MIOST baselines for the 2021/2023/2024 SWOT mapping benchmarks.
4. Diagnostics — geostrophic currents from (post-hoc)¶
This is not a physics-aware reconstruction step — it is a one-line post-processing of the GP mean using the geostrophic relation in §2b. Compute from the GP, then take finite differences on the grid:
This is what DUACS, MIOST, and OSCAR[9] all publish as their L4 surface-current product. Zero impact on the GP inversion. Use the user’s local [derivatives/finite_difference_geo.py](file:///home/azureuser/localfiles/jej_vc_snippets/derivatives/finite_difference_geo.py) — implements on a lat–lon grid with the spherical metric , — drop the GP mean field through it and you have operational currents.
If you want uncertainty on the currents, take finite differences of each posterior sample from the pathwise loop and compute the empirical std of the resulting — same pathwise machinery from 01, no extra cost.
Recap — where each physics knob plugs in¶
| Physics axis | Knob | Touches in the pathwise formula | Cost vs baseline | gaussx primitive |
|---|---|---|---|---|
| DATA — subtract | Pre-process: tides, DAC, MDT | (one-shot upstream) | Free | None — pure data step |
| DATA — augment | Drifter / HF radar / Doppler-altimetry velocities as derivative obs | adds rows to and ; uses jax.grad(k) for the derivative-row entries | Linear in number of velocity obs; usually | ImplicitKernelOperator with a typed kernel; jax.grad |
| KERNEL — clever choice | SQG spectral prior | inside RFF / Bochner | Free — same RFF cost | pyrox._basis._rff._draw_spectral_frequencies |
| KERNEL — multi-scale basis | MIOST/MASSH Gabor wavelets | via sparse Ψ | Cheaper — same as 01’s “fully-RFF” column with sparse Ψ | gaussx.LowRankUpdate + Woodbury solve |
| KERNEL — multi-component | Block-stacked MIOST components (geostrophic + equatorial + barotropic + IT) | Ψ becomes block-stacked, block-diagonal | Linear in number of components (free) | gaussx.LowRankUpdate (one per component, summed) |
| KERNEL — inducing points | Nyström approximation | Same as 01 Route B | gaussx.nystrom_operator | |
| DIAGNOSTIC | Geostrophic currents from | Finite-difference of GP output | Free (post-hoc) | jej_vc_snippets/derivatives/finite_difference_geo.py |
| (separate framework) | 3D-Var / 4D-Var / DYMOST / BFN-QG / 4DVarNet | Replaces the GP prior with a dynamical or learned regulariser | Different framework — see 04 | Outside the gaussx pathwise stack |
Recommended sequence¶
For the next concrete step in this project, taking the menu in increasing order of effort and decreasing order of leverage:
- DATA — subtract (a half-day fix). Confirm that you read CMEMS
sla_filtered, with DAC and MDT already removed. This is the lowest-effort, highest-impact change — running on uncorrected η will be all wrong regardless of what kernel you use. - KERNEL — multi-scale wavelet basis on top of
gaussx.LowRankUpdate. Highest-leverage extension that stays inside the pathwise framework — well-documented, locally-mirrored reference code (MASSH), gives a resolution improvement over a single-scale Matérn, costs the same as 01’s fully-RFF column. - KERNEL — SQG spectral prior as the spatial component of one of the wavelet sub-frames. One-line variation on (2); especially helpful in mid-latitude winter.
- DATA — augment with drifter velocities. Once the kernel is right, derivative obs add cheap, physically meaningful constraints. Plug into the same
gx.ImplicitKernelOperatorwith a typed kernel.
For variational baselines (3D-Var, 4D-Var, DYMOST, BFN-QG, 4DVarNet) that replace the GP prior with a dynamical or learned regulariser, see Variational SSH reconstruction — 3D-Var, 4D-Var, DYMOST, BFN-QG, 4DVarNet.
Lyard, F. H., Allain, D. J., Cancet, M., Carrère, L., Picot, N. (2021). “FES2014 global ocean tide atlas.” Ocean Science 17, 615–649.
Carrère, L., Lyard, F. (2003). “Modeling the barotropic response of the global ocean to atmospheric wind and pressure forcing — comparisons with observations.” Geophys. Res. Lett. 30, 1275.
Ardhuin, F., et al. (2019). “SKIM, a candidate satellite mission exploring global ocean currents and waves.” Frontiers in Marine Science 6, 209.
Held, I., Pierrehumbert, R., Garner, S., Swanson, K. (1995). “Surface quasi-geostrophic dynamics.” J. Fluid Mech. 282, 1–20.
Lapeyre, G., Klein, P. (2006). “Dynamics of the upper oceanic layers in terms of surface quasigeostrophy theory.” J. Phys. Oceanogr. 36, 165–176.
González-Haro, C., Isern-Fontanet, J. (2014). “Global ocean current reconstruction from altimetric and microwave SST measurements.” J. Geophys. Res. Oceans 119, 3378–3391.
Callies, J., Ferrari, R. (2013). “Interpreting energy and tracer spectra of upper-ocean turbulence in the submesoscale range (1–200 km).” J. Phys. Oceanogr. 43, 2456–2474.
Ballarotta, M., et al. (2023). “Improved global sea surface height and current maps from remote sensing and in situ observations.” Earth Syst. Sci. Data 15, 295–315.