Skip to article frontmatterSkip to article content

drawing

Fig.1 - Various quantities related to sea surface height. Source: GGOS

Assumptions

PressureSea Surface Height (SSH)\text{Pressure} \propto \text{Sea Surface Height (SSH)}
Coriolis ForceVelocity\text{Coriolis Force} \propto \text{Velocity}

Representation

We have a coordinate vector, xRD\mathbf{x} \in \mathbb{R}^D, which describe the state 2D space (x,y)(x,y). We also have a time component, tt, which describes the 2D space at a specific point in time. We also have a two functions which describe the flow state: the potential function, q\boldsymbol{q}, is the ... and the ψ\boldsymbol{\psi} is an auxilary function which .... Both functions take in the coordinate vector and time and output a scalar value.

q=q(x,t)ψ=ψ(x,t)\begin{aligned} q &= \boldsymbol{q}(\mathbf{x},t) \\ \psi &= \boldsymbol{\psi}(\mathbf{x},t) \end{aligned}

PDE

We have the following PDE for the QG dynamics:

tq+detJ(ψ,q)=0\partial_t q + \det J(\psi, q) = 0

where q(x,t)R2×RRq(x,t) \in \mathbb{R}^2 \times \mathbb{R} \rightarrow \mathbb{R} is the potential vorticity (PV), ψ(x,t)R2×RR\psi(x,t) \in \mathbb{R}^2 \times \mathbb{R} \rightarrow \mathbb{R} is the stream function, t\partial_t is the partial derivative wrt tt, J\boldsymbol{J}, is the Jacobian operator and detJ(,)\det \boldsymbol{J}(\cdot,\cdot) is the determinant of the Jacobian.

Objective: We want to convert this PDE in terms of sea surface height (SSH) instead of PV and the stream function.


PINNs Loss Derivation

Stream Function

Let’s define the relationship between the SSH, uu, and the stream function ψ.

ψ=fgu=c1u\psi = \frac{f}{g}u = c_1 u

where c1=fgc_1 = \frac{f}{g}. If we plug in the SSH into the PDE, we get:

tq+detJ(c1u,q)=0\partial_t q + \det J(c_1 u, q) = 0

To simplify the notation, we will factor out the constant, c1c_1, from the determinant Jacobian term.

tq+c1detJ(u,q)=0\partial_t q + c_1\det J(u, q) = 0

Proof: Constants and determinant Jacobians

detJ(c1u,q)=xc1uyqyc1uxqdetJ(c1u,q)=c1xuyqc1yuxqc1detJ(u,q)=c1(xuyqyuxq)\begin{aligned} \det J(c_1 u, q) &= \partial_x c_1 u \partial_y q - \partial_y c_1 u \partial_x q \\ \det J(c_1 u, q) &= c_1 \partial_x u \partial_y q - c_1 \partial_y u \partial_x q \\ c_1\det J(u, q) &= c_1 \left(\partial_x u \partial_y q - \partial_y u \partial_x q\right) \\ \end{aligned}

Note: we used the property that (cf)=cf\partial (c f) = c \partial f.

QED.


Potential Vorticity

Now, let’s define the relationship between the stream function and the PV. This is given by:

q=2ψ+1LR2ψ=2ψ+c2ψ\begin{aligned} q &= \nabla^2 \psi + \frac{1}{L_R^2} \psi \\ &= \nabla^2 \psi + c_2 \psi \end{aligned}

where 2\nabla^2 is the Laplacian operator and c2=1LR2c_2 = \frac{1}{L_R^2}. If we plug in SSH, uu, into the stream function, as defined above, we get:

q=2(c1u)+c2(c1u)=c12u+c3u\begin{aligned} q &= \nabla^2 (c_1 u) + c_2 (c_1 u) \\ &= c_1 \nabla^2 u + c_3 u \end{aligned}

where c3=c1c2=fgLR2c_3 = c_1 c_2 = \frac{f}{g L_R^2}. We can plug in the PV, qq, into the PDE in terms of SSH, uu.

t(c12u+c3u)+c1detJ(u,c12u+c3u)=0\partial_t \left(c_1 \nabla^2 u + c_3 u \right) + c_1\det J \left(u, c_1 \nabla^2 u + c_3 u \right) = 0

Now, we can expand this equation but first let’s break this up into two terms:

t(c12u+c3u)Term I+c1detJ(u,c12u+c3u)Term II=0\underbrace{\partial_t \left(c_1 \nabla^2 u + c_3 u \right)}_{\text{Term I}} + c_1\underbrace{\det J \left( u, c_1 \nabla^2 u + c_3 u \right)}_{\text{Term II}} = 0

Now we will tackle both of these terms one at a time.


Term I

So term I is:

f1:=t(c12u+c3u)f_1 := \partial_t \left(c_1 \nabla^2 u + c_3 u \right)

We can expand this via the partial derivative, t\partial_t.

f1:=c1t2u+c3tuf_1 := c_1 \partial_t \nabla^2 u + c_3 \partial_t u

So plugging this back into the PDE gives us:

c1t2u+c3tu+c1detJ(u,c12u+c3u)Term II=0c_1 \partial_t \nabla^2 u + c_3 \partial_t u +c_1 \underbrace{\det J \left(u, c_1 \nabla^2 u + c_3 u \right)}_{\text{Term II}} = 0

Term II

We can factorize this determinant Jacobian term.

c1detJ(u,c12u+c3u)=c1detJ(u,c12u)+detJ(c1u,c3u)c_1\det J \left(u, c_1 \nabla^2 u + c_3 u \right) = c_1\det \boldsymbol{J}(u, c_1 \nabla^2 u) + \det \boldsymbol{J}(c_1 u, c_3 u)

And furthermore, we can factor out the constants

c1detJ(u,c12u+c3u)=c12detJ(u,2u)+c1c3detJ(u,u)c_1\det J \left( u, c_1 \nabla^2 u + c_3 u \right) = c_1^2\det \boldsymbol{J}(u, \nabla^2 u) + c_1c_3 \det \boldsymbol{J}(u, u)

Proof: Determinant Jacobian Expansion

So term II is:

f2:=detJ(c1u,c12u+c3u)f_2 := \det J \left(c_1 u, c_1 \nabla^2 u + c_3 u \right)

We know the definition of the determinant of the Jacobian for a vector-valued function, f=[f1(x,y),f2(x,y)]:R2R2\boldsymbol{f} = [f_1(x,y), f_2(x,y)]^\top: \mathbb{R}^2 \rightarrow \mathbb{R}^2, as there is an identity.

detJ(f1(x,y),f2(x,y))=xf1yf2yf1xf2\det J \left( f_1(x,y), f_2(x,y)\right) = \partial_x f_1 \partial_y f_2 - \partial_y f_1 \partial_x f_2

If use this identity with term II, we get:

detJ(c1u,c12u+c3u)=x(c1u)y(c12u+c3u)y(c1u)x(c12u+c3u)\det J \left(c_1 u, c_1 \nabla^2 u + c_3 u \right) = \partial_x (c_1 u) \partial_y (c_1 \nabla^2u + c_3 u) - \partial_y (c_1 u) \partial_x (c_1 \nabla^2u + c_3 u)

Again, let’s split this up into two subterms and tackle them one-by-one.

detJ(c1u,c12u+c3u)=x(c1u)y(c12u+c3u)Term IIay(c1u)x(c12u+c3u)Term IIa\det J \left(c_1 u, c_1 \nabla^2 u + c_3 u \right) = \underbrace{\partial_x (c_1 u) \partial_y (c_1 \nabla^2u + c_3 u)}_{\text{Term IIa}} - \underbrace{\partial_y (c_1 u) \partial_x (c_1 \nabla^2u + c_3 u)}_{\text{Term IIa}}

Term IIa

We have the following term for Term IIa:

f2a:=x(c1u)y(c12u+c3u)f_{2a} := \partial_x (c_1 u) \partial_y (c_1 \nabla^2u + c_3 u)

We can expand the terms to get the following:

f2a:=x(c1u)y2(c1u)+x(c1u)y(c3u)f_{2a} := \partial_x (c_1 u) \partial_y \nabla^2 (c_1 u) + \partial_x (c_1 u) \partial_y (c_3 u)

And we can simplify, by factoring out constants, to get:

f2a:=c12xuy2u+c1c3xuyuf_{2a} := c_1^2 \partial_x u \partial_y \nabla^2 u + c_1c_3 \partial_x u \partial_y u

Term IIb

f2b:=y(c1u)x(c12u+c3u)f_{2b} := \partial_y (c_1 u) \partial_x (c_1 \nabla^2u + c_3 u)

We can expand the terms to get the following:

f2b:=c12yux2u+c1c3yuxuf_{2b} := c_1^2 \partial_y u \partial_x \nabla^2u + c_1 c_3 \partial_y u \partial_x u

Combined (Term IIa, IIb)

We can substitute all of these two terms into our original expression

detJ(c1u,c12u+c3u)=c12xuy2u+c1c3xuyuc12yux2uc1c3yuxu\det J \left(c_1 u, c_1 \nabla^2 u + c_3 u \right) = \\ c_1^2 \partial_x u \partial_y \nabla^2 u + c_1c_3 \partial_x u \partial_y u \\ - c_1^2 \partial_y u \partial_x \nabla^2u - c_1 c_3 \partial_y u \partial_x u

If we group the terms by operators, ,2\partial, \nabla^2, then we get:

detJ(c1u,c12u+c3u)=c12xuy2uc12yux2u2+c1c3xuyuc1c3yuxu\det J \left(c_1 u, c_1 \nabla^2 u + c_3 u \right) = \\ \underbrace{c_1^2 \partial_x u \partial_y \nabla^2 u - c_1^2 \partial_y u \partial_x \nabla^2u}_{\nabla^2} + \underbrace{c_1c_3 \partial_x u \partial_y u - c_1c_3\partial_y u \partial_x u}_{\partial}

So each of these terms are determinant Jacobian terms, detJ\det \boldsymbol J.

detJ(c1u,c12u+c3u)=c12detJ(u,2u)+c1c3detJ(u,u)\det J \left(c_1 u, c_1 \nabla^2 u + c_3 u \right) = c_1^2\det \boldsymbol{J}(u, \nabla^2 u) + c_1c_3 \det \boldsymbol{J}(u, u)

We have the final form for our PDE in terms of SSH, uu, which combines terms I and II.

QED.


Final Form

So we have the final form for our PDE in terms of SSH.

c1t2u+c3tu+c12detJ(u,2u)+c1c3detJ(u,u)=0c_1 \partial_t \nabla^2 u + c_3 \partial_t u + c_1^2\det \boldsymbol{J}(u, \nabla^2 u) + c_1c_3 \det \boldsymbol{J}(u, u) = 0

We will factor out a constant term

t2u+c2tu+c1detJ(u,2u)+c3detJ(u,u)=0\partial_t \nabla^2 u + c_2 \partial_t u + c_1\det \boldsymbol{J}(u, \nabla^2 u) + c_3 \det \boldsymbol{J}(u, u) = 0

Expanded Form

It is also good to do the expanded form with all of the partial derivatives because then we can see the computational portions with the gradient operations.

t2u+c2tu+c1xuy2uc1yux2u+c3xuyuc3yuxu=0\partial_t \nabla^2 u + c_2 \partial_t u + c_1 \partial_x u \partial_y \nabla^2 u - c_1 \partial_y u \partial_x \nabla^2u + c_3 \partial_x u \partial_y u - c_3\partial_y u \partial_x u = 0

Again, we will group the terms together in terms of the order of their derivatives, (2,)(\nabla^2, \partial).

t2u+c1xuy2uc1yux2u+c2tu+c3xuyuc3yuxu=0\partial_t \nabla^2 u + c_1 \partial_x u \partial_y \nabla^2 u - c_1 \partial_y u \partial_x \nabla^2u + c_2 \partial_t u + c_3 \partial_x u \partial_y u - c_3\partial_y u \partial_x u = 0

Now, another grouping:

t2u+c1xuy2uc1yux2u2+c2tu+c3xuyuc3yuxu=0\underbrace{\partial_t \nabla^2 u + c_1 \partial_x u \partial_y \nabla^2 u - c_1 \partial_y u \partial_x \nabla^2u}_{\nabla^2} + \underbrace{c_2 \partial_t u + c_3 \partial_x u \partial_y u - c_3\partial_y u \partial_x u}_{\partial} = 0

QG Equation (SSH)

Note: For the ease of notation, let’s denote uu as the SSH. The above PDE can be written in terms of uu

t2u+c2tu+c1detJ(u,2u)+c3detJ(u,u)=0\partial_t \nabla^2 u + c_2 \partial_t u + c_1\det \boldsymbol{J}(u, \nabla^2 u) + c_3 \det \boldsymbol{J}(u, u) = 0

Note: See derivation for how I came up with this term.


Expanded Form

We will do the expanded form with all of the partial derivatives so that we can see the gradient operations which we will need for the computational portions.

t2u+c2tu+c1xuy2uc1yux2u+c3xuyuc3yuxu=0\partial_t \nabla^2 u + c_2 \partial_t u + c_1 \partial_x u \partial_y \nabla^2 u - c_1 \partial_y u \partial_x \nabla^2u + c_3 \partial_x u \partial_y u - c_3\partial_y u \partial_x u = 0

Again, we will group the terms together in terms of the order of their derivatives, (2,)(\nabla^2, \partial).

t2u+c1xuy2uc1yux2u+c2tu+c3xuyuc3yuxu=0\partial_t \nabla^2 u + c_1 \partial_x u \partial_y \nabla^2 u - c_1 \partial_y u \partial_x \nabla^2u + c_2 \partial_t u + c_3 \partial_x u \partial_y u - c_3\partial_y u \partial_x u = 0

Now, another grouping:

t2u+c1xuy2uc1yux2u2+c2tu+c3xuyuc3yuxu=0\underbrace{\partial_t \nabla^2 u + c_1 \partial_x u \partial_y \nabla^2 u - c_1 \partial_y u \partial_x \nabla^2u}_{\nabla^2} + \underbrace{c_2 \partial_t u + c_3 \partial_x u \partial_y u - c_3\partial_y u \partial_x u}_{\nabla} = 0

Now, we will use the expanded forms to do the code. I have snippets using PyTorch and Jax (TODO). They use the DAG/OOP and Functional approach (respectively) so the notation is slightly different.


Gradient (1st Order)

Let’s look at the first-order gradients.

L:=c2tu+c3xuyuc3yuxu\mathcal{L}_\partial := c_2 \partial_t u + c_3 \partial_x u \partial_y u - c_3\partial_y u \partial_x u

Code

Let’s use the following notation:

u=fθ(x)u = \boldsymbol{f_\theta}(\mathbf{x})

where xRD\mathbf{x} \in \mathbb{R}^D, uR\mathbf{u} \in \mathbb{R}, and D=[Nx,Ny,Nt]D = [N_x, N_y, N_t].

Note: This is the PyTorch notation.

# coords variable, (B, D)
x = torch.Variable(x, requires_grad=True)

# output variable, (B,)
u = model(x)

# Jacobian vector, (B, D)
u_jac = jacobian(u, x)

# partial derivatives, (B,)
u_x, u_y, u_t = split(u_jac, dim=1)

# calculate term 2, (B,)
term2 = c2 * u_t + c3 * u_x * u_y - c3 * u_y * u_x

Laplacian + Gradient

Now we can look at the second order gradients.

L2:=t2u+c1xuy2uc1yux2u\mathcal{L}_{\nabla^2} := \partial_t \nabla^2 u + c_1 \partial_x u \partial_y \nabla^2 u - c_1 \partial_y u \partial_x \nabla^2u

Note: This is where all of the computational expense will come from.


Code

# coords variable, (B, D)
x = torch.Variable(x, requires_grad=True)

# output variable, (B,)
u = model(x)

# Laplacian (B,)
u_lap = laplacian(u, x)

# gradient of laplacian, (B, D)
u_lap_jac = jacobian(u_lap, x)

# partial derivatives, (B,)
u_lap_x, u_lap_y, u_lap_t = split(u_lap_jac, dim=1)

# calculate term
term1 = u_lap_t + c1 * u_x * u_lap_y - c1 * u_y * u_lap_x

Althernative (cheaper?)

We can also try this re-using the gradients we have already computed (from term 1). This will reduce the computations a bit by reusing our previous jacobian calculation to remove recalculating the Laplacian.

# coords variable, (B, D)
x = torch.Variable(x, requires_grad=True)

# output variable, (B,)
u = model(x)

# Jacobian vector, (B, D)
u_jac = jacobian(u, x)

u_jac2 = jacobian(u_jac, x)

u_xx, u_yy, u_tt = split(u_jac2, dim=1)

# Laplacian (B,)
u_lap = u_xx + u_yy

# gradient of laplacian, (B, D)
u_lap_jac = jacobian(u_lap, x)

# partial derivatives, (B,)
u_lap_x, u_lap_y, u_lap_t = split(u_lap_jac, dim=1)

# calculate term
term1 = u_lap_t + c1 * u_x * u_lap_y - c1 * u_y * u_lap_x

Loss

So now we combine both losses to get the full loss.

Lphy:=(L+L2)2=0\mathcal{L}_{phy} := (\mathcal{L}_{\partial} + \mathcal{L}_{\nabla^2})^2 = 0

Note: We typically square this term. It helps during the minimization process to always have a positive value. Otherwise, it will try to minimize a negative value. Instead of the squared term, we can also do the absolute value or perhaos the square root of the squared term.


# (B,)
term_partial = ...

# (B,)
term_nabla = ...

# combine terms
loss = term_partial + term_nabla

# compute loss term
loss = loss.square().mean()

# scale the loss
loss *= alpha

My Code

You can find the full function here. I have included the full function below for anyone interested.

# from ..operators.differential_simp import gradient
from ..operators.differential import grad as gradient
import torch
import torch.nn as nn


class QGRegularization(nn.Module):
    def __init__(
        self, f: float = 1e-4, g: float = 9.81, Lr: float = 1.0, reduction: str = "mean"
    ):
        super().__init__()

        self.f = f
        self.g = g
        self.Lr = Lr
        self.reduction = reduction

    def forward(self, out, x):

        x = x.requires_grad_(True)

        # gradient, nabla x
        out_jac = gradient(out, x)
        assert out_jac.shape == x.shape

        # calculate term 1
        loss1 = _qg_term1(out_jac, x, self.f, self.g, self.Lr)
        # calculate term 2
        loss2 = _qg_term2(out_jac, self.f, self.g, self.Lr)

        loss = (loss1 + loss2).sq

        if self.reduction == "sum":
            return loss.sum()
        elif self.reduction == "mean":
            return loss.mean()
        else:
            raise ValueError(f"Unrecognized reduction: {self.reduction}")


def qg_constants(f, g, L_r):
    c_1 = f / g
    c_2 = 1 / L_r**2
    c_3 = c_1 * c_2
    return c_1, c_2, c_3


def qg_loss(
    ssh, x, f: float = 1e-4, g: float = 9.81, Lr: float = 1.0, reduction: str = "mean"
):
    # gradient, nabla x
    # x = x.detach().clone().requires_grad_(True)
    # print(x.shape, ssh.shape)
    ssh_jac = gradient(ssh, x)
    assert ssh_jac.shape == x.shape

    # calculate term 1
    loss1 = _qg_term1(ssh_jac, x, f, g, Lr)
    # calculate term 2
    loss2 = _qg_term2(ssh_jac, f, g, Lr)

    loss = torch.sqrt((loss1 + loss2) ** 2)

    if reduction == "sum":
        return loss.sum()
    elif reduction == "mean":
        return loss.mean()
    else:
        raise ValueError(f"Unrecognized reduction: {reduction}")


def _qg_term1(u_grad, x_var, f: float = 1.0, g: float = 1.0, L_r: float = 1.0):
    """
    t1 = ∂𝑡∇2𝑢 + 𝑐1 ∂𝑥𝑢 ∂𝑦∇2𝑢 − 𝑐1 ∂𝑦𝑢 ∂𝑥∇2𝑢
    Parameters:
    ----------
    u_grad: torch.Tensor, (B, Nx, Ny, T)
    x_var: torch.Tensor, (B,
    f: float, (,)
    g: float, (,)
    Lr: float, (,)
    Returns:
    --------
    loss : torch.Tensor, (B,)
    """

    x_var = x_var.requires_grad_(True)
    c_1, c_2, c_3 = qg_constants(f, g, L_r)

    # get partial derivatives | partial x, y, t
    u_x, u_y, u_t = torch.split(u_grad, [1, 1, 1], dim=1)

    # jacobian^2 x2, ∇2
    u_grad2 = gradient(u_grad, x_var)
    assert u_grad2.shape == x_var.shape

    # split jacobian -> partial x, partial y, partial t
    u_xx, u_yy, u_tt = torch.split(u_grad2, [1, 1, 1], dim=1)
    assert u_xx.shape == u_yy.shape == u_tt.shape

    # laplacian (spatial), nabla^2
    u_lap = u_xx + u_yy
    assert u_lap.shape == u_xx.shape == u_yy.shape

    # gradient of laplacian, ∇ ∇2
    u_lap_grad = gradient(u_lap, x_var)
    assert u_lap_grad.shape == x_var.shape

    # split laplacian into partials
    u_lap_grad_x, u_lap_grad_y, u_lap_grad_t = torch.split(u_lap_grad, [1, 1, 1], dim=1)
    assert u_lap_grad_x.shape == u_lap_grad_y.shape == u_lap_grad_t.shape

    # term 1
    loss = u_lap_grad_t + c_1 * u_x * u_lap_grad_y - c_1 * u_y * u_lap_grad_x
    assert loss.shape == u_lap_grad_t.shape == u_lap_grad_y.shape == u_lap_grad_x.shape

    return loss


def _qg_term2(u_grad, f: float = 1.0, g: float = 1.0, Lr: float = 1.0):
    """
    t2 = 𝑐2 ∂𝑡(𝑢) + 𝑐3 ∂𝑥(𝑢) ∂𝑦(𝑢) − 𝑐3 ∂𝑦(𝑢) ∂𝑥(𝑢)
    Parameters:
    ----------
    ssh_grad: torch.Tensor, (B, Nx, Ny, T)
    f: float, (,)
    g: float, (,)
    Lr: float, (,)
    Returns:
    --------
    loss : torch.Tensor, (B,)
    """
    _, c_2, c_3 = qg_constants(f, g, Lr)

    # get partial derivatives | partial x, y, t
    u_x, u_y, u_t = torch.split(u_grad, [1, 1, 1], dim=1)

    # calculate term 2
    loss = c_2 * u_t + c_3 * u_x * u_y - c_3 * u_y * u_x

    return loss

Previous Code


Redouane

He previously made some attempts to use this as a regularization term. Here is his code snippet.

def forward(self, coords):
        coords = coords.clone().detach().requires_grad_(True) # allows to take derivative w.r.t. input
        SSH = self.firstnet(coords)
        gradSSH = self.gradient(SSH, coords)
        dSSHdx = gradSSH[:,0:1]
        dSSHdy = gradSSH[:,1:2]
        d2SHHd2x = self.gradient(dSSHdx, coords)[:,0:1]
        d2SHHd2y = self.gradient(dSSHdy, coords)[:,1:2]
        dQ = self.gradient(d2SHHd2x+d2SHHd2y, coords)
        output = self.secondnet(self.Bnorm(torch.cat((dSSHdy,
                                                      dSSHdx,
                                                      d2SHHd2x+d2SHHd2y,
                                                      dQ[:,0:1],
                                                      dQ[:,1:2]),1)))
                                                      #dQ[:,0:1] * dSSHdy,
                                                      #dQ[:,1:2] * dSSHdx
        output =  dSSHdx *  dQ[:,1:2] -   dSSHdy * dQ[:,0:1]
        return (1e-5*dQ[:,2:3]-output), coords, SSH