Ensemble Kalman Filter#
Notes for the Ensemble Kalman Filter (EnsKF).
Model#
Forecast Step#
Analysis Step (Forward)#
where:
Analysis Step (Backwards)#
where:
Model#
In man applications, there are cases where we cannot write the transition model and/or the emission model in a simple matrix operation. Instead, we use non-linear functions.
This gives us added flexibility and this also allows us to model more complexities found within the data. However, we cannot use the linear Kalman filter equations for inference.
Solutions: In general, you’ll find 3 classes of algorithms to deal with this: Extended Kalman Filters (EKF), Ensemble Kalman Filters (EnKF) and Particle Filters (PF).
Ensemble Kalman Filter#
In the EnKF, we use an ensemble to track the state of the system. So \(\mathbf{x}(t)\), is approximated by \(N_e\) members. These members are used to track the evolution of the state. Then the empirical mean and covariance is calculated followed by the Kalman filter equations to update the state based on observations. This provides a robust and fast solution.
Forecast Step#
Deterministic Ensemble Kalman Filter#
Stochastic Ensemble Kalman Filter#
Computational Issues#
Sherman-Morrison Woodbury
Kalman Gain (Observation State)
Reduced Space
where \(\mathbf{S} = \frac{\mathbf{x}}{\sqrt{N-1}}\).
Inverting Symmetric Definite Matrices#
Standard
C_inv = linalg.pinv(C, hermitian=True)
SVD
U, D, VH = linalg.svd(C, full_matrices=False, hermitian=True)
C_inv = VH.T @ diag(1/D) @ U.T
Note: This is useful for data assimilation with square roots
Cholesky
L = linalg.cholesky(C)
U = solve_triangular(L, eye(L.shape[1]), lower=True)
C_inv = solve_triangular(L.transpose(1, 2), U, lower=False)
Note: This is useful for optimal interpolation.
Analysis Step (SVD)#
Forecast Ensemble Anomalies
Forecast Ensemble Covariance
Kalman Gain (Ensemble Space)
S @ inv(eye(Ne) + (H @ S).T @ inv(R) @ H @ S) @ (H @ S).T @ inv(R)
[Ns x D] = [Ns x D] @ []
SVD of Signal/Noise
Kalman Gain with SVD
Analysis Ensemble of Anomalies
Analysis Ensemble of Covariances
Analysis Mean
Analysis Ensemble
Perturbed Observations#
Sequential Data Assimilation with Non-Linear Quasi-Geostrophic Model using Monte Carlo methods to forecast error statistics - Evensen (1994)
Data Assimilation using Ensemble Kalman Filter Technique - Houtekamer & Mitchell (1998)
Inputs
We have some ensemble members.
\(\mathbf{X}_{e} \in \mathbb{R}^{N_e \times D_x}\)
\(\mathbf{y} \in \mathbb{R}^{D_y}\)
Sample the Observations given the state
Here, we propagate these samples through our emission function, \(\boldsymbol{H}\), to give us some predictions for the observations.
where \(\mathbf{Y}_e \in \mathbb{R}^{N_e \times D_y}\).
# emission function
Y_ens = H(X_ens, t, rng)
[Ns x Dy]
Calculate 1st Moment
Now we need to calculate the 1st moment (the mean) to approximate the Gaussian distribution. We need to do this for both the predicted state, \(\mathbf{X}_e\), and the predicted observations, \(\mathbf{Y}_e\).
where \(\bar{\mathbf{x}} \in \mathbb{R}^{D}\)
Now we need to calculate the 1st and 2nd moments (mean, covariance) to approximate the Gaussian distribution. We need to do this for both the predicted state, \(\mathbf{X}_e\), and the predicted observations, \(\mathbf{Y}_e\).
# mean of state ensembles (Dx)
x_mu = mean(X_ens)
# mean of obs ensembles (Dy)
y_mu = mean(Y_ens)
Calculate Residuals
We calculate the errors (residuals) for our ensemble of predictions, \(\mathbf{Y}_e\), wrt to the true observation value, \(\mathbf{y}_\text{obs}\).