Skip to article frontmatterSkip to article content

In this blog, we discuss some different types of missing values.

  • Missing Completely At Random (MCAR)
  • Missing Not At Random (MNAR)
  • Missing

Definitions

map to buried treasure

A simple schematic for different types of missingness. Source: Bessenbacher, 2022.

Missing Completely At Random. If the missingness is a completely independent of any process, then we consider it MCAR. This process could be physical or instrumental.

mnp(m)m_{n} \sim p(m)

where nn is the spatiotemporal coordinate.

Missing At Random. If the missing is completely independent of the underlying physical process that we are measuring but it could be dependent on some other auxillary process then we consider this to be Missing At Random (MAR). An example would be satellite swaths that are measuring the spectral channels or the alongtrack data. The missing data is not dependent on any underlying physical process.

Steps

Interpolation Step - Create Initial Estimates by Spatial Interpolation. For example, we can divide the signal into climatology and monthly anomalies. They can use some gap-filling with splines and/or kriging.

Monthly Anomalies:yˉm=t=1kmyyˉmRDΩ×12Climatology:yˉc=1Tct=1TcytyˉmRDΩ×12\begin{aligned} \text{Monthly Anomalies}: && && \bar{\mathbf{y}}_m &= \sum_{t=1} \mathbf{k}_m * \mathbf{y} && && \bar{\mathbf{y}}_m\in\mathbb{R}^{D_\Omega\times 12}\\ \text{Climatology}: && && \bar{\mathbf{y}}_c &= \frac{1}{T_c}\sum_{t=1}^{T_c}\mathbf{y}_t && && \bar{\mathbf{y}}_m\in\mathbb{R}^{D_\Omega\times 12}\\ \end{aligned}

where TcT_c is the period of a considered climatology, e.g., 30 years, and km\mathbf{k}_m is a kernel for a month.

Feature Engineering

References
  1. Bessenbacher, V., Seneviratne, S. I., & Gudmundsson, L. (2022). CLIMFILL v0.9: a framework for intelligently gap filling Earth observations. Geoscientific Model Development, 15(11), 4569–4596. 10.5194/gmd-15-4569-2022