Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Griddata

Irregular-grid interpolation and GLT orthorectification

UNEP
IMEO
MARS

Module: georeader/griddata.py (617 LOC) Role: the onramp for curvilinear sensors — pushbroom imagers, swath scanners, and any sensor that gives you per-pixel lons and lats rather than a clean affine transform. Where read.py ends and EMIT / PRISMA / EnMAP / MODIS / VIIRS begin.


1. Why this module exists

Rectilinear rasters (S2, Landsat, Planet) come with an Affine transform — six numbers that map pixel (col, row) to geographic (x, y). Every function in read.py assumes this transform exists.

But many sensors don’t ship that way. EMIT, PRISMA, EnMAP, MODIS, VIIRS, AVHRR all give you:

There’s no transform. Two adjacent pixels in the array don’t necessarily map to adjacent points on Earth. Reading a polygon AOI requires either interpolating the irregular grid onto a regular one, or looking up which sensor pixel each output pixel came from. This module does both.


2. Irregular vs regular grids

┌─────────────────────────────────────────────────────────────────────────┐
│   IRREGULAR vs REGULAR GRIDS                                            │
│                                                                         │
│   Irregular (Swath/Sensor)           Regular (Orthorectified)           │
│   ─────────────────────────           ──────────────────────            │
│                                                                         │
│       ●  ●   ●  ●                     ┌──┬──┬──┬──┐                     │
│     ●    ●  ●    ●                    ├──┼──┼──┼──┤                     │
│      ●   ● ●  ●                       ├──┼──┼──┼──┤                     │
│    ●   ●    ●   ●                     ├──┼──┼──┼──┤                     │
│                                       └──┴──┴──┴──┘                     │
│                                                                         │
│   Each pixel has unique (lon, lat)    Fixed transform: pixel → geo      │
│   Spacing varies with scan angle      Uniform spacing, axis-aligned     │
│   Common in: pushbroom sensors        Required for: GIS, web maps       │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

The job of griddata.py is to take the left and produce the right — a GeoTensor with a real affine transform that the rest of the package can consume.

There are two strategies for doing this:

  1. Interpolation (reproject, read_to_crs, read_reproject_like) — sample the scattered points onto a regular grid via scipy.interpolate.griddata. Works for any sensor that gives you (lons, lats) arrays. Slow for large hyperspectral cubes; smooth output.

  2. GLT lookup (georreference) — apply a precomputed Geolocation Lookup Table that names “for each output pixel, which sensor pixel?” Fast, lossless (no interpolation artefacts), but requires the sensor’s product to ship a GLT (NASA EMIT does; PRISMA doesn’t).


3. Interpolation method comparison

┌────────────────────────────────────────────────────────────────────────┐
│  INTERPOLATION METHOD COMPARISON                                       │
│                                                                        │
│  Method      │ Continuity │ Speed  │ Best For                          │
│  ────────────┼────────────┼────────┼─────────────────────────────────  │
│  "nearest"   │ C⁰         │ Fast   │ Categorical data, masks           │
│  "linear"    │ C⁰         │ Medium │ Simple surfaces, quick preview    │
│  "cubic"     │ C²         │ Slow   │ Smooth continuous data (default)  │
│                                                                        │
│  C⁰ = continuous but not differentiable (may have sharp edges)         │
│  C² = smooth, twice differentiable (recommended for radiance/refl)     │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

The default is "cubic" — Clough-Tocher interpolation on the Delaunay triangulation of the scattered points. C² smoothness matters for hyperspectral retrievals because matched filters and unmixing react badly to discontinuities. The cost: cubic on a ~10⁶-point hyperspectral scene is multi-minute; consider downsampling first if the geometry doesn’t need full-res.

"linear" is the right default for “I just want to look at it”; "nearest" for categorical data (cloud masks, class labels).


4. GLT — geolocation lookup tables

┌────────────────────────────────────────────────────────────────────────┐
│  GLT-BASED ORTHORECTIFICATION                                          │
│                                                                        │
│  Sensor Array (irregular)              Output Grid (regular)           │
│  ┌───────────────────────┐             ┌──┬──┬──┬──┬──┐                │
│  │  0   1   2   3   ...  │             │  │  │  │  │  │                │
│  │                       │             ├──┼──┼──┼──┼──┤                │
│  │ [r,c] = radiance      │     GLT     │  │██│██│  │  │                │
│  │                       │  ────────►  ├──┼──┼──┼──┼──┤                │
│  │                       │             │  │██│██│██│  │                │
│  └───────────────────────┘             └──┴──┴──┴──┴──┘                │
│                                                                        │
│  GLT[0, i, j] = column in sensor array                                 │
│  GLT[1, i, j] = row in sensor array                                    │
│  output[i, j] = sensor[GLT[1,i,j], GLT[0,i,j]]                         │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

A GLT is a (2, H_out, W_out) integer array. Channel 0 holds the source column, channel 1 holds the source row, and each output pixel reads from data[..., glt[1, i, j], glt[0, i, j]] — a pure index gather. Invalid pixels (output cells with no source coverage) are marked with fill_value_default (typically -1).

GLTs are produced once by the data provider’s processing chain (using the full sensor model — orbital ephemeris, look angles, terrain DEM) and shipped alongside the radiance product. NASA EMIT does this; the EnMAP and PRISMA L1 products instead ship raw lons/lats and require interpolation at consumption time. The structural diagram repeats inside the georreference docstring:

┌────────────────────────────────────────────────────────────────────┐
│  GLT ARRAY STRUCTURE                                               │
│                                                                    │
│  glt.shape = (2, H_out, W_out)                                     │
│                                                                    │
│  glt[0, i, j] = source column (x-index in sensor array)            │
│  glt[1, i, j] = source row    (y-index in sensor array)            │
│                                                                    │
│  For each output pixel (i, j):                                     │
│    output[..., i, j] = data[..., glt[1,i,j], glt[0,i,j]]           │
│                                                                    │
│  ┌──────────────────────┐      ┌──────────────────────┐            │
│  │    Sensor Array     │       │   Output Grid        │            │
│  │    (raw data)       │       │   (orthorectified)   │            │
│  │   ┌───┬───┬───┐     │       │  ┌──┬──┬──┬──┐       │            │
│  │   │ A │ B │ C │     │       │  │  │ A│ B│  │       │            │
│  │   ├───┼───┼───┤     │ GLT   │  ├──┼──┼──┼──┤       │            │
│  │   │ D │ E │ F │  ───────►   │  │  │ D│ E│ F│       │            │
│  │   ├───┼───┼───┤     │       │  ├──┼──┼──┼──┤       │            │
│  │   │ G │ H │ I │     │       │  │ G│ H│ I│  │       │            │
│  │   └───┴───┴───┘     │       │  └──┴──┴──┴──┘       │            │
│  └──────────────────────┘      └──────────────────────┘            │
│                                                                    │
│  GLT handles: terrain distortion, sensor geometry, Earth curvature │
│  Invalid pixels: glt values = fill_value_default (typically -1)    │
│                                                                    │
└────────────────────────────────────────────────────────────────────┘

The output grid in the diagram has cells that pull from non-adjacent sensor pixels — that’s terrain distortion encoded in the GLT. Because it’s a pure gather, GLT orthorectification:

The trade-off: fixed output grid. If you want a different resolution or CRS than the GLT was built for, you either re-ship the GLT or fall back to interpolation.


5. The interpolation workflow, step by step

The internal walkthrough lives inside reproject’s docstring:

┌────────────────────────────────────────────────────────────────────┐
│  INTERPOLATION WORKFLOW                                            │
│                                                                    │
│  1. Flatten inputs                                                 │
│     data: (H, W, C) → (H×W, C)  [or (H, W) → (H×W,)]               │
│     lons/lats: (H, W) → (H×W,)                                     │
│                                                                    │
│  2. Generate output coordinate grid                                │
│     meshgrid(transform, width, height) → (xs, ys)                  │
│     Transform xs, ys from dst_crs to input crs if different        │
│                                                                    │
│  3. Call scipy.interpolate.griddata                                │
│     points = (lons_flat, lats_flat)                                │
│     values = data_flat                                             │
│     xi = (xs_grid, ys_grid)                                        │
│     result = griddata(points, values, xi, method=method)           │
│                                                                    │
│  4. Reshape and handle nodata                                      │
│     Fill NaN regions with fill_value_default                       │
│     Transpose to (C, H, W) if multi-band                           │
│                                                                    │
└────────────────────────────────────────────────────────────────────┘

A few details that bite:


6. The three high-level entry points

All three return a GeoTensor with a real affine transform. Pick by what you have on hand:

read_to_crs(data, lons, lats, resolution_dst, dst_crs=..., crs="EPSG:4326", method="cubic")

You have (data, lons, lats) and a target resolution. Auto-computes the output bounds (from footprint(lons, lats)), the output shape, and a figure_out_transform. Default dst_crs is the auto-detected UTM zone.

Use this for the first orthorectification of a scene with no preferred grid.

read_reproject_like(data, lons, lats, data_like, method="cubic", ...)

You have (data, lons, lats) and want the result on another raster’s grid (e.g., to coregister an EMIT scene with a Sentinel-2 scene). Reads data_like.transform, data_like.crs, data_like.shape and dispatches to reproject.

Use this when stacking heterogeneous sensors onto a common grid.

reproject(data, lons, lats, width, height, transform, dst_crs, crs="EPSG:4326", fill_value_default=-1, method="cubic")

The full-control entry point. You provide the output grid explicitly. Both wrappers above ultimately call this.

Use this for production pipelines where the output grid is fixed by upstream config (e.g., a tile catalog).

There’s also get_shape_transform_crs(lons, lats, resolution_dst, ...) — the inverse-design helper that read_to_crs uses internally. Useful when you want to plan a grid before committing to the slow interpolation step.


7. The fast path: georreference(glt, data)

georreference(glt: GeoTensor, data: NDArray,
              valid_glt: Optional[NDArray] = None,
              fill_value_default: Optional[Union[int, float]] = None) -> GeoTensor

glt carries the transform and CRS (because it’s a GeoTensor); data is the raw sensor array. The function:

  1. Checks the valid_glt mask (or computes it from glt != fill_value_default).

  2. Allocates the output array with shape inferred from glt.shape[1:] plus data’s leading dims.

  3. Issues output[..., valid_glt] = data[..., glt[1, valid_glt], glt[0, valid_glt]] — a single fancy-index gather.

Speed scales with output pixel count and band count, not with input scene size. A 285-band EMIT scene orthorectifies in seconds vs minutes for cubic interpolation.

Source: griddata.py:473.


8. Footprint helper

footprint(lons: NDArray, lats: NDArray) -> Polygon

Builds a Shapely polygon for an irregular grid. Picks the four extremum points (argmin/argmax of lon and lat in raveled form) and connects them. Approximate — not the actual hull — but fast and good enough for AOI-overlap checks.

For exact hulls use shapely.MultiPoint(zip(lons.ravel(), lats.ravel())).convex_hull. The fast version is used inside read_to_crs to derive output bounds.


9. Function reference

FunctionReturnsUse for
reproject(data, lons, lats, width, height, transform, dst_crs, ...)GeoTensorfull-control interpolation
read_to_crs(data, lons, lats, resolution_dst, ...)GeoTensorauto-grid interpolation
read_reproject_like(data, lons, lats, data_like, ...)GeoTensormatch another raster’s grid
get_shape_transform_crs(lons, lats, resolution_dst, ...)(shape, transform, crs)plan before interpolating
meshgrid(transform, width, height, source_crs=..., dst_crs=...)(xs, ys) arraysbuild query points
georreference(glt, data, valid_glt=..., fill_value_default=...)GeoTensorGLT-based orthorectification
footprint(lons, lats)Polygonquick irregular-grid hull

METHOD_DEFAULT = "cubic" is the module-level default for all interpolation entry points.


10. Sharp edges


11. How the curvilinear sensor readers use this module

The geotoolz plan is to keep this division: GLT-equipped sensors get the fast path, everyone else gets cubic interpolation. The griddata module is the substrate; the readers are the per-sensor wrappers.


12. Why this matters for geotoolz

Three concrete things downstream:

  1. Hyperspectral operators (matched filter, ACE, RX, unmixing) consume orthorectified GeoTensors. The interpolation step happens at read time, inside the sensor reader. By the time geotoolz.hyperspectral.MatchedFilter runs, it’s just a GeoTensor like any other — the curvilinear-ness is invisible.

  2. The cost asymmetry shapes the pipeline. Interpolating a 285-band EMIT scene cubically is minutes. Operators that re-read the scene (e.g., parameter sweeps) should load() once and reuse, not ortho-on-demand.

  3. GLT support is sensor-specific. A geotoolz.presets.emit.EMIT_METHANE_MF preset can rely on the fast GLT path; the equivalent EnMAP/PRISMA preset cannot. This shows up as different default behaviour in those presets — not a bug, a substrate constraint.

Next chapter: Mosaic — combining multiple rasters into seamless composites with reprojection and nodata fill.