Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

spectral_indices — index catalog and transformation engine

Apply named spectral transformations (NDVI, NDWI, NBR, EVI, …) to scenes and time-series across any sensor whose bands you have. Browse a catalog of ~250 named indices; build composed formulas from the algebra; export results as derived bands.

Status: design only. No code in this PR.

Question answered

“Given this AOI and this sensor, what does index X look like — as a map for one scene, and as a time-series across many scenes?”

This is the daylight-to-dusk applied tool: most users of remote sensing data spend their time computing and comparing indices, and right now there isn’t a clean local app for it — they either reach for Earth Engine or write per-index notebooks by hand.

Scope

Algorithm

1. Index registry: load awesome-spectral-indices/spectral-indices.json
       into a dict keyed by short_name. Each entry has formula,
       bands, domain (vegetation/water/fire/...), and reference.
2. Sensor → band-to-wavelength mapping: maintained as a small per-
       sensor dict in our own sensors.py (overlays satellite_viewer's
       registry).
3. Resolve an index against a sensor: substitute the index's named
       band slots (B, G, R, RE1, ..., N, S1, S2) with the sensor's
       actual asset/file keys. If any required band is missing, raise.
4. Read the required bands at the AOI bbox via georeader (or
       odc-stac depending on stack choice).
5. Evaluate the index formula on the band arrays. The formula is a
       small arithmetic expression — safe to eval through `numexpr`
       or `asteval`, never raw eval().
6. Output: derived band raster at the union extent of the input bands,
       resampled to the coarsest input resolution.
7. (v2 only) repeat per scene in the catalog, stack along time dim.

Stages

vWhat it does
v0Pick sensor + AOI + scene + index → derived band map.
v1Searchable index catalog: filter by domain (vegetation / water /
fire / soil / snow / urban / generic), see formula + bands + citation.
v2Time-series mode: catalog of N scenes (12-month window default) →
derived band per scene → chart of the AOI-mean over time.
v3Formula builder: user types (NDVI - NDWI) / (NDVI + NDWI) and it
gets parsed → safely evaluated → mapped, same as named indices.

You ship a useful tool after v1; v2/v3 are the depth.

Architecture

projects/spectral_indices/
├── pyproject.toml
├── README.md
├── data/
│   └── spectral-indices.json   # pinned snapshot of the registry
├── src/spectral_indices/
│   ├── __init__.py
│   ├── registry.py     # load_indices(), get(name), filter_by_domain(...)
│   ├── sensors.py      # per-sensor band→wavelength + asset-key map
│   ├── resolve.py      # match an index's band slots to a sensor
│   ├── evaluate.py     # safe formula evaluation on ndarrays
│   ├── reader.py       # AOI + bands → ndarray (delegates to georeader)
│   └── series.py       # v2: catalog → time series
├── tests/
│   ├── test_registry.py
│   ├── test_resolve.py
│   ├── test_evaluate.py    # hand-check NDVI / NDWI on synthetic arrays
│   └── test_series.py
└── apps/
    └── (Panel / Streamlit / notebook subapp; see Stack options)

geotoolz is the natural place for evaluate.py — each named index becomes a tiny Operator subclass and the formula builder becomes Sequential / Graph composition. But that’s a refactor decision, not blocking.

Output schema

For v0/v1 (single scene): an xarray.DataArray:

dims    : (y, x)
coords  : y, x   (in the sensor's UTM CRS)
attrs   : index, sensor, scene_id, formula, bands_used, units

For v2 (time series): xarray.DataArray with extra time dim, or a DataFrame for the AOI-mean trace:

columns : [datetime, sensor, scene_id, index, value_mean, value_std,
           value_p10, value_p90, n_valid_pixels]

Stack options

Each named index is an Operator subclass returning a single band:

class NDVI(Operator):
    def _apply(self, gt):
        nir, red = gt.values[NIR_IDX], gt.values[RED_IDX]
        return gt.array_as_geotensor((nir - red) / (nir + red + 1e-10))
    def get_config(self):
        return {"nir_idx": NIR_IDX, "red_idx": RED_IDX}

Composed formulas are Sequential / Graph. Time series via geocatalog + a per-scene Operator application loop. Reads via georeader.

Pro: showcases the entire geotoolz + geocatalog + georeader stack — this is the canonical use case. Reproducibility via Operator.get_config(). Con: writing all 250 indices as Operators by hand is silly. Better: one generic IndexOp that takes a name and looks up the formula from the registry, then evaluates it dynamically.

Option B — Earth Engine

ee.Image.expression("(N - R) / (N + R)", {"N": img.select("B8"), ...}) evaluates band-math in EE natively. Combine with the eemont extension which has the same awesome-spectral-indices catalog wired in:

img = (
    ee.ImageCollection("COPERNICUS/S2_SR")
    .filterBounds(aoi)
    .filterDate(t0, t1)
    .map(lambda img: img.spectralIndices(["NDVI", "NDWI", "NBR"]))
)

Pro: zero infrastructure, every sensor EE has is pre-supported, the index catalog is already wired. Implementation is ~50 lines. Con: hosted, EE’s terms-of-use, doesn’t help build out your own stack. Best as a comparison / sanity-check tool, not the primary app.

Option C — odc-stac + xarray directly

Skip both geotoolz and georeader:

items = stac_client.search(...).items()
ds = odc.stac.load(items, bands=["B04", "B08"], bbox=aoi.bounds)
ndvi = (ds.B08 - ds.B04) / (ds.B08 + ds.B04 + 1e-10)

Pro: industry-standard stack, plays well with dask + zarr, every modern Python remote-sensing project uses this shape. Con: less reproducibility metadata than Operator.get_config(); doesn’t showcase your stack.

Option D — GDAL band-math via gdal_calc.py

Shell-tool / CLI approach. Each index becomes a saved GDAL --calc expression.

Pro: zero Python at compute time; great for batch jobs that just need to land COGs on disk. Con: no interactivity, no time series, no UI.

My recommendation: A as the library, with C as the time-series back-end if geocatalog scaling becomes a bottleneck. B as a notebook sibling for “what does Earth Engine say?” cross-check.

UI integration

Two natural shapes:

  1. Notebook (always ship this): a spectral_indices.compute(aoi, sensor, index, t0, t1) helper that returns an xarray.DataArray, plus a .plot() recipe. Time-series via .compute_series(...).
  2. Streamlit / Panel app: sidebar = sensor + AOI + date + index picker (with searchable catalog) + formula editor; main = map + time-series chart + formula display + citation.

The catalog browser is the headline feature:

+---------------------+----------------------------------------+
| Domain  : water  ▾  |  Search: "modified"                    |
| Sensor  : sent.-2 ▾ |  Filter: sensors compatible only       |
+---------------------+----------------------------------------+
| MNDWI - Modified Normalized Difference Water Index           |
|   formula : (G - S1) / (G + S1)                              |
|   bands   : G=B3, S1=B11   (Sentinel-2)                      |
|   citation: Xu (2006)                                        |
|   [Apply to AOI]                                             |
+--------------------------------------------------------------+
| WI2015 - Water Index 2015                                    |
| ...                                                          |
+--------------------------------------------------------------+

Compute budget

Risks / open questions

Acceptance

Out of scope