Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

pixel_spectra — pixel-scale spectral inspector

Given a single pixel, a multi-point selection, or a small polygon AOI, pull the per-pixel reflectance spectrum from a hyperspectral scene, overlay a reference spectrum, and score similarity with a chosen distance metric. Built for vetting candidate detections (methane plumes, mineral identifications, snow / vegetation verification) before downstream analysis.

Status: design only. No code in this PR.

Question answered

“Does the pixel I clicked on (or the patch I drew) actually look like the thing I think it is — and how close is it?”

This is upstream of the methane_pod workflow: before fitting a point-process to a list of plume detections, you’d open each candidate’s AOI here, sanity-check that the spectrum matches a methane-enhanced reference (Mag1c-style absorption signature, or a HITRAN-derived synthetic), and reject false positives. The same tool works for any “is this thing X?” question over a hyperspectral scene.

Scope

Algorithm

1. Discover scenes intersecting the AOI for the chosen sensor
       (delegate to satellite_viewer.search; the SENSORS registry
       already has emit-l2a-rfl).
2. For each scene:
     a. Sign asset hrefs (planetary-computer or earthaccess token).
     b. Read the full reflectance cube clipped to the AOI bbox via
        georeader → ndarray of shape (n_bands, h, w) plus the band
        wavelengths from the asset's metadata.
     c. Convert AOI geometry → pixel-coordinate list (snap/contains).
3. Build a pixel-spectra DataFrame: one row per (scene, pixel),
       columns = [lon, lat, datetime] + 285 reflectance values.
4. Reference spectrum: resample to scene's wavelength grid (linear
       interp, drop bands with no overlap).
5. Apply distance metric per pixel against the reference → one float
       per (scene, pixel).
6. (Optional) Mask non-usable bands: EMIT atmospheric water-vapor
       windows around 1.4 μm and 1.9 μm (~80 of the 285 bands).
       Mask choice is a knob exposed in the UI.

Stages

vWhat it does
v0Point AOI → spectrum vs. reference, side-by-side plot. No metric.
v1Add distance metric: single number per pixel, alongside the plot.
v2Polygon / MultiPoint AOI → per-pixel metric histogram + mean spectrum
with uncertainty (±σ) band; “good vs. bad” pixel scatter on the map.
v3Bulk vetting: upload a GeoJSON / CSV of candidate detection AOIs, run
the per-pixel metric over each, return a sorted table with score + the
best-matching scene per candidate.

You ship after v2 for interactive use; v3 turns it into a vetting pipeline.

Architecture

projects/pixel_spectra/
├── pyproject.toml
├── README.md
├── src/pixel_spectra/
│   ├── __init__.py
│   ├── sensors.py        # HSI-only subset of the satellite_viewer registry
│   ├── reader.py         # AOI → pixel-spectra DataFrame (via georeader)
│   ├── references.py     # library loader (USGS splib, ECOSTRESS), interp helpers
│   ├── metrics.py        # SAM, SID, cosine, Euclidean, matched_filter
│   └── bulk.py           # the v3 vetting pipeline
├── tests/
│   ├── test_metrics.py   # numerical correctness vs. hand-computed cases
│   ├── test_reader.py    # AOI → pixel-coord conversion (offline)
│   └── test_references.py# library loader + resample
└── apps/                 # whichever stack we pick from "Stack options"

Repos reused: satellite_viewer (discovery + credentials), georeader (windowed reads). Nothing from the geotoolz / geopatcher / geocatalog stack is needed at this scale (single scene, ≤ few hundred pixels).

Output schema

A long-format pandas DataFrame:

scene_id      : string         # STAC item id
pixel_lon     : float64        # WGS84 lon of the pixel centre
pixel_lat     : float64        # WGS84 lat of the pixel centre
datetime      : datetime[UTC]  # acquisition time
n_bands       : int            # how many bands made it past masking
distance      : float64        # the chosen metric vs. reference
spectrum      : object         # list[float] of length n_bands (or
                               # path to a per-row Zarr for large dumps)
reference_id  : string         # which reference spectrum was used
metric        : string         # "SAM" / "SID" / "cosine" / ...
mask_id       : string         # which band-mask was applied

Stack options

Different ways to build this, in increasing order of how much exists already vs. how much you control.

Build on what’s already in research_notebook:

Pro: zero new dependencies, lights up the existing stack, plays nicely with the planned satellite_climatology work. Con: georeader’s HSI ergonomics on EMIT specifically still need a micro-bench (it was designed against S2/L8 first).

Option B — rioxarray + pystac-client directly

Skip georeader; open each EMIT asset as an xarray Dataset via rioxarray.open_rasterio(..., chunks={"band": 32}) and use xarray’s windowed indexing.

Pro: industry-standard pipeline, plays well with dask if it grows. Every Python remote-sensing engineer can read it. Con: more boilerplate; loses the georeader abstraction for non-COG sensors you might add later (PRISMA HDF5).

Option C — Earth Engine

ee.ImageCollection("EMIT/L2A").filterBounds(aoi).getRegion(...) returns pixel values directly.

Pro: no asset signing, no scene reads, no infrastructure. Con: EMIT may not be ingested in EE (check at start of M1); EE also can’t handle 285-band image returns cleanly — you’d hit element limits. Probably a dead-end for HSI at this scale.

Option D — Notebook recipe only

No app, just a Jupyter notebook + a pixel_spectra.read_aoi(...) helper. Click in JupyterLab via ipyleaflet Draw control. Spectrum plot inline.

Pro: shortest path to a working demo. Matches your geostack notebook pattern. Con: no bulk vetting UI; the v3 pipeline becomes a script not an app.

My recommendation: A for the library + a notebook at v0/v1 (option D’s shape) + a Panel/Streamlit subapp at v2/v3 (option A’s infra under a UI). Ship v0 as the notebook in week 1; promote to a real app once the metric set stabilises.

UI integration

Single-screen layout:

+---------------------------+--------------------------------------+
| Sensor   : emit-l2a-rfl   |  [ leafmap basemap ]                 |
| AOI      : draw or upload |  AOI overlay + selected pixels (dots)|
| Date     : range slider   |                                      |
| Reference: library | up.. |                                      |
| Metric   : SAM ▾          |                                      |
| Mask     : default ▾      |                                      |
| [ Inspect ]               |                                      |
+---------------------------+--------------------------------------+
| Spectrum panel  (selected pixel + reference, with mask shaded)   |
+------------------------------------------------------------------+
| Metric panel    (histogram for multi-pixel AOI, or scalar)       |
+------------------------------------------------------------------+
| Bulk table      (v3 only: sorted scores per candidate)           |
+------------------------------------------------------------------+

Click on the map → highlight the pixel → re-render the spectrum panel against the same reference.

Compute budget

Memory: one EMIT scene clipped to a 1 km AOI is ~30 MB float32. Trivial.

Risks / open questions

Acceptance

Out of scope