Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

REST APIs — a background primer

UNEP
IMEO
MARS

REST APIs: A Background Primer

A companion to the STAC primer. STAC API is itself a REST API, so this document is both the foundation under STAC and the broader framework that explains why some catalogs aren’t STAC-compliant — and what you do when they aren’t.


Part 1 — ELI5: What is a REST API?

The restaurant analogy

You walk into a restaurant. You sit down. The waiter comes over with a menu. You point at “burger, medium rare, fries.” A while later the waiter brings the burger. You eat, you pay, you leave.

That entire interaction is a REST exchange:

If you wanted dessert, you’d make a new order (“brownie”). If you wanted to change your burger to well-done halfway through, you’d issue a new request. The waiter doesn’t carry mental notes about your preferences from yesterday — every request must say everything the kitchen needs to know to make the dish.

That’s REST. It’s a way to interact with a server where every request is self-contained and the server’s job is to look at the request, do the work, and return a response. No memory, no session, no implicit state.

You already use REST every day

When you load a webpage, your browser sends a request like “GET this URL” to the server, the server responds with HTML, and your browser renders it. When you click a link, that’s another request. When you fill out a form and click submit, that’s a request with form data attached.

The web is REST. The HTTP protocol that makes browsers work is the same protocol that makes REST APIs work. The only difference between “the web” and “a REST API” is who’s on the other end of the response — a human reading HTML, or a program parsing JSON.

This is the deep reason REST won: it didn’t invent anything. It just said “use HTTP the way HTTP was designed to be used.” Everything else (servers, caches, load balancers, proxies, browsers, debuggers) already worked.

A simpler analogy: filing requests at a government office

You fill out a form (the request). You hand it to the clerk at window 3 (the URL endpoint). The clerk does what the form says — registers your car, issues a permit, looks up a record (the operation, determined by the form type — GET, POST, PUT, DELETE). They hand you back paperwork (the response) with a stamp showing it worked (status code 200) or a rejection slip (status code 400 or 500).

If you need to file another request, you fill out another form. The clerk doesn’t remember you from earlier. Every interaction is one form, one outcome.

What REST is not


Part 2 — The shape of REST

The four moving parts of every REST interaction

       ┌──────────────────────────────────────┐
       │                                      │
       │            REQUEST                   │
       │                                      │
       │   ┌────────┐                         │
       │   │ VERB   │  GET                    │   "What action?"
       │   └────────┘                         │
       │   ┌────────┐                         │
       │   │ URL    │  /collections/sentinel-2-l2a/items/S2A_...
       │   └────────┘                         │   "What resource?"
       │   ┌────────┐                         │
       │   │HEADERS │  Authorization: Bearer eyJ...
       │   │        │  Accept: application/json
       │   └────────┘                         │   "Auth, format, etc."
       │   ┌────────┐                         │
       │   │ BODY   │  (empty for GET)        │   "Payload, for writes"
       │   └────────┘                         │
       │                                      │
       └──────────────────────────────────────┘
                          │
                          ▼
       ┌──────────────────────────────────────┐
       │                                      │
       │            RESPONSE                  │
       │                                      │
       │   ┌────────┐                         │
       │   │ STATUS │  200 OK                 │   "Did it work?"
       │   └────────┘                         │
       │   ┌────────┐                         │
       │   │HEADERS │  Content-Type: application/json
       │   │        │  X-RateLimit-Remaining: 99
       │   └────────┘                         │
       │   ┌────────┐                         │
       │   │ BODY   │  { "id": "S2A_...", ... }
       │   └────────┘                         │   "The actual data"
       │                                      │
       └──────────────────────────────────────┘

That’s every REST call. The whole protocol fits in this box.

The verbs (HTTP methods)

REST inherits HTTP’s verbs and assigns them meaning:

VerbMeansIdempotent?Has body?
GETRead a resource. Should have no side effects.YesNo
POSTCreate a new resource, or trigger an action.NoYes
PUTReplace a resource entirely with the request body.YesYes
PATCHPartially modify a resource.UsuallyYes
DELETERemove a resource.YesNo
HEADLike GET, but only return headers.YesNo
OPTIONSAsk what verbs/headers an endpoint supports.YesNo

Idempotent means: making the same request twice has the same effect as making it once. GET /scenes/123 returns the same scene whether you call it 1 time or 100 times. POST /scenes creates a new scene every time you call it — not idempotent.

In remote sensing APIs, you’ll see ~95% GETs (reading catalog data) and ~5% POSTs (submitting search queries with bodies too big for a URL, or triggering processing jobs). PUT/PATCH/DELETE are rare unless you’re publishing to a catalog you own.

URLs as resource identifiers

The core REST idea about URLs is that a URL identifies a “thing” (a resource), not an “action.”

Good REST URL structure:

GET    /collections                              → list of collections
GET    /collections/sentinel-2-l2a               → one specific collection
GET    /collections/sentinel-2-l2a/items         → list of items in that collection
GET    /collections/sentinel-2-l2a/items/S2A_... → one specific item
POST   /collections/sentinel-2-l2a/items         → create a new item
DELETE /collections/sentinel-2-l2a/items/S2A_... → delete that item

Notice that the URL describes what, not how. The verb describes the how. This is the “REST way.” There’s no /getSentinel2Items or /createNewItem URL — those would be RPC-style (“call this function”), not REST-style (“act on this resource”).

In practice, many real APIs blend styles. CMR’s /search/granules.json is RPC-flavored. Sentinel Hub’s /process endpoint is action-oriented. Don’t let the purity discussion distract you — the practical question is always “does this endpoint do what I need,” not “is it pure REST.”

Status codes

The response status code tells you how it went. Categories:

The 401/403 distinction is genuinely useful and often misused. 401 means “I don’t know who you are” — your token is missing, expired, or malformed. 403 means “I know who you are, but you don’t have permission for this” — your token is valid but lacks the scope. If you’re debugging an auth issue and you see 403, don’t go regenerating tokens — your problem is permissions, not credentials.

The 429 case matters for catalogs: nearly every public STAC API will rate-limit you if you hammer it. Look for Retry-After and X-RateLimit-* headers in the response and back off accordingly.

Headers

Headers carry metadata about the request or response. The ones you’ll see constantly:

Request headers you send:

Response headers you receive:


Part 3 — The constraints that make something “REST”

REST is short for Representational State Transfer, coined by Roy Fielding in his 2000 PhD dissertation. Fielding wasn’t inventing a protocol; he was describing the architectural style behind the web that had already won. He named six constraints. If your API follows all six, it’s “RESTful.”

In practice, most APIs follow ~4 of them. The community has stopped policing this.

The six constraints

  1. Client-server. The client (your code) and server (the catalog) are separate. Either can evolve independently as long as the interface stays compatible.

  2. Stateless. Every request contains everything the server needs to handle it. The server doesn’t store session state. If you need to authenticate, you send the auth on every request — not once at the start.

  3. Cacheable. Responses must mark themselves cacheable or not. Caches (browsers, CDNs, intermediate proxies) can store responses and serve them again without bothering the server. This is why CDN-fronted catalogs (Planetary Computer, Earth Search) are fast even under heavy load.

  4. Uniform interface. Everyone uses the same verbs, the same URL conventions, the same status codes. You can use the same HTTP client (curl, requests, httpx) against any REST API. This constraint is the reason REST won — uniformity at the protocol level meant infrastructure could be shared.

  5. Layered system. Between you and the actual server there might be caches, load balancers, authentication proxies, rate limiters. You can’t tell. As far as your code is concerned, you’re talking to one server.

  6. Code on demand (optional). The server can send code (JavaScript) that the client executes. Browsers do this; REST APIs rarely do. Often ignored.

HATEOAS — the constraint everyone skips

There’s a seventh idea inside “uniform interface” that’s important enough to break out: HATEOAS (Hypermedia As The Engine Of Application State). It says: the server should tell the client what it can do next by including links in its responses.

A response from a “true” REST API doesn’t just give you data; it gives you data plus links to related resources and available actions:

{
  "id": "S2A_10SEG_20240615_0_L2A",
  "datetime": "2024-06-15T18:42:13Z",
  "links": [
    {"rel": "self",       "href": "https://.../items/S2A_..."},
    {"rel": "parent",     "href": "https://.../collections/sentinel-2-l2a"},
    {"rel": "thumbnail",  "href": "https://.../thumbnail.jpg"},
    {"rel": "next",       "href": "https://.../items?token=abc"}
  ]
}

The client can follow parent to navigate up, next for pagination, thumbnail for the preview — without knowing the URL structure ahead of time. The server tells you what’s possible by handing you the URLs.

STAC takes HATEOAS seriously. Every STAC object has a links array, and a well-behaved STAC client crawls links rather than constructing URLs by hand. This is why static STAC catalogs work — you give a client a root URL and it can discover everything by following links.

Most non-STAC REST APIs ignore HATEOAS and require you to read documentation and build URLs by hand. This is fine in practice but means the client has to know more about the API ahead of time.

The Richardson Maturity Model

A useful (informal) way to grade how “RESTful” an API is:

Nobody actually cares about the level — the model is just useful for diagnosing what your API is missing. Level 2 is good enough for almost everything.


Part 4 — REST APIs in remote sensing

The geospatial world is full of REST APIs. Some are STAC-compliant; many aren’t. Roughly four categories of what they do:

4a. Data search / discovery APIs

These find scenes matching your query. Examples:

The key thing to internalize: STAC API is one of many ways to do data discovery over REST. A provider that offers STAC almost always has a non-STAC native API as well (often older), and the non-STAC API often has features the STAC view doesn’t expose (e.g., ordering, processing options, provider-specific metadata).

4b. Processing APIs (compute on demand)

These don’t just find data — they process it server-side and return results.

Processing APIs trade flexibility for convenience. If you just need NDVI over a small AOI, Sentinel Hub Process returns it in one HTTP call. If you need to do something they don’t support (e.g., your own retrieval algorithm), you fall back to downloading data and processing locally.

For methane work, processing APIs are usually not the right fit — your algorithms are too specialized and the providers don’t host the dependencies (PyTorch, JAX, your specific NumPyro models). Discovery + local processing is the standard pattern.

4c. Tasking APIs (commercial satellites)

For commercial constellations (Planet, Maxar, ICEYE, Capella, Umbra, BlackSky), you can order new acquisitions over their REST API. You POST a JSON describing the target area, acquisition window, quality requirements, and you get back an order ID. You poll for status. Eventually the imagery shows up in their catalog.

These are POST-heavy APIs with complex state machines (order placed → accepted → scheduled → acquired → processed → delivered). They almost always include async notification mechanisms (webhooks, polling endpoints) since tasking can take hours to days.

4d. Tile / serving APIs (visualization)

For rendering imagery on web maps:

These exist in the ecosystem but rarely matter for scientific ML work — you’d use them in a dashboard, not in a training pipeline.

How REST compares to OGC’s older standards

Before REST won, the OGC (Open Geospatial Consortium) had its own family of standards that did similar things in different ways:

These services use HTTP but in a more verbose, XML-heavy, less idiomatic way (often everything is GET with KVP query strings, response is GML/XML). They’re still widely deployed in government GIS infrastructure but are gradually being replaced by:

If you see XML-based WMS/WFS/WCS in the wild today, it’s usually legacy government infrastructure. Modern catalogs go REST/JSON.


Part 5 — Authentication: where the practical pain lives

REST authentication is where 80% of the operational complexity in EO pipelines actually lives. The recipe files you have already touch on this; this section gives the systematic view.

5a. No authentication

Some catalogs are fully public: Element 84 Earth Search, NASA CMR-STAC (for search — assets need auth), VEDA, Digital Earth Africa/Australia, Maxar Open Data, Capella Open, Umbra Open. You just hit the URL.

import httpx
response = httpx.get("https://earth-search.aws.element84.com/v1/collections")

When this works, life is easy. When it doesn’t, you fall into one of the patterns below.

5b. API keys

The simplest authenticated pattern: you sign up, get a long string, send it on every request. Two common placements:

# As a header (most common)
httpx.get(url, headers={"Authorization": f"ApiKey {API_KEY}"})
httpx.get(url, headers={"X-API-Key": API_KEY})

# As a query parameter (less secure — keys end up in logs)
httpx.get(f"{url}?api_key={API_KEY}")

Used by: Radiant MLHub, Carbon Mapper, many smaller providers. Trivial to use, no token refresh, no expiry handling. Just don’t commit them to git.

5c. HTTP Basic Auth

You send a username and password (base64-encoded) on every request.

import httpx
response = httpx.get(url, auth=(username, password))
# Sends: Authorization: Basic base64(username:password)

Used by: DLR Geoservice (UMS), some legacy archives, the THREDDS data server. Simple but inflexible — no scoping, no expiry, no revocation without changing the password.

5d. Bearer tokens (the OAuth 2.0 family)

You first authenticate to an authorization server (often a different host) and get a token. Then you send that token on every subsequent request to the resource server.

# Step 1: Get token
token_response = httpx.post(
    "https://auth.example.com/token",
    data={"grant_type": "...", ...},
).json()
access_token = token_response["access_token"]

# Step 2: Use token on data requests
httpx.get(
    "https://data.example.com/items",
    headers={"Authorization": f"Bearer {access_token}"},
)

The differences between OAuth flows are in how step 1 works:

Client Credentials flow — for machine-to-machine. You have a client_id and client_secret (issued by the provider). You POST them to the token endpoint. Used by: Copernicus Data Space (CDSE) for Sentinel Hub, Planet for some operations, most commercial APIs for backend services.

Authorization Code flow — for user-facing applications. The user is redirected to the provider’s login page, authorizes your app, and you receive a code that you exchange for a token. Used when an app acts on behalf of a logged-in user. Rare in scientific pipelines but common in web UIs.

Device Code flow — for headless devices. You display a code to the user, they log in on their phone, your device polls for completion. Used by NASA Earthdata Login in headless contexts.

Refresh tokens — most OAuth flows give you both an access_token (short-lived, ~1 hour) and a refresh_token (long-lived, days to months). When the access token expires you use the refresh token to get a new one without re-authenticating. A robust client handles this automatically — manually re-authenticating every hour is the most common bug in OAuth code.

5e. NASA Earthdata Login (a specific story)

EDL is OAuth 2.0 + bespoke conventions. You log in once (browser or .netrc), get a bearer token, and use it against any NASA DAAC. For S3 direct access you need an extra step:

  1. Authenticate to EDL → bearer token.

  2. Use the bearer token to hit /s3credentials on the relevant DAAC.

  3. That endpoint returns temporary AWS STS credentials (access key, secret, session token, expiry).

  4. Use those STS credentials with boto3/obstore/fsspec for S3 access.

  5. Refresh ~hourly.

This is why earthaccess and obstore.auth.earthdata.NasaEarthdataCredentialProvider exist — they automate the whole loop. Don’t do it by hand if you can avoid it.

5f. AWS SigV4 (when REST meets cloud auth)

S3 and many AWS-flavored services authenticate with Signature Version 4 — a signature derived from your AWS access key, secret, the request method, URL, headers, body, and a timestamp. The signature goes in the Authorization header.

You almost never sign by hand. boto3, obstore, fsspec’s s3fs, and rasterio’s GDAL all do it for you. But you’ll see SigV4 errors in the wild, and recognizing what they mean (“the timestamp on your request is too skewed from the server’s clock,” “your credentials are valid but you don’t have permission to read this bucket”) is useful.

CDSE’s S3-compatible endpoint at eodata.dataspace.copernicus.eu uses SigV4 with CDSE-issued keys.

5g. mTLS (mutual TLS)

In some government and security-sensitive APIs, the client presents a certificate, not just the server. Your code holds a .pem file with a private key, and HTTPS handshake validates you on both sides.

httpx.get(url, cert=("client.pem", "client-key.pem"))

Rare in commercial EO but appears in some defense / national-archive APIs.

The auth decision tree

When approaching a new provider:

  1. Is search public? Yes → just hit it. No → find the auth docs.

  2. Is asset access public? Yes → use HTTPS hrefs directly. No → continue.

  3. What pattern do they use? Look for keywords: “API key” = §5b. “Basic auth” or .netrc = §5c. “OAuth,” “client_id,” “Bearer” = §5d. “Earthdata Login” = §5e. “STS,” “AWS credentials” = §5f.

  4. Is there a Python library that handles it? Almost always yes. earthaccess for NASA, planetary-computer for PC, sentinelhub-py for Sentinel Hub, requests_oauthlib for generic OAuth. Use it.


Part 6 — STAC API as a REST API

Now we can be precise about how STAC fits in.

STAC API is a REST API with a specific schema and specific endpoints. Specifically, it’s:

In Richardson Maturity Model terms, STAC is unusual in actually implementing Level 3 (HATEOAS) — every Item has a links array, every search response has next/prev links, every Collection links to its items. Most REST APIs you’ll encounter don’t bother.

So: everything in this primer applies to STAC API. The auth patterns, the status codes, the headers, the pagination styles — STAC inherits all of it. STAC just adds:

  1. A schema (Items must look like a GeoJSON Feature with specific fields).

  2. A vocabulary (specific extensions for eo, proj, raster, etc.).

  3. A capability (full cross-collection search via /search).

  4. Discoverability (conformance classes + HATEOAS).

If a provider’s REST API doesn’t follow STAC, it’s not because STAC is impossible — it’s usually because the provider’s API predates STAC (CMR, Planet Data API, Sentinel Hub Catalog v1) and the cost of migrating is high. Most providers now offer STAC alongside their legacy API.


Part 7 — When you’d use a non-STAC REST API

In your day-to-day work, you’ll occasionally hit cases where STAC isn’t enough or isn’t available:

Processing services. If you want server-side NDVI computation, mosaicking, or compositing, you need a Process API (Sentinel Hub, OpenEO). STAC has no concept of “compute on this data.”

Tasking. Ordering new acquisitions from Planet, Maxar, ICEYE, Capella, Umbra — these all use provider-specific REST APIs. STAC doesn’t have tasking.

Provider-specific archives that haven’t adopted STAC. Some research data centers (CEDA, some EUMETSAT services) still expose primarily legacy REST or OData. You either use their native API or wait for the STAC view.

Auth and credential exchanges. EDL token endpoints, CDSE OAuth endpoints, AWS STS endpoints — these are all REST but not STAC. You hit them as part of auth setup before doing STAC work.

Internal/private APIs. Your team’s internal “list of processed scenes” API is probably non-STAC REST unless someone built it to be STAC-compliant.

Granule-level operations on NASA data. CMR’s native API exposes things STAC-CMR doesn’t always surface (collection metadata, variable subsetting via Harmony, OPeNDAP links).

For your work specifically (methane, hyperspectral, ocean): the MARS system you’ve worked on, the IMEO operational pipeline, and most plume-detection databases are non-STAC REST. Even when there’s a STAC view, it usually doesn’t cover all of the operational metadata.


Part 8 — Python tooling for REST

The Python REST ecosystem is mature. The libraries you’ll actually use:

requests — the default

import requests
r = requests.get("https://earth-search.aws.element84.com/v1")
r.json()

Synchronous, blocking, the default in nearly every Python tutorial since 2011. Has session objects for connection pooling and persistent auth:

session = requests.Session()
session.headers.update({"Authorization": f"Bearer {token}"})
r = session.get(url)

Good for: scripts, notebooks, anywhere you don’t care about concurrency.

httpx — the modern default

import httpx

# Sync usage — drop-in replacement for requests
r = httpx.get(url)

# Async usage
async with httpx.AsyncClient() as client:
    r = await client.get(url)

Drop-in replacement API, plus async support, plus HTTP/2 support. Use this for new code. It’s what rustac and most modern async tooling expects to interoperate with.

aiohttp — when you need pure async

import aiohttp
async with aiohttp.ClientSession() as session:
    async with session.get(url) as response:
        data = await response.json()

More verbose API than httpx but slightly faster at high concurrency. Use if you’re benchmarking 10k concurrent requests and httpx isn’t keeping up; otherwise prefer httpx.

urllib3 — the low level

What requests is built on. Use directly only when you need control over the underlying connection pool (custom SSL contexts, connection-level retry policies, raw socket access). Rare.

The auth helper libraries

Pagination helpers

REST APIs paginate in several styles; clients must handle them:

Link header pagination (RFC 5988):

Link: <https://.../page2>; rel="next", <https://.../page10>; rel="last"

Parse the Link header, follow rel="next" until it’s missing.

Cursor/token pagination: Response includes "next_page_token": "abc123". Pass it as ?page_token=abc123 in the next request.

Offset/limit pagination: Response includes "total": 1000, you request ?offset=0&limit=100, then ?offset=100&limit=100, etc.

STAC’s pagination: Links array in the response includes a rel: "next" link with a fully-formed URL. Just follow it. pystac-client and rustac handle this automatically.

A common gotcha: don’t assume the page size you request is what you get. Most APIs cap it at 100 or 1000. Always look at what the server actually returned.


Part 9 — Common gotchas

1. Pagination is silent. If you call a search endpoint without pagination handling, you’ll get the first page (often 10 or 100 items) and not realize you missed the rest. STAC has 50,000 matching items? You get 10 unless you iterate. pystac-client and rustac handle this; manual requests.get(url).json()["features"] does not.

2. Status codes lie sometimes. Some APIs return 200 OK with an error message in the body. Always check both the status code and the response body shape. Look for error or errors keys before assuming success.

3. Rate limiting is unforgiving. Hit a STAC API with for i in range(10000): requests.get(...) and you’ll get banned for an hour. Look at X-RateLimit-Remaining headers; back off when it gets low. Better: use a library that does this for you (httpx with retry transport, tenacity for exponential backoff).

4. Tokens expire silently mid-pipeline. You start a long-running analysis with a fresh token, three hours later your access token has expired, and suddenly every request returns 401. Robust pipelines refresh tokens proactively or wrap requests in retry-with-refresh logic. The provider’s SDK usually does this; hand-rolled clients usually don’t.

5. Stateless means stateless. “I logged in earlier in this script” is meaningless to the server. Every request must carry auth. (HTTP cookies are a session-mimicking layer, but most APIs use bearer tokens and require them on every request.)

6. Error response shapes are inconsistent. Provider A returns {"error": "..."}. Provider B returns {"errors": [{"code": "...", "message": "..."}]}. Provider C returns a free-text message. Standardize at the client boundary.

7. Idempotency matters for retries. If a POST fails, can you safely retry it? Maybe — depends on whether the server treats it idempotently. Some APIs support Idempotency-Key headers (Stripe-style); most don’t. Retrying POSTs on timeouts can create duplicate orders / duplicate jobs.

8. Content negotiation pitfalls. A server might return JSON if you send Accept: application/json and XML if you send Accept: application/xml. Most modern APIs default to JSON if you don’t specify, but the safe move is always to explicitly set Accept.

9. URL construction footguns. Trailing slashes matter on some servers (/items/items/). Query string ordering doesn’t matter logically but can affect caching (CDNs sometimes cache by exact URL). When in doubt, copy the URL from the docs.

10. The “REST” label tells you almost nothing. “Our API is RESTful” can mean anything from Level 1 RPC-over-HTTP to a fully HATEOAS-compliant Level 3 API. Always read the docs, never assume conventions.

11. Sync vs async mismatch in libraries. requests.get() blocks. httpx.AsyncClient().get() returns a coroutine. Mixing them in the same code is the most common cause of “why is my async code synchronous” confusion. Pick one for a given module.

12. Compression is on by default but worth checking. Servers will gzip responses if you send Accept-Encoding: gzip (httpx and requests both do this automatically). If you’re piping raw responses through middleware, make sure the middleware doesn’t strip the encoding header without decompressing.


Part 10 — REST in the GeoStack

Where REST sits in the overall picture:

                          ┌──────────────────────────────────────────┐
                          │              SCIENCE / ML LAYER          │
                          │   JAX  ·  PyTorch  ·  scikit-learn       │
                          │   plumax  ·  somax  ·  gpyroX  ·  gaussx │
                          └──────────────────▲───────────────────────┘
                                             │  jnp.ndarray / tensors
                          ┌──────────────────┴───────────────────────┐
                          │            ARRAY / LABELED-ARRAY         │
                          │     xarray  ·  rioxarray  ·  dask        │
                          │     odc-stac  ·  stackstac  ·  zarr      │
                          └──────────────────▲───────────────────────┘
                                             │  lazy dask-backed DataArrays
                          ┌──────────────────┴───────────────────────┐
                          │              RASTER I/O                  │
                          │   rasterio (GDAL)  ·  kerchunk  ·  h5py  │
                          └──────────────────▲───────────────────────┘
                                             │  byte streams / file handles
                          ┌──────────────────┴───────────────────────┐
                          │           OBJECT-STORE LAYER             │
                          │   obstore  ·  fsspec  ·  boto3           │
                          │   (auth via REST token endpoints!)       │
                          └──────────────────▲───────────────────────┘
                                             │  signed hrefs / credentials
                          ┌──────────────────┴───────────────────────┐
                          │       CATALOG / METADATA LAYER           │
                          │   rustac  ·  pystac  ·  pystac-client    │
                          │   stac-geoparquet  ·  Arrow / DuckDB     │
                          │                                          │
                          │   Non-STAC alternatives:                 │
                          │   earthaccess (CMR)  ·  sentinelhub-py   │
                          │   planet-sdk  ·  raw httpx/requests      │
                          └──────────────────▲───────────────────────┘
                                             │  REST API calls
                          ┌──────────────────┴───────────────────────┐
                          │              REST API LAYER              │
                          │                                          │
                          │   STAC APIs:                             │
                          │     Planetary Computer · Earth Search    │
                          │     CMR-STAC · CDSE STAC                 │
                          │     DLR Geoservice · ...                 │
                          │                                          │
                          │   Non-STAC REST APIs:                    │
                          │     NASA CMR native · CDSE OData         │
                          │     Sentinel Hub Process · OpenEO        │
                          │     Planet Data · Maxar eAPI             │
                          │     EUMETSAT Data Store · ...            │
                          │                                          │
                          │   Auth endpoints (also REST):            │
                          │     EDL · CDSE OAuth · AWS STS · ...     │
                          └──────────────────────────────────────────┘

REST is everywhere in the lower half of the stack. It’s the foundation under STAC, under auth, under object stores’ credential exchanges. Every arrow between layers that crosses a network boundary is almost certainly a REST call.

The mental model that matters: STAC is a specific kind of REST. The auth flow is a different specific kind of REST. Object store APIs are yet another kind of REST. Different schemas, same underlying protocol. Once you can read HTTP requests fluently — verb, URL, headers, body, status, response body — you can debug anything in this layer of the stack.


Part 11 — REST vs STAC: a direct comparison

AspectGeneric REST APISTAC API
ProtocolHTTPHTTP
FormatAnything (usually JSON)JSON only
SchemaProvider-specificStrictly specified (Catalog, Collection, Item)
EndpointsProvider-specificStandardized (/collections, /search, etc.)
Filter languageProvider-specificCQL2 (standardized)
PaginationMultiple stylesLink-based (HATEOAS)
Conformance discoveryUsually none/conformance endpoint
Cross-API toolingCustom per providerUniversal (pystac-client, rustac)
HATEOASRarelyYes
VersioningProvider-specificSTAC version negotiated via stac_version
Asset descriptionsFree-formStandardized assets dict with roles, types
Spatial queryProvider-specificbbox, intersects
Temporal queryProvider-specificdatetime (ISO 8601, range syntax)

When STAC wins: multi-provider workflows, reproducible pipelines, scripting against many catalogs, anything where uniform tooling pays off.

When generic REST wins: provider-specific features (tasking, processing, ordering, provenance details), legacy catalogs that haven’t adopted STAC, internal/private APIs, anything that needs functionality outside STAC’s metadata-only scope.

In practice: most modern EO workflows are mostly-STAC with a few non-STAC REST calls for auth and for provider-specific operations. Your plumax pipeline will likely look like: STAC for discovery, EDL REST for auth, S3 (REST-flavored) for byte access, your code for everything above that.


Where to go from here

REST is a less glamorous foundation than STAC — it predates the modern geospatial stack and has no special connection to it. But that’s exactly the point: REST won because it’s general, and STAC inherits its universality. Once REST is comfortable, everything in the catalog and auth layers stops feeling like mystery and starts feeling like variations on one theme.