Community Attribute Service — harmonized access to global geospatial attribute datasets
Project description
CAS — Community Attribute Service
Harmonized access to global geospatial attribute datasets (DEM, soil, land cover, climate, vegetation) through a community-driven, open-source passthrough service.
CAS is not a data warehouse — it's a QC layer and one-stop-shop that pulls from upstream providers on-demand, validates responses, and returns harmonized results.
CAS ships 228 active providers spanning DEM/elevation, soil, land cover, hydrology, vegetation/canopy, climate/water-balance, geology, and biodiversity — including global/flagship datasets plus 169 national/regional providers across 38 countries (incl. MapBiomas land cover for Amazonia, Chaco, Pampa, Bolivia, Colombia, Peru, Paraguay, Uruguay and Venezuela in South America, and Indonesia in South-East Asia). Every provider listed below is registered in the runtime connector registry and exercised by the end-to-end health sweep; run cas providers to see the live list.
Status: Alpha (v0.1.0)
Statement of need
Large-sample hydrology depends on harmonized catchment attributes — terrain, soil, land cover, climate, and geology summarized over thousands of basins — as popularized by CAMELS-style datasets. Assembling such attributes today still means writing bespoke, per-dataset extraction scripts: every provider exposes a different protocol (WCS, STAC+COG, OPeNDAP, Zarr), grid, projection, and no-data convention, and the resulting one-off pipelines are rarely reusable or comparable across studies. CAS replaces that with a single interface for harmonized, quality-controlled zonal attribute extraction across 200+ providers: given a geometry and dataset identifiers, it fans out to the upstream services, subsets server-side, computes zonal statistics, applies QC (range, coverage, cross-provider consistency), and returns uniform results with provenance and citations. It is aimed at hydrologists, land-surface modelers, and large-sample studies that need reproducible attribute datasets without maintaining their own extraction code.
Quick Start
pip install "community-attribute-service[stac]"
The PyPI distribution is named community-attribute-service; the package you import is still cas and the CLI command is still cas. To work from a source checkout instead:
git clone https://github.com/DarriEy/CAS.git && cd CAS
pip install -e ".[dev,stac]"
# List registered providers
cas providers
# List available datasets from a provider
cas datasets -p isric_soilgrids
# Extract mean clay content for a polygon
cas extract \
-g '{"type":"Polygon","coordinates":[[[-96.6,39],[-96.5,39],[-96.5,39.1],[-96.6,39.1],[-96.6,39]]]}' \
-d isric_soilgrids:clay_0-5cm
# Cross-provider DEM comparison
cas extract \
-g @my_catchment.geojson \
-d copernicus_dem:elevation \
-d usgs_3dep:elevation \
-d nasadem:elevation \
-d alos_dem:elevation
# Multi-attribute extraction
cas extract \
-g @my_catchment.geojson \
-d copernicus_dem:elevation \
-d isric_soilgrids:clay_0-5cm \
-d esa_worldcover:land_cover
# Run health checks
cas health
Note for reviewers: commands that contact live providers (
cas extract,cas health,cas verify, the-m networktests) can occasionally fail due to transient upstream outages outside CAS's control; the daily CI health sweep compared against the committed baseline (health/baseline.json) is the mitigation that separates real regressions from provider downtime.
API
pip install "community-attribute-service[api,stac]"
uvicorn cas.api.app:create_app --factory --reload
POST /api/v1/extract — Extract attributes for a geometry
POST /api/v1/extract/batch — Extract attributes for many geometries
GET /api/v1/datasets — List available datasets (paginated)
GET /api/v1/providers — List registered providers (paginated)
GET /api/v1/providers/{slug} — One provider with full dataset metadata
GET /health — Liveness check + result-cache stats
GET /metrics — Prometheus metrics exposition
GET /docs — Interactive OpenAPI docs
The endpoints are typed with Pydantic response models, so /openapi.json and
/docs are a complete, always-in-sync description of the service. The full
228-provider catalog is discoverable over HTTP: list GET /api/v1/providers,
then drill into GET /api/v1/providers/{slug} for resolution, bbox, license,
citation, and variables.
/datasets and /providers accept limit (1–1000, default 100) and offset
query params and return {total, limit, offset, count, ...}. Catalog responses
are served from an in-memory metadata cache (TTL CAS_METADATA_CACHE_TTL_S).
Every response carries an X-Request-ID header; errors use a consistent envelope:
{"error": {"type": "request_limit", "message": "...", "request_id": "abc123"}}
Configuration
All runtime config is read from CAS_-prefixed environment variables. Hardening
features are off by default — the same image runs internal or public depending
only on env:
| Variable | Default | Purpose |
|---|---|---|
CAS_PROVIDER_TIMEOUT_S |
30 |
Per-provider extraction deadline (slow upstream → warning) |
CAS_REQUEST_TIMEOUT_S |
120 |
Whole-request backstop deadline |
CAS_MAX_DATASETS_PER_REQUEST |
50 |
Reject oversized requests (422) |
CAS_MAX_POLYGON_VERTICES |
10000 |
Reject overly complex geometries (422) |
CAS_RESULT_CACHE_TTL_S / CAS_RESULT_CACHE_MAX_ENTRIES |
600 / 10000 |
Result cache tuning |
CAS_METADATA_CACHE_TTL_S |
3600 |
Catalog cache TTL |
CAS_CORS_ORIGINS |
* |
Allowed origins (comma-separated or JSON) |
CAS_AUTH_ENABLED / CAS_API_KEYS |
false / — |
Require X-API-Key from a comma-separated allowlist |
CAS_RATE_LIMIT_ENABLED / CAS_RATE_LIMIT_PER_MINUTE |
false / 60 |
Per-caller fixed-window rate limit (per process) |
Deployment (Docker)
docker build -t cas-api .
docker run -p 8000:8000 \
-e CAS_AUTH_ENABLED=true -e CAS_API_KEYS=key1,key2 \
-e CAS_RATE_LIMIT_ENABLED=true \
cas-api
The rate limiter is in-memory and per-process; for multi-replica deployments, enforce limits at the ingress/gateway instead.
Python API (embedded)
import cas is the supported interface for using CAS in-process — no service
to deploy. Build a request, extract, iterate results:
import cas
cas.configure(provider_timeout_s=60) # optional: override env-based settings
request = cas.BatchAttributeRequest(
geometries=[{"type": "Point", "coordinates": [-96.5, 39.0]}],
dataset_ids=["copernicus_dem:elevation", "isric_soilgrids:clay_0-5cm"],
)
batch = cas.batch_extract_sync(request)
for resp in batch.responses:
for r in resp.results:
print(r.dataset_id, r.value, r.units, r.quality)
Async callers can await cas.extract(...) / await cas.batch_extract(...)
directly. See the Python API docs
for the full blessed surface (cas.__all__).
Python SDK (HTTP client)
cas.client is a typed wrapper over the HTTP API of a deployed service (ships
with the core package, no extra needed). It returns the same cas.core.models
types the service uses, and offers both a synchronous and an asynchronous
client.
from cas.client import CASClient
with CASClient("http://localhost:8000") as cas:
# Discover the catalog over HTTP
for p in cas.providers(limit=1000).providers:
print(p.slug, p.protocol)
detail = cas.provider("copernicus_dem") # full dataset metadata
# Extract (geometry accepts a GeoJSON geometry or Feature)
resp = cas.extract(
geometry={"type": "Point", "coordinates": [-96.5, 39.0]},
dataset_ids=["copernicus_dem:elevation", "isric_soilgrids:clay_0-5cm"],
)
for r in resp.results:
print(r.dataset_id, r.value, r.units, r.quality)
Non-2xx responses raise cas.client.CASError, carrying the parsed error
envelope (status_code, error_type, message, request_id). An async
AsyncCASClient mirrors the same methods.
Documentation
Full documentation (quick start, HTTP API, SDK guide + reference, CLI, provider catalog, architecture) is built with MkDocs:
pip install -e ".[docs]"
mkdocs serve # http://localhost:8000
mkdocs build # static site → ./site
The site is published to GitHub Pages on every push to main via
.github/workflows/docs.yml.
Architecture
Geometry in → CAS engine → fan out to providers → server-side subset → zonal stats → QC → results out
- Passthrough: No data storage. Every request goes to the upstream provider.
- Plugin connectors: Each provider is a self-contained module with
@registerdecorator. - Protocol mixins: WCS, STAC+COG, OPeNDAP — compose into connectors via multiple inheritance.
- Zonal statistics: Continuous (mean/median/min/max/std) and categorical (majority/distribution).
- QC validation: Range checks, coverage thresholds, cross-provider consistency.
- Daily CI health checks: Verify providers are up with known test polygons.
Implemented Providers
CAS registers 228 active providers. The tables below list the headline global/flagship
datasets per category; the national breadth (169 national/regional providers across 38 countries)
is summarized in National providers by country. The complete
machine-readable catalog (resolution, bbox, license, variables) lives in
inventory/providers.yaml and is regenerated with cas export-inventory. Get the live list any
time with cas providers.
Roughly by category: DEM/Elevation ~46, Soil ~46, Land Cover ~46, Hydrology/Water ~38, Vegetation/Canopy ~20, Geology ~7, plus Biodiversity/Ecology and other thematic layers.
DEM / Elevation — global & flagship
| Provider | Slug | Resolution | Coverage | Access |
|---|---|---|---|---|
| Copernicus DEM GLO-30 | copernicus_dem |
30m | Global | Open |
| Copernicus DEM GLO-90 | cop_dem_90 |
90m | Global | Open |
| USGS 3DEP | usgs_3dep |
10m | US | Public domain |
| NASADEM (SRTM) | nasadem |
30m | 56S–60N | Public domain |
| ALOS World 3D | alos_dem |
30m | Global | JAXA (research) |
| ASTER GDEM v3 | aster_gdem |
30m | Global | NASA Earthdata login |
| ArcticDEM | arctic_dem |
10m | >50N | Open (PGC) |
| REMA (Antarctica) | rema |
8m | <53S | Open (PGC) |
| ETOPO 2022 (topo+bathy) | etopo_2022 |
~2km | Global | Open (NOAA) |
| GEBCO Bathymetry | gebco |
500m | Global | Open (GEBCO) |
| OpenTopography | opentopography |
30–90m | Global | API key (free) |
| MERIT DEM | merit_dem |
90m | Global | Registration (CC-BY-NC) |
| TanDEM-X 90m | tandem_x |
90m | Global | Registration (DLR) |
Plus ~32 national/regional DEMs (Australia, Canada HRDEM, Japan GSI, and most of Europe).
Soil — global & flagship
| Provider | Slug | Resolution | Coverage | Access |
|---|---|---|---|---|
| ISRIC SoilGrids 2.0 | isric_soilgrids |
250m | Global | CC-BY-4.0 |
| SoilGrids derived (OCS / WRB) | soilgrids_derived |
250m | Global | CC-BY-4.0 |
| OpenLandMap | openlandmap |
250m | Global | CC-BY-SA-4.0 |
| SSURGO / gNATSGO | ssurgo, gnatsgo |
30m | US | Public domain |
| POLARIS | polaris |
30m | US | CC-BY-NC-4.0 |
| SLGA | slga |
90m | Australia | CC-BY-4.0 |
Plus ~40 national soil products (Germany soil-quality/texture/water/erosion suite, Ireland, France, Netherlands, Nordics, Brazil, India, Mexico, Argentina, and more).
Land Cover — global & flagship
| Provider | Slug | Resolution | Coverage | Access |
|---|---|---|---|---|
| ESA WorldCover | esa_worldcover |
10m | Global | CC-BY-4.0 |
| ESA CCI Land Cover | esa_cci_lc |
300m | Global | ESA CCI |
| Dynamic World | dynamic_world |
10m | Global | CC-BY-4.0 (GEE) |
| Esri 10m LULC | esri_lulc |
10m | Global | CC-BY-4.0 |
| Impact Observatory LULC | io_lulc |
10m | Global | CC-BY-4.0 |
| Impact Observatory LULC (9-class) | io_lulc_9class |
10m | Global | CC-BY-4.0 |
| MODIS MCD12Q1 | modis_lc |
500m | Global | Open (NASA) |
| GHSL Human Settlement | ghsl |
100m | Global | CC-BY-4.0 |
| MS Building Footprints | ms_buildings |
~1m | Global | ODbL |
| CORINE Land Cover | corine_lc |
100m | Europe | Copernicus |
| NLCD | nlcd |
30m | US | Public domain |
Plus ~35 national/regional land-cover & cropland products (USDA CDL, NRCan, DEA Africa, and national maps for ~25 countries).
Hydrology / Water (~38)
merit_hydro (global flow/accumulation), jrc_gsw (Global Surface Water), modis_snow,
permafrost (global), ramsar_wetlands, hydrobasins (HydroSHEDS upstream drainage area,
Americas), hydrolakes (HydroSHEDS mean lake depth, Central Asia), hydrorivers (HydroSHEDS
nearest/main-river discharge, South America), plus national hydrography, groundwater, aquifer,
flood, lake, catchment and runoff layers (USGS NHD/WBD, Switzerland rivers/glaciers, Ireland,
UK, Spain, Belgium, Finland, Australia water observations).
Vegetation / Canopy (~20)
canopy_height (ETH 10m), hansen_forest (Global Forest Change), chloris_biomass,
hgb_biomass, alos_fnf, plus national forest height/volume/species, fractional cover,
EEA High-Resolution Layers (tree/leaf/woody/grassland), and Norway vegetation/landslide layers.
Geology (~7)
National bedrock, lithology, and quaternary geology for Belgium, Estonia, Greece, Norway, Portugal, and Spain.
Climate / Water Balance
terraclimate (global ~4 km monthly TerraClimate via Planetary Computer, Zarr datacube):
potential & actual evapotranspiration, climatic water deficit, soil moisture, precipitation,
runoff, Palmer Drought Severity Index, temperature, plus a derived UNEP aridity index.
Biodiversity / Ecology & other
biodiversity (global Biodiversity Intactness), mobi (US biodiversity importance),
brazil_biomes, hrea (electricity access), mtbs (US burn severity).
National providers by country
166 of the 228 providers are country-specific (DEM, soil, land cover, hydrology, geology) across 38 countries, plus trinational MapBiomas land cover for Amazonia, Chaco and Pampa (169 national/regional providers in total). Counts:
| Country | Providers | Country | Providers |
|---|---|---|---|
| Germany | 20 | Czechia | 2 |
| USA | 18 | Denmark | 2 |
| Ireland | 13 | Estonia | 2 |
| Switzerland | 12 | India | 2 |
| Australia | 10 | Lithuania | 2 |
| UK | 10 | Mexico | 2 |
| Norway | 9 | Portugal | 2 |
| Belgium | 7 | Sweden | 2 |
| Spain | 7 | Argentina | 1 |
| Finland | 6 | Austria | 1 |
| Canada | 5 | Colombia | 2 |
| France | 3 | Ethiopia | 1 |
| Netherlands | 5 | Greece | 1 |
| Slovenia | 3 | Indonesia | 2 |
| Brazil | 2 | Japan | 1 |
| Croatia | 2 | Peru | 3 |
| Bolivia | 1 | Paraguay | 1 |
| Uruguay | 1 | Venezuela | 1 |
| Luxembourg, Nigeria | 1 each |
Most European, North American, and Australian connectors are open (OGC WCS/WMS or STAC+COG). A handful require free registration — see below.
Providers Requiring Registration
Some providers require free registration before use. CAS will display clear instructions when you attempt to use them without credentials.
OpenTopography (free API key)
Provides server-side subsetting for SRTM, COP30, COP90, NASADEM, AW3D30, EU_DTM.
- Register at https://portal.opentopography.org/
- Go to My Account → API Keys → Request API Key
- Set the key:
export CAS_OPENTOPOGRAPHY_API_KEY=your_key
MERIT DEM (University of Tokyo)
Global 90m hydrologically adjusted DEM (noise, canopy, speckle removed).
- Visit https://hydro.iis.u-tokyo.ac.jp/~yamadai/MERIT_DEM/
- Fill out the registration form
- A download password will be emailed to you
- Set credentials:
export CAS_MERIT_USER=your_email
export CAS_MERIT_PASSWORD=your_password
TanDEM-X 90m (DLR)
Global 90m DEM from radar interferometry. Ellipsoidal heights (WGS84).
- Register at https://sso.eoc.dlr.de/pwm-tdmdem90
- Set credentials:
export CAS_TANDEMX_USER=your_email
export CAS_TANDEMX_PASSWORD=your_password
Google Earth Engine / Dynamic World (Google Cloud)
Global 10m near-real-time land cover from Sentinel-2.
- Create a Google Cloud project at https://console.cloud.google.com/
- Enable the Earth Engine API
- Create a service account with Earth Engine scope
- Download the JSON key file
- Set the path:
export CAS_GEE_SERVICE_ACCOUNT_KEY=/path/to/service-account-key.json
Or authenticate interactively: earthengine authenticate
Finland DEM 2m (Maanmittauslaitos)
National high-resolution DEM from the Finnish Land Survey.
- Register at https://www.maanmittauslaitos.fi/rajapinnat/api-avaimen-ohje
- Create a free API key
- Set the key:
export CAS_MML_API_KEY=your_key
Denmark DEMs (Dataforsyningen / SDFI)
National DHM 0.4m and terrain data from the Danish Agency for Data Supply.
- Register at https://dataforsyningen.dk/
- Create a user and generate an API token
- Set the token:
export CAS_DATAFORSYNINGEN_TOKEN=your_token
Germany DGM200 (BKG)
National 200m DEM from the German Federal Agency for Cartography.
- Register at https://gdz.bkg.bund.de/
- Request a UUID access token (free for open data services)
- Set the token:
export CAS_BKG_UUID=your_uuid
Copernicus CORINE Land Cover (Copernicus Dataspace)
European land cover at 100m from the Copernicus programme.
- Register at https://dataspace.copernicus.eu/
- Generate an API token
- Set the token:
export CAS_COPERNICUS_TOKEN=your_token
Digital Earth Africa (DEA)
Pan-African SRTM derivatives, ESA WorldCover, fractional cover, water observations, cropland extent, and NDVI climatology via WCS.
Access may be restricted by region. If you receive 403 errors:
- Check https://www.digitalearthafrica.org/ for current access policies
- DEA services may require access from African IP ranges or API registration
NASA Earthdata (ASTER GDEM, MODIS)
Some NASA products (e.g. aster_gdem, MODIS via modis_lc) require a free Earthdata login.
- Register at https://urs.earthdata.nasa.gov/
- Set credentials:
export CAS_EARTHDATA_USER=your_username
export CAS_EARTHDATA_PASSWORD=your_password
Adding a Provider
- Create
src/cas/connectors/my_provider.py - Subclass
BaseConnector, implementlist_datasets()andextract() - Decorate with
@register("my_provider") - Add entry to
inventory/providers.yaml - Create
tests/connectors/test_my_provider.py
@register("my_provider")
class MyProviderConnector(WCSMixin, BaseConnector):
slug = "my_provider"
display_name = "My Provider"
base_url = "https://api.example.com"
protocol = "wcs"
async def list_datasets(self) -> list[Dataset]:
...
async def extract(self, dataset_id, geometry, time_range=None) -> AttributeResult:
...
For providers requiring registration, use RegistrationRequiredError with clear instructions:
from cas.core.exceptions import RegistrationRequiredError
class MyGatedConnector(BaseConnector):
def _get_credentials(self):
key = os.environ.get("CAS_MY_PROVIDER_KEY", "")
if not key:
raise RegistrationRequiredError(
self.slug,
"https://provider.example.com/register",
"Register for a free API key, then:\n export CAS_MY_PROVIDER_KEY=your_key",
)
return key
Development
pip install -e ".[dev,stac]"
ruff check src/ tests/
mypy src/cas/ --ignore-missing-imports
pytest tests/ -v # unit tests (no network)
End-to-end extraction checks
Tests marked network run a real extract() against live upstream
providers and are excluded from the default run. Each provider is tested
over a coverage-derived test polygon — a small area inside the
provider's own declared coverage (see cas.monitor.geometry_check), so
country-specific connectors are exercised over data they actually serve
rather than a single fixed point.
pytest tests/test_e2e_extract.py -m network -v # sweep all providers
pytest tests/test_e2e_extract.py -m network -k usgs_3dep # one provider
cas health # CLI equivalent: end-to-end sweep + summary
cas health -s usgs_3dep # single provider
cas health --strict # exit non-zero if any provider is down
cas verify # fast endpoint reachability only (no extraction)
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file community_attribute_service-0.2.0.tar.gz.
File metadata
- Download URL: community_attribute_service-0.2.0.tar.gz
- Upload date:
- Size: 444.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39fccc2851380df7ed3261f44d482b25f7cefd4602d3b6e9b598873de2d07977
|
|
| MD5 |
f2d38b0f81ee374b7df47af8e74fbf6e
|
|
| BLAKE2b-256 |
b90835ceb04ff94b3af1ea5caaae80dc606bec440802ec3fdfd4b21e9fc37df2
|
File details
Details for the file community_attribute_service-0.2.0-py3-none-any.whl.
File metadata
- Download URL: community_attribute_service-0.2.0-py3-none-any.whl
- Upload date:
- Size: 290.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5ce25c0e206fbac82127e2b6befb666a42023e897fa70d459947f8936ca324f1
|
|
| MD5 |
e66a1086411a1218eb1a95ea30a4d237
|
|
| BLAKE2b-256 |
4fefde54e4b8314a740a21a569a3bd2198f971e7a33995221e4aec4d55add3e2
|