Skip to main content

lightweight async Esri REST client with optional GeoPandas extras

Project description

restgdf

lightweight async Esri REST client with optional GeoPandas extras

PyPI version Python versions Downloads License

CI Publish to PyPI coverage

Read the Docs llms.txt Ask DeepWiki

Pydantic v2 pre-commit Ruff Checked with mypy Code style: black security: bandit

Python ≥ 3.9 | async-first | zero mandatory geo dependencies | pydantic-validated responses

import asyncio
from aiohttp import ClientSession
from restgdf import FeatureLayer

async def main():
    async with ClientSession() as session:
        layer = await FeatureLayer.from_url(
            "https://maps1.vcgov.org/arcgis/rest/services/Beaches/MapServer/6",
            session=session,
        )
        print(layer.name, layer.count)
        async for row in layer.stream_rows():
            print(row)
            break

asyncio.run(main())
2.0 release highlights and migration summary (click to expand)

Release highlights

restgdf 2.0.0 includes the following major additions alongside the core typed-model migration described below.

  • Streaming APIs. FeatureLayer.stream_features, stream_feature_batches, and stream_rows expose ArcGIS pagination as async generators with on_truncation="raise" | "ignore" | "split", order="request" | "completion", and max_concurrent_pages knobs, plus an R-61 feature_layer.stream parent span when telemetry is enabled. stream_gdf_chunks is the legacy GeoDataFrame-per-page shape (requires restgdf[geo], completion-order only, no shared knobs). stream_rows works on the base install.
  • Pandas-first output. FeatureLayer.get_df() returns a pandas.DataFrame without requiring the geo extra, sibling to get_gdf().
  • Output adapters. restgdf.adapters.{dict,stream,pandas,geopandas} compose the streaming primitives into tabular shapes.
  • Nested config. restgdf.Config / restgdf.get_config() replace the flat Settings object with eight frozen sub-configs and RESTGDF_<CATEGORY>_<FIELD> env vars. The old flat variables keep working with a DeprecationWarning.
  • Error taxonomy. restgdf.errors exposes RestgdfError, ConfigurationError, OptionalDependencyError, TransportError, RestgdfTimeoutError, RateLimitError, ArcGISServiceError, PaginationError, FieldDoesNotExistError, SchemaValidationError, AuthenticationError, and OutputConversionError — all with URL, status-code, and retry-after context populated where applicable.
  • Optional telemetry. pip install restgdf[telemetry] unlocks RestgdfInstrumentor and trace/span log correlation; see the new tracing recipe and streaming recipe.
  • Header-token default. Tokens now ride the X-Esri-Authorization header by default; set AuthConfig.transport="body" to restore the old behavior.

See CHANGELOG.md and MIGRATION.md for the full release notes and upgrade guidance.

2.0 migration changes

restgdf 2.0 is a major release built on pydantic 2.13. See MIGRATION.md for the full breaking-changes table and code-rewrite recipes.

  • Typed responses. FeatureLayer.metadata, Directory.metadata / .services / .report, and helpers like get_metadata, safe_crawl now return pydantic models instead of raw dicts.
  • Validated envelopes. get_feature_count, get_object_ids, and token refresh surface malformed ArcGIS payloads as a typed RestgdfResponseError (with model_name, context, raw).
  • Schema-drift observability. Vendor variance in permissive payloads (metadata, crawl) is logged through the opt-in restgdf.schema_drift logger instead of silently KeyError-ing.
  • Redacted credentials. AGOLUserPass.password is a pydantic.SecretStr so passwords are never in repr() or logs.
  • Centralized settings. Settings / get_settings() reads RESTGDF_* environment variables (chunk size, timeout, user agent, token URL, refresh threshold, etc.).
  • Migration helpers. restgdf.compat.as_dict and as_json_dict convert any returned model back to a plain dict during a transitional upgrade window.
  • Deprecated shim. restgdf._types.* still imports the legacy TypedDict names, but they now re-export the pydantic classes and emit DeprecationWarning. The shim will be removed in a future major release.
  • Dependency bump. pydantic>=2.13.3,<3 is a new required dependency.

Resilience extra

For production workloads that need automatic retry with jitter and per-service-root rate limiting, install the optional resilience extra:

pip install restgdf[resilience]

This adds stamina and aiolimiter. Wrap any AsyncHTTPSession with restgdf.resilience.ResilientSession and configure via restgdf.resilience.ResilienceConfig or RESTGDF_RESILIENCE_ENABLED=1. See MIGRATION.md for details.

gpd.read_file(url, driver="ESRIJSON") does not account for max record count limitations, so large services get truncated at the server's maxRecordCount.

restgdf uses asyncio to read all features from a service, not just the first page, while letting you choose between a light-core install and an optional GeoPandas extra.

Installation

Requires Python ≥ 3.9.

Install the lightweight core package when you want typed metadata, query helpers, crawl/auth utilities, or raw feature rows without pulling in pandas, geopandas, or pyogrio:

pip install restgdf

Base-install capabilities include:

  • typed pydantic response models like LayerMetadata and CrawlReport
  • FeatureLayer.from_url, .metadata, .count, and .get_oids()
  • single-field get_unique_values() queries
  • raw feature dictionaries via FeatureLayer.stream_features() / stream_rows() (deprecated row_dict_generator() still works)
  • Directory crawling and ArcGISTokenSession authentication helpers

Install the geo extra for GeoDataFrame and pandas-backed workflows:

pip install "restgdf[geo]"

restgdf[geo] adds:

  • FeatureLayer.get_gdf() / deprecated getgdf()
  • FeatureLayer.sample_gdf() and head_gdf()
  • FeatureLayer.fieldtypes
  • pandas-backed helpers like get_value_counts() and get_nested_count()
  • low-level restgdf.utils.getgdf helpers

Treat the split above as the stable dependency boundary: geo-enabled environments should depend on restgdf[geo] explicitly. See MIGRATION.md for the full 1.x → 2.0 rewrite table and upgrade recipes.

Light-core usage

import asyncio

from aiohttp import ClientSession

from restgdf import FeatureLayer


beaches_url = r"https://maps1.vcgov.org/arcgis/rest/services/Beaches/MapServer/6"


async def main():
    async with ClientSession() as session:
        beaches = await FeatureLayer.from_url(beaches_url, session=session)
        cities = await beaches.get_unique_values("CITY")

        first_rows = []
        async for row in beaches.stream_rows(data={"outFields": "CITY,STATE"}):
            first_rows.append(row)
            if len(first_rows) == 2:
                break

    return beaches.count, beaches.metadata.max_record_count, cities[:3], first_rows


count, max_record_count, cities, first_rows = asyncio.run(main())

print(count, max_record_count)
print(cities)
print(first_rows[0])

Streaming

FeatureLayer exposes ArcGIS pagination as three async generators so you can process millions of rows without buffering them in memory. The on_truncation knob controls what happens when the server caps a page at maxRecordCount: "raise" (default), "ignore" (log + continue), or "split" (bisect by object-id and retry, up to depth 32).

import asyncio

from aiohttp import ClientSession

from restgdf import FeatureLayer


zipcodes_url = "https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/USA_ZIP_Codes_2016/FeatureServer/0"


async def main():
    async with ClientSession() as session:
        oh = await FeatureLayer.from_url(
            zipcodes_url,
            where="STATE = 'OH'",
            session=session,
        )

        # 1. Feature dicts, one per row (base install)
        first_feature = None
        async for feature in oh.stream_features(on_truncation="split"):
            first_feature = feature
            break

        # 2. Per-page batches, preserving ArcGIS page boundaries
        page_sizes = []
        async for batch in oh.stream_feature_batches(order="request"):
            page_sizes.append(len(batch))
            if len(page_sizes) == 3:
                break

        # 3. GeoDataFrame chunks (requires `restgdf[geo]`; note that
        # stream_gdf_chunks does *not* accept on_truncation / order /
        # max_concurrent_pages — it yields in completion order).
        chunk_shapes = []
        async for chunk in oh.stream_gdf_chunks():
            chunk_shapes.append(chunk.shape)
            if len(chunk_shapes) == 2:
                break

    return first_feature, page_sizes, chunk_shapes


first_feature, page_sizes, chunk_shapes = asyncio.run(main())

See the streaming recipe for the full matrix of on_truncation, order, and max_concurrent_pages combinations on the iter_pages-based shapes (stream_features, stream_feature_batches, stream_rows).

GeoDataFrame workflows (restgdf[geo])

import asyncio

from aiohttp import ClientSession

from restgdf import FeatureLayer


beaches_url = r"https://maps1.vcgov.org/arcgis/rest/services/Beaches/MapServer/6"

zipcodes_url = "https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/USA_ZIP_Codes_2016/FeatureServer/0"


async def main():
    async with ClientSession() as session:
        beaches = await FeatureLayer.from_url(beaches_url, session=session)
        beaches_gdf = await beaches.get_gdf()

        daytona = await beaches.where("LOWER(City) LIKE 'daytona%'")
        daytona_gdf = await daytona.get_gdf()

        oh_zipcodes = await FeatureLayer.from_url(
            zipcodes_url,
            where="STATE = 'OH'",
            session=session,
        )
        oh_zipcodes_gdf = await oh_zipcodes.get_gdf()

    return beaches_gdf, daytona_gdf, oh_zipcodes_gdf


beaches_gdf, daytona_gdf, oh_zipcodes_gdf = asyncio.run(main())

print(beaches_gdf.shape)
# (243, 10)

print(daytona_gdf.shape)
# (83, 10)

print(oh_zipcodes_gdf.shape)
# (1026, 8)

Keyword arguments to FeatureLayer.get_gdf() are passed on to aiohttp.ClientSession.post; include query parameters like where and token in the data dict when needed.

Token authentication

Token helpers are available in the base install. The GeoDataFrame example below requires restgdf[geo] because it calls get_gdf().

import asyncio

from aiohttp import ClientSession

from restgdf import AGOLUserPass, ArcGISTokenSession, FeatureLayer


secured_url = "https://example.com/arcgis/rest/services/Secured/FeatureServer/0"


async def main():
    async with ClientSession() as base_session:
        token_session = ArcGISTokenSession(
            session=base_session,
            credentials=AGOLUserPass(
                username="my-username",
                password="my-password",
            ),
        )
        layer = await FeatureLayer.from_url(secured_url, session=token_session)
        return await layer.get_gdf()


secured_gdf = asyncio.run(main())

If you already have a token, you can pass it with token="..." or data={"token": "..."}.

Typed responses

Typed responses are part of the base pip install restgdf surface.

Every response is a pydantic model. Attribute access replaces dict indexing, and model_dump(by_alias=True) round-trips back to ArcGIS camelCase:

import asyncio

from aiohttp import ClientSession

from restgdf import FeatureLayer


async def main():
    async with ClientSession() as session:
        fl = await FeatureLayer.from_url(beaches_url, session=session)
        md = fl.metadata                      # restgdf.LayerMetadata
        return md.name, md.max_record_count, md.model_dump(by_alias=True)


name, max_record_count, arcgis_dict = asyncio.run(main())

Need a plain dict during a transitional migration? Use restgdf.compat.as_dict(md). See MIGRATION.md for the full 1.x → 2.0 rewrite table.

ArcGIS drift guarantees and limitations

  • Strict envelopes (CountResponse, ObjectIdsResponse, TokenResponse) fail fast with RestgdfResponseError when required keys are missing or malformed.
  • Permissive envelopes (LayerMetadata, FeaturesResponse, crawl/service payloads) still tolerate unknown extras and missing optional fields, but a top-level ArcGIS JSON error envelope ({"error": {...}}) now raises RestgdfResponseError instead of being mistaken for partial metadata or an empty feature page.
  • Query helpers intentionally call aiohttp JSON decoding with content_type=None, so mislabeled JSON bodies such as text/plain still parse.
  • Non-JSON/HTML bodies are not normalized: malformed query bodies still bubble the underlying JSON decoder error, and ArcGISTokenSession.update_token() still preserves aiohttp's native ContentTypeError behavior for HTML token pages.

Documentation

Full docs live at https://restgdf.readthedocs.io/ (hosted by Read the Docs).

Docs for humans and LLMs

Every page is published in three formats so you can feed it to a teammate or to a language model without any preprocessing:

Format URL
Rendered HTML https://restgdf.readthedocs.io/en/latest/
Plain Markdown (per page) append .md to any page — e.g. https://restgdf.readthedocs.io/en/latest/quickstart.html.md
llms.txt index https://restgdf.readthedocs.io/en/latest/llms.txt
llms-full.txt (all pages) https://restgdf.readthedocs.io/en/latest/llms-full.txt
Ask DeepWiki https://deepwiki.com/joshuasundance-swca/restgdf

Point your coding agent or RAG pipeline at llms-full.txt for the entire reference in a single file, or at llms.txt for a concise table of contents.

For contributors

  • CONTRIBUTING.md — local setup, PR checklist, commit conventions, gate suite.
  • ARCHITECTURE.md — module layout, exception taxonomy, logger hierarchy, config precedence, session ownership, streaming shapes, extras matrix.
  • CHANGELOG.md — every user-visible change.
  • MIGRATION.md — upgrading from 1.x to 2.0.

Uses

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

restgdf-3.0.0.tar.gz (225.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

restgdf-3.0.0-py3-none-any.whl (115.2 kB view details)

Uploaded Python 3

File details

Details for the file restgdf-3.0.0.tar.gz.

File metadata

  • Download URL: restgdf-3.0.0.tar.gz
  • Upload date:
  • Size: 225.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for restgdf-3.0.0.tar.gz
Algorithm Hash digest
SHA256 a76d8cc371664cd97490a7ff12d929db635aee00dd97363b086f17b225fa0478
MD5 5c6d878aabd1890316bdc28f1f610578
BLAKE2b-256 6540a22ecbd7e20d44ab3c325e9eaac31dd469e91f5cb8b4af3b53ae758be156

See more details on using hashes here.

Provenance

The following attestation bundles were made for restgdf-3.0.0.tar.gz:

Publisher: publish_on_pypi.yml on joshuasundance-swca/restgdf

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file restgdf-3.0.0-py3-none-any.whl.

File metadata

  • Download URL: restgdf-3.0.0-py3-none-any.whl
  • Upload date:
  • Size: 115.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for restgdf-3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4e59c0feb102386e4c1b7c5b3f882843dee4f20218c26f66eeb656800a3a626f
MD5 15bff30e85499bca33444796d09d0d51
BLAKE2b-256 90d153c8421c3c75dac36adce1800c28e847ed0fb080e7b059c7340948bffb8a

See more details on using hashes here.

Provenance

The following attestation bundles were made for restgdf-3.0.0-py3-none-any.whl:

Publisher: publish_on_pypi.yml on joshuasundance-swca/restgdf

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page