Skip to main content

A Python API wrapper for the raw-data-api service.

Project description

Raw Data API Python Client

HOT

A Python client for the Humanitarian OpenStreetMap Team (HOT) Raw Data API.

Publish Docs Publish Package version Downloads Pre-Commit License


📖 Documentation: https://hotosm.github.io/raw-data-api-py/

🖥️ Source Code: https://github.com/hotosm/raw-data-api-py


Installation

pip install raw-data-api-py

Conceptual Overview

The OSM Data Client allows you to extract OpenStreetMap data for specific geographic areas through the HOT Raw Data API. The workflow follows this pattern:

  1. Define an area of interest (GeoJSON polygon)
  2. Configure filters for specific OpenStreetMap features
  3. Submit a request and wait for processing
  4. Download and use the resulting data

Quick Start

import asyncio
from osm_data_client import get_osm_data

async def main():
    # Define area of interest
    geometry = {
        "type": "Polygon",
        "coordinates": [[
            [-73.98, 40.75],  # NYC area
            [-73.98, 40.76],
            [-73.97, 40.76],
            [-73.97, 40.75],
            [-73.98, 40.75]
        ]]
    }

    # Request building data
    result = await get_osm_data(
        geometry,
        fileName="nyc_buildings",
        outputType="geojson",
        filters={
            "tags": {
                "all_geometry": {
                    "building": []  # All buildings
                }
            }
        }
    )

    print(f"Data downloaded to: {result.path}")

if __name__ == "__main__":
    asyncio.run(main())

Command-Line Interface

Extract data using the CLI:

python -m osm_data_client.cli \
  --bounds -73.98 40.75 -73.97 40.76 \
  --feature-type building --out buildings.geojson

Key Components

  • get_osm_data: Main function for simple requests
  • RawDataClient: Configurable client for advanced usage
  • GeometryInput: Handles polygon validation
  • RequestParams: Handles request configuration
  • RawDataResult: Contains the result file path and metadata

Common Use Cases

Configuring Output Directory

from osm_data_client import RawDataClient, RawDataClientConfig

config = RawDataClientConfig(output_directory="/path/to/outputs")
client = RawDataClient(config)

result = await client.get_osm_data(geometry, **params)

Streaming Data Directly (No Download)

from osm_data_client import RawDataOutputOptions

# Do not download the file, just return the response
options = RawDataOutputOptions(download_file=False)

result = await client.get_osm_data(geometry, options, {
    "outputType": "geojson",
    "bindZip": False,
})

[!NOTE] This configuration is best used with the bindZip=False param and geojson output, as shown above.

Controlling File Extraction

from osm_data_client import RawDataOutputOptions, AutoExtractOption

# Always extract from zip archives
options = RawDataOutputOptions(auto_extract=AutoExtractOption.force_extract)

result = await client.get_osm_data(geometry, options, **params)

Using Different Output Formats

# GeoJSON example
result = await get_osm_data(
    geometry,
    outputType="geojson",
    filters={"tags": {"all_geometry": {"building": []}}}
)

# Shapefile example
result = await get_osm_data(
    geometry,
    outputType="shp",
    filters={"tags": {"all_geometry": {"highway": []}}}
)

Error Handling

The client uses specific exception types for different errors:

from osm_data_client.exceptions import ValidationError, APIRequestError

try:
    result = await get_osm_data(geometry, **params)
except ValidationError as e:
    print(f"Invalid input: {e}")
except APIRequestError as e:
    print(f"API error: {e}")

API Reference

Core Functions

async def get_osm_data(
    geometry: dict[str, Any] | str,
    **kwargs
) -> RawDataResult

Client Classes

class RawDataClient:
    async def get_osm_data(
        self,
        geometry: dict[str, Any] | str,
        output_options: RawDataOutputOptions = RawDataOutputOptions.default(),
        **kwargs
    ) -> RawDataResult

Configuration Classes

@dataclass
class RawDataClientConfig:
    access_token: Optional[str] = None
    memory_threshold_mb: int = 50
    base_api_url: str = "https://api-prod.raw-data.hotosm.org/v1"
    output_directory: Path = Path.cwd()
class AutoExtractOption(Enum):
    automatic = auto()     # Decide based on format and size
    force_zip = auto()     # Always keep as zip
    force_extract = auto() # Always extract

CLI Options

python -m osm_data_client.cli [options]

Options:
  --geojson PATH          Path to GeoJSON file or GeoJSON string
  --bounds XMIN YMIN XMAX YMAX
                          Bounds coordinates in EPSG:4326
  --feature-type TYPE     Type of feature to download (default: "building")
  --out PATH              Output path (default: "./osm_data.geojson")
  --format FORMAT         Output format (geojson, shp, kml, etc.)
  --no-zip                Do not request data as a zip file
  --extract               Extract files from zip archive
  --verbose, -v           Enable verbose logging

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raw_data_api_py-0.3.0.tar.gz (113.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

raw_data_api_py-0.3.0-py3-none-any.whl (32.2 kB view details)

Uploaded Python 3

File details

Details for the file raw_data_api_py-0.3.0.tar.gz.

File metadata

  • Download URL: raw_data_api_py-0.3.0.tar.gz
  • Upload date:
  • Size: 113.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.9.3 CPython/3.12.3

File hashes

Hashes for raw_data_api_py-0.3.0.tar.gz
Algorithm Hash digest
SHA256 283d1fcd5f9959b1df3e5c243d75f309f967ea8df8adf1c9a0a2e92c52ab726e
MD5 61027c65ca4bd9fa61771a95aca42843
BLAKE2b-256 6f77f81520896a1656faf6abf9345c92273d960dc292654b8c1a399e308bd55f

See more details on using hashes here.

File details

Details for the file raw_data_api_py-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for raw_data_api_py-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0a5e0063bbe1d70c98b57200c31505f780a92a73c633ed4c927c05436e97fead
MD5 b4352dee106ba84495eb20b2496c21e9
BLAKE2b-256 c64c12c7f99cfc302813bdada0e54bba334ca9f84e11ffb337b7bc34d021cfb6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page