Skip to main content

A Python API wrapper for the raw-data-api service.

Project description

Raw Data API Python Client

HOT

A Python client for the Humanitarian OpenStreetMap Team (HOT) Raw Data API.

Publish Docs Publish Package version Downloads Pre-Commit License


📖 Documentation: https://hotosm.github.io/raw-data-api-py/

🖥️ Source Code: https://github.com/hotosm/raw-data-api-py


Installation

pip install raw-data-api-py

Conceptual Overview

The OSM Data Client allows you to extract OpenStreetMap data for specific geographic areas through the HOT Raw Data API. The workflow follows this pattern:

  1. Define an area of interest (GeoJSON polygon)
  2. Configure filters for specific OpenStreetMap features
  3. Submit a request and wait for processing
  4. Download and use the resulting data

Quick Start

import asyncio
from osm_data_client import get_osm_data

async def main():
    # Define area of interest
    geometry = {
        "type": "Polygon",
        "coordinates": [[
            [-73.98, 40.75],  # NYC area
            [-73.98, 40.76],
            [-73.97, 40.76],
            [-73.97, 40.75],
            [-73.98, 40.75]
        ]]
    }

    # Request building data
    result = await get_osm_data(
        geometry,
        fileName="nyc_buildings",
        outputType="geojson",
        filters={
            "tags": {
                "all_geometry": {
                    "building": []  # All buildings
                }
            }
        }
    )

    print(f"Data downloaded to: {result.path}")

if __name__ == "__main__":
    asyncio.run(main())

Command-Line Interface

Extract data using the CLI:

python -m osm_data_client.cli \
  --bounds -73.98 40.75 -73.97 40.76 \
  --feature-type building --out buildings.geojson

Key Components

  • get_osm_data: Main function for simple requests
  • RawDataClient: Configurable client for advanced usage
  • GeometryInput: Handles polygon validation
  • RequestParams: Handles request configuration
  • RawDataResult: Contains the result file path and metadata

Common Use Cases

Configuring Output Directory

from osm_data_client import RawDataClient, RawDataClientConfig

config = RawDataClientConfig(output_directory="/path/to/outputs")
client = RawDataClient(config)

result = await client.get_osm_data(geometry, **params)

Streaming Data Directly (No Download)

from osm_data_client import RawDataOutputOptions

# Do not download the file, just return the response
options = RawDataOutputOptions(download_file=False)

result = await client.get_osm_data(geometry, options, {
    "outputType": "geojson",
    "bindZip": False,
})

[!NOTE] This configuration is best used with the bindZip=False param and geojson output, as shown above.

Controlling File Extraction

from osm_data_client import RawDataOutputOptions, AutoExtractOption

# Always extract from zip archives
options = RawDataOutputOptions(auto_extract=AutoExtractOption.force_extract)

result = await client.get_osm_data(geometry, options, **params)

Using Different Output Formats

# GeoJSON example
result = await get_osm_data(
    geometry,
    outputType="geojson",
    filters={"tags": {"all_geometry": {"building": []}}}
)

# Shapefile example
result = await get_osm_data(
    geometry,
    outputType="shp",
    filters={"tags": {"all_geometry": {"highway": []}}}
)

Error Handling

The client uses specific exception types for different errors:

from osm_data_client.exceptions import ValidationError, APIRequestError

try:
    result = await get_osm_data(geometry, **params)
except ValidationError as e:
    print(f"Invalid input: {e}")
except APIRequestError as e:
    print(f"API error: {e}")

API Reference

Core Functions

async def get_osm_data(
    geometry: dict[str, Any] | str,
    **kwargs
) -> RawDataResult

Client Classes

class RawDataClient:
    async def get_osm_data(
        self,
        geometry: dict[str, Any] | str,
        output_options: RawDataOutputOptions = RawDataOutputOptions.default(),
        **kwargs
    ) -> RawDataResult

Configuration Classes

@dataclass
class RawDataClientConfig:
    access_token: Optional[str] = None
    memory_threshold_mb: int = 50
    base_api_url: str = "https://api-prod.raw-data.hotosm.org/v1"
    output_directory: Path = Path.cwd()
class AutoExtractOption(Enum):
    automatic = auto()     # Decide based on format and size
    force_zip = auto()     # Always keep as zip
    force_extract = auto() # Always extract

CLI Options

python -m osm_data_client.cli [options]

Options:
  --geojson PATH          Path to GeoJSON file or GeoJSON string
  --bounds XMIN YMIN XMAX YMAX
                          Bounds coordinates in EPSG:4326
  --feature-type TYPE     Type of feature to download (default: "building")
  --out PATH              Output path (default: "./osm_data.geojson")
  --format FORMAT         Output format (geojson, shp, kml, etc.)
  --no-zip                Do not request data as a zip file
  --extract               Extract files from zip archive
  --verbose, -v           Enable verbose logging

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raw_data_api_py-0.2.0.tar.gz (108.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

raw_data_api_py-0.2.0-py3-none-any.whl (32.1 kB view details)

Uploaded Python 3

File details

Details for the file raw_data_api_py-0.2.0.tar.gz.

File metadata

  • Download URL: raw_data_api_py-0.2.0.tar.gz
  • Upload date:
  • Size: 108.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.9.3 CPython/3.12.3

File hashes

Hashes for raw_data_api_py-0.2.0.tar.gz
Algorithm Hash digest
SHA256 03f74967657cfd50ad56debc4c557207a1cb479569acadeadb90b3f50c631b06
MD5 97b43230e67007a6ce186fe859d46724
BLAKE2b-256 b3539777752948e73fe321cb01bae80ffc7e3bed0457ec623b5f7a144ba8dd13

See more details on using hashes here.

File details

Details for the file raw_data_api_py-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for raw_data_api_py-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0fc4eb03ce3878fe27979d368b99d3f00cc12e060f691844fcb0ac05b4f6051e
MD5 3c8c42f9487bc2a2a71fc9db40b7f438
BLAKE2b-256 7eab36c5146edc001c2a5125008d3a4a397eb743421ff46c3741e8ae608b8c51

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page