Skip to main content

Shared Airflow infrastructure for OpenPlanetData workflows

Project description

openplanetdata-airflow

Shared Airflow infrastructure for OpenPlanetData workflows. Provides custom operators, shared constants, and static geospatial data for processing OpenStreetMap data pipelines.

Installation

pip install openplanetdata-airflow

Operators

GolOperator

Runs gol commands inside the OpenPlanetData Docker container, with optional output file redirection.

from openplanetdata.airflow.operators.gol import GolOperator

query = GolOperator(
    task_id="query",
    args=["query", "planet.gol", "a[boundary=administrative]"],
    output_file="/data/output.geojson",
)

Ogr2OgrOperator

Runs ogr2ogr inside a GDAL Docker container. Supports Airflow template rendering and dynamic task mapping with expand().

from openplanetdata.airflow.operators.ogr2ogr import Ogr2OgrOperator

convert = Ogr2OgrOperator(
    task_id="convert",
    args=["-f", "GPKG", "output.gpkg", "input.geojson"],
)

Shared defaults

The openplanetdata.airflow.defaults module provides shared constants:

Constant Description
DOCKER_MOUNT Bind mount configuration for /data
EMAIL_ALERT_RECIPIENTS Alert email recipients
GDAL_FULL_IMAGE GDAL Docker image reference (ubuntu-full)
GDAL_SMALL_IMAGE GDAL Docker image reference (ubuntu-small)
OPENPLANETDATA_IMAGE OpenPlanetData worker Docker image
OPENPLANETDATA_SHARED_DIR Shared data directory path
OPENPLANETDATA_WORK_DIR Working directory path
R2_BUCKET Cloudflare R2 bucket name
R2INDEX_CONNECTION_ID Airflow connection ID for R2

Static data

  • openplanetdata.airflow.data.continents — list of 7 continents with names and slugs
  • openplanetdata.airflow.data.countries — 250 countries and territories with ISO 3166-1 alpha-2 codes and coastline flags

Docker image

The docker/ directory contains a multi-stage Dockerfile that builds an Airflow worker image with geospatial tools pre-installed:

  • aria2 for fast downloads
  • GDAL/OGR for format conversions
  • GOL (Geodesk) for OSM boundary queries
  • osmcoastline for coastline extraction
  • pyosmium for OSM data updates

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openplanetdata_airflow-1.2.0.tar.gz (9.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openplanetdata_airflow-1.2.0-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file openplanetdata_airflow-1.2.0.tar.gz.

File metadata

  • Download URL: openplanetdata_airflow-1.2.0.tar.gz
  • Upload date:
  • Size: 9.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for openplanetdata_airflow-1.2.0.tar.gz
Algorithm Hash digest
SHA256 bd5e6c676ff5fa7168ba18bd84a497017bb415ed6d1796509e816a00781a8f9b
MD5 c0b38707830e41b7ed98d778fda709df
BLAKE2b-256 b3df6dfe9a42967bf9ced1ddd736d6dcd6d9f893a64aff94928e70a96ec4bd09

See more details on using hashes here.

Provenance

The following attestation bundles were made for openplanetdata_airflow-1.2.0.tar.gz:

Publisher: release.yml on openplanetdata/openplanetdata-airflow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file openplanetdata_airflow-1.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for openplanetdata_airflow-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5d7a21d93898b9bb1b292f81fb213397fb87c3aba4b6e6c98939c5534226f6ad
MD5 02cbe01cbb11dc1488461c9b5976956b
BLAKE2b-256 f536201307dce596030a3ad7dee9bf8a70aec33216aa67e69aebe1a0234174e7

See more details on using hashes here.

Provenance

The following attestation bundles were made for openplanetdata_airflow-1.2.0-py3-none-any.whl:

Publisher: release.yml on openplanetdata/openplanetdata-airflow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page