Skip to main content

Shared Airflow infrastructure for OpenPlanetData workflows

Project description

openplanetdata-airflow

Shared Airflow infrastructure for OpenPlanetData workflows. Provides custom operators, shared constants, and static geospatial data for processing OpenStreetMap data pipelines.

Installation

pip install openplanetdata-airflow

Operators

Ogr2OgrOperator

Runs ogr2ogr inside a GDAL Docker container. Supports Airflow template rendering and dynamic task mapping with expand().

from openplanetdata.airflow.operators.ogr2ogr import Ogr2OgrOperator

convert = Ogr2OgrOperator(
    task_id="convert",
    args=["-f", "GPKG", "output.gpkg", "input.geojson"],
)

GolOperator

Runs gol commands inside the OpenPlanetData Docker container, with optional output file redirection.

from openplanetdata.airflow.operators.gol import GolOperator

query = GolOperator(
    task_id="query",
    args=["query", "planet.gol", "a[boundary=administrative]"],
    output_file="/data/output.geojson",
)

Shared defaults

The openplanetdata.airflow.defaults module provides shared constants:

Constant Description
DOCKER_MOUNT Bind mount configuration for /data
GDAL_IMAGE GDAL Docker image reference
OPENPLANETDATA_IMAGE OpenPlanetData worker Docker image
OPENPLANETDATA_WORK_DIR Working directory path
OPENPLANETDATA_SHARED_DIR Shared data directory path
EMAIL_ALERT_RECIPIENTS Alert email recipients
R2_BUCKET Cloudflare R2 bucket name
R2INDEX_CONNECTION_ID Airflow connection ID for R2

Static data

  • openplanetdata.airflow.data.continents — list of 7 continents with names and slugs
  • openplanetdata.airflow.data.countries — 250 countries and territories with ISO 3166-1 alpha-2 codes and coastline flags

Docker image

The docker/ directory contains a multi-stage Dockerfile that builds an Airflow worker image with geospatial tools pre-installed:

  • GOL (Geodesk) for OSM boundary queries
  • osmcoastline for coastline extraction
  • GDAL/OGR for format conversions
  • pyosmium for OSM data updates
  • aria2 for fast downloads

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openplanetdata_airflow-1.0.0.tar.gz (9.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openplanetdata_airflow-1.0.0-py3-none-any.whl (8.3 kB view details)

Uploaded Python 3

File details

Details for the file openplanetdata_airflow-1.0.0.tar.gz.

File metadata

  • Download URL: openplanetdata_airflow-1.0.0.tar.gz
  • Upload date:
  • Size: 9.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for openplanetdata_airflow-1.0.0.tar.gz
Algorithm Hash digest
SHA256 8478376e145844ea3ce36013f4e4e4aeef2bf78b4a31092aef07330021480ae2
MD5 759a7556f55560e9c7a30ec07b77f1db
BLAKE2b-256 909db4871e3f6b8675191da585efe603146daaa1821dd0a66c7b9970f9ab1280

See more details on using hashes here.

File details

Details for the file openplanetdata_airflow-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for openplanetdata_airflow-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 263c25e14b303d17df140835c71d4109da091c4a25afa8c2fe6a94eeb7ca0927
MD5 29f141373997fc33e9d9d4f671dd6462
BLAKE2b-256 c3262e7259e3ffaf3d9bf7dc51dfb94cc5135b09f5897302af13e8c326267255

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page