Skip to main content

Shared Airflow infrastructure for OpenPlanetData workflows

Project description

openplanetdata-airflow

Shared Airflow infrastructure for OpenPlanetData workflows. Provides custom operators, shared constants, and static geospatial data for processing OpenStreetMap data pipelines.

Installation

pip install openplanetdata-airflow

Operators

GolOperator

Runs gol commands inside the OpenPlanetData Docker container, with optional output file redirection.

from openplanetdata.airflow.operators.gol import GolOperator

query = GolOperator(
    task_id="query",
    args=["query", "planet.gol", "a[boundary=administrative]"],
    output_file="/data/output.geojson",
)

Ogr2OgrOperator

Runs ogr2ogr inside a GDAL Docker container. Supports Airflow template rendering and dynamic task mapping with expand().

from openplanetdata.airflow.operators.ogr2ogr import Ogr2OgrOperator

convert = Ogr2OgrOperator(
    task_id="convert",
    args=["-f", "GPKG", "output.gpkg", "input.geojson"],
)

Shared defaults

The openplanetdata.airflow.defaults module provides shared constants:

Constant Description
DOCKER_MOUNT Bind mount configuration for /data
EMAIL_ALERT_RECIPIENTS Alert email recipients
GDAL_FULL_IMAGE GDAL Docker image reference (ubuntu-full)
GDAL_SMALL_IMAGE GDAL Docker image reference (ubuntu-small)
OPENPLANETDATA_IMAGE OpenPlanetData worker Docker image
OPENPLANETDATA_SHARED_DIR Shared data directory path
OPENPLANETDATA_WORK_DIR Working directory path
R2_BUCKET Cloudflare R2 bucket name
R2INDEX_CONNECTION_ID Airflow connection ID for R2

Static data

  • openplanetdata.airflow.data.continents — list of 7 continents with names and slugs
  • openplanetdata.airflow.data.countries — 250 countries and territories with ISO 3166-1 alpha-2 codes and coastline flags

Docker image

The docker/ directory contains a multi-stage Dockerfile that builds an Airflow worker image with geospatial tools pre-installed:

  • aria2 for fast downloads
  • GDAL/OGR for format conversions
  • GOL (Geodesk) for OSM boundary queries
  • osmcoastline for coastline extraction
  • pyosmium for OSM data updates

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openplanetdata_airflow-1.1.0.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openplanetdata_airflow-1.1.0-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file openplanetdata_airflow-1.1.0.tar.gz.

File metadata

  • Download URL: openplanetdata_airflow-1.1.0.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for openplanetdata_airflow-1.1.0.tar.gz
Algorithm Hash digest
SHA256 23e37ec0933e5be9d84bc41897bc51630a7933b447abf8c0f9a602bd1428dc70
MD5 20cd27b99ed446ea39d595a43d77a03a
BLAKE2b-256 e25ba68caa80fcd9ca27cc32660575655001311f1834b9e414b3082edc928be2

See more details on using hashes here.

Provenance

The following attestation bundles were made for openplanetdata_airflow-1.1.0.tar.gz:

Publisher: release.yml on openplanetdata/openplanetdata-airflow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file openplanetdata_airflow-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for openplanetdata_airflow-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3ccd3fb01fb26973e98312b4095c416fcbc57f3c5e507043f356c65af4b641f6
MD5 4e935e7342787d8a7288387ae17f17c9
BLAKE2b-256 f611951c1e8b8ca8d0c46eb0e2f0e7ff826e9c71f61ca17f7ed3fcbc9e0ceb46

See more details on using hashes here.

Provenance

The following attestation bundles were made for openplanetdata_airflow-1.1.0-py3-none-any.whl:

Publisher: release.yml on openplanetdata/openplanetdata-airflow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page