Skip to main content

Abstraction of gdal datasets for doing basic math operations

Project description

Yirgacheffe: a declarative geospatial library for Python to make data-science with maps easier

CI Documentation PyPI version

Overview

Yirgacheffe is a declarative geospatial library, allowing you to operate on both raster and polygon geospatial datasets without having to do all the tedious book keeping around layer alignment or dealing with hardware concerns around memory or parallelism. you can load into memory safely.

Example common use-cases:

  • Do the datasets overlap? Yirgacheffe will let you define either the intersection or the union of a set of different datasets, scaling up or down the area as required.
  • Rasterisation of vector layers: if you have a vector dataset then you can add that to your computation and yirgaceffe will rasterize it on demand, so you never need to store more data in memory than necessary.
  • Do the raster layers get big and take up large amounts of memory? Yirgacheffe will let you do simple numerical operations with layers directly and then worry about the memory management behind the scenes for you.
  • Parallelisation of operations over many CPU cores.
  • Built in support for optionally using GPUs via MLX support.

Installation

Yirgacheffe is available via pypi, so can be installed with pip for example:

$ pip install yirgacheffe

Documentation

The documentation can be found on yirgacheffe.org

Simple examples:

Here is how to do cloud removal from Sentinel-2 data, using the Scene Classification Layer data:

import yirgaceffe as yg

with (
  yg.read_raster("T37NCG_20250909T073609_B06_20m.jp2") as vre2,
  yg.read_raster("T37NCG_20250909T073609_SCL_20m.jp2") as scl,
):
  is_cloud = (scl == 8) | (scl == 9) | (scl == 10)  # various cloud types
  is_shadow = (scl == 3)
  is_bad = is_cloud | is_shadow

  masked_vre2 = yg.where(is_bad, float("nan"), vre2)
  masked_vre2.to_geotiff("vre2_cleaned.tif")

or a species' Area of Habitat calculation:

import yirgaceffe as yg

with (
    yg.read_raster("habitats.tif") as habitat_map,
    yg.read_raster('elevation.tif') as elevation_map,
    yg.read_shape('species123.geojson') as range_map,
):
    refined_habitat = habitat_map.isin([...species habitat codes...])
    refined_elevation = (elevation_map >= species_min) & (elevation_map <= species_max)
    aoh = refined_habitat * refined_elevation * range_polygon * area_per_pixel_map
    print(f'Area of habitat: {aoh.sum()}')

Citation

If you use Yirgacheffe in your research, please cite our paper:

Michael Winston Dales, Alison Eyres, Patrick Ferris, Francesca A. Ridley, Simon Tarr, and Anil Madhavapeddy. 2025. Yirgacheffe: A Declarative Approach to Geospatial Data. In Proceedings of the 2nd ACM SIGPLAN International Workshop on Programming for the Planet (PROPL '25). Association for Computing Machinery, New York, NY, USA, 47–54. https://doi.org/10.1145/3759536.3763806

BibTeX
@inproceedings{10.1145/3759536.3763806,
  author = {Dales, Michael Winston and Eyres, Alison and Ferris, Patrick and Ridley, Francesca A. and Tarr, Simon and Madhavapeddy, Anil},
  title = {Yirgacheffe: A Declarative Approach to Geospatial Data},
  year = {2025},
  isbn = {9798400721618},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3759536.3763806},
  doi = {10.1145/3759536.3763806},
  abstract = {We present Yirgacheffe, a declarative geospatial library that allows spatial algorithms to be implemented concisely, supports parallel execution, and avoids common errors by automatically handling data (large geospatial rasters) and resources (cores, memory, GPUs). Our primary user domain comprises ecologists, where a typical problem involves cleaning messy occurrence data, overlaying it over tiled rasters, combining layers, and deriving actionable insights from the results. We describe the successes of this approach towards driving key pipelines related to global biodiversity and describe the capability gaps that remain, hoping to motivate more research into geospatial domain-specific languages.},
  booktitle = {Proceedings of the 2nd ACM SIGPLAN International Workshop on Programming for the Planet},
  pages = {47–54},
  numpages = {8},
  keywords = {Biodiversity, Declarative, Geospatial, Python},
  location = {Singapore, Singapore},
  series = {PROPL '25}
}

Thanks

Thanks to discussion and feedback from my colleagues, particularly Alison Eyres, Patrick Ferris, Amelia Holcomb, and Anil Madhavapeddy.

Inspired by the work of Daniele Baisero in his AoH library.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yirgacheffe-1.12.1.tar.gz (54.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yirgacheffe-1.12.1-py3-none-any.whl (62.6 kB view details)

Uploaded Python 3

File details

Details for the file yirgacheffe-1.12.1.tar.gz.

File metadata

  • Download URL: yirgacheffe-1.12.1.tar.gz
  • Upload date:
  • Size: 54.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for yirgacheffe-1.12.1.tar.gz
Algorithm Hash digest
SHA256 7f0733ce891517fbd82ba03209b18d47fce5a4574b38a3ed3a9d1e3cd199461e
MD5 db61081fe5d98909271af4bba6d74772
BLAKE2b-256 5a15394762d7f88d2bc22ae3c4115f86cde35687221d97fc5e4fdd70d4827946

See more details on using hashes here.

File details

Details for the file yirgacheffe-1.12.1-py3-none-any.whl.

File metadata

  • Download URL: yirgacheffe-1.12.1-py3-none-any.whl
  • Upload date:
  • Size: 62.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for yirgacheffe-1.12.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6ea2af193c14f76554d735ef0a35296579fe42b6dc67e7a6e1ba77ee4a922659
MD5 e0a8f28eadceee7145b1744c82da234f
BLAKE2b-256 0482a1c0855c521da311ec8b46f51fb4d3c99c439c7a614f52b0b79d3892ee42

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page