Skip to main content

Abstraction of gdal datasets for doing basic math operations

Project description

Yirgacheffe: a declarative geospatial library for Python to make data-science with maps easier

CI Documentation PyPI version

Overview

Yirgacheffe is a declarative geospatial library, allowing you to operate on both raster and polygon geospatial datasets without having to do all the tedious book keeping around layer alignment or dealing with hardware concerns around memory or parallelism. you can load into memory safely.

Example common use-cases:

  • Do the datasets overlap? Yirgacheffe will let you define either the intersection or the union of a set of different datasets, scaling up or down the area as required.
  • Rasterisation of vector layers: if you have a vector dataset then you can add that to your computation and yirgaceffe will rasterize it on demand, so you never need to store more data in memory than necessary.
  • Do the raster layers get big and take up large amounts of memory? Yirgacheffe will let you do simple numerical operations with layers directly and then worry about the memory management behind the scenes for you.
  • Parallelisation of operations over many CPU cores.
  • Built in support for optionally using GPUs via MLX support.

Installation

Yirgacheffe is available via pypi, so can be installed with pip for example:

$ pip install yirgacheffe

Documentation

The documentation can be found on yirgacheffe.org

Simple examples:

Here is how to do cloud removal from Sentinel-2 data, using the Scene Classification Layer data:

import yirgaceffe as yg

with (
  yg.read_raster("T37NCG_20250909T073609_B06_20m.jp2") as vre2,
  yg.read_raster("T37NCG_20250909T073609_SCL_20m.jp2") as scl,
):
  is_cloud = (scl == 8) | (scl == 9) | (scl == 10)  # various cloud types
  is_shadow = (scl == 3)
  is_bad = is_cloud | is_shadow

  masked_vre2 = yg.where(is_bad, float("nan"), vre2)
  masked_vre2.to_geotiff("vre2_cleaned.tif")

or a species' Area of Habitat calculation:

import yirgaceffe as yg

with (
    yg.read_raster("habitats.tif") as habitat_map,
    yg.read_raster('elevation.tif') as elevation_map,
    yg.read_shape('species123.geojson') as range_map,
):
    refined_habitat = habitat_map.isin([...species habitat codes...])
    refined_elevation = (elevation_map >= species_min) & (elevation_map <= species_max)
    aoh = refined_habitat * refined_elevation * range_polygon * area_per_pixel_map
    print(f'Area of habitat: {aoh.sum()}')

Citation

If you use Yirgacheffe in your research, please cite our paper:

Michael Winston Dales, Alison Eyres, Patrick Ferris, Francesca A. Ridley, Simon Tarr, and Anil Madhavapeddy. 2025. Yirgacheffe: A Declarative Approach to Geospatial Data. In Proceedings of the 2nd ACM SIGPLAN International Workshop on Programming for the Planet (PROPL '25). Association for Computing Machinery, New York, NY, USA, 47–54. https://doi.org/10.1145/3759536.3763806

BibTeX
@inproceedings{10.1145/3759536.3763806,
  author = {Dales, Michael Winston and Eyres, Alison and Ferris, Patrick and Ridley, Francesca A. and Tarr, Simon and Madhavapeddy, Anil},
  title = {Yirgacheffe: A Declarative Approach to Geospatial Data},
  year = {2025},
  isbn = {9798400721618},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3759536.3763806},
  doi = {10.1145/3759536.3763806},
  abstract = {We present Yirgacheffe, a declarative geospatial library that allows spatial algorithms to be implemented concisely, supports parallel execution, and avoids common errors by automatically handling data (large geospatial rasters) and resources (cores, memory, GPUs). Our primary user domain comprises ecologists, where a typical problem involves cleaning messy occurrence data, overlaying it over tiled rasters, combining layers, and deriving actionable insights from the results. We describe the successes of this approach towards driving key pipelines related to global biodiversity and describe the capability gaps that remain, hoping to motivate more research into geospatial domain-specific languages.},
  booktitle = {Proceedings of the 2nd ACM SIGPLAN International Workshop on Programming for the Planet},
  pages = {47–54},
  numpages = {8},
  keywords = {Biodiversity, Declarative, Geospatial, Python},
  location = {Singapore, Singapore},
  series = {PROPL '25}
}

Thanks

Thanks to discussion and feedback from my colleagues, particularly Alison Eyres, Patrick Ferris, Amelia Holcomb, and Anil Madhavapeddy.

Inspired by the work of Daniele Baisero in his AoH library.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yirgacheffe-1.10.2.tar.gz (78.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yirgacheffe-1.10.2-py3-none-any.whl (52.5 kB view details)

Uploaded Python 3

File details

Details for the file yirgacheffe-1.10.2.tar.gz.

File metadata

  • Download URL: yirgacheffe-1.10.2.tar.gz
  • Upload date:
  • Size: 78.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for yirgacheffe-1.10.2.tar.gz
Algorithm Hash digest
SHA256 99eddb52986b733af044ad606c25ff204f8a94afd8820d3a5464749096008165
MD5 8da0254e10e18a92398b33ddcb210c72
BLAKE2b-256 47057f4838ae45c7fcea47e8525ee44a513b0d28ae90875f28943c6407c47d70

See more details on using hashes here.

File details

Details for the file yirgacheffe-1.10.2-py3-none-any.whl.

File metadata

  • Download URL: yirgacheffe-1.10.2-py3-none-any.whl
  • Upload date:
  • Size: 52.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for yirgacheffe-1.10.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9a9121175c7c3ad957714a946d33608eb574cf2a38b301e901a59db0492455c0
MD5 1647a1796dabfbfbde9bba30092caf36
BLAKE2b-256 9fced065db9c6e7557997449ff2abe49da67549ac5f47cca39cb2cc3e43f4306

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page