Skip to main content

Abstraction of gdal datasets for doing basic math operations

Project description

Yirgacheffe: a declarative geospatial library for Python to make data-science with maps easier

CI Documentation PyPI version

Overview

Yirgacheffe is a declarative geospatial library, allowing you to operate on both raster and polygon geospatial datasets without having to do all the tedious book keeping around layer alignment or dealing with hardware concerns around memory or parallelism. you can load into memory safely.

Example common use-cases:

  • Do the datasets overlap? Yirgacheffe will let you define either the intersection or the union of a set of different datasets, scaling up or down the area as required.
  • Rasterisation of vector layers: if you have a vector dataset then you can add that to your computation and yirgaceffe will rasterize it on demand, so you never need to store more data in memory than necessary.
  • Do the raster layers get big and take up large amounts of memory? Yirgacheffe will let you do simple numerical operations with layers directly and then worry about the memory management behind the scenes for you.
  • Parallelisation of operations over many CPU cores.
  • Built in support for optionally using GPUs via MLX support.

Installation

Yirgacheffe is available via pypi, so can be installed with pip for example:

$ pip install yirgacheffe

Documentation

The documentation can be found on yirgacheffe.org

Simple examples:

Here is how to do cloud removal from Sentinel-2 data, using the Scene Classification Layer data:

import yirgaceffe as yg

with (
  yg.read_raster("T37NCG_20250909T073609_B06_20m.jp2") as vre2,
  yg.read_raster("T37NCG_20250909T073609_SCL_20m.jp2") as scl,
):
  is_cloud = (scl == 8) | (scl == 9) | (scl == 10)  # various cloud types
  is_shadow = (scl == 3)
  is_bad = is_cloud | is_shadow

  masked_vre2 = yg.where(is_bad, float("nan"), vre2)
  masked_vre2.to_geotiff("vre2_cleaned.tif")

or a species' Area of Habitat calculation:

import yirgaceffe as yg

with (
    yg.read_raster("habitats.tif") as habitat_map,
    yg.read_raster('elevation.tif') as elevation_map,
    yg.read_shape('species123.geojson') as range_map,
):
    refined_habitat = habitat_map.isin([...species habitat codes...])
    refined_elevation = (elevation_map >= species_min) & (elevation_map <= species_max)
    aoh = refined_habitat * refined_elevation * range_polygon * area_per_pixel_map
    print(f'Area of habitat: {aoh.sum()}')

Citation

If you use Yirgacheffe in your research, please cite our paper:

Michael Winston Dales, Alison Eyres, Patrick Ferris, Francesca A. Ridley, Simon Tarr, and Anil Madhavapeddy. 2025. Yirgacheffe: A Declarative Approach to Geospatial Data. In Proceedings of the 2nd ACM SIGPLAN International Workshop on Programming for the Planet (PROPL '25). Association for Computing Machinery, New York, NY, USA, 47–54. https://doi.org/10.1145/3759536.3763806

BibTeX
@inproceedings{10.1145/3759536.3763806,
  author = {Dales, Michael Winston and Eyres, Alison and Ferris, Patrick and Ridley, Francesca A. and Tarr, Simon and Madhavapeddy, Anil},
  title = {Yirgacheffe: A Declarative Approach to Geospatial Data},
  year = {2025},
  isbn = {9798400721618},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3759536.3763806},
  doi = {10.1145/3759536.3763806},
  abstract = {We present Yirgacheffe, a declarative geospatial library that allows spatial algorithms to be implemented concisely, supports parallel execution, and avoids common errors by automatically handling data (large geospatial rasters) and resources (cores, memory, GPUs). Our primary user domain comprises ecologists, where a typical problem involves cleaning messy occurrence data, overlaying it over tiled rasters, combining layers, and deriving actionable insights from the results. We describe the successes of this approach towards driving key pipelines related to global biodiversity and describe the capability gaps that remain, hoping to motivate more research into geospatial domain-specific languages.},
  booktitle = {Proceedings of the 2nd ACM SIGPLAN International Workshop on Programming for the Planet},
  pages = {47–54},
  numpages = {8},
  keywords = {Biodiversity, Declarative, Geospatial, Python},
  location = {Singapore, Singapore},
  series = {PROPL '25}
}

Thanks

Thanks to discussion and feedback from my colleagues, particularly Alison Eyres, Patrick Ferris, Amelia Holcomb, and Anil Madhavapeddy.

Inspired by the work of Daniele Baisero in his AoH library.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yirgacheffe-1.11.3.tar.gz (51.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yirgacheffe-1.11.3-py3-none-any.whl (58.6 kB view details)

Uploaded Python 3

File details

Details for the file yirgacheffe-1.11.3.tar.gz.

File metadata

  • Download URL: yirgacheffe-1.11.3.tar.gz
  • Upload date:
  • Size: 51.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for yirgacheffe-1.11.3.tar.gz
Algorithm Hash digest
SHA256 3f4c608a9ac761c0ea60efc414c394cbd2e797e8e33320a685d8e73ed2912317
MD5 7e6fc8325c858409ccc17d1cd1f1f320
BLAKE2b-256 889a95e2834a23e5f62e7f55974ed9e4fabdfb97956c1b0d6fa998083347c0f2

See more details on using hashes here.

File details

Details for the file yirgacheffe-1.11.3-py3-none-any.whl.

File metadata

  • Download URL: yirgacheffe-1.11.3-py3-none-any.whl
  • Upload date:
  • Size: 58.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for yirgacheffe-1.11.3-py3-none-any.whl
Algorithm Hash digest
SHA256 56ea927982d3b01ea8254313fef7fa219e11db1d19b53821b630db8e791284e2
MD5 7884f75e657869f1698462512c1bf0ff
BLAKE2b-256 105111126c4bffa77b45eeba222b51783df1676240709fb2459d8067096eff89

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page