Skip to main content

A toolbox to download, process, store and visualise Global Ecosystem Dynamics Investigation (GEDI) L2A-B and L4A-C data

Project description

gediDB Logo

gediDB: A toolbox for Global Ecosystem Dynamics Investigation (GEDI) L2A-B and L4A-C data

Pipelines Code coverage Docs Available on PyPI PyPI Downloads DOI Code style: black

gediDB is an open-source Python package designed to streamline the processing, analysis, and management of GEDI L2A-B and L4A-C data. This toolbox enables efficient and flexible data querying and management of large GEDI datasets stored with TileDB, a high-performance, multi-dimensional array database.

gediDB integrates key functionalities such as structured data querying, multi-dimensional data processing, and metadata management. With built-in support for parallel engines (e.g. Dask), the toolbox ensures scalability for large datasets, allowing efficient parallel processing on local machines or clusters.

Key Features of gediDB

  • TileDB-Based Storage: GEDI data is stored and managed in TileDB arrays, providing efficient, scalable, multi-dimensional data storage, enabling fast and flexible access to large volumes of data.
  • Flexible Data Querying: Easily query GEDI data across spatial, temporal, and variable dimensions. Access data within bounding boxes, or retrieve the nearest shots to a specific location, using intuitive filtering options for precision.
  • Parallel Processing: Process large GEDI datasets in parallel, enabling concurrent downloading, processing, and TileDB insertion of GEDI products. The number of concurrent processes can be easily controlled based on available system resources.
  • Metadata-Driven: Maintain and manage metadata for each dataset, ensuring that important contextual information like units, descriptions, and source details are stored and accessible.
  • Geospatial Data Management: Integrate seamlessly with tileDB to enable spatial queries, transformations, and geospatial analyses.

Why gediDB?

gediDB simplifies and automates the workflow for GEDI data processing, making it easier to retrieve, filter, and analyze complex datasets in an efficient, scalable manner. Whether you're investigating biomass distribution, monitoring forest dynamics, or conducting large-scale ecological studies, gediDB supports users with tools to handle and analyze large GEDI datasets with ease.

Documentation

Learn more about gediDB in its official documentation at https://gedidb.readthedocs.io/en/latest/.

Contributing

You can find information about contributing to gediDB on our Contributing page.

History

The development of the gediDB package began during the PhD of Amelia Holcomb, who initially created part of this toolset to analyze and manage GEDI data for her research. Recognizing the potential of her work to benefit the broader scientific community, the Global Land Monitoring team collaborated with Amelia in March 2024 to expand and optimize her code, transforming it into a scalable and versatile Python package named gediDB. This collaboration refined the toolbox to handle large-scale datasets with TileDB, integrate parallel processing, and incorporate a robust querying and metadata management system. Today, gediDB is designed to help researchers in ecological and environmental sciences by making GEDI data processing more efficient and accessible.

About the authors

Simon Besnard, a senior researcher in the Global Land Monitoring Group at GFZ Helmholtz Centre Potsdam, studies terrestrial ecosystems' dynamics and their feedback on environmental conditions. He specializes in developing methods to analyze large EO and climate datasets to understand ecosystem functioning in a changing climate. His current research focuses on forest structure changes over the past decade and their links to the carbon cycle.

Felix Dombrowski is a Bachelor’s student in Computer Science at the University of Potsdam and a research intern in the Global Land Monitoring Group at GFZ Helmholtz Centre Potsdam. At GFZ, his work has focused on developing toolboxes to process Earth Observation data efficiently.

Amelia Holcomb is a PhD candidate in Computer Science at the University of Cambridge, researching remote sensing and machine learning to study carbon sequestration and forest regrowth. Previously, she worked as a site reliability engineer at Google on Bigtable. She holds an MMath from the University of Waterloo and a B.A. in Mathematics from Yale.

Contact

For any questions or inquiries, please contact:

Acknowledgments

We acknowledge funding support by the European Union through the FORWARDS project. We would also like to thank the R2D2 Workshop (March 2024, GFZ Potsdam) for providing the opportunity to meet and discuss GEDI data processing.

License

This project is licensed under the EUROPEAN UNION PUBLIC LICENCE v.1.2 License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gedidb-2025.9.24.tar.gz (10.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gedidb-2025.9.24-py3-none-any.whl (10.4 MB view details)

Uploaded Python 3

File details

Details for the file gedidb-2025.9.24.tar.gz.

File metadata

  • Download URL: gedidb-2025.9.24.tar.gz
  • Upload date:
  • Size: 10.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for gedidb-2025.9.24.tar.gz
Algorithm Hash digest
SHA256 970f3ac59f4d34a3b9e0ae3be495975fce8f70d3b783328f394f724d9624bb0d
MD5 08b44ccb0db27ceed8d9bb894d798404
BLAKE2b-256 1eb8e820cb8952f39069b7efb8e261ba0926f793c10841c2d5e7b34dc1089ed6

See more details on using hashes here.

Provenance

The following attestation bundles were made for gedidb-2025.9.24.tar.gz:

Publisher: pypi-release.yaml on simonbesnard1/gedidb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gedidb-2025.9.24-py3-none-any.whl.

File metadata

  • Download URL: gedidb-2025.9.24-py3-none-any.whl
  • Upload date:
  • Size: 10.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for gedidb-2025.9.24-py3-none-any.whl
Algorithm Hash digest
SHA256 6c2098e97ddb1688f70a38b16bc3c18f5ec9611aa4f94db274ee89a872b4f720
MD5 ab8a7c25c7caa332bd19041b6bc4289c
BLAKE2b-256 0965c2dd97414887ff05ec0ee53adca4d301ca72242d0053588a0d6e07ee5a48

See more details on using hashes here.

Provenance

The following attestation bundles were made for gedidb-2025.9.24-py3-none-any.whl:

Publisher: pypi-release.yaml on simonbesnard1/gedidb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page