Skip to main content

An automated, convenient downloader for Google Earth Engine datasets in Python

Project description

eeharvest

header

Project generated with PyScaffold Commitizen friendly codecov PyPI-Server Conda Version Monthly Downloads GitHub last commit

An Agricultural Research Federation (AgReFed) project, the eeharvest package simplifies access to Google Earth Engine and its data catalog with a quartet of convenient methods to collect, process and download data:

  • preprocess(): server-side processing, cloud and shadow masking, image reduction and calculation of spectral indices
  • aggregate(): 🚧(work-in-progress)🚧 perform additional temporal aggregaton on data
  • download(): download data collection(s) to disk without limits on size or number of files
  • map(): preview assets automatically in an interactive map

⚠ WARNING: eeharvest does only a few things, but it does them well. The main objective is to provide a simple, intuitive interface to Google Earth Engine that is easy to use and understand for researchers who may not have a lot of experience with Python or Google Earth Engine, but they "just want to download some maps". Most importantly, eeharvest is designed to be used with geodata-harvester for fully automated and reproducible data extraction and processing, but we understand the benefits of using it as a standalone package.

If you are an advanced user, we recommend that you use the Earth Engine API directly (but see useful add-on packages such as eemont and geemap in our acknowledgements below).

Why eeharvest?

This package is part of the AgReFed Geodata-Harvester project which extends the vision of providing Findable, Accessible, Interoperable and Reusable (FAIR) agricultural data (and beyond) to Australian researchers and stakeholders.

There are currently three packages that have been produced under AgReFed:

  • 🐍 geodata-harvester (link): a Python package for data extraction and processing from a wide range of data sources in Australia, with support for Google Earth Engine via a dependency on eeharvest (see below)
  • 🐍 eeharvest: this package, which provides access to Google Earth Engine and is designed to work as a standalone package
  • R dataharvester (link): an R package that replicates the functionality of geodata-harvester, but with additional support for functional R programming and the tidyverse

Features

  • Download from any dataset available on the Google Earth Engine Data Catalog
  • ✅ Perform automatic cloud and shadow masking (credit: eemont)
  • Scale and offset image bands instantly (credit: eemont)
  • Spatial aggregation/reduction (e.g. median)
  • Temporal aggregation/reduction (🚧 work-in-progress 🚧)
  • ✅ Quickly calculate from a vast library of spectral indices, e.g. NDVI, BAI (credit: Awesome Spectral Indices)
  • Preview assets instantly using interactive maps, including calculated spectral indices (credit: geemap)
  • Downlod any number of image assets with (almost) no size limits - please be sensible with this feature (credit: geedim)
  • Automate all of the above with the use of YAML config files

Examples

import eeharvest

eeharvest.initialise()

# specify collection, coordinates and date range
img = eeharvest.collect(
        collection="LANDSAT/LC08/C02/T1_L2",
        coords=[149.799, -30.31, 149.80, -30.309],
        date_min="2019-01-01",
        date_max="2019-02-01",
    )

# cloud and shadow masking, spatial aggregation, NDVI calculation
img.preprocess(mask_clouds=True, reduce="median", spectral="NDVI")

# visualise (optional, but fun)
img.map(bands="NDVI")

# download to disk (defaults to a "downloads" folder in working directory)
img.download(bands="NDVI")

Installation

Installing dependencies from conda

Before installing the package you may need to install the following packages manually:

  • GDAL: to manipulate raster and vector geospatial data
  • gcloud CLI: needed to authenticate to Google servers

In most cases, these can be installed through conda-forge (but see alternatives below if not):

conda install -c conda-forge gdal google-cloud-sdk

Installing dependencies from binaries

If conda is somehow not an option, you can install the two dependencies from binaries. For GDAL, use apt-get or brew (macOS). Clear instructions have been written on the rasterio and PyPi GDAL websites. For the Google Cloud SDK, follow the instructions on the gcloud CLI website.

Conda - recommended

conda install -c conda-forge eeharvest

Pip

pip install -U eeharvest

Attribution and Acknowledgments

This software was developed by the Sydney Informatics Hub, a core research facility of the University of Sydney, as part of the Data Harvesting project for the Agricultural Research Federation (AgReFed). AgReFed is supported by the Australian Research Data Commons (ARDC) and the Australian Government through the National Collaborative Research Infrastructure Strategy (NCRIS).

Acknowledgments are an important way for us to demonstrate the value we bring to your research. Your research outcomes are vital for ongoing funding of the Sydney Informatics Hub. If you make use of this software for your research project, please include the following acknowledgment:

This research was supported by the Sydney Informatics Hub, a Core Research Facility of the University of Sydney, and the Agricultural Research Federation (AgReFed).

Credits

Note

This project has been set up using PyScaffold 4.3.1 and the dsproject extension 0.7.2. For more information see CONTRIBUTING.md in this repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eeharvest-1.6.0.tar.gz (136.7 kB view details)

Uploaded Source

Built Distribution

eeharvest-1.6.0-py3-none-any.whl (24.1 kB view details)

Uploaded Python 3

File details

Details for the file eeharvest-1.6.0.tar.gz.

File metadata

  • Download URL: eeharvest-1.6.0.tar.gz
  • Upload date:
  • Size: 136.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.0

File hashes

Hashes for eeharvest-1.6.0.tar.gz
Algorithm Hash digest
SHA256 14194d3508b39121799f08e028113a0547b4e34abadce5d70323f39bd94298f9
MD5 063b632e4a9ff7f7beb59b1eaab463ea
BLAKE2b-256 5cd2950674a9d36d963a340b17b86bb04e48303f525a2c3e2867353510b8a054

See more details on using hashes here.

File details

Details for the file eeharvest-1.6.0-py3-none-any.whl.

File metadata

  • Download URL: eeharvest-1.6.0-py3-none-any.whl
  • Upload date:
  • Size: 24.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.0

File hashes

Hashes for eeharvest-1.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dd3c2dfebb806a5eac3d9ac18988609af78954f054755b14181544de02d61fff
MD5 35015ece13ffe6cdd7ad0de0e706cdb1
BLAKE2b-256 4a0beaa500dab95cb49e768ea0225c7b1a4eeb723c8e10288ce179d90a380c15

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page