Skip to main content

Because using satellite data should not be rocket science.

Project description

AgriGEE.lite

mascote

AgriGEE.lite is an Earth Engine wrapper that allows easy download of Analysis Ready Multimodal Data (ARD), focused on downloading time series of agricultural and native vegetation data.

For example, to download and view a time series of cloud-free Sentinel 2 imagery cropped to a specific field and date range, only a few lines of code are required. Here’s an example:

import agrigee_lite as agl
import ee

ee.Initialize()

gdf = gpd.read_parquet("data/sample.parquet")
row = gdf.iloc[0]

satellite = agl.sat.Sentinel2(bands=["red", "green", "blue"])
agl.vis.images(row.geometry, row.start_date, row.end_date, satellite)

Through this example, it is already possible to understand the basic functioning of the lib. The entire lib was designed to be used in conjunction with GeoPandas, and a basic knowledge of it is necessary.

{Sentinel 2 RGB Agricultural Area from Mato Grosso, Brazil}

You can also download aggregations, such as spatial median aggregations of indices. Here's an example with median from multiple satellites:

{Multiple satellites EVI2 time series}

For more examples, see the examples folder.

Finally, the library features multithreaded downloading, which allows downloading on average 16-22 time series per second (assuming 1-year series, cloud-free for Sentinel 2 Surface Reflectance).

The lib has 3 types of elements, which are divided into modules:

  • agl.sat = Data sources, usually coming from satellites/sensors. When defining a sensor, it is possible to choose which bands you want to view/download, or whether you want to use atmospheric corrections or not. By default, all bands are downloaded, and all atmospheric corrections and harmonizations are used.

  • agl.vis = Module that allows you to view data, either through time series or images.

  • agl.get = Module that allows you to download data on a large scale.

Available data sources (satellites, sensors, models and so on)

Name Bands Start Date End Date Regionality Pixel Size Revisit Time Variations
Sentinel 2 Blue, Green, Red, Re1, Re2, Re3, Nir, Re4, Swir1, Swir2 2016-01-01 (still operational) Worldwide 10 -- 60 5 days (with clouds), 8 days (wo) Surface Reflectance, Top of Atmosphere
Landsat 5 Blue, Green, Red, Nir, Swir1, Swir2 1984-03-01 2013-05-05 Worldwide* 15 -- 30 16 days Surface Reflectance, Top of Atmosphere; Tier 1 and Tier 2;
Landsat 7 Blue, Green, Red, Nir, Swir1, Swir2 1999-04-15 2022-04-06 Worldwide* 15 -- 30 16 days Surface Reflectance, Top of Atmosphere; Tier 1 and Tier 2;
Landsat 8 Blue, Green, Red, Nir, Swir1, Swir2 2013-04-11 (still operational) Worldwide 15 -- 30 16 days Surface Reflectance, Top of Atmosphere; Tier 1 and Tier 2;
Landsat 9 Blue, Green, Red, Nir, Swir1, Swir2 2021-11-01 (still operational) Worldwide 15 -- 30 16 days Surface Reflectance, Top of Atmosphere; Tier 1 and Tier 2;
MODIS Terra/Acqua Red, Nir 2000-02-18 (still operational) Worldwide 15 -- 30 daily (with clouds)
Sentinel 1 VV, VH - C Band 2014-10-03 (still operational) Worldwide* 10** 5 days**** GRD, ARD***
JAXOS PalSAR 1/2 HH, HV - L Band 2014-08-04 (still operational) Worldwide 25** 15 days GRD
Satellite Embeddings V1 64-dimensional embedding 2017-01-01 2024-01-01 Worldwide 10 1 year
Mapbiomas Brazil 37 Land Usage Land Cover Classes 1985-01-01 2024-12-31 Brazil 30 1 year
ANADEM Slope, Elevation, Aspect (single image) (single image) South America 30** (single image)
World Reference Base (2006) Soil Groups - SoilGrids WRB Soil Classes (30 categories) (single image) (single image) Worldwide 250 (single image)

Observations

  • *Landsat 7 images began to have artifacts caused by a sensor problem from 2003-05-31.
  • **Pixel size/spatial resolution for active sensors (or models that use active sensors) often lacks a clear value, as it depends on the angle of incidence. Here, the GEE value itself is explained, representing the highest resolution captured.
  • ***Analysis Ready Data (ARD) is an advanced post-processing method applied to a SAR. However, it is quite costly, and its usefulness must be evaluated on a case-by-case basis.
  • ****Sentinel 1 was a twin satellite, one of which went out of service due to a malfunction. Therefore, the revisit time varies greatly depending on the desired geolocation.

Motivations: what an average data scientist - me - thought when I started learning GEE

My journey with GEE started two and a half years ago. GEE is excellent, it allows you to use several different satellite data, but it is very complex to code. In addition to using A LOT of boilerplate code, the errors are extremely confusing, since the tool is executed server-side, most likely with a pure functional language (like Haskell). During my master's degree, I struggled a lot writing codes for GEE. Furthermore, harmonizing all satellites at the same time is difficult. Typically, each satellite has a different range of values ​​and cloud masks. Tired of having to rewrite similar codes, I decided to create a lib with a simple goal: using satellite data should be as simple as reading a CSV in Pandas, and you shouldn't need to be a Remote Sensing expert to achieve it.

Objectives and target audience

The main objective of the lib is to be a simple, direct and high-performance way to download satellite data, both for academic and commercial use. Did you like it? Give it a star. Want to contribute? Open your Pull Request, I will be happy to include it.

Questions possibly asked

But isn't it just a case of using STAC? Why pay Google?

This is a tempting proposition, and it actually makes sense for large scale projects. However, processing satellite data locally can easily escalate to hell, especially for countries with huge agricultural areas like Australia, the United States or Brazil. So, you have to do the math to figure out whether it is cheaper or not to use GEE than to have your own processing infrastructure than STAC. However, GEE is completely free for students and non-commercial projects.

"Hello, I am a Remote Sensing expert, and I believe that the term satellite is not the most appropriate, radars do not have bands and.... "

Yes, I was told that. The use of the term "satellite" instead of sensor, data source or something else is intended to simplify things, even though it is not the most correct term. Note that even Mapbiomas, a WONDERFUL project that is made using models and AMAZING PEOPLE (❤️) is called a satellite, and is treated exactly the same as Sentinel 2 or any Landsat. The same goes for the idea of ​​"bands" in a radar like Sentinel 1. The more standardized it is, the easier it is to keep the library code working. However, note that your help to the project is VERY WELCOME.

The library mascot is cute! Did you make it?

Absolutely not, I'm terrible at drawings and anything. I made it using GPT4, and all the rights belong to God knows who. The base art is from the Odd-Eyes Venom Dragon card from the Yu-Gi-Oh card game. The inspiration has nothing to do with venom, but rather because it is a plant dragon (agriculture), it is a fusion card (multimodal data) and it has odd-eyes (like satellites, seeing the world through different eyes). If you're a cartoonist and want to design a new mascot, I'd be more than happy to make it official.

Known Bugs

  • QuadTree clustering functions produce absurd results when there are very uneven geographic density distributions, for example, thousands of points in one country and a few dozen in another. Some prior geospatial partitioning is recommended.

TO-DO

  • Add Sentinel 2 as a satellite;
  • Add Landsats 5, 7, 8, 9 as a satellite;
  • Add Sentinel 1 GRD as a satellite;
  • Add Mapbiomas Brazil as a satellite (data source);
  • Add MODIS Terra/Acqua;
  • Add Satellite Image Time Series Aggregations online download;
  • Add Satellite Image Time Series Aggregations task download;
  • Add Images online download/visualization with matplotlib;
  • Add single/multiple SITS visualization
  • Add smart_open[gcs] for autorecovery SITS from GCS;
  • Add ALOS-2 PALSAR-2 radar;
  • Add Images online visualization with plotly;
  • Make cloud mask removable;
  • Add all other Mapbiomas;
  • Add Sentinel 1 ARD;
  • Add Sentinel 3;
  • Add jurassic Landsats (1-4);
  • Add Landsat Pansharpening for image download;

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agrigee_lite-2.0.2.tar.gz (28.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agrigee_lite-2.0.2-py3-none-any.whl (29.7 kB view details)

Uploaded Python 3

File details

Details for the file agrigee_lite-2.0.2.tar.gz.

File metadata

  • Download URL: agrigee_lite-2.0.2.tar.gz
  • Upload date:
  • Size: 28.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.10

File hashes

Hashes for agrigee_lite-2.0.2.tar.gz
Algorithm Hash digest
SHA256 8ab458e0c038936ae4b8bfe591ea1b19a0c2ce14c506b3f160abcff686b19441
MD5 150bd6a34eca05ae4dc93a46b623e806
BLAKE2b-256 706fdb2b789ab530d00eb075259496641ca193566459924b7c3945b6d30cd3fd

See more details on using hashes here.

File details

Details for the file agrigee_lite-2.0.2-py3-none-any.whl.

File metadata

  • Download URL: agrigee_lite-2.0.2-py3-none-any.whl
  • Upload date:
  • Size: 29.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.10

File hashes

Hashes for agrigee_lite-2.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 06f868e1618ece934f4d3b65548a7ef9fe20bf8ab4666e09038e8869be16f4e4
MD5 a65f9241b36677eb2620f77b27ced234
BLAKE2b-256 15902c65f99e740da7c743fbe744b0940a2574e50610edecfb61195a031156be

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page