Skip to main content

Virtualizarr access for AEF embeddings.

Reason this release was yanked:

Introduced a regression resulting in default unmasked values pulling aef values

Project description

aef-loader

Virtualizarr access for AEF embeddings as an analysis ready data cube, alongside rapid querying of the GCS and Source Coop index. 2x quicker than rioxarray for single tile downloads.

What is AEF?

Alpha Earth Foundations embeddings is a dataset produced by Google Deepmind, providing a yearly 64-channel embeddings derived from numerous satellite image sources with numerous downstream applications. The embeddings are stored as multi-band Cloud-Optimised GeoTIFFs (COGs), alongside a parquet index file.

AEF is stored by two hosts:

  • Google Cloud Storage - requester pays (requires gcp_project)
  • Source Cooperative - AWS hosted and free to access

More in the docs.

What does aef-loader do?

aef-loader provides two key functionalities:

  1. Rapid download, and querying of indexes for source_coop + gcs with obstore and geopandas
  2. Lazily load the COGs as VirtualiZarr as a datatree by UTM zone, COG headers are cached, so repeated reads are cheap(er)

As additional utilities:

  • dequantize, requantize, or raw-cast (int8 → float32) the embeddings
  • split the "embeddings" dataset into 64 datasets
  • use odc-geobox for dask aware reprojections for creating multi-zone composites

Overview

Alpha Earth Foundations embeddings is a dataset produced by Google Deepmind, providing a yearly 64-channel embeddings derived from numerous satellite image sources with numerous downstream applications. The embeddings are stored as multi-band Cloud-Optimised GeoTIFFs (COGs).

aef-loader supports two dataset hosts, both having tradeoffs:

  1. Google Cloud Storage - maintained by the Earth Engine team, more up to date but requiring authentication and "requester pays", meaning users must pay egress and other charges.
  2. Source Cooperative - Hosted on AWS S3 and free to access, includes the full time range (2017-2025) Recommended

Installation

pip install aef-loader

or:

uv add aef-loader

Quick Start

import asyncio
from aef_loader import AEFIndex, VirtualTiffReader, DataSource
from aef_loader.utils import reproject_datatree
from odc.geo.geobox import GeoBox

async def main():
    # Initialize index (Source Cooperative - no auth needed)
    index = AEFIndex(source=DataSource.SOURCE_COOP)
    await index.download()
    index.load() # returns a gdf for alternative use

    # Query for tiles
    tiles = await index.query(
        bbox=(-122.5, 37.5, -122.0, 38.0),
        years=(2020, 2023),
    )

    # Load tiles organized by UTM zone
    async with VirtualTiffReader() as reader:
        tree = await reader.open_tiles_by_zone(tiles)

    # Each zone is a separate Dataset with its native CRS
    for zone in tree.children:
        ds = tree[zone].ds
        print(f"{zone}: {ds.odc.crs}, {dict(ds.sizes)}")

    # Optionally reproject all zones to a common CRS
    target = GeoBox.from_bbox(
        bbox=(-122.5, 37.5, -122.0, 38.0),
        crs="EPSG:4326",
        resolution=0.0001,
    )
    combined = reproject_datatree(tree, target)

asyncio.run(main())

Attribution and Dataset License

This dataset is licensed under CC-BY 4.0 and requires the following attribution text: "The AlphaEarth Foundations Satellite Embedding dataset is produced by Google and Google DeepMind."

Special notes

Thanks to Max Jones, Virtual-tiff and Virtualizarr.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aef_loader-0.2.2.tar.gz (168.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aef_loader-0.2.2-py3-none-any.whl (20.4 kB view details)

Uploaded Python 3

File details

Details for the file aef_loader-0.2.2.tar.gz.

File metadata

  • Download URL: aef_loader-0.2.2.tar.gz
  • Upload date:
  • Size: 168.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for aef_loader-0.2.2.tar.gz
Algorithm Hash digest
SHA256 93dc1f86e6d769017352bc3add1717e912cc7cc514aef57dc954774fb9e95d8d
MD5 aeab65bacfbacd588ea81cb319a4e72e
BLAKE2b-256 f916a1247a4f830ffaa38680b94db1eb3f454818b81244d71969768fdc7148ba

See more details on using hashes here.

Provenance

The following attestation bundles were made for aef_loader-0.2.2.tar.gz:

Publisher: publish.yml on jakenotjay/aef-loader

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file aef_loader-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: aef_loader-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 20.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for aef_loader-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f8f08c4b274af5336d0bb6bbd1c7be9a7492440cd21b4227c6e6726ffcad4905
MD5 5aa683ac810105508d383bb1486b3439
BLAKE2b-256 0bb7738d0f55484f0a52038ba5366953a92a5f18add09edde2dfda079bdd7892

See more details on using hashes here.

Provenance

The following attestation bundles were made for aef_loader-0.2.2-py3-none-any.whl:

Publisher: publish.yml on jakenotjay/aef-loader

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page