Skip to main content

Efficient processing of cubic Earth-observation (EO) data.

Project description

A Python package for efficient processing of cubic earth observation (EO) data 🚀

PyPI License Black isort


GitHub: https://github.com/andesdatacube/cubexpress/ 🌐

PyPI: https://pypi.org/project/cubexpress/ 🛠️

Tests

Overview

CubeXpress is a Python package designed to simplify and accelerate the process of working with Google Earth Engine (GEE) data cubes. With features like multi-threaded downloads, automatic subdivision of large requests, and direct pixel-level computations on GEE, CubeXpress helps you handle massive datasets with ease.

Key Features

  • Fast Image and Collection Downloads
    Retrieve single images or entire collections at once, taking advantage of multi-threaded requests.
  • Automatic Tiling
    Large images are split ("quadsplit") into smaller sub-tiles, preventing errors with GEE’s size limits.
  • Direct Pixel Computations
    Perform computations (e.g., band math) directly on GEE, then fetch results in a single step.
  • Scalable & Efficient
    Optimized memory usage and parallelism let you handle complex tasks in big data environments.

Installation

Install the latest version from PyPI:

pip install cubexpress

Note: You need a valid Google Earth Engine account and earthengine-api installed (pip install earthengine-api). Also run ee.Initialize() before using CubeXpress.


Basic Usage

Download a single ee.Image

import ee
import cubexpress

# Initialize Earth Engine
ee.Initialize(project="your-project-id")

# Create a raster transform
geotransform = cubexpress.lonlat2rt(
    lon=-76.5,
    lat=-9.5,
    edge_size=128,  # Width=Height=128 pixels
    scale=90        # 90m resolution
)

# Define a single Request
request = cubexpress.Request(
    id="dem_test",
    raster_transform=geotransform,
    bands=["elevation"],
    image="NASA/NASADEM_HGT/001" # Note: you can wrap with ee.Image("NASA/NASADEM_HGT/001").divide(10000) if needed

# Build the RequestSet
cube_requests = cubexpress.RequestSet(requestset=[request])

# Download with multi-threading
cubexpress.getcube(
    request=cube_requests,
    output_path="output_dem",
    nworkers=4,
    max_deep_level=5
)

This will create a GeoTIFF named dem_test.tif in the output_dem folder, containing the elevation band.


Download pixel values from an ee.ImageCollection

You can fetch multiple images by constructing a RequestSet with several Request objects. For example, filter Sentinel-2 images near a point:

import ee
import cubexpress

ee.Initialize(project="your-project-id")

# Filter a Sentinel-2 collection
point = ee.Geometry.Point([-97.59, 33.37])
collection = ee.ImageCollection("COPERNICUS/S2_SR_HARMONIZED") \
               .filterBounds(point) \
               .filterDate('2024-01-01', '2024-01-31')

# Extract image IDs
image_ids = collection.aggregate_array('system:id').getInfo()

# Set the geotransform
geotransform = cubexpress.lonlat2rt(
    lon=-97.59, 
    lat=33.37, 
    edge_size=512, 
    scale=10
)

# Build multiple requests
requests = [
    cubexpress.Request(
        id=f"s2test_{i}",
        raster_transform=geotransform,
        bands=["B4", "B3", "B2"],
        image=image_id  # Note: you can wrap with ee.Image(image_id).divide(10000) if needed
    )
    for i, image_id in enumerate(image_ids)
]

# Create the RequestSet
cube_requests = cubexpress.RequestSet(requestset=requests)

# Download
cubexpress.getcube(
    request=cube_requests,
    output_path="output_sentinel",
    nworkers=4,
    max_deep_level=5
)

Process and extract a pixel from an ee.Image

If you provide an ee.Image with custom calculations (e.g., .divide(10000), .normalizedDifference(...)), CubeXpress can run those on GEE, then download the result. For large results, it automatically splits the image into sub-tiles.

import ee
import cubexpress

ee.Initialize(project="your-project-id")

# Example: NDVI from Sentinel-2
image = ee.Image("COPERNICUS/S2_HARMONIZED/20170804T154911_20170804T155116_T18SUJ") \
           .normalizedDifference(["B8", "B4"]) \
           .rename("NDVI")

geotransform = cubexpress.lonlat2rt(
    lon=-76.59, 
    lat=38.89, 
    edge_size=256, 
    scale=10
)

request = cubexpress.Request(
    id="ndvi_test",
    raster_transform=geotransform,
    bands=["NDVI"],
    image=image  # custom expression
)

cube_requests = cubexpress.RequestSet(requestset=[request])

cubexpress.getcube(
    request=cube_requests,
    output_path="output_ndvi",
    nworkers=2,
    max_deep_level=5
)

Advanced Usage

Same Set of Sentinel-2 Images for Multiple Points

Below is a advanced example demonstrating how to work with multiple points and a Sentinel-2 image collection in one script. We first create a global collection but then filter it on a point-by-point basis, extracting only the images that intersect each coordinate. Finally, we download them in parallel using CubeXpress.

import ee
import cubexpress

# Initialize Earth Engine with your project
ee.Initialize(project="your-project-id")

# Define multiple points (longitude, latitude)
points = [
    (-97.64, 33.37),
    (-97.59, 33.37)
]

# Start with a broad Sentinel-2 collection
collection = (
    ee.ImageCollection("COPERNICUS/S2_SR_HARMONIZED")
    .filterDate("2024-01-01", "2024-01-31")
)

# Build a list of Request objects
requestset = []
for i, (lon, lat) in enumerate(points):
    # Create a point geometry for the current coordinates
    point_geom = ee.Geometry.Point([lon, lat])
    collection_filtered = collection.filterBounds(point_geom)
    
    # Convert the filtered collection into a list of asset IDs
    image_ids = collection_filtered.aggregate_array("system:id").getInfo()
    
    # Define a geotransform for this point
    geotransform = cubexpress.lonlat2rt(
        lon=lon,
        lat=lat,
        edge_size=512,  # Adjust the image size in pixels
        scale=10        # 10m resolution for Sentinel-2
    )
    
    # Create one Request per image found for this point
    requestset.extend([
        cubexpress.Request(
            id=f"s2test_{i}_{idx}",
            raster_transform=geotransform,
            bands=["B4", "B3", "B2"],
            image=image_id
        )
        for idx, image_id in enumerate(image_ids)
    ])

# Combine into a RequestSet
cube_requests = cubexpress.RequestSet(requestset=requestset)

# Download everything in parallel
results = cubexpress.getcube(
    request=cube_requests,
    nworkers=4,
    output_path="images_s2",
    max_deep_level=5
)

print("Downloaded files:", results)

How it works:

  1. Points: We define multiple coordinates in points.
  2. Global collection: We retrieve a broad Sentinel-2 collection covering the desired date range.
  3. Per-point filter: For each point, we call .filterBounds(...) to get only images intersecting that location.
  4. Geotransform: We create a local geotransform (edge_size, scale) defining the spatial extent and resolution around each point.
  5. Requests: Each point-image pair becomes a Request, stored in a single list.
  6. Parallel download: With cubexpress.getcube(), all requests are fetched simultaneously, automatically splitting large outputs into sub-tiles if needed (up to max_deep_level).

License

This project is licensed under the MIT License.


Built with 🌎 and ❤️ by the CubeXpress team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cubexpress-0.1.44.tar.gz (50.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cubexpress-0.1.44-py3-none-any.whl (50.6 kB view details)

Uploaded Python 3

File details

Details for the file cubexpress-0.1.44.tar.gz.

File metadata

  • Download URL: cubexpress-0.1.44.tar.gz
  • Upload date:
  • Size: 50.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.3 Linux/6.14.0-24-generic

File hashes

Hashes for cubexpress-0.1.44.tar.gz
Algorithm Hash digest
SHA256 d8fa039ab067ea0743f8226fb44b46607ef106eebbbde29d8aa2eaf7498d63ec
MD5 3bb0a3c30634afb899b724bbe908705f
BLAKE2b-256 f2826dfd4eab3014fd4a910232cc9c09439f30e0b4f500df93595385ad7e019a

See more details on using hashes here.

File details

Details for the file cubexpress-0.1.44-py3-none-any.whl.

File metadata

  • Download URL: cubexpress-0.1.44-py3-none-any.whl
  • Upload date:
  • Size: 50.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.3 Linux/6.14.0-24-generic

File hashes

Hashes for cubexpress-0.1.44-py3-none-any.whl
Algorithm Hash digest
SHA256 b4502c8c772b6aca4ab196df331138d5de838da72d6175f07d70c5724400fb0a
MD5 78418049ac25971e52cc91a2a9017432
BLAKE2b-256 2ab2cac06b5f970054fc08173106548e499d582374ddd2b0e4c36ad05ade12ad

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page