Skip to main content

Efficient processing of cubic Earth-observation (EO) data.

Project description

A Python package for efficient processing of cubic earth observation (EO) data 🚀

PyPI License Black isort


GitHub: https://github.com/andesdatacube/cubexpress/ 🌐

PyPI: https://pypi.org/project/cubexpress/ 🛠️

Tests

Overview

CubeXpress is a Python package designed to simplify and accelerate the process of working with Google Earth Engine (GEE) data cubes. With features like multi-threaded downloads, automatic subdivision of large requests, and direct pixel-level computations on GEE, CubeXpress helps you handle massive datasets with ease.

Key Features

  • Fast Image and Collection Downloads
    Retrieve single images or entire collections at once, taking advantage of multi-threaded requests.
  • Automatic Tiling
    Large images are split ("quadsplit") into smaller sub-tiles, preventing errors with GEE’s size limits.
  • Direct Pixel Computations
    Perform computations (e.g., band math) directly on GEE, then fetch results in a single step.
  • Scalable & Efficient
    Optimized memory usage and parallelism let you handle complex tasks in big data environments.

Installation

Install the latest version from PyPI:

pip install cubexpress

Note: You need a valid Google Earth Engine account and earthengine-api installed (pip install earthengine-api). Also run ee.Initialize() before using CubeXpress.


Basic Usage

Download a single ee.Image

import ee
import cubexpress

# Initialize Earth Engine
ee.Initialize(project="your-project-id")

# Create a raster transform
geotransform = cubexpress.lonlat2rt(
    lon=-76.5,
    lat=-9.5,
    edge_size=128,  # Width=Height=128 pixels
    scale=90        # 90m resolution
)

# Define a single Request
request = cubexpress.Request(
    id="dem_test",
    raster_transform=geotransform,
    bands=["elevation"],
    image="NASA/NASADEM_HGT/001" # Note: you can wrap with ee.Image("NASA/NASADEM_HGT/001").divide(10000) if needed

# Build the RequestSet
cube_requests = cubexpress.RequestSet(requestset=[request])

# Download with multi-threading
cubexpress.getcube(
    request=cube_requests,
    output_path="output_dem",
    nworkers=4,
    max_deep_level=5
)

This will create a GeoTIFF named dem_test.tif in the output_dem folder, containing the elevation band.


Download pixel values from an ee.ImageCollection

You can fetch multiple images by constructing a RequestSet with several Request objects. For example, filter Sentinel-2 images near a point:

import ee
import cubexpress

ee.Initialize(project="your-project-id")

# Filter a Sentinel-2 collection
point = ee.Geometry.Point([-97.59, 33.37])
collection = ee.ImageCollection("COPERNICUS/S2_SR_HARMONIZED") \
               .filterBounds(point) \
               .filterDate('2024-01-01', '2024-01-31')

# Extract image IDs
image_ids = collection.aggregate_array('system:id').getInfo()

# Set the geotransform
geotransform = cubexpress.lonlat2rt(
    lon=-97.59, 
    lat=33.37, 
    edge_size=512, 
    scale=10
)

# Build multiple requests
requests = [
    cubexpress.Request(
        id=f"s2test_{i}",
        raster_transform=geotransform,
        bands=["B4", "B3", "B2"],
        image=image_id  # Note: you can wrap with ee.Image(image_id).divide(10000) if needed
    )
    for i, image_id in enumerate(image_ids)
]

# Create the RequestSet
cube_requests = cubexpress.RequestSet(requestset=requests)

# Download
cubexpress.getcube(
    request=cube_requests,
    output_path="output_sentinel",
    nworkers=4,
    max_deep_level=5
)

Process and extract a pixel from an ee.Image

If you provide an ee.Image with custom calculations (e.g., .divide(10000), .normalizedDifference(...)), CubeXpress can run those on GEE, then download the result. For large results, it automatically splits the image into sub-tiles.

import ee
import cubexpress

ee.Initialize(project="your-project-id")

# Example: NDVI from Sentinel-2
image = ee.Image("COPERNICUS/S2_HARMONIZED/20170804T154911_20170804T155116_T18SUJ") \
           .normalizedDifference(["B8", "B4"]) \
           .rename("NDVI")

geotransform = cubexpress.lonlat2rt(
    lon=-76.59, 
    lat=38.89, 
    edge_size=256, 
    scale=10
)

request = cubexpress.Request(
    id="ndvi_test",
    raster_transform=geotransform,
    bands=["NDVI"],
    image=image  # custom expression
)

cube_requests = cubexpress.RequestSet(requestset=[request])

cubexpress.getcube(
    request=cube_requests,
    output_path="output_ndvi",
    nworkers=2,
    max_deep_level=5
)

Advanced Usage

Same Set of Sentinel-2 Images for Multiple Points

Below is a advanced example demonstrating how to work with multiple points and a Sentinel-2 image collection in one script. We first create a global collection but then filter it on a point-by-point basis, extracting only the images that intersect each coordinate. Finally, we download them in parallel using CubeXpress.

import ee
import cubexpress

# Initialize Earth Engine with your project
ee.Initialize(project="your-project-id")

# Define multiple points (longitude, latitude)
points = [
    (-97.64, 33.37),
    (-97.59, 33.37)
]

# Start with a broad Sentinel-2 collection
collection = (
    ee.ImageCollection("COPERNICUS/S2_SR_HARMONIZED")
    .filterDate("2024-01-01", "2024-01-31")
)

# Build a list of Request objects
requestset = []
for i, (lon, lat) in enumerate(points):
    # Create a point geometry for the current coordinates
    point_geom = ee.Geometry.Point([lon, lat])
    collection_filtered = collection.filterBounds(point_geom)
    
    # Convert the filtered collection into a list of asset IDs
    image_ids = collection_filtered.aggregate_array("system:id").getInfo()
    
    # Define a geotransform for this point
    geotransform = cubexpress.lonlat2rt(
        lon=lon,
        lat=lat,
        edge_size=512,  # Adjust the image size in pixels
        scale=10        # 10m resolution for Sentinel-2
    )
    
    # Create one Request per image found for this point
    requestset.extend([
        cubexpress.Request(
            id=f"s2test_{i}_{idx}",
            raster_transform=geotransform,
            bands=["B4", "B3", "B2"],
            image=image_id
        )
        for idx, image_id in enumerate(image_ids)
    ])

# Combine into a RequestSet
cube_requests = cubexpress.RequestSet(requestset=requestset)

# Download everything in parallel
results = cubexpress.getcube(
    request=cube_requests,
    nworkers=4,
    output_path="images_s2",
    max_deep_level=5
)

print("Downloaded files:", results)

How it works:

  1. Points: We define multiple coordinates in points.
  2. Global collection: We retrieve a broad Sentinel-2 collection covering the desired date range.
  3. Per-point filter: For each point, we call .filterBounds(...) to get only images intersecting that location.
  4. Geotransform: We create a local geotransform (edge_size, scale) defining the spatial extent and resolution around each point.
  5. Requests: Each point-image pair becomes a Request, stored in a single list.
  6. Parallel download: With cubexpress.getcube(), all requests are fetched simultaneously, automatically splitting large outputs into sub-tiles if needed (up to max_deep_level).

License

This project is licensed under the MIT License.


Built with 🌎 and ❤️ by the CubeXpress team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cubexpress-0.1.46.tar.gz (49.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cubexpress-0.1.46-py3-none-any.whl (49.5 kB view details)

Uploaded Python 3

File details

Details for the file cubexpress-0.1.46.tar.gz.

File metadata

  • Download URL: cubexpress-0.1.46.tar.gz
  • Upload date:
  • Size: 49.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.3 Linux/6.14.0-24-generic

File hashes

Hashes for cubexpress-0.1.46.tar.gz
Algorithm Hash digest
SHA256 42b986ec625b0b6efedcbaf822f8075aaedb71d7ed0d1c5b04f0611edd8ca9fb
MD5 b91c564638ecfd22fabc235cb78aa9fc
BLAKE2b-256 c05fe859fadb5aee148a6aa578c57e016c17a0b12a9f9ddfed6bb4e4f7478188

See more details on using hashes here.

File details

Details for the file cubexpress-0.1.46-py3-none-any.whl.

File metadata

  • Download URL: cubexpress-0.1.46-py3-none-any.whl
  • Upload date:
  • Size: 49.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.3 Linux/6.14.0-24-generic

File hashes

Hashes for cubexpress-0.1.46-py3-none-any.whl
Algorithm Hash digest
SHA256 2e6e184a88cf0b717005c666a2330d796b269f733f4c2dc3f80ec4143f366bde
MD5 42720f736f15632dde3c88c3c2a5cdc3
BLAKE2b-256 1e120e29dd940a2013acf6d2503e5e3f2c71cb32ea0944251c58cb3cd83be64e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page