Skip to main content

Efficient processing of cubic Earth-observation (EO) data.

Project description

A Python package for efficient processing of cubic earth observation (EO) data 🚀

PyPI License Black isort


GitHub: https://github.com/andesdatacube/cubexpress/ 🌐

PyPI: https://pypi.org/project/cubexpress/ 🛠️

Tests

Overview

CubeXpress is a Python package designed to simplify and accelerate the process of working with Google Earth Engine (GEE) data cubes. With features like multi-threaded downloads, automatic subdivision of large requests, and direct pixel-level computations on GEE, CubeXpress helps you handle massive datasets with ease.

Key Features

  • Fast Image and Collection Downloads
    Retrieve single images or entire collections at once, taking advantage of multi-threaded requests.
  • Automatic Tiling
    Large images are split ("quadsplit") into smaller sub-tiles, preventing errors with GEE’s size limits.
  • Direct Pixel Computations
    Perform computations (e.g., band math) directly on GEE, then fetch results in a single step.
  • Scalable & Efficient
    Optimized memory usage and parallelism let you handle complex tasks in big data environments.

Installation

Install the latest version from PyPI:

pip install cubexpress

Note: You need a valid Google Earth Engine account and earthengine-api installed (pip install earthengine-api). Also run ee.Initialize() before using CubeXpress.


Basic Usage

Download a single ee.Image

import ee
import cubexpress

# Initialize Earth Engine
ee.Initialize(project="your-project-id")

# Create a raster transform
geotransform = cubexpress.lonlat2rt(
    lon=-76.5,
    lat=-9.5,
    edge_size=128,  # Width=Height=128 pixels
    scale=90        # 90m resolution
)

# Define a single Request
request = cubexpress.Request(
    id="dem_test",
    raster_transform=geotransform,
    bands=["elevation"],
    image="NASA/NASADEM_HGT/001" # Note: you can wrap with ee.Image("NASA/NASADEM_HGT/001").divide(10000) if needed

# Build the RequestSet
cube_requests = cubexpress.RequestSet(requestset=[request])

# Download with multi-threading
cubexpress.getcube(
    request=cube_requests,
    output_path="output_dem",
    nworkers=4,
    max_deep_level=5
)

This will create a GeoTIFF named dem_test.tif in the output_dem folder, containing the elevation band.


Download pixel values from an ee.ImageCollection

You can fetch multiple images by constructing a RequestSet with several Request objects. For example, filter Sentinel-2 images near a point:

import ee
import cubexpress

ee.Initialize(project="your-project-id")

# Filter a Sentinel-2 collection
point = ee.Geometry.Point([-97.59, 33.37])
collection = ee.ImageCollection("COPERNICUS/S2_SR_HARMONIZED") \
               .filterBounds(point) \
               .filterDate('2024-01-01', '2024-01-31')

# Extract image IDs
image_ids = collection.aggregate_array('system:id').getInfo()

# Set the geotransform
geotransform = cubexpress.lonlat2rt(
    lon=-97.59, 
    lat=33.37, 
    edge_size=512, 
    scale=10
)

# Build multiple requests
requests = [
    cubexpress.Request(
        id=f"s2test_{i}",
        raster_transform=geotransform,
        bands=["B4", "B3", "B2"],
        image=image_id  # Note: you can wrap with ee.Image(image_id).divide(10000) if needed
    )
    for i, image_id in enumerate(image_ids)
]

# Create the RequestSet
cube_requests = cubexpress.RequestSet(requestset=requests)

# Download
cubexpress.getcube(
    request=cube_requests,
    output_path="output_sentinel",
    nworkers=4,
    max_deep_level=5
)

Process and extract a pixel from an ee.Image

If you provide an ee.Image with custom calculations (e.g., .divide(10000), .normalizedDifference(...)), CubeXpress can run those on GEE, then download the result. For large results, it automatically splits the image into sub-tiles.

import ee
import cubexpress

ee.Initialize(project="your-project-id")

# Example: NDVI from Sentinel-2
image = ee.Image("COPERNICUS/S2_HARMONIZED/20170804T154911_20170804T155116_T18SUJ") \
           .normalizedDifference(["B8", "B4"]) \
           .rename("NDVI")

geotransform = cubexpress.lonlat2rt(
    lon=-76.59, 
    lat=38.89, 
    edge_size=256, 
    scale=10
)

request = cubexpress.Request(
    id="ndvi_test",
    raster_transform=geotransform,
    bands=["NDVI"],
    image=image  # custom expression
)

cube_requests = cubexpress.RequestSet(requestset=[request])

cubexpress.getcube(
    request=cube_requests,
    output_path="output_ndvi",
    nworkers=2,
    max_deep_level=5
)

Advanced Usage

Same Set of Sentinel-2 Images for Multiple Points

Below is a advanced example demonstrating how to work with multiple points and a Sentinel-2 image collection in one script. We first create a global collection but then filter it on a point-by-point basis, extracting only the images that intersect each coordinate. Finally, we download them in parallel using CubeXpress.

import ee
import cubexpress

# Initialize Earth Engine with your project
ee.Initialize(project="your-project-id")

# Define multiple points (longitude, latitude)
points = [
    (-97.64, 33.37),
    (-97.59, 33.37)
]

# Start with a broad Sentinel-2 collection
collection = (
    ee.ImageCollection("COPERNICUS/S2_SR_HARMONIZED")
    .filterDate("2024-01-01", "2024-01-31")
)

# Build a list of Request objects
requestset = []
for i, (lon, lat) in enumerate(points):
    # Create a point geometry for the current coordinates
    point_geom = ee.Geometry.Point([lon, lat])
    collection_filtered = collection.filterBounds(point_geom)
    
    # Convert the filtered collection into a list of asset IDs
    image_ids = collection_filtered.aggregate_array("system:id").getInfo()
    
    # Define a geotransform for this point
    geotransform = cubexpress.lonlat2rt(
        lon=lon,
        lat=lat,
        edge_size=512,  # Adjust the image size in pixels
        scale=10        # 10m resolution for Sentinel-2
    )
    
    # Create one Request per image found for this point
    requestset.extend([
        cubexpress.Request(
            id=f"s2test_{i}_{idx}",
            raster_transform=geotransform,
            bands=["B4", "B3", "B2"],
            image=image_id
        )
        for idx, image_id in enumerate(image_ids)
    ])

# Combine into a RequestSet
cube_requests = cubexpress.RequestSet(requestset=requestset)

# Download everything in parallel
results = cubexpress.getcube(
    request=cube_requests,
    nworkers=4,
    output_path="images_s2",
    max_deep_level=5
)

print("Downloaded files:", results)

How it works:

  1. Points: We define multiple coordinates in points.
  2. Global collection: We retrieve a broad Sentinel-2 collection covering the desired date range.
  3. Per-point filter: For each point, we call .filterBounds(...) to get only images intersecting that location.
  4. Geotransform: We create a local geotransform (edge_size, scale) defining the spatial extent and resolution around each point.
  5. Requests: Each point-image pair becomes a Request, stored in a single list.
  6. Parallel download: With cubexpress.getcube(), all requests are fetched simultaneously, automatically splitting large outputs into sub-tiles if needed (up to max_deep_level).

License

This project is licensed under the MIT License.


Built with 🌎 and ❤️ by the CubeXpress team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cubexpress-0.1.45.tar.gz (49.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cubexpress-0.1.45-py3-none-any.whl (49.5 kB view details)

Uploaded Python 3

File details

Details for the file cubexpress-0.1.45.tar.gz.

File metadata

  • Download URL: cubexpress-0.1.45.tar.gz
  • Upload date:
  • Size: 49.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.3 Linux/6.14.0-24-generic

File hashes

Hashes for cubexpress-0.1.45.tar.gz
Algorithm Hash digest
SHA256 1ffd9a2de8d8767a0bc4c1cfe13f706f717b7b36cca79842fc26192f13e3d18c
MD5 31b3103df900be32b8a54dc84780260d
BLAKE2b-256 6bfc06f4a67b7733a8710c241c017ae1b6bbe9466f7c8d6a242f20a9690c6665

See more details on using hashes here.

File details

Details for the file cubexpress-0.1.45-py3-none-any.whl.

File metadata

  • Download URL: cubexpress-0.1.45-py3-none-any.whl
  • Upload date:
  • Size: 49.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.3 Linux/6.14.0-24-generic

File hashes

Hashes for cubexpress-0.1.45-py3-none-any.whl
Algorithm Hash digest
SHA256 25bc19f183a095196ee21819b228365b230768eedf982fbf394c20c7838db7c2
MD5 efd6ff083f6b4e46578bb29bca132e3b
BLAKE2b-256 7b6694aadecc8fb1b097c500e9ebf03d487854219a96e4dd64042b0d6a278717

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page