Standalone Python core for defining and executing openEO processes locally with xarray, dask, and ML backends.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

openeo-core

A standalone Python library providing a fluent, Pythonic API for working with raster data cubes and vector cubes, implementing selected openEO processes locally using xarray and dask, with STAC MLM-compatible ML model objects.

Features

Fluent DataCube API — chain raster and vector operations in a readable pipeline
openEO process-aligned — function signatures match the openEO process specs
STAC MLM-compatible models — every model carries full STAC Machine Learning Model metadata
Multiple ML backends — scikit-learn, XGBoost, and PyTorch (TempCNN, LightTAE)
Flexible feature dimensions — control which cube dimensions become model features via dimension
Spatial indexing — accelerated vector operations with R-tree spatial index
Process Registry — discover and search bundled openEO process specifications

Installation

Install from GitHub

# With uv
uv pip install git+https://github.com/PondiB/openeo-core.git

# With pip
pip install git+https://github.com/PondiB/openeo-core.git

Optional extras (ML backends, dev):

# ML backends
uv pip install "openeo-core[ml-sklearn] @ git+https://github.com/PondiB/openeo-core.git"
uv pip install "openeo-core[ml-xgboost] @ git+https://github.com/PondiB/openeo-core.git"
uv pip install "openeo-core[ml-torch] @ git+https://github.com/PondiB/openeo-core.git"

# Everything
uv pip install "openeo-core[all] @ git+https://github.com/PondiB/openeo-core.git"

# Dev tools
pip install "openeo-core[dev] @ git+https://github.com/PondiB/openeo-core.git"

Install from source (development)

Clone the repository and sync dependencies:

git clone https://github.com/PondiB/openeo-core.git
cd openeo-core

# Core install (xarray, dask, geopandas, pystac-client, stackstac)
uv sync

# With ML backends
uv sync --extra ml-sklearn
uv sync --extra ml-xgboost
uv sync --extra ml-torch

# Everything including dev tools
uv sync --extra dev

Quick Start

Fluent DataCube API

from openeo_core import DataCube

# Load from Microsoft Planetary Computer (Sentinel-2)
cube = DataCube.load_collection(
    "sentinel-2-l2a",
    spatial_extent={"west": 10.0, "south": 50.0, "east": 11.0, "north": 51.0},
    temporal_extent=("2023-06-01", "2023-06-30"),
    bands=["red", "nir"],
)

# Fluent chaining
result = (
    cube
    .filter_bbox(west=10.2, south=50.2, east=10.8, north=50.8)
    .filter_temporal(extent=("2023-06-10", "2023-06-20"))
    .ndvi(nir="nir", red="red")
    .compute()
)

ML Models (openEO process-aligned, STAC MLM-compatible)

Model objects are STAC MLM-compatible and the API follows the openEO process specs exactly:

from openeo_core.model import (
    mlm_class_random_forest,
    mlm_regr_random_forest,
    mlm_class_xgboost,
    mlm_class_tempcnn,
    mlm_class_lighttae,
    ml_fit,
    ml_predict,
    save_ml_model,
    load_stac_ml,
)

# 1. Initialize (openEO: mlm_class_random_forest)
model = mlm_class_random_forest(
    max_variables="sqrt",
    num_trees=200,
    seed=42,
)

# 2. Train (openEO: ml_fit)
trained = ml_fit(model, training_gdf, target="label")

# 3. Predict (openEO: ml_predict)
predictions = ml_predict(raster_cube, trained)

# 4. Save with STAC Item (openEO: save_ml_model)
save_ml_model(trained, name="my_rf_model")

# 5. Load from STAC Item (openEO: load_stac_ml)
restored = load_stac_ml("my_rf_model/my_rf_model.stac.json")
predictions = ml_predict(new_raster, restored)

Feature dimensions

The dimension parameter controls which data cube dimensions are flattened into the feature vector for model training and prediction. It is set once at model initialisation and used automatically by ml_predict:

# Default: only the "bands" dimension becomes features
model = mlm_class_random_forest(dimension=["bands"])

# Use both spectral and temporal dimensions as features
model = mlm_class_random_forest(
    max_variables="sqrt",
    num_trees=200,
    dimension=["bands", "t"],
)
trained = ml_fit(model, training_gdf, target="label")
predictions = ml_predict(raster_cube, trained)  # dimension handled automatically

Default values per model type:

Model	Default `dimension`
Random Forest	`["bands"]`
XGBoost	`["bands"]`
TempCNN	`["bands", "t"]`
LightTAE	`["bands", "t"]`

XGBoost classification

model = mlm_class_xgboost(
    learning_rate=0.15,
    max_depth=5,
    min_child_weight=1,
    subsample=0.8,
    min_split_loss=1,
    seed=42,
)
trained = ml_fit(model, training_gdf, target="label")

TempCNN classification (PyTorch)

model = mlm_class_tempcnn(
    epochs=100,
    batch_size=64,
    learning_rate=0.001,
    seed=42,
)
trained = ml_fit(model, training_gdf, target="label")
predictions = ml_predict(raster_cube, trained)

LightTAE classification (PyTorch)

model = mlm_class_lighttae(
    epochs=150,
    batch_size=128,
    learning_rate=0.0005,
    seed=42,
)
trained = ml_fit(model, training_gdf, target="label")
predictions = ml_predict(raster_cube, trained)

STAC MLM metadata on model objects

Every model carries full STAC MLM metadata:

model = mlm_class_random_forest(max_variables="sqrt", num_trees=100)
props = model.to_stac_properties()
# {
#   "mlm:name": "Random Forest Classifier",
#   "mlm:architecture": "Random Forest",
#   "mlm:tasks": ["classification"],
#   "mlm:framework": "scikit-learn",
#   "mlm:hyperparameters": {"max_variables": "sqrt", "num_trees": 100, "seed": null},
#   "mlm:input": [...],
#   "mlm:output": [...],
#   ...
# }

stac_item = model.to_stac_item()
# Full STAC Feature with MLM extension

Convenience factory (backward-compatible)

from openeo_core.model import Model, ml_fit, ml_predict

model = Model.random_forest(task="classification", max_variables="sqrt", num_trees=200)
trained = ml_fit(model, gdf, target="label")
preds = ml_predict(raster, trained)

# PyTorch models
model = Model.tempcnn(epochs=50, batch_size=32)
model = Model.lighttae(epochs=100, learning_rate=0.001)

Process Registry

from openeo_core.processes import ProcessRegistry

registry = ProcessRegistry()
print(registry.list_processes())
ndvi_spec = registry.get_process("ndvi")
results = registry.search("vegetation")

Load from STAC / GeoJSON

cube = DataCube.load_stac(
    "https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a",
    assets=["red", "nir"],
)

vector = DataCube.load_geojson({"type": "FeatureCollection", "features": [...]})

Vector cubes (GeoDataFrame and xvec)

Vector cubes can be GeoDataFrames or xarray DataArrays/Datasets with xvec geometry coordinates:

uv pip install "openeo-core[geo]"

import xarray as xr
from shapely.geometry import Point

# Create xvec-backed vector cube
da = xr.DataArray(
    [1.0, 2.0, 3.0],
    dims=["geom"],
    coords={"geom": [Point(10, 50), Point(10.5, 50.5), Point(11, 51)]},
).xvec.set_geom_indexes("geom", crs=4326)

cube = DataCube(da)
result = cube.filter_bbox(west=9, south=49, east=11, north=51)

Documentation

docs/index.md — Documentation index
docs/architecture.md — Software structure, design, and component overview

Architecture

openeo_core/
  __init__.py          # DataCube, type aliases
  datacube.py          # Fluent wrapper + dispatch
  types.py             # RasterCube/VectorCube/Cube aliases
  ops/
    raster.py          # xarray/dask raster operations
    vector.py          # geopandas, dask-geopandas, xvec vector operations
  io/
    collection.py      # load_collection (pystac-client + stackstac)
    stac.py            # load_stac (pystac + stackstac)
    geojson.py         # load_geojson (geopandas)
  model/
    __init__.py        # Public API exports
    mlm.py             # MLModel (STAC MLM-compatible object)
    base.py            # openEO process functions + Model factory
    sklearn.py         # scikit-learn estimator builder (internal)
    xgboost_backend.py # XGBoost estimator builder (internal)
    torch.py           # PyTorch wrapper (TempCNN, LightTAE)
    torch_models/      # PyTorch nn.Module implementations
      tempcnn.py       # TempCNN architecture
      lighttae.py      # LightTAE architecture
  processes/
    registry.py        # JSON spec registry
    resources/         # Packaged process JSON specs

openEO ML Process Mapping

openEO Process	Python Function	Description
`mlm_class_random_forest`	`mlm_class_random_forest()`	Init RF classifier
`mlm_regr_random_forest`	`mlm_regr_random_forest()`	Init RF regressor
`mlm_class_xgboost`	`mlm_class_xgboost()`	Init XGBoost classifier
`mlm_class_tempcnn`	`mlm_class_tempcnn()`	Init TempCNN classifier
`mlm_class_lighttae`	`mlm_class_lighttae()`	Init LightTAE classifier
`ml_fit`	`ml_fit(model, training_set, target)`	Train a model
`ml_predict`	`ml_predict(data, model)`	Predict with trained model
`save_ml_model`	`save_ml_model(data, name, options)`	Save model + STAC Item
`load_stac_ml`	`load_stac_ml(uri, ...)`	Load model from STAC Item

Examples

Notebook	Description
01_ndvi.ipynb	NDVI computation with the DataCube API
02_ml_random_forest.ipynb	Random Forest classification pipeline
03_process_registry.ipynb	Exploring the Process Registry
04_ml_tempcnn.ipynb	TempCNN temporal classification with PyTorch

Running Tests

uv run pytest tests/ -v

License

Apache-2.0

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

bpondi

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0

Apr 20, 2026

0.1.2

Mar 11, 2026

0.1.1

Mar 2, 2026

0.1.0

Feb 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openeo_core-0.2.0.tar.gz (3.2 MB view details)

Uploaded Apr 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

openeo_core-0.2.0-py3-none-any.whl (102.8 kB view details)

Uploaded Apr 20, 2026 Python 3

File details

Details for the file openeo_core-0.2.0.tar.gz.

File metadata

Download URL: openeo_core-0.2.0.tar.gz
Upload date: Apr 20, 2026
Size: 3.2 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for openeo_core-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`d7260f2a7ec078be81763dd3d7b8209cc7387985cdef28ea604325d0baa8afec`
MD5	`08e85b24deaf7967efc279fe2a149cbd`
BLAKE2b-256	`3f8f803186d084c16d7cee56fc6fe007d0d603c0adfc16033695e2200eb37726`

See more details on using hashes here.

Provenance

The following attestation bundles were made for openeo_core-0.2.0.tar.gz:

Publisher: publish.yml on PondiB/openeo-core

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: openeo_core-0.2.0.tar.gz
- Subject digest: d7260f2a7ec078be81763dd3d7b8209cc7387985cdef28ea604325d0baa8afec
- Sigstore transparency entry: 1342018007
- Sigstore integration time: Apr 20, 2026
Source repository:
- Permalink: PondiB/openeo-core@0ad1cc3885e34d470a3d803550f4b205c2f5dc72
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/PondiB
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@0ad1cc3885e34d470a3d803550f4b205c2f5dc72
- Trigger Event: release

File details

Details for the file openeo_core-0.2.0-py3-none-any.whl.

File metadata

Download URL: openeo_core-0.2.0-py3-none-any.whl
Upload date: Apr 20, 2026
Size: 102.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for openeo_core-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e37b77e4aff417040a2b743629c41d1afd0533173ce0c7de10d6daf92f3b7e7f`
MD5	`c65aa3e29c6bfa9c05fdae168d86730c`
BLAKE2b-256	`b4ef09d6b02bbc15576549384d9eec82322eef11780f78e34afe94deecaed08d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for openeo_core-0.2.0-py3-none-any.whl:

Publisher: publish.yml on PondiB/openeo-core

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: openeo_core-0.2.0-py3-none-any.whl
- Subject digest: e37b77e4aff417040a2b743629c41d1afd0533173ce0c7de10d6daf92f3b7e7f
- Sigstore transparency entry: 1342018050
- Sigstore integration time: Apr 20, 2026
Source repository:
- Permalink: PondiB/openeo-core@0ad1cc3885e34d470a3d803550f4b205c2f5dc72
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/PondiB
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@0ad1cc3885e34d470a3d803550f4b205c2f5dc72
- Trigger Event: release

openeo-core 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

openeo-core

Features

Installation

Install from GitHub

Install from source (development)

Quick Start

Fluent DataCube API

ML Models (openEO process-aligned, STAC MLM-compatible)

Feature dimensions

XGBoost classification

TempCNN classification (PyTorch)

LightTAE classification (PyTorch)

STAC MLM metadata on model objects

Convenience factory (backward-compatible)

Process Registry

Load from STAC / GeoJSON

Vector cubes (GeoDataFrame and xvec)

Documentation

Architecture

openEO ML Process Mapping

Examples

Running Tests

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance