High-performance Rust UDFs for Earth Observation processing

These details have not been verified by PyPI

Project description

eo-processor

High-performance Rust UDFs for Earth Observation (EO) processing with Python bindings. Provides fast spectral indices, temporal statistics, and (internally) spatial distance utilities.

Overview

eo-processor accelerates common Earth Observation and geospatial computations using Rust + PyO3, exposing a Python API compatible with NumPy, XArray, and Dask. Rust execution bypasses Python's Global Interpreter Lock (GIL), enabling true parallelism in multi-core environments and large-array workflows.

Primary focus areas:

Spectral indices (NDVI, NDWI, EVI, generic normalized differences)
Temporal compositing/statistics (median, mean, standard deviation)
Spatial utilities (distance computations — currently available via internal module)

Key Features

Rust-accelerated numerical kernels (safe, no unsafe code)
Automatic dimensional dispatch (1D vs 2D for spectral indices)
Temporal statistics across a leading “time” axis for 1D–4D arrays
Optional skipping of NaN values (skip_na=True)
Ready for XArray / Dask parallelized workflows
Type hints and stubs for IDE assistance
Deterministic, GIL-efficient performance

Installation

Using `pip` (PyPI)

pip install eo-processor

Optional extras (for distributed / parallel array workflows):

pip install eo-processor[dask]

Using `uv` (fast dependency manager)

# Create and sync environment
uv venv
source .venv/bin/activate
uv pip install eo-processor

From Source

Requirements:

Python 3.8+
Rust toolchain (install via https://rustup.rs/)
maturin for building the extension

git clone https://github.com/BnJam/eo-processor.git
cd eo-processor

# Build & install in editable (development) mode
pip install maturin
maturin develop --release

# Or build a wheel
maturin build --release
pip install target/wheels/*.whl

Quick Start

import numpy as np
from eo_processor import ndvi, ndwi, evi, normalized_difference

nir  = np.array([0.8, 0.7, 0.6])
red  = np.array([0.2, 0.1, 0.3])
blue = np.array([0.1, 0.05, 0.08])
green = np.array([0.35, 0.42, 0.55])

ndvi_vals = ndvi(nir, red)
ndwi_vals = ndwi(green, nir)
evi_vals  = evi(nir, red, blue)
nd_generic = normalized_difference(nir, red)

print(ndvi_vals, ndwi_vals, evi_vals, nd_generic)

All spectral index functions return NumPy arrays directly (no tuple wrappers).

API Summary

Top-level Python exports (via eo_processor):

Function	Description
`normalized_difference(a, b)`	Generic `(a - b) / (a + b)` with zero-denominator safeguard
`ndvi(nir, red)`	Normalized Difference Vegetation Index
`ndwi(green, nir)`	Normalized Difference Water Index
`enhanced_vegetation_index(nir, red, blue)` / `evi(...)`	Enhanced Vegetation Index (EVI: G*(NIR-Red)/(NIR + C1Red - C2Blue + L))
`median(arr, skip_na=True)`	Temporal median across leading axis for 1D–4D arrays
`composite(arr, method="median", **kwargs)`	Convenience wrapper (currently only median)
`temporal_mean(arr, skip_na=True)`	Mean across time dimension
`temporal_std(arr, skip_na=True)`	Sample standard deviation (n-1 denominator) across time
`savi(nir, red, L=0.5)`	Soil Adjusted Vegetation Index: (NIR - Red)/(NIR + Red + L) * (1 + L); variable L (≥ 0)
`nbr(nir, swir2)`	Normalized Burn Ratio: (NIR - SWIR2)/(NIR + SWIR2)
`ndmi(nir, swir1)`	Normalized Difference Moisture Index: (NIR - SWIR1)/(NIR + SWIR1)
`nbr2(swir1, swir2)`	Normalized Burn Ratio 2: (SWIR1 - SWIR2)/(SWIR1 + SWIR2)
`gci(nir, green)`	Green Chlorophyll Index: (NIR / Green) - 1 (division guarded)
`delta_ndvi(pre_nir, pre_red, post_nir, post_red)`	Change in NDVI (pre - post); vegetation loss (positive values often indicate decrease in post-event NDVI)
`delta_nbr(pre_nir, pre_swir2, post_nir, post_swir2)`	Change in NBR (pre - post); burn severity (higher positive change suggests more severe burn)

Spatial distance functions (pairwise distance matrices; now exported at the top level — note O(N*M) memory/time for large point sets). Formulas (a, b ∈ ℝ^D). All spectral/temporal index functions accept any numeric NumPy dtype (int, uint, float32, float64, etc.); inputs are automatically coerced to float64 internally for consistency:

Euclidean: √(∑ᵢ (aᵢ - bᵢ)²)
Manhattan (L₁): ∑ᵢ |aᵢ - bᵢ|
Chebyshev (L_∞): maxᵢ |aᵢ - bᵢ|
Minkowski (L_p): (∑ᵢ |aᵢ - bᵢ|^p)^(1/p), with p ≥ 1.0 (this library enforces p ≥ 1)

Function	Description
`euclidean_distance(points_a, points_b)`	Pairwise Euclidean distances (shape (N,M))
`manhattan_distance(points_a, points_b)`	Pairwise L1 distance
`chebyshev_distance(points_a, points_b)`	Pairwise max-abs (L∞) distance
`minkowski_distance(points_a, points_b, p)`	Pairwise L^p distance
(Median helpers for dimension dispatch)	Implementations backing `median`

If you need spatial distance functions at the top level, add them to python/eo_processor/__init__.py and re-export.

Spectral Indices

NDVI

Formula: (NIR - Red) / (NIR + Red) Typical interpretation:

Water / snow: < 0 (often strongly negative for clear water)
Bare soil / built surfaces: ~ 0.0 – 0.2
Sparse vegetation / stressed crops: 0.2 – 0.5
Healthy dense vegetation: > 0.5 (tropical forest can exceed 0.7)

NDWI

Formula: (Green - NIR) / (Green + NIR) Typical interpretation:

Open water bodies: > 0.3 (often 0.4–0.6)
Moist vegetation / wetlands: 0.0 – 0.3
Dry vegetation / bare soil: < 0.0 (negative values)

EVI

Formula: EVI = G * (NIR - Red) / (NIR + C1 * Red - C2 * Blue + L) Constants (MODIS): G=2.5, C1=6.0, C2=7.5, L=1.0 Typical interpretation:

EVI dampens soil & atmospheric effects relative to NDVI
Moderate vegetation: ~0.2 – 0.4
Dense / healthy canopy: >0.4 (can reach ~0.8 in lush tropical zones)
Very low / senescent vegetation: <0.2

SAVI

Formula: SAVI = (NIR - Red) / (NIR + Red + L) * (1 + L) Typical soil factor L=0.5; recommended range 0–1. Higher L reduces soil background effects. Implementation supports variable L (must be ≥ 0). Interpretation (similar to NDVI but more robust over bright soil):

Bare / bright soil: ~0.0 – 0.2
Moderate vegetation: 0.2 – 0.5
Healthy dense vegetation: > 0.5 Use smaller L (e.g. 0.25) for dense vegetation, larger L (~1.0) for very sparse vegetation / bright soil conditions.

NBR

Formula: NBR = (NIR - SWIR2) / (NIR + SWIR2) Used for burn severity and post-fire change detection. Typical interpretation (pre-fire vs post-fire):

Healthy vegetation (pre-fire): high positive (≈0.4 – 0.7)
Recently burned areas: strong drop; post-fire NBR often < 0.1 or negative Change analysis often uses ΔNBR (pre - post). Common burn severity thresholds (example ranges, refine per study):
ΔNBR > 0.66: High severity
0.44 – 0.66: Moderate-high
0.27 – 0.44: Moderate-low
0.1 – 0.27: Low severity
< 0.1: Unburned / noise

NDMI

Formula: NDMI = (NIR - SWIR1) / (NIR + SWIR1) Moisture / canopy water content indicator. Typical interpretation:

High positive (>0.3): Moist / healthy canopy (leaf water content high)
Near zero (0.0 – 0.3): Moderate moisture / possible stress onset
Negative (<0.0): Dry vegetation / senescence / possible drought stress

NBR2

Formula: NBR2 = (SWIR1 - SWIR2) / (SWIR1 + SWIR2) Highlights burn severity and subtle thermal / moisture differences. Typical interpretation:

Lower values: Increased moisture / less burn impact
Higher values: Greater dryness / potential higher burn severity Use in tandem with NBR or NDMI for refined burn severity or moisture discrimination.

GCI

Formula: GCI = (NIR / Green) - 1 Green Chlorophyll Index; division by near-zero Green values is guarded to return 0. Typical interpretation:

Values > 0 indicate chlorophyll presence
0 – 2: Sparse to moderate chlorophyll (grassland, early growth)
2 – 8: Higher chlorophyll density (crops peak growth, healthy canopy)
8: Very dense chlorophyll (may indicate saturation; verify sensor & calibration) Absolute ranges vary with sensor, atmospheric correction, and reflectance scaling—use relative comparisons or time-series trends.

All indices auto-dispatch between 1D and 2D input arrays; shapes must match.

Change Detection Indices

Change detection indices operate on “pre” and “post” event imagery (e.g., before vs after fire, storm, harvest):

Formulae: ΔNDVI = NDVI(pre) - NDVI(post) ΔNBR = NBR(pre) - NBR(post)

Typical interpretation:

Positive ΔNDVI: vegetation loss / canopy degradation.
Near-zero ΔNDVI: minimal change.
Positive ΔNBR: higher burn severity (consult study-specific threshold tables).
Use masks (cloud, snow, shadow) to set unreliable pre/post pixels to NaN before computing deltas.

These delta indices also accept any numeric dtype; values are coerced to float64.

CLI Usage

A command-line helper is available (scripts/eo_cli.py) to batch compute indices from .npy band files:

Single index:

python scripts/eo_cli.py --index ndvi --nir data/nir.npy --red data/red.npy --out outputs/ndvi.npy

Multiple indices:

python scripts/eo_cli.py --index ndvi savi ndmi nbr --nir data/nir.npy --red data/red.npy --swir1 data/swir1.npy --swir2 data/swir2.npy --out-dir outputs/

Change detection:

python scripts/eo_cli.py --index delta_nbr \
  --pre-nir pre/nir.npy --pre-swir2 pre/swir2.npy \
  --post-nir post/nir.npy --post-swir2 post/swir2.npy \
  --out outputs/delta_nbr.npy

Cloud mask (0=cloud, 1=clear):

python scripts/eo_cli.py --index ndvi --nir data/nir.npy --red data/red.npy --mask data/cloudmask.npy --out outputs/ndvi_masked.npy

PNG preview:

python scripts/eo_cli.py --index ndvi --nir data/nir.npy --red data/red.npy --out outputs/ndvi.npy --png-preview outputs/ndvi.png

Use --savi-l to adjust soil factor for SAVI; use --clamp MIN MAX to restrict output range before saving; --allow-missing skips indices lacking required bands.

Temporal Statistics & Compositing

Temporal functions assume the first axis is “time”:

1D: (time,)
2D: (time, band)
3D: (time, y, x)
4D: (time, band, y, x)

Example (temporal mean of a stack):

import numpy as np
from eo_processor import temporal_mean, temporal_std

# Simulate (time, y, x) stack: 10 timesteps of 256x256
cube = np.random.rand(10, 256, 256)
mean_image = temporal_mean(cube)       # shape (256, 256)
std_image  = temporal_std(cube)        # shape (256, 256)

Median compositing:

from eo_processor import median, composite
median_image = median(cube)          # same as composite(cube, method="median")

Skip NaNs:

cloudy_series = np.array([[0.2, np.nan, 0.5],
                          [0.25, 0.3,   0.45],
                          [0.22, np.nan, 0.47]])  # (time, band)
clean_mean = temporal_mean(cloudy_series, skip_na=True)   # ignores NaNs
strict_mean = temporal_mean(cloudy_series, skip_na=False) # bands with NaN → NaN

Spatial Distances (Internal)

Currently available in the Rust core module:

from eo_processor import _core

import numpy as np
points_a = np.array([[0.0, 0.0],
                     [1.0, 1.0]])
points_b = np.array([[1.0, 0.0],
                     [0.0, 1.0]])

dist_euclid = _core.euclidean_distance(points_a, points_b)
dist_manhat = _core.manhattan_distance(points_a, points_b)
dist_cheby  = _core.chebyshev_distance(points_a, points_b)
dist_mink   = _core.minkowski_distance(points_a, points_b, 3.0)

Each returns an (N, M) array of pairwise distances. Note: These perform O(N*M) computations; for very large sets consider spatial indexing approaches (not yet implemented here).

XArray / Dask Integration

import dask.array as da
import xarray as xr
from eo_processor import ndvi

nir_dask = da.random.random((5000, 5000), chunks=(500, 500))
red_dask = da.random.random((5000, 5000), chunks=(500, 500))

nir_xr = xr.DataArray(nir_dask, dims=["y", "x"])
red_xr = xr.DataArray(red_dask, dims=["y", "x"])

ndvi_da = xr.apply_ufunc(
    ndvi,
    nir_xr,
    red_xr,
    dask="parallelized",
    output_dtypes=[float],
)

result = ndvi_da.compute()

Performance

Rust implementations avoid Python-loop overhead and release the GIL. Example benchmark (single-thread baseline):

import numpy as np, time
from eo_processor import ndvi

nir = np.random.rand(5000, 5000)
red = np.random.rand(5000, 5000)

t0 = time.time()
rust_out = ndvi(nir, red)
t_rust = time.time() - t0

t0 = time.time()
numpy_out = (nir - red) / (nir + red)
t_numpy = time.time() - t0

print(f"Rust: {t_rust:.4f}s  NumPy: {t_numpy:.4f}s  Speedup: {t_numpy/t_rust:.2f}x")

Observed speedups vary by platform and array size. Always benchmark in your environment.

Test Coverage

The badge above is generated from coverage.xml via scripts/generate_coverage_badge.py. To regenerate after test changes:

tox -e coverage
python scripts/generate_coverage_badge.py coverage.xml coverage-badge.svg

Contributing

See CONTRIBUTING.md and AGENTS.md for guidelines (workflows, security posture, and pre-commit checklist). Typical steps:

cargo fmt
cargo clippy --all-targets -- -D warnings
pytest
tox -e coverage

Add new Rust functions → export via #[pyfunction] → register in src/lib.rs → expose in python/eo_processor/__init__.py → add type stubs → add tests → update README.

Roadmap (Indicative)

Additional spectral indices (SAVI, NBR, GCI)
Sliding window / neighborhood stats
Direct distance exports at top-level
Distributed temporal composites (chunk-aware)
Optional GPU acceleration feasibility study

Scientific Citation

@software{eo_processor,
  title = {eo-processor: High-performance Rust UDFs for Earth Observation},
  author = {Ben Smith},
  year = {2025},
  url = {https://github.com/BnJam/eo-processor}
}

License

MIT License. See LICENSE.

Disclaimer

This library focuses on computational primitives; it does not handle:

Cloud masking
Sensor-specific calibration
CRS reprojection
I/O of remote datasets

Combine with domain tools (e.g., rasterio, xarray, dask-geopandas) for complete EO pipelines.

Support

Open issues for bugs or enhancements. Feature proposals with benchmarks and EO relevance are welcome.

Benchmark Harness

A minimal benchmarking harness is provided at scripts/benchmark.py to measure performance of spectral, temporal, and spatial distance functions. The spectral group currently includes: ndvi, ndwi, evi, savi, nbr, ndmi, nbr2, gci, and normalized_difference.

Basic usage (spectral functions on a 2048x2048 image):

python scripts/benchmark.py --group spectral --height 2048 --width 2048

Compare Rust vs pure NumPy baselines (supported for spectral & temporal functions):

python scripts/benchmark.py --group temporal --time 24 --height 1024 --width 1024 --compare-numpy

Distance benchmarks (pairwise matrix; O(N*M)):

python scripts/benchmark.py --group distances --points-a 2000 --points-b 2000 --point-dim 8

Write JSON and Markdown reports:

python scripts/benchmark.py --group all --compare-numpy --json-out bench.json --md-out bench.md

Key options:

--group {spectral|temporal|distances|all} selects predefined function sets.
--functions <names...> overrides group selection with explicit functions.
--compare-numpy enables baseline timing (speedup > 1.0 indicates Rust faster).
--loops / --warmups control timing repetitions.
--json-out writes structured results (including baseline metrics when enabled).
--md-out writes a Markdown table suitable for PRs / reports.
--quiet suppresses console table output (still writes artifacts).
--minkowski-p sets the norm order for Minkowski distance (must be ≥ 1.0).

Speedup vs NumPy = (NumPy mean time / Rust mean time); values > 1 indicate Rust is faster.

Happy processing!

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.19.0

Jan 7, 2026

0.18.0

Dec 15, 2025

0.17.0

Dec 14, 2025

0.16.0

Dec 13, 2025

0.15.0

Dec 9, 2025

0.14.1

Dec 6, 2025

0.13.3

Dec 6, 2025

0.13.2

Nov 24, 2025

0.12.3

Nov 24, 2025

0.11.0

Nov 23, 2025

0.10.1

Nov 21, 2025

0.10.0

Nov 21, 2025

0.9.0

Nov 18, 2025

0.8.0

Nov 16, 2025

0.7.2

Nov 15, 2025

0.7.1

Nov 15, 2025

0.7.0

Nov 14, 2025

0.6.0

Nov 12, 2025

0.5.0

Nov 12, 2025

This version

0.4.0

Nov 11, 2025

0.3.0

Nov 10, 2025

0.2.0

Nov 10, 2025

0.1.0

Nov 9, 2025

0.0.0

Nov 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eo_processor-0.4.0.tar.gz (150.1 kB view details)

Uploaded Nov 11, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

eo_processor-0.4.0-cp312-cp312-manylinux_2_34_x86_64.whl (336.0 kB view details)

Uploaded Nov 11, 2025 CPython 3.12manylinux: glibc 2.34+ x86-64

File details

Details for the file eo_processor-0.4.0.tar.gz.

File metadata

Download URL: eo_processor-0.4.0.tar.gz
Upload date: Nov 11, 2025
Size: 150.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.10.0

File hashes

Hashes for eo_processor-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`5dde9df1b4f1e96376f94c2477565aa825a2263c575377656244c326fc7cd6e9`
MD5	`8afc4a31e53dc96d4541b86157b9b781`
BLAKE2b-256	`199c9d009ccbef66bce797b7fc27a2c7a2f954acb1c0356e23755f7c2adf5568`

See more details on using hashes here.

File details

Details for the file eo_processor-0.4.0-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

Download URL: eo_processor-0.4.0-cp312-cp312-manylinux_2_34_x86_64.whl
Upload date: Nov 11, 2025
Size: 336.0 kB
Tags: CPython 3.12, manylinux: glibc 2.34+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.10.0

File hashes

Hashes for eo_processor-0.4.0-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm	Hash digest
SHA256	`0878aa5fe1e4455f915708afc8dc083ffa59bf7f18e6dedf1ec9b35763662122`
MD5	`e3a6571edf76117a655c62557286e8fe`
BLAKE2b-256	`7d7a86f23604c392884d963e14588b60047dd13963fbaeac72cf757e35c34593`

See more details on using hashes here.

eo-processor 0.4.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

eo-processor

Overview

Key Features

Installation

Using pip (PyPI)

Using uv (fast dependency manager)

From Source

Quick Start

API Summary

Spectral Indices

NDVI

NDWI

EVI

SAVI

NBR

NDMI

NBR2

GCI

Change Detection Indices

CLI Usage

Temporal Statistics & Compositing

Spatial Distances (Internal)

XArray / Dask Integration

Performance

Test Coverage

Contributing

Roadmap (Indicative)

Scientific Citation

License

Disclaimer

Support

Benchmark Harness

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Using `pip` (PyPI)

Using `uv` (fast dependency manager)