High-performance Rust UDFs for Earth Observation processing

These details have not been verified by PyPI

Project description

eo-processor

High-performance Rust (PyO3) UDFs for Earth Observation (EO) processing with Python bindings. Fast spectral indices, temporal statistics, masking utilities, and spatial distance functions.

Documentation

Full documentation is hosted online: https://bnjam.dev/eo-processor/

The site contains:

Quick start guides and worked examples (NDVI, EVI, compositing, masking)
Complete API reference with function signatures, expected input shapes/dtypes, and return types
CLI usage and examples for batch processing and PNG preview generation
Tutorials for integrating with XArray / Dask and for building reproducible benchmarks
Developer & contribution notes (how to add Rust UDFs, register Python bindings, test, and create type stubs)
Guidance on building from source, running the benchmark harness, and regenerating coverage badges
Release notes / changelog and citation information

Overview

eo-processor accelerates common remote sensing computations using safe Rust (no unsafe) exposed via PyO3. All public functions interoperate with NumPy and can be embedded in XArray / Dask pipelines. Rust kernels release Python's GIL; multi-core parallelism (via Rayon) is leveraged for selected operations (larger temporal aggregations, pairwise distances).

Focus areas:

Spectral & change-detection indices
Temporal statistics & median compositing (1D–4D stacks)
Masking & data quality filtering (value / range / SCL / invalid sentinels)
Pairwise spatial distances (utility layer)
Benchmark harness for reproducible performance measurements

Key Features

Rust-accelerated numerical kernels (float64 internal, stable results)
Automatic dimensional dispatch (1D / 2D for spectral indices, 1D–4D for temporal/masking)
Change detection support (ΔNDVI, ΔNBR)
Flexible masking utilities (exact values, ranges, SCL codes)
Median, mean, sample standard deviation over time axis
Pairwise distance functions (Euclidean, Manhattan, Chebyshev, Minkowski)
Type stubs (__init__.pyi) for IDE / mypy
Benchmark script with optional NumPy baseline comparison
Pure CPU, no external network or storage side-effects in core path

Installation

PyPI (standard)

pip install eo-processor

Optional extras for array ecosystem:

pip install eo-processor[dask]

Using `uv`

uv venv
source .venv/bin/activate
uv pip install eo-processor

From Source

Requirements:

Python 3.8+
Rust toolchain (rustup recommended)
maturin for building the extension module

git clone https://github.com/BnJam/eo-processor.git
cd eo-processor

pip install maturin
maturin develop --release        # build & install in-place
# or wheel:
maturin build --release
pip install target/wheels/*.whl

Quick Start

import numpy as np
from eo_processor import ndvi, ndwi, evi, normalized_difference

nir   = np.array([0.8, 0.7, 0.6])
red   = np.array([0.2, 0.1, 0.3])
blue  = np.array([0.1, 0.05, 0.08])
green = np.array([0.35, 0.42, 0.55])

print(ndvi(nir, red))               # NDVI
print(ndwi(green, nir))             # NDWI
print(evi(nir, red, blue))          # EVI
print(normalized_difference(nir, red))

All inputs may be any numeric NumPy dtype (int/uint/float); internal coercion to float64.

API Summary

Function	Purpose
`normalized_difference(a, b)`	Generic normalized difference `(a - b) / (a + b)` with near-zero denominator safeguard
`ndvi(nir, red)`	Normalized Difference Vegetation Index
`ndwi(green, nir)`	Normalized Difference Water Index
`evi(nir, red, blue)` / `enhanced_vegetation_index(...)`	Enhanced Vegetation Index (G*(NIR - Red)/(NIR + C1Red - C2Blue + L))
`savi(nir, red, L=0.5)`	Soil Adjusted Vegetation Index `(NIR - Red)/(NIR + Red + L) * (1 + L)`
`nbr(nir, swir2)`	Normalized Burn Ratio `(NIR - SWIR2)/(NIR + SWIR2)`
`ndmi(nir, swir1)`	Normalized Difference Moisture Index `(NIR - SWIR1)/(NIR + SWIR1)`
`nbr2(swir1, swir2)`	Normalized Burn Ratio 2 `(SWIR1 - SWIR2)/(SWIR1 + SWIR2)`
`gci(nir, green)`	Green Chlorophyll Index `(NIR / Green) - 1` (division guard)
`delta_ndvi(pre_nir, pre_red, post_nir, post_red)`	Change in NDVI `(NDVI_pre - NDVI_post)`
`delta_nbr(pre_nir, pre_swir2, post_nir, post_swir2)`	Change in NBR `(NBR_pre - NBR_post)`
`median(arr, skip_na=True)`	Temporal median (time axis) with NaN skipping
`composite(arr, method="median")`	Compositing convenience (currently median only)
`temporal_mean(arr, skip_na=True)`	Mean across time axis
`temporal_std(arr, skip_na=True)`	Sample standard deviation (n-1) across time
`moving_average_temporal(arr, window, skip_na=True, mode="same")`	Sliding window mean (same/valid edge modes, NaN skip/propagate)
`moving_average_temporal_stride(arr, window, stride, skip_na=True, mode="same")`	Strided moving average (downsampled temporal smoothing)
`pixelwise_transform(arr, scale=1.0, offset=0.0, clamp_min=None, clamp_max=None)`	Per-pixel linear transform with optional clamping
`euclidean_distance(points_a, points_b)`	Pairwise Euclidean distances
`manhattan_distance(points_a, points_b)`	Pairwise L1 distances
`chebyshev_distance(points_a, points_b)`	Pairwise L∞ distances
`minkowski_distance(points_a, points_b, p)`	Pairwise L^p distances (p ≥ 1)
`mask_vals(arr, values=None, fill_value=None, nan_to=None)`	Mask exact codes, optional fill & NaN normalization
`replace_nans(arr, value)`	Replace all NaNs with `value`
`mask_out_range(arr, min_val=None, max_val=None, fill_value=None)`	Mask values outside `[min, max]`
`mask_in_range(arr, min_val=None, max_val=None, fill_value=None)`	Mask values inside `[min, max]`
`mask_invalid(arr, invalid_values, fill_value=None)`	Mask list of sentinel values (e.g., `0, -9999`)
`mask_scl(scl, keep_codes=None, fill_value=None)`	Mask Sentinel‑2 SCL codes, keeping selected classes

Temporal dimension expectations:

1D: (time,)
2D: (time, band)
3D: (time, y, x)
4D: (time, band, y, x)

Distance functions: input shape (N, D) and (M, D) → output (N, M) (O(N*M) memory/time).

Spectral & Change Detection Indices

All indices auto-dispatch 1D vs 2D arrays (matching shapes required).

NDVI

(NIR - Red) / (NIR + Red) Interpretation (approximate):

< 0: water / snow
0.0–0.2: bare soil / built surfaces
0.2–0.5: sparse to moderate vegetation
0.5: healthy dense vegetation

NDWI

(Green - NIR) / (Green + NIR)

0.3: open water (often 0.4–0.6)
0.0–0.3: moist vegetation / wetlands
< 0.0: dry vegetation / soil

EVI

G * (NIR - Red) / (NIR + C1*Red - C2*Blue + L) (MODIS constants: G=2.5, C1=6.0, C2=7.5, L=1.0) Improves sensitivity over high biomass & reduces soil/atmospheric noise vs NDVI.

SAVI

(NIR - Red) / (NIR + Red + L) * (1 + L) Typical L=0.5. Larger L for sparse vegetation (bright soil), smaller for dense vegetation.

NBR

(NIR - SWIR2) / (NIR + SWIR2) Used for burn severity. Compare pre/post via ΔNBR.

NDMI

(NIR - SWIR1) / (NIR + SWIR1) Moisture / canopy water content indicator.

NBR2

(SWIR1 - SWIR2) / (SWIR1 + SWIR2) Highlights moisture & thermal differences; complementary to NBR/NDMI.

GCI

(NIR / Green) - 1 Chlorophyll proxy; division by near-zero guarded to avoid instability.

Change Detection

ΔNDVI = NDVI_pre - NDVI_post ΔNBR = NBR_pre - NBR_post Positive ΔNDVI: vegetation loss. Positive ΔNBR: burn severity increase.

Masking Utilities

Rust-accelerated preprocessing helpers for quality filtering.

Function	Notes
`mask_vals`	Exact equality masking (codes → `fill_value` or NaN) + optional NaN normalization
`replace_nans`	Force all NaNs to a scalar
`mask_out_range`	Mask outside interval
`mask_in_range`	Mask inside interval
`mask_invalid`	Shorthand for common invalid sentinels
`mask_scl`	Keep only selected Sentinel‑2 SCL classes

Example:

import numpy as np
from eo_processor import mask_vals, replace_nans, mask_out_range, mask_scl

scl = np.array([4,5,6,8,9])  # vegetation, vegetation, water, cloud (med), cloud (high)
clear = mask_scl(scl, keep_codes=[4,5,6])   # -> [4., 5., 6., nan, nan]

ndvi = np.array([-0.3, 0.1, 0.8, 1.2])
valid = mask_out_range(ndvi, min_val=-0.2, max_val=1.0)  # -> [nan,0.1,0.8,nan]

arr = np.array([0, 100, -9999, 50])
clean = mask_vals(arr, values=[0, -9999])  # -> [nan,100.,nan,50.]
filled = replace_nans(clean, -9999.0)      # -> [-9999.,100.,-9999.,50.]

Temporal Statistics & Compositing

Median, mean, and standard deviation across time axis (skip NaNs optional):

import numpy as np
from eo_processor import temporal_mean, temporal_std, median

cube = np.random.rand(12, 256, 256)  # (time, y, x)
mean_img  = temporal_mean(cube)      # (256, 256)
std_img   = temporal_std(cube)       # (256, 256)
median_img = median(cube)

composite(cube, method="median") currently routes to median.

Trend Analysis

eo-processor provides a simple trend analysis UDF to detect breaks in a time series. This implementation uses a recursive approach, which is more performant than the previous iterative version.

Function	Purpose
`trend_analysis(y, threshold)`	Detects breaks in a time series by recursively fitting linear models.

Example:

import numpy as np
from eo_processor._core import trend_analysis

# Generate some sample time-series data with a break
y = np.concatenate([
    np.linspace(0, 10, 50),
    np.linspace(10, 0, 50)
]) + np.random.normal(0, 0.5, 100)

# Run the trend analysis
segments = trend_analysis(y.tolist(), threshold=5.0)

# Print the results
print("Trend Analysis Results:")
for segment in segments:
    print(
        f"  Start: {segment.start_index}, "
        f"End: {segment.end_index}, "
        f"Slope: {segment.slope:.4f}, "
        f"Intercept: {segment.intercept:.4f}"
    )

Advanced Temporal & Pixelwise Processing

High-performance smoothing and per-pixel transforms for deep temporal stacks and large spatial tiles.

Formulas:

Moving average: MA_t = mean(x_{start..end}) where [start, end] is the window centered (same) or fixed (valid) around t.
Strided moving average: sample MA_{k*stride} for integer k to downsample temporal resolution.
Pixelwise transform: y = clamp(scale * x + offset) (clamping optional).

Example (moving average with edge handling and NaN skipping):

from eo_processor import moving_average_temporal
import numpy as np

series = np.array([1.0, 2.0, 3.0, 4.0])
ma_same  = moving_average_temporal(series, window=3, mode="same")   # length preserved
ma_valid = moving_average_temporal(series, window=3, mode="valid")  # only full windows

3D temporal cube smoothing (deep stack):

cube = np.random.rand(48, 1024, 1024)
smoothed = moving_average_temporal(cube, window=5, mode="same", skip_na=True)

Strided downsampling (reduce temporal resolution):

from eo_processor import moving_average_temporal_stride
downsampled = moving_average_temporal_stride(cube, window=5, stride=4, mode="same")
print(downsampled.shape)  # (ceil(48/4), 1024, 1024)

Pixelwise transform (scale + offset + clamping):

from eo_processor import pixelwise_transform
arr = np.random.rand(2048, 2048)
stretched = pixelwise_transform(arr, scale=1.2, offset=-0.1, clamp_min=0.0, clamp_max=1.0)

Chaining operations (temporal smoothing then per-pixel adjustment):

ma = moving_average_temporal(cube, window=7)
enhanced = pixelwise_transform(ma, scale=1.1, offset=0.05, clamp_min=0.0, clamp_max=1.0)

Performance Notes:

Prefix-sum approach makes moving average O(T) per pixel independent of window size.
Parallelization occurs over spatial/band pixels for 3D/4D arrays.
Strided variant reduces output size for downstream tasks (e.g., model inference, feature extraction).
Pixelwise transforms are single-pass and can be fused with other operations in custom workflows.

Use Cases:

Smoothing noisy temporal reflectance or index stacks prior to trend analysis.
Reducing temporal dimension before ML model training (stride-based smoothing).
Intensity scaling & clamping for visualization or input normalization.

Spatial Distances

Pairwise distance matrices:

import numpy as np
from eo_processor import euclidean_distance, manhattan_distance

A = np.random.rand(100, 8)  # (N, D)
B = np.random.rand(250, 8)  # (M, D)

dist_e = euclidean_distance(A, B)    # (100, 250)
dist_l1 = manhattan_distance(A, B)

For large N*M consider spatial indexing or chunking (not implemented).

XArray / Dask Integration

import dask.array as da
import xarray as xr
from eo_processor import ndvi

nir_dask  = da.random.random((5000, 5000), chunks=(500, 500))
red_dask  = da.random.random((5000, 5000), chunks=(500, 500))

nir_xr = xr.DataArray(nir_dask, dims=["y", "x"])
red_xr = xr.DataArray(red_dask, dims=["y", "x"])

ndvi_xr = xr.apply_ufunc(
    ndvi,
    nir_xr,
    red_xr,
    dask="parallelized",
    output_dtypes=[float],
)

result = ndvi_xr.compute()

CLI Usage

Console script exposed as eo-processor (installed via PyPI):

# Single index
eo-processor --index ndvi --nir nir.npy --red red.npy --out ndvi.npy

# Multiple indices (provide necessary bands)
eo-processor --index ndvi savi ndmi nbr --nir nir.npy --red red.npy --swir1 swir1.npy --swir2 swir2.npy --out-dir outputs/

# Change detection (ΔNBR)
eo-processor --index delta_nbr \
  --pre-nir pre/nir.npy --pre-swir2 pre/swir2.npy \
  --post-nir post/nir.npy --post-swir2 post/swir2.npy \
  --out outputs/delta_nbr.npy

# List supported indices
eo-processor --list

# Apply cloud mask (0=cloud, 1=clear)
eo-processor --index ndvi --nir nir.npy --red red.npy --mask cloudmask.npy --out ndvi_masked.npy

# PNG preview (requires optional Pillow)
eo-processor --index ndvi --nir nir.npy --red red.npy --out ndvi.npy --png-preview ndvi.png

Selected flags:

--savi-l soil brightness factor for SAVI.
--clamp MIN MAX output range clamping.
--allow-missing skip indices lacking required bands instead of error.

Performance

Example benchmark (NDVI on a large array):

import numpy as np, time
from eo_processor import ndvi

nir = np.random.rand(5000, 5000)
red = np.random.rand(5000, 5000)

t0 = time.time()
rust_out = ndvi(nir, red)
t_rust = time.time() - t0

t0 = time.time()
numpy_out = (nir - red) / (nir + red)
t_numpy = time.time() - t0

print(f"Rust: {t_rust:.3f}s  NumPy: {t_numpy:.3f}s  Speedup: {t_numpy/t_rust:.2f}x")

Speedups depend on array shape, memory bandwidth, and CPU cores. Use the benchmark harness for systematic comparison.

Benchmark Harness

scripts/benchmark.py provides grouped tests:

# Spectral functions (e.g., NDVI, NDWI, EVI, SAVI, NBR, NDMI, NBR2, GCI)
python scripts/benchmark.py --group spectral --height 2048 --width 2048

# Temporal (compare Rust vs NumPy)
python scripts/benchmark.py --group temporal --time 24 --height 1024 --width 1024 --compare-numpy

# Distances
python scripts/benchmark.py --group distances --points-a 2000 --points-b 2000 --point-dim 8

# All groups; write reports
python scripts/benchmark.py --group all --compare-numpy --json-out bench.json --md-out bench.md

Key options:

--functions <list> override group selection.
--compare-numpy baseline timings (speedup > 1.0 ⇒ Rust faster).
--minkowski-p <p> set order (p ≥ 1).
--loops, --warmups repetition control.
--json-out, --md-out artifact outputs.

Test Coverage

Regenerate badge after modifying logic/tests:

tox -e coverage
python scripts/generate_coverage_badge.py coverage.xml coverage-badge.svg

Ensure the badge is committed if coverage changes materially.

Contributing

Follow repository guidelines (AGENTS.md, copilot instructions). Checklist before proposing a PR:

Implement Rust function(s) (no unsafe)
Register via wrap_pyfunction! in src/lib.rs
Export in python/eo_processor/__init__.py
Add type stubs in python/eo_processor/__init__.pyi
Add tests (tests/test_<feature>.py) including edge cases & NaN handling
Update README (API Summary, examples, formulas)
Run:
- cargo fmt
- cargo clippy -- -D warnings
- cargo test (if Rust tests)
- pytest
- tox -e coverage
- ruff and mypy (if configured)
Update version if public API added (minor bump)
Regenerate coverage badge if changed
Confirm no secrets / large binaries staged

Commit message pattern:

<type>(scope): concise summary

Optional rationale, benchmarks, references

Types: feat, fix, perf, docs, test, chore, build, ci

Example:

feat(indices): add Green Chlorophyll Index (GCI)

Implements 1D/2D dispatch, tests, docs, benchmark entry.

Semantic Versioning

Patch: Internal fixes, refactors, docs only
Minor: New functions (backward-compatible)
Major: Breaking changes (signature changes, removals)

Roadmap (Indicative)

Additional spectral indices (future: NBR derivatives, custom moisture composites)
Sliding window / neighborhood statistics (mean, variance)
Optional multithread strategies for very large temporal cubes
Expanded masking (boolean predicate composition)
Extended change metrics (ΔNDMI, fractional vegetation cover)

(Items requiring strategic design will request human review before implementation.)

Scientific Citation

@software{eo_processor,
  title   = {eo-processor: High-performance Rust UDFs for Earth Observation},
  author  = {Ben Smith},
  year    = {2025},
  url     = {https://github.com/BnJam/eo-processor}
}

License

MIT License – see LICENSE.

Disclaimer

Core library focuses on computational primitives. It does NOT perform:

Sensor-specific radiometric calibration
Atmospheric correction
CRS reprojection / spatial indexing
Cloud/shadow detection algorithms beyond simple masking
Data acquisition / I/O orchestration

Integrate with domain tools (rasterio, xarray, dask, geopandas) for full pipelines.

Support

Open issues for bugs or enhancements. Provide:

Reproducible snippet
Input shapes / dtypes
Expected vs actual output
Benchmark data (if performance-related)

Acknowledgements

Built with PyO3, NumPy, ndarray, and Rayon. Thanks to the scientific EO community for standardized index formulations.

Enjoy fast, reproducible Earth Observation processing!

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.19.0

Jan 7, 2026

0.18.0

Dec 15, 2025

0.17.0

Dec 14, 2025

0.16.0

Dec 13, 2025

0.15.0

Dec 9, 2025

0.14.1

Dec 6, 2025

0.13.3

Dec 6, 2025

0.13.2

Nov 24, 2025

0.12.3

Nov 24, 2025

0.11.0

Nov 23, 2025

This version

0.10.1

Nov 21, 2025

0.10.0

Nov 21, 2025

0.9.0

Nov 18, 2025

0.8.0

Nov 16, 2025

0.7.2

Nov 15, 2025

0.7.1

Nov 15, 2025

0.7.0

Nov 14, 2025

0.6.0

Nov 12, 2025

0.5.0

Nov 12, 2025

0.4.0

Nov 11, 2025

0.3.0

Nov 10, 2025

0.2.0

Nov 10, 2025

0.1.0

Nov 9, 2025

0.0.0

Nov 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eo_processor-0.10.1.tar.gz (337.1 kB view details)

Uploaded Nov 21, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

eo_processor-0.10.1-cp312-cp312-manylinux_2_34_x86_64.whl (483.1 kB view details)

Uploaded Nov 21, 2025 CPython 3.12manylinux: glibc 2.34+ x86-64

File details

Details for the file eo_processor-0.10.1.tar.gz.

File metadata

Download URL: eo_processor-0.10.1.tar.gz
Upload date: Nov 21, 2025
Size: 337.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.10.2

File hashes

Hashes for eo_processor-0.10.1.tar.gz
Algorithm	Hash digest
SHA256	`94db997e9fab4232fe3e246a5dd61004863b581995d288c5730a80f91f441b8f`
MD5	`157d06b1803f758dda11f4e880efd047`
BLAKE2b-256	`57eb766f1f14b4d64f100f98e07f18444bad1f5e0afaf3112ada873739818b08`

See more details on using hashes here.

File details

Details for the file eo_processor-0.10.1-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

Download URL: eo_processor-0.10.1-cp312-cp312-manylinux_2_34_x86_64.whl
Upload date: Nov 21, 2025
Size: 483.1 kB
Tags: CPython 3.12, manylinux: glibc 2.34+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.10.2

File hashes

Hashes for eo_processor-0.10.1-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm	Hash digest
SHA256	`18d08ffedc47ccc4952d129af97c93e8f37cd9cf981b7eb1e5009056d6ce2d6c`
MD5	`f2deab9f2c76e72c1216fb781e23e1a9`
BLAKE2b-256	`57e18ac32b11bfceb611ee5f4d61d22193be51d803e0293971ee13f776e815e2`

See more details on using hashes here.

eo-processor 0.10.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

eo-processor

Documentation

Overview

Key Features

Installation

PyPI (standard)

Using uv

From Source

Quick Start

API Summary

Spectral & Change Detection Indices

NDVI

NDWI

EVI

SAVI

NBR

NDMI

NBR2

GCI

Change Detection

Masking Utilities

Temporal Statistics & Compositing

Trend Analysis

Advanced Temporal & Pixelwise Processing

Spatial Distances

XArray / Dask Integration

CLI Usage

Performance

Benchmark Harness

Test Coverage

Contributing

Semantic Versioning

Roadmap (Indicative)

Scientific Citation

License

Disclaimer

Support

Acknowledgements

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Using `uv`