Open-source provenance SDK and specification for verifiable EO and climate data workflows
Project description
trazaeo V1
This repository contains the trazaeo Rust crate and Python bindings for
verifiable provenance in Earth observation and climate data workflows. The
project includes hashing, provenance records, proof logging adaptors, and
examples for NC to Zarr or Icechunk verification flows.
The V1 protocol covers three primary use cases:
- source-device capture, where a sensor or edge device signs captured bytes
- transport receipt, where a ground station or relay attests to received bytes or helper processing
- dataset transforms and publication, where one or more inputs are turned into derived artifacts and checkpointed for audit
V1 envelope schemas live in trazaeo/schemas/.
The current release is V1: a stable core verification model with optional adaptor-backed assurance for storage binding and proof logging.
Repository contracts:
- Compatibility matrix:
docs/contracts/compatibility.md - Quality gates:
docs/contracts/quality-gates.md - Python example boundaries:
docs/contracts/architecture.md - Merkle/Bao replacement proposal:
docs/proposals/merkle-bao-replacement.md - Roadmap:
ROADMAP.md - Documentation site source:
website/(Vocs)
Building
You can build the crate with the standard Rust toolchain. From the repository root run:
cargo build --release
This will produce the trazaeo library in target/release.
To run the unit tests execute:
cargo test
For full local quality gates (lint + type checks + tests), run from repo root:
make ci
To run coverage locally (Rust LCOV + Python coverage XML), run:
make coverage
To run Rust fuzz targets locally, install cargo-fuzz and run a target from
trazaeo/fuzz/:
cargo install cargo-fuzz
cargo fuzz run decode_range_proof_package --manifest-path trazaeo/fuzz/Cargo.toml
To install local commit-time checks (pre-commit parity with CI):
make precommit-install
The exact gate policy is documented in docs/contracts/quality-gates.md.
To run the streaming BLAKE3 performance harness:
cargo run --example perf_hashing -- <path-to-file> [chunk_size_bytes] [threads]
Reliability examples (source, transport, and transform to reward readiness)
These examples support reliability validation for the V1 flow described in TRAZAEO_V1_SPEC.md sections 15, 8, and 12.
Rust retry + idempotency demo:
cargo run --example reliability_demo
Python file-root reliability check (after building Python bindings):
python -m trazaeo_workflows reliability-check <path-to-file> --chunk 1048576 --threads 4
Python netCDF content-root check:
python -m trazaeo_workflows hash-netcdf <path-to-file> --chunk 4096 --threads 4
Python source-device capture demo:
python -m trazaeo_workflows capture-source \
--subject-id capture-source-1 \
--capture-actor-id sensor-1 \
--capture-system-id sensor-pipeline-1 \
--output-ref obj://raw/1 \
--segment-id frame-1 \
--payload-text telemetry
Python transport-receipt capture demo:
python -m trazaeo_workflows capture-transport \
--subject-id capture-transport-1 \
--capture-actor-id ground-station-1 \
--capture-system-id rx-1 \
--input-ref uplink://pass-1 \
--output-ref obj://relay/1 \
--segment-id seg-transport-1 \
--payload-text downlink-frame
Python publish+verify envelope demo:
TRUST_POLICY_JSON='{"allowed_keys":["18e6a97db14c236f52bb13ee7c843ee077ae77c43a37d2f8c548abd79036e599"],"revoked_keys":[],"audit_log":[{"action":"allow","key_id":"18e6a97db14c236f52bb13ee7c843ee077ae77c43a37d2f8c548abd79036e599","reason":"local demo trust policy","effective_at":"2026-01-01T00:00:00Z"}]}'
python -m trazaeo_workflows publish-demo --mode sampled --trust-policy-json "$TRUST_POLICY_JSON"
publish-demo prints one JSON object with publish_input, publish_envelope,
and verification_report.
Python adaptor demo with S3-style storage + public-RPC Solana proof log:
python -m trazaeo_workflows publish-solana --mode sampled --trust-policy-json "$TRUST_POLICY_JSON"
By default the demo uses https://api.devnet.solana.com with an ephemeral devnet
signer for local testing. For solana-mainnet, pass a funded Solana keypair file:
python -m trazaeo_workflows publish-solana \
--cluster solana-mainnet \
--rpc-url https://api.mainnet-beta.solana.com \
--proof-log-keypair-path ~/.config/solana/id.json \
--trust-policy-json "$TRUST_POLICY_JSON"
The memo-backed public-RPC proof-log adaptor verifies the committed transaction
and signer, but it does not expose a chain root, so the CLI reports chain_root: null for that adaptor.
Python NC collection to Zarr/Icechunk conversion + verification demo:
python -m trazaeo_workflows icechunk \
path/to/a.nc path/to/b.nc \
--zarr-store outputs/sst.zarr \
--dataset-id sst \
--dataset-version v1 \
--trust-policy-json "$TRUST_POLICY_JSON"
Jupyter notebook walkthrough for pre/post conversion visualization and verification:
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -e '.[notebooks]'
jupyter lab examples/python_netcdf/notebooks/nc_to_zarr_provenance_walkthrough.ipynb
This notebook install is self-contained for the walkthrough: it includes the
example runtime dependencies, dask[array], and ipykernel.
Documentation site
Install docs dependencies and run local dev mode:
cd website
npm install
npm run dev
Build the static docs site:
cd website
npm run build
Python bindings
Python bindings are provided via PyO3. The easiest way to build them is with
maturin:
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install maturin
cd trazaeo
maturin develop --release --features python-extension,python-proof-log-rpc
After building you can import the trazaeo module from Python.
Python dependencies for examples/tests
Install optional Python dependencies for netCDF examples and test tooling:
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -e '.[python-examples,test]'
Example
Below is a minimal Rust example that hashes a file into a content descriptor:
use trazaeo::hashing::hash_file_content_descriptor;
fn main() {
let descriptor = hash_file_content_descriptor(
"data.bin",
"artifact-1",
1024,
4,
"application/octet-stream",
"2026-01-01T00:00:00Z",
)
.expect("content descriptor");
println!("content root: {}", descriptor.content_root_hash);
}
In Python you can call the provided hash helper after installing the editable package in your virtual environment. Both single threaded and multithreaded variants are exposed:
>>> from trazaeo import blake3_hash
>>> blake3_hash(b"hello world")
>>> from trazaeo import blake3_hash_mt
>>> blake3_hash_mt(b"hello world", 4)
Optional Bao range proofs
trazaeo can generate Bao outboard data and byte-range proof packages internally,
so downstream apps do not need to bolt this on themselves.
This is an integrity feature, not a secrecy feature. In the current V1 model, Bao verifies byte ranges against the BLAKE3 file hash recorded in the content descriptor.
Bao support is optional and gated behind the Rust feature bao-range-proofs.
Default builds do not expose the Bao helpers.
>>> from trazaeo import bao_outboard_json, bao_range_proof_package_json
>>> outboard_json = bao_outboard_json("example.nc", 4096, 4, None)
>>> proof_json = bao_range_proof_package_json("example.nc", 0, 4096, 4096, 4)
Hashing a netCDF file with zero copy
The crate provides a helper to hash a file directly into a content descriptor using memory mapping. From Python you can compute the content root of a netCDF file as follows:
>>> from trazaeo import blake3_content_root
>>> root = blake3_content_root("example.nc", 4096, 4)
>>> print(root.hex())
blake3_content_root reads the input using a zero-copy memory map to minimize
RAM usage.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file trazaeo-0.5.0.tar.gz.
File metadata
- Download URL: trazaeo-0.5.0.tar.gz
- Upload date:
- Size: 95.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
facb3cac003560fe550e65ee715537c712df6454a31ebc6f8a24cf804e2ee0c0
|
|
| MD5 |
e59a3378ef66a0d7933fa5bfc4216b5f
|
|
| BLAKE2b-256 |
7abc2525a019a7a4c25a2c0f69dfc28614347aa81776f836de4fc107634fab0b
|
File details
Details for the file trazaeo-0.5.0-cp314-cp314-macosx_11_0_arm64.whl.
File metadata
- Download URL: trazaeo-0.5.0-cp314-cp314-macosx_11_0_arm64.whl
- Upload date:
- Size: 2.7 MB
- Tags: CPython 3.14, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5ef0f1c295ee7592337746f5d81ae71f73b1284c169bb888906da2ae9d510dc6
|
|
| MD5 |
7b7c6e74fb8f1d3494be16544fbb8d81
|
|
| BLAKE2b-256 |
52949c0a84cf96f49d35e50016e456d8458806369e9d232768a3789dc46fef10
|