ar.io MLflow plugin — verifiable provenance for the ML lifecycle
Project description
ar-io-mlflow
Verifiable provenance for the MLflow lifecycle — training, registration, promotion, inference. Signed cryptographic proofs are anchored to ar.io, so an auditor can verify a model or decision long after your MLflow server is gone.
Status. Alpha. The cryptography, packaging, and verification flow are stable; default behaviors prioritize frictionless evaluation over production hardening. See
docs/plugin-production.mdfor deployment guidance andCHANGELOG.mdfor what's shipped.
Install
# From source — PyPI publish is on the roadmap
git clone https://github.com/ar-io/ar-io-mlflow.git
cd ar-io-mlflow
pip install -e .
Python 3.10+. Pulls in MLflow, PyNaCl, the ar.io Turbo SDK, and cryptography.
MLflow version compatibility
Tested against MLflow 2.14 through 3.x. The plugin's prediction-side
verify_source_of_truth reads trace tags directly via the lighter
_tracing_client.get_trace_info API, sidestepping MLflow 3.x's stricter
mlflow.artifactLocation requirement on client.get_trace(). Training,
registration, and prediction verification all work on either major version.
Quickstart
import mlflow
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
import ario_mlflow
# Point MLflow at a tracking store. Skip if MLFLOW_TRACKING_URI is
# already set in your env, or if you're happy with the cwd's ./mlruns.
mlflow.set_tracking_uri("file:///tmp/mlruns")
X, y = load_iris(return_X_y=True)
with mlflow.start_run():
model = LogisticRegression(max_iter=200).fit(X, y)
mlflow.log_metric("accuracy", model.score(X, y))
mlflow.sklearn.log_model(model, name="model")
# Signs a proof, hashes the logged artifacts, writes ario.* tags,
# and uploads ~500 bytes to Arweave via Turbo (free for small payloads).
# allow_empty_dataset_inputs=True opts out of dataset anchoring; see
# "Dataset anchoring" below for the recommended pattern.
result = ario_mlflow.anchor(allow_empty_dataset_inputs=True)
print(result["tags"]["ario.training_tx"])
No wallet configured? The plugin auto-generates one on first run and persists it
to ~/.ario-mlflow/wallet.json so your signing address stays stable across
sessions. Set ARIO_MLFLOW_ARWEAVE_WALLET=/path/to/wallet.json to use your own.
The auto-generated wallet starts unfunded — that's fine for typical usage
because Turbo's free tier covers small uploads (see "Wallet & cost" below).
A full runnable example lives in examples/sklearn-quickstart/.
The three integration points
1. ario_mlflow.anchor() — training provenance
Call inside an active mlflow.start_run() after logging your model. The plugin
auto-resolves the logged model's artifact_path from MLflow's log-model history,
so you rarely need to pass it explicitly.
Returns a dict with envelope, payload, payload_bytes, payload_hash,
previous_hash, anchor_result, tags, artifact_path, artifact_status
("hashed" / "no_artifacts" / "hash_failed"), and artifact_error.
Failure modes. anchor() is synchronous and runs to completion before the
with block exits.
- Arweave upload fails (gateway down, network): the envelope is still
signed locally and
ario.verify_statusis set tosigned;ario.training_txis absent. Your MLflow run still succeeds. Re-run later to retry. The underlyingArweaveAnchor.last_errorattribute carries the cause if you pass an explicitarweave=instance you can inspect. - Artifact hashing fails (artifacts not yet logged, store unreachable):
raises
ario_mlflow.anchoring.ArtifactAccessError. Wrap the call if you want to log-and-continue. - Caller-supplied wallet missing or malformed (when constructing your
own
ArweaveAnchor(wallet_path=...)and passing it asarweave=): raisesario_mlflow.WalletLoadErrorfrom the constructor — operator intent must not be silently overridden by an auto-generated wallet under a different on-chain identity. Passwallet_path=None(or omit the arg) to use the auto-generated default. - No active run: raises
RuntimeError. The function requires an activemlflow.start_run()block.
2. ario_mlflow.ArioMlflowClient — registration + promotion
A drop-in replacement for mlflow.tracking.MlflowClient. Registration and stage
promotions are anchored automatically in a background thread. Query the outcome
via the client:
from ario_mlflow import ArioMlflowClient
client = ArioMlflowClient()
mv = client.create_model_version("credit-scorer", "runs:/<run_id>/model")
# Block until the async anchor finishes (optional):
client.wait_for_anchor("registration", "credit-scorer", mv.version, timeout=30)
status = client.anchor_status("registration", "credit-scorer", mv.version)
# {"status": "anchored", "tx_id": "...", "error": None, "done": True}
Failure modes. Registration and promotion both return their MLflow
ModelVersion immediately; anchoring runs in a daemon thread.
- The MLflow operation always succeeds independently — anchoring failures
never break
create_model_version()ortransition_model_version_stage(). anchor_status()returns{"status": ...}where status is one ofanchoring(in flight),anchored(Arweave upload succeeded),signed(envelope signed but Arweave upload failed),failed(anchoring crashed — seeerror), orunknown(no anchor was ever queued for this key).wait_for_anchor()returnsFalseon timeout. Process exit before the daemon completes is fine — the daemon is non-blocking by design.
3. ario_mlflow.VerifiedModel — inference
Wraps a registered model with an integrity check that runs before the underlying pyfunc model is loaded (so a tampered artifact never gets a chance to execute user code):
from ario_mlflow import VerifiedModel
vm = VerifiedModel("models:/credit-scorer/1") # raises IntegrityError on hash mismatch
# Features, in order: annual_income, credit_utilization, debt_to_income_ratio,
# months_employed, credit_score.
result = vm.predict([78000, 0.18, 0.22, 72, 745])
print(result.decision_id, result.proof_status) # "anchoring" → "anchored"
# Wait for the background anchor if you want the TX synchronously:
result.wait_for_anchor(timeout=10)
print(result.tx_id, result.anchor_error)
Failure modes.
- Tampered model artifact —
VerifiedModel(model_uri)raisesario_mlflow.IntegrityErrorbefore the underlying pyfunc model is loaded, so a swapped model never gets the chance to execute user code. Catch this exception to alert your security operations rather than silently fail open. predict()always returns the model's output even if anchoring later fails. Inspectresult.proof_status:anchoring(in flight),anchored(Arweave upload succeeded),failed(seeresult.anchor_error), ordisabled(no wallet / Turbo unavailable).- No registered model TX yet — predictions chain to the model version's
ario.registration_tx. IfArioMlflowClient's registration daemon hasn't finished, the first few predictions chain toGENESIS(read once at model init; the registration TX never gets re-read on per-prediction calls — this avoids races).
Dataset anchoring
Each MLflow dataset can have its own signed Arweave proof, independent of any specific training run. Useful for:
- Auditors who need to prove "this dataset existed at time T, signed by X" without depending on a particular model run.
- Dataset publishers who anchor once and hand the TX to downstream model trainers.
- Compliance (e.g. EU AI Act Article 53 GPAI training-data summaries) that expects dataset-level artifacts, not fragments inside a model proof.
Two ways to use it:
import mlflow
import ario_mlflow
ds = mlflow.data.from_pandas(df, source="s3://bucket/train_q1.parquet", name="train_q1")
# A) Implicit — auto-anchored inside training (recommended for typical use)
with mlflow.start_run():
mlflow.log_input(ds, context="training")
model.fit(...)
mlflow.sklearn.log_model(model, "model")
ario_mlflow.anchor()
# Each logged dataset gets its own Arweave TX automatically;
# the training proof references each by TX.
# B) Explicit — publisher pattern, no MLflow run needed
result = ario_mlflow.anchor(dataset=ds)
print(result["tx_id"]) # standalone dataset proof, hand off to downstream
The standalone-dataset envelope commits to the dataset's name, source URI, digest, and schema hash — not to its rows. Datasets stay private; the commitment is portable.
Wallet & cost
Each anchored event is a ~500-byte signed commitment (bounded 400–700 bytes by the plugin's smoke test). Turbo's free tier covers uploads under 105 KiB, so typical usage is free — the auto-generated wallet works out of the box with zero balance, and most teams never need to fund it.
You'd only need to fund the wallet if you're hitting Turbo's per-account free-tier limits or anchoring larger payloads. To top up:
- Visit console.ar.io — credit-card or crypto top-up
for the wallet address logged by the plugin on first use
(
wallet: <address>, mode=persistent).
For production deployments, generate a dedicated wallet (don't rely on the
auto-generated one), set ARIO_MLFLOW_ARWEAVE_WALLET=/path/to/your/wallet.json,
and treat the wallet like any other production secret. Source data (params,
metrics, artifact bytes) always stays in MLflow — nothing else goes on chain —
so costs are flat regardless of how big your training run was.
At scale. Each event is one upload, so cost grows linearly with anchor
volume, not with model size. A high-throughput inference service anchoring
every prediction sees one ~500-byte upload per call — well under the per-file
free threshold. Account-level limits and any paid-tier rates are documented
at console.ar.io and the
ardrive.io Turbo docs; model your projected
volume against current rates before going live. See also
docs/plugin-production.md for wallet
ops, monitoring, and balance alerting.
Network requirements
If your environment restricts outbound traffic, allowlist:
| Host | Used for |
|---|---|
turbo-gateway.com |
Uploads (Turbo bundler) and proof fetches |
arweave.net or other ar.io gateways |
Proof fetches (fallback) |
turbo.ardrive.io |
TX bundler-status checks |
Your configured ARIO_MLFLOW_ARIO_VERIFY_URL |
Optional ar.io Verify attestations |
Override the upload/fetch host with ARIO_MLFLOW_GATEWAY_HOST if you want to
route through a specific gateway operator.
Performance
What blocks vs what runs in the background:
| Call | Behavior |
|---|---|
anchor() |
Synchronous. Hashes artifacts, signs, uploads to Turbo before returning. Typically a few seconds end-to-end; longer if artifact hashing is large. |
ArioMlflowClient.create_model_version() / transition_model_version_stage() |
Returns immediately, anchors in a daemon thread. Use wait_for_anchor() if you need the TX before continuing. |
VerifiedModel.__init__ |
Synchronous. Re-hashes artifacts, compares to ario.artifact_hash, raises IntegrityError on mismatch. One-time cost per model load. |
VerifiedModel.predict() |
Returns immediately with the prediction; anchor runs in a daemon thread. No per-prediction latency added by anchoring. |
For high-throughput inference, the predict path is the hot one — predictions return as soon as the model produces an output. The Arweave upload happens asynchronously and writes back to the trace tags when it completes.
Resilience
The plugin's HTTP layer is built to absorb transient ar.io gateway weather without bubbling up as user-visible failures.
- Retries on transient failures. All upload, fetch, and ar.io Verify
requests share a
requests.Sessionwith aurllib3Retry adapter: HTTP 5xx and 429 responses are retried with exponential backoff (default: 2 retries, 0.5s/1.0s waits,Retry-Afterhonored). 4xx responses other than 429 are not retried — they're hard failures. Tunable viamax_retriesandretry_backoff_factorconstructor kwargs onArweaveAnchorandArioVerifyClient. - Multi-gateway fetch fallback.
ArweaveAnchor.fetch_proof()walksself.gatewaysin order: on a transient failure for one, the next is tried automatically. Default list is["turbo-gateway.com", "ardrive.net"]; override via thegateways=kwarg or theARIO_MLFLOW_GATEWAYSenv var. A single flaky gateway no longer surfaces as a hard "Proof Found" FAIL in any verifier UI. - Failure introspection via
last_error. Whenupload_proof(),fetch_proof(), orArioVerifyClient.submit_verification()returnsNone, the instance'slast_errorattribute carries a string describing the cause — gateway down, retries exhausted, response body unparseable, etc. Distinguish "anchor disabled" from "everything we tried failed" without parsing logs. - Attestation-level polling.
ArioVerifyClient.poll_attestation(tx_id, target_level=2, timeout=120, interval=5)repeatedly submits the verification request until the desired attestation level is reached or the timeout expires. Returns the latest result either way (so callers can render "level 1, still propagating" status rather than nothing). Useful when you want to wait for full maturity before surfacing a Verified badge.
Environment variables
| Variable | Purpose | Default |
|---|---|---|
ARIO_MLFLOW_ARWEAVE_WALLET |
Path to an Arweave JWK wallet file | auto-generates + persists at ~/.ario-mlflow/wallet.json |
ARIO_MLFLOW_GATEWAY_HOST |
Primary ar.io gateway used in returned URLs | turbo-gateway.com |
ARIO_MLFLOW_GATEWAYS |
Comma-separated list of ar.io gateways tried in order on fetch failures (e.g. g1.com,g2.com) |
primary + ardrive.net fallback |
ARIO_MLFLOW_SIGNING_KEY |
Base64-encoded Ed25519 seed | auto-generates at ~/.ario-mlflow/keys/ |
ARIO_MLFLOW_ARIO_VERIFY_URL |
ar.io Verify REST API base URL — e.g. https://perma.online/local/verify (an ar.io operator's Verify endpoint) |
ar.io attestation disabled if unset |
Tags the plugin writes
On the training run (anchor()):
ario.enabled,ario.version— via the registeredRunContextProviderario.public_key,ario.verify_status,ario.artifact_hashario.payload_hash— SHA-256 of the canonical payload bytes (the same hash committed in the envelope)ario.training_tx,ario.arweave_url— when the Arweave upload succeededario.wallet_mode—user-configured/persistent/ephemeral
On the registered model (chain head, written by anchor()):
ario.last_training_hash— pointer to the most recent training proof for this registered model; the next training reads it to set itsprevious_hash
On model versions (ArioMlflowClient):
ario.artifact_verified—true/falsefrom re-hashing at registrationario.registration_tx,ario.promotion_tx,ario.arweave_url
After running ar-io-mlflow verify … (training run or model version):
ario.verify_status→verifiedario.attestation_level—1,2, or3(see levels section below)ario.report_url— link to the ar.io Verify dashboard for this proofario.attested_by,ario.attested_at— gateway operator and timestamp, only present when the operator has configured a signing wallet
On @mlflow.trace spans emitted by VerifiedModel.predict():
ario.payload_json— the full canonical payload (mirror of theario/predictions/<id>/payload.jsonartifact). Read byverify_source_of_truthas the second MLflow surface for prediction check 3.ario.decision_id,ario.model_name,ario.model_versionario.input_hash,ario.output_hash,ario.payload_hashario.proof_status,ario.prediction_tx,ario.arweave_urlario.artifact_verified(when known)
CLI
ar-io-mlflow verify run <run_id> # verify training proof
ar-io-mlflow verify model <name>/<version> # verify registration proof
ar-io-mlflow verify trace <trace_id> # verify an inference proof
ar-io-mlflow audit <name>/<version> # full model-lineage audit
The CLI reads MLFLOW_TRACKING_URI (default ./mlruns) — export it to point
at the same store you used at training time, otherwise the run lookup will
fail with Run '<id>' not found. Set ARIO_MLFLOW_ARIO_VERIFY_URL to enable
the optional ar.io attestation row.
All verify commands run the same three-row verify flow plus the optional
ar.io attestation:
- Proof Found — fetch the pure-commitment envelope from ar.io for the recorded TX ID.
- Decision / Training / Registration Record Matches — download
ario/payload.jsonfrom MLflow, re-hash, compare to the envelope'spayload_hash, and re-derive canonical bytes from a separate live MLflow surface and compare to the anchored payload. This catches MLflow tampering — if either surface was modified after anchoring, the two won't agree.verify run(Training Record Matches) re-fetchesrun.data.params/metrics/artifact_checksums.verify model(Registration Record Matches) re-derives the artifact-verified state from the source run.verify trace(Decision Record Matches) re-fetches theario.payload_jsontrace tag (mirrored byVerifiedModel.predictat write time) and compares to the artifact.
- Signature Confirmed — the signature on the envelope verifies against the embedded public key.
Plus an Attested by line — independent third-party check by an ar.io
gateway operator (when ARIO_MLFLOW_ARIO_VERIFY_URL is configured).
Results are written back to the MLflow tags and the HTML report is regenerated.
If an MLflow retention policy has pruned a prediction's trace, row 2 returns
reason=live_refetch_incomplete rather than silently passing — the proof
itself (signature + anchored bytes + ar.io) is on permanent storage and
remains verifiable.
What the ar.io attestation means
ar-io-mlflow verify reports the ar.io attestation as Verified or
Pending verification. A proof reads Verified once an ar.io gateway has:
- Found it permanently stored on Arweave.
- Re-downloaded the bytes and matched the SHA-256 against the gateway's own digest.
- Verified the signature against the original signer's public key.
For programmatic callers, ario.attestation_level exposes the same status as
an integer (1, 2, or 3) — useful when you want to distinguish "still
propagating" from "fully verified."
Operator attestation. When the ar.io gateway operator has configured a
signing wallet, the verification result is itself signed and
ario.attested_by / ario.attested_at get written back to your MLflow tags.
That's an independent statement from a known operator, verifiable by any
third party against their public key (standard RSA-PSS SHA-256).
These attestations cover integrity and authenticity of the anchored record. Semantic verification (whether this model produced this output on this input) is a separate problem and on the roadmap, not in v0.1.
Verifying without Python
The proof envelope spec is language-neutral: an Ed25519 signature over an
RFC-8785 (JCS) canonicalized JSON object, with a SHA-256 commitment to the
canonical payload bytes that live as an MLflow artifact. Any RFC-8785 +
Ed25519 + SHA-256 implementation in any language can verify a proof — no
ar-io-mlflow install needed.
The auditor recipe:
- Fetch the envelope from any ar.io gateway:
GET https://<gateway>/raw/<tx_id>. - Verify the signature. Strip the
signaturefield from the envelope, JCS-canonicalize the rest (RFC 8785), then verify the originalsignature(hex) against the embeddedpublic_key(hex) using Ed25519. - Re-hash the canonical payload. Download
ario/payload.jsonfrom the MLflow run's artifacts. Compute SHA-256 of the raw bytes. Compare to the envelope'spayload_hash. - Walk the chain (optional). Each envelope's
previous_hashis the prior anchor'spayload_hashfor that event type, or"GENESIS". Fetch the predecessor by its TX (recorded in the relevant tag, e.g.ario.last_training_hash) and recurse.
JCS implementations exist for Python (jcs), JavaScript (canonicalize),
Go (gowebpki/jcs), Java, Rust, and others — interoperable with the same
ecosystem as Notary and Sigstore.
The Python plugin is a convenience wrapper around this recipe; the proof itself doesn't depend on the plugin's continued existence.
Tests
pytest tests/test_plugin_smoke.py tests/test_plugin_verify.py tests/test_input_anchoring.py
No network or MLflow server required.
Related docs
CHANGELOG.md— release history and known limitations.docs/architecture.md— system design (pure-commitment proofs, per-event chains, JCS canonicalization).docs/plugin-production.md— production deployment guide: wallet ops, CI/CD patterns, monitoring, runbooks.docs/plugin-threat-model.md— what the plugin defends against, what it doesn't, trust boundaries.- A reference demo app using this plugin lives at vilenarios/Verifiable-AI-Decision-Records-Demo.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ar_io_mlflow-0.1.0.tar.gz.
File metadata
- Download URL: ar_io_mlflow-0.1.0.tar.gz
- Upload date:
- Size: 105.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
84bfae64bf25a7920b204197f3738c5451fa054169f4d567d8a1937abe8b0149
|
|
| MD5 |
0b383be056caf209df0b029a69a018c3
|
|
| BLAKE2b-256 |
4eb024f89cc87253015021d4c25fdf2dcf4c8c1a6a981932881c3506897fe4df
|
Provenance
The following attestation bundles were made for ar_io_mlflow-0.1.0.tar.gz:
Publisher:
publish.yml on ar-io/ar-io-mlflow
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ar_io_mlflow-0.1.0.tar.gz -
Subject digest:
84bfae64bf25a7920b204197f3738c5451fa054169f4d567d8a1937abe8b0149 - Sigstore transparency entry: 1509000796
- Sigstore integration time:
-
Permalink:
ar-io/ar-io-mlflow@737ff1d47a5ced8951d33aaa59c7606702c766c3 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/ar-io
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@737ff1d47a5ced8951d33aaa59c7606702c766c3 -
Trigger Event:
release
-
Statement type:
File details
Details for the file ar_io_mlflow-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ar_io_mlflow-0.1.0-py3-none-any.whl
- Upload date:
- Size: 75.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
267e8a1c0fb45a0a9c9a447e195ed9632c691a977af88661e3bc38e2aa07dd58
|
|
| MD5 |
5ee64c84d40430c6d20b4a05dedc61af
|
|
| BLAKE2b-256 |
f939a0d58fb1bfac96025538c28397f801eca15ce0d9e13783f37623560465d6
|
Provenance
The following attestation bundles were made for ar_io_mlflow-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on ar-io/ar-io-mlflow
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ar_io_mlflow-0.1.0-py3-none-any.whl -
Subject digest:
267e8a1c0fb45a0a9c9a447e195ed9632c691a977af88661e3bc38e2aa07dd58 - Sigstore transparency entry: 1509000902
- Sigstore integration time:
-
Permalink:
ar-io/ar-io-mlflow@737ff1d47a5ced8951d33aaa59c7606702c766c3 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/ar-io
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@737ff1d47a5ced8951d33aaa59c7606702c766c3 -
Trigger Event:
release
-
Statement type: