Verifiable Reconciliation Proofs — cryptographic data pipeline integrity verification
Project description
veridata-recon
Verifiable Reconciliation Proofs for Python — powered by Rust.
⚠️ This is not the
veridatapandas-cleaning package. This library provides cryptographic data pipeline reconciliation using Merkle trees, Ed25519 signatures, and the VRP (Verifiable Reconciliation Proof) format.
What It Does
veridata-recon lets you mathematically prove that data made it from a source
system to a sink system without:
- Drops — records lost in transit
- Mutations — records altered during transfer
- Duplicates — records replicated unexpectedly
It generates cryptographic proofs (VRP documents) that can be verified offline by any party with the public key — no access to the original data required.
Installation
pip install veridata-recon
Quick Start
import veridata_recon as vr
# Generate a random salt for this reconciliation run
salt = vr.generate_salt()
# Your source and sink records (e.g., from Kafka topic and Iceberg table)
source = [
{"order_id": "1001", "item": "widget", "qty": "5", "status": "shipped"},
{"order_id": "1002", "item": "gadget", "qty": "3", "status": "pending"},
{"order_id": "1003", "item": "gizmo", "qty": "1", "status": "shipped"},
]
sink = [
{"order_id": "1001", "item": "widget", "qty": "5", "status": "shipped"},
{"order_id": "1002", "item": "gadget", "qty": "3", "status": "pending"},
{"order_id": "1003", "item": "gizmo", "qty": "1", "status": "shipped"},
]
# Reconcile with cryptographic proof
result = vr.reconcile(
source=source,
sink=sink,
identity_rule="composite:[order_id]",
content_fields=["order_id", "item", "qty", "status"],
salt=salt,
)
print(result["verdict"]) # "PASS"
print(result["matched_count"]) # 3
Detecting Issues
# Source has 3 records, sink is missing one
sink_missing = source[:2]
result = vr.reconcile(
source=source,
sink=sink_missing,
identity_rule="composite:[order_id]",
content_fields=["order_id", "item", "qty", "status"],
salt=salt,
)
print(result["verdict"]) # "FAIL"
print(len(result["missing"])) # 1 — order_id 1003 dropped
Key Features
Hashing
# SHA-256 (default) or BLAKE3
digest = vr.hash_bytes(b"hello world")
digest_b3 = vr.hash_bytes(b"hello world", algorithm="blake3")
Fingerprinting
fp = vr.fingerprint(
record={"order_id": "1001", "amount": "99.99"},
identity_rule="composite:[order_id]",
content_fields=["order_id", "amount"],
salt=salt,
)
# Returns: {"id_hash": "ab12...", "content_hash": "cd34...", "fingerprint": "ef56..."}
Key Management
# Generate a new Ed25519 key pair
keys = vr.generate_keypair()
print(keys["public_key"]) # base64-encoded
print(keys["private_key"]) # base64-encoded — keep secret!
# Reload from private key
keys2 = vr.keypair_from_private(keys["private_key"])
assert keys2["public_key"] == keys["public_key"]
Proof Verification
# Verify a .vrp.json proof file offline
outcome = vr.verify_proof("path/to/proof.vrp.json", public_key_b64)
# Returns: "PASS", "FAIL", or "UNVERIFIED"
Use Cases
- Data Pipeline Integrity: Prove Kafka→Iceberg pipelines don't lose data
- Regulatory Compliance: Cryptographic evidence of data completeness
- CI/CD Gates: Fail builds if reconciliation doesn't pass
- Cross-Team Trust: Share proofs without sharing raw data
- Audit Trails: Chain proofs for continuous monitoring
How It Works
- Fingerprint each record using salted, domain-separated hashing
- Reconcile source vs sink fingerprint sets
- Produce a verdict: PASS, FAIL, or UNVERIFIED
- Generate Merkle proofs for missing records (offline verifiable)
- Sign the entire proof with Ed25519
Performance
Built on Rust with zero-copy where possible. Handles millions of records
efficiently thanks to the underlying veridata-core engine.
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file veridata_recon-0.1.0.tar.gz.
File metadata
- Download URL: veridata_recon-0.1.0.tar.gz
- Upload date:
- Size: 59.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
89a1be3c76a70f44e29cc51d9ca4716cb7141086f8480201415948316e25fefc
|
|
| MD5 |
4c9ef61215323685da74e6300c4ab718
|
|
| BLAKE2b-256 |
631504ff10c47436f6123d82bdb8b9f9938c32110b8c2fd4391cc139cffc665f
|
Provenance
The following attestation bundles were made for veridata_recon-0.1.0.tar.gz:
Publisher:
publish.yml on vaquarkhan/veridata
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
veridata_recon-0.1.0.tar.gz -
Subject digest:
89a1be3c76a70f44e29cc51d9ca4716cb7141086f8480201415948316e25fefc - Sigstore transparency entry: 1820651915
- Sigstore integration time:
-
Permalink:
vaquarkhan/veridata@0e13e7571e678a96129eb0e938a4cc6f8c37e4d3 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/vaquarkhan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0e13e7571e678a96129eb0e938a4cc6f8c37e4d3 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file veridata_recon-0.1.0-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: veridata_recon-0.1.0-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 504.7 kB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c776c86236d505d03e9c70b58cc43143044906503c4e693a01aa1723de1eed22
|
|
| MD5 |
c620f315695aa1dce5806d2651f58eac
|
|
| BLAKE2b-256 |
6321dabcc0049b5de453c7d109b17e6267a6b1d481ed7ca3c37514b1f688b988
|
Provenance
The following attestation bundles were made for veridata_recon-0.1.0-cp312-cp312-win_amd64.whl:
Publisher:
publish.yml on vaquarkhan/veridata
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
veridata_recon-0.1.0-cp312-cp312-win_amd64.whl -
Subject digest:
c776c86236d505d03e9c70b58cc43143044906503c4e693a01aa1723de1eed22 - Sigstore transparency entry: 1820652853
- Sigstore integration time:
-
Permalink:
vaquarkhan/veridata@0e13e7571e678a96129eb0e938a4cc6f8c37e4d3 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/vaquarkhan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0e13e7571e678a96129eb0e938a4cc6f8c37e4d3 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file veridata_recon-0.1.0-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.
File metadata
- Download URL: veridata_recon-0.1.0-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.12, macOS 10.12+ universal2 (ARM64, x86-64), macOS 10.12+ x86-64, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a3e89fad523ab5e9463ba86ccfda1edb55591c1b7da2b38c3576c5a9788d7ca
|
|
| MD5 |
567cc603eb424e4fdff77c1b1a17cb16
|
|
| BLAKE2b-256 |
c55cd98279f86c1f2034827fbaf2b925ce2d125fb2021ffeaf9f4933e9ece1bc
|
Provenance
The following attestation bundles were made for veridata_recon-0.1.0-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl:
Publisher:
publish.yml on vaquarkhan/veridata
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
veridata_recon-0.1.0-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl -
Subject digest:
0a3e89fad523ab5e9463ba86ccfda1edb55591c1b7da2b38c3576c5a9788d7ca - Sigstore transparency entry: 1820652459
- Sigstore integration time:
-
Permalink:
vaquarkhan/veridata@0e13e7571e678a96129eb0e938a4cc6f8c37e4d3 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/vaquarkhan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0e13e7571e678a96129eb0e938a4cc6f8c37e4d3 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file veridata_recon-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: veridata_recon-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 667.3 kB
- Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
200114899c52770bb1fe7bc9fc97f6fe8d667ba6d394a53314dfe885bade1dd4
|
|
| MD5 |
75631dce8810dbfc5c1ae54983e8bc05
|
|
| BLAKE2b-256 |
884c92096b6aa5bdc8ea5530735972a20941d18f16436fd813134939406638e8
|
Provenance
The following attestation bundles were made for veridata_recon-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
publish.yml on vaquarkhan/veridata
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
veridata_recon-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
200114899c52770bb1fe7bc9fc97f6fe8d667ba6d394a53314dfe885bade1dd4 - Sigstore transparency entry: 1820653225
- Sigstore integration time:
-
Permalink:
vaquarkhan/veridata@0e13e7571e678a96129eb0e938a4cc6f8c37e4d3 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/vaquarkhan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0e13e7571e678a96129eb0e938a4cc6f8c37e4d3 -
Trigger Event:
workflow_dispatch
-
Statement type: