Skip to main content

Verifiable Reconciliation Proofs — cryptographic data pipeline integrity verification

Project description

veridata-recon

Verifiable Reconciliation Proofs for Python — powered by Rust.

⚠️ This is not the veridata pandas-cleaning package. This library provides cryptographic data pipeline reconciliation using Merkle trees, Ed25519 signatures, and the VRP (Verifiable Reconciliation Proof) format.

What It Does

veridata-recon lets you mathematically prove that data made it from a source system to a sink system without:

  • Drops — records lost in transit
  • Mutations — records altered during transfer
  • Duplicates — records replicated unexpectedly

It generates cryptographic proofs (VRP documents) that can be verified offline by any party with the public key — no access to the original data required.

Installation

pip install veridata-recon

Quick Start

import veridata_recon as vr

# Generate a random salt for this reconciliation run
salt = vr.generate_salt()

# Your source and sink records (e.g., from Kafka topic and Iceberg table)
source = [
    {"order_id": "1001", "item": "widget", "qty": "5", "status": "shipped"},
    {"order_id": "1002", "item": "gadget", "qty": "3", "status": "pending"},
    {"order_id": "1003", "item": "gizmo",  "qty": "1", "status": "shipped"},
]

sink = [
    {"order_id": "1001", "item": "widget", "qty": "5", "status": "shipped"},
    {"order_id": "1002", "item": "gadget", "qty": "3", "status": "pending"},
    {"order_id": "1003", "item": "gizmo",  "qty": "1", "status": "shipped"},
]

# Reconcile with cryptographic proof
result = vr.reconcile(
    source=source,
    sink=sink,
    identity_rule="composite:[order_id]",
    content_fields=["order_id", "item", "qty", "status"],
    salt=salt,
)

print(result["verdict"])        # "PASS"
print(result["matched_count"])  # 3

Detecting Issues

# Source has 3 records, sink is missing one
sink_missing = source[:2]

result = vr.reconcile(
    source=source,
    sink=sink_missing,
    identity_rule="composite:[order_id]",
    content_fields=["order_id", "item", "qty", "status"],
    salt=salt,
)

print(result["verdict"])      # "FAIL"
print(len(result["missing"])) # 1 — order_id 1003 dropped

Key Features

Hashing

# SHA-256 (default) or BLAKE3
digest = vr.hash_bytes(b"hello world")
digest_b3 = vr.hash_bytes(b"hello world", algorithm="blake3")

Fingerprinting

fp = vr.fingerprint(
    record={"order_id": "1001", "amount": "99.99"},
    identity_rule="composite:[order_id]",
    content_fields=["order_id", "amount"],
    salt=salt,
)
# Returns: {"id_hash": "ab12...", "content_hash": "cd34...", "fingerprint": "ef56..."}

Key Management

# Generate a new Ed25519 key pair
keys = vr.generate_keypair()
print(keys["public_key"])   # base64-encoded
print(keys["private_key"])  # base64-encoded — keep secret!

# Reload from private key
keys2 = vr.keypair_from_private(keys["private_key"])
assert keys2["public_key"] == keys["public_key"]

Proof Verification

# Verify a .vrp.json proof file offline
outcome = vr.verify_proof("path/to/proof.vrp.json", public_key_b64)
# Returns: "PASS", "FAIL", or "UNVERIFIED"

Use Cases

  • Data Pipeline Integrity: Prove Kafka→Iceberg pipelines don't lose data
  • Regulatory Compliance: Cryptographic evidence of data completeness
  • CI/CD Gates: Fail builds if reconciliation doesn't pass
  • Cross-Team Trust: Share proofs without sharing raw data
  • Audit Trails: Chain proofs for continuous monitoring

How It Works

  1. Fingerprint each record using salted, domain-separated hashing
  2. Reconcile source vs sink fingerprint sets
  3. Produce a verdict: PASS, FAIL, or UNVERIFIED
  4. Generate Merkle proofs for missing records (offline verifiable)
  5. Sign the entire proof with Ed25519

Performance

Built on Rust with zero-copy where possible. Handles millions of records efficiently thanks to the underlying veridata-core engine.

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

veridata_recon-0.1.0.tar.gz (59.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

veridata_recon-0.1.0-cp312-cp312-win_amd64.whl (504.7 kB view details)

Uploaded CPython 3.12Windows x86-64

veridata_recon-0.1.0-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl (1.2 MB view details)

Uploaded CPython 3.12macOS 10.12+ universal2 (ARM64, x86-64)macOS 10.12+ x86-64macOS 11.0+ ARM64

veridata_recon-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (667.3 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

File details

Details for the file veridata_recon-0.1.0.tar.gz.

File metadata

  • Download URL: veridata_recon-0.1.0.tar.gz
  • Upload date:
  • Size: 59.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for veridata_recon-0.1.0.tar.gz
Algorithm Hash digest
SHA256 89a1be3c76a70f44e29cc51d9ca4716cb7141086f8480201415948316e25fefc
MD5 4c9ef61215323685da74e6300c4ab718
BLAKE2b-256 631504ff10c47436f6123d82bdb8b9f9938c32110b8c2fd4391cc139cffc665f

See more details on using hashes here.

Provenance

The following attestation bundles were made for veridata_recon-0.1.0.tar.gz:

Publisher: publish.yml on vaquarkhan/veridata

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file veridata_recon-0.1.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for veridata_recon-0.1.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 c776c86236d505d03e9c70b58cc43143044906503c4e693a01aa1723de1eed22
MD5 c620f315695aa1dce5806d2651f58eac
BLAKE2b-256 6321dabcc0049b5de453c7d109b17e6267a6b1d481ed7ca3c37514b1f688b988

See more details on using hashes here.

Provenance

The following attestation bundles were made for veridata_recon-0.1.0-cp312-cp312-win_amd64.whl:

Publisher: publish.yml on vaquarkhan/veridata

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file veridata_recon-0.1.0-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.

File metadata

File hashes

Hashes for veridata_recon-0.1.0-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
Algorithm Hash digest
SHA256 0a3e89fad523ab5e9463ba86ccfda1edb55591c1b7da2b38c3576c5a9788d7ca
MD5 567cc603eb424e4fdff77c1b1a17cb16
BLAKE2b-256 c55cd98279f86c1f2034827fbaf2b925ce2d125fb2021ffeaf9f4933e9ece1bc

See more details on using hashes here.

Provenance

The following attestation bundles were made for veridata_recon-0.1.0-cp312-cp312-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl:

Publisher: publish.yml on vaquarkhan/veridata

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file veridata_recon-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for veridata_recon-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 200114899c52770bb1fe7bc9fc97f6fe8d667ba6d394a53314dfe885bade1dd4
MD5 75631dce8810dbfc5c1ae54983e8bc05
BLAKE2b-256 884c92096b6aa5bdc8ea5530735972a20941d18f16436fd813134939406638e8

See more details on using hashes here.

Provenance

The following attestation bundles were made for veridata_recon-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on vaquarkhan/veridata

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page