Skip to main content

Attested Data Lineage specification and reference implementation

Project description

Kest: Attested Data Lineage

PyPI version Release codecov

Kest is a high-integrity data lineage and security framework built for secure data pipelines and agentic workflows. It ensures that every piece of data carries a Kest Passport—a cryptographically verifiable record of its origin, the systems it traversed, and its accumulated risk profile (taints).

Core Features

  • Data Lineage as a DAG: Every execution step is recorded in a Directed Acyclic Graph (DAG) for non-repudiable audit trails.
  • Taint Tracking: Data is automatically marked with "taints" as it flows through untrusted or sensitive processing nodes.
  • OPA Policy Enforcement: Native integration with Open Policy Agent (Rego) to enforce security constraints at runtime based on the data's entire history.
  • Implicit Tracking: Secure-by-default behavior. Any data crossing a @verified boundary is automatically tracked, even if it enters the system as a raw primitive.
  • Cryptographic Integrity: Recursive DAG hashing ($H_{bind}$) ensures that any modification to historical data or node identities invalidates the final signature.

Installation

Using pip

pip install kest

To enable support for running OPA (Open Policy Agent) locally (via lakera-regorus):

[!NOTE] The kest[opa] local evaluation extra currently only supports Python 3.11 due to underlying upstream dependencies (lakera-regorus). For other Python versions, use the remote opa-client extra instead.

pip install kest[opa]

Using uv

uv add kest

To enable support for running OPA (Open Policy Agent) locally (via lakera-regorus):

[!NOTE] The kest[opa] local evaluation extra currently only supports Python 3.11 due to underlying upstream dependencies (lakera-regorus). For other Python versions, use the remote opa-client extra instead.

uv add kest --extra opa

To enable support for running OPA against a remote server (via opa-python-client):

uv add kest --extra opa-client

Quick Start

from kest import verified, originate, config
from kest.core.policy import LocalOpaEngine

# 1. Setup a global policy engine
config.policy_engine = LocalOpaEngine()

policy = """
package kest.policy
default allow = false

# Specific rule: only allow input that came from System A
allow_system_a_only {
    input.taints[_] == "system_a"
}

# Generic rule: must not mix unstructured internet data with unstripped PII
allow_merge {
    not unsafe_mix
}

unsafe_mix {
    input.taints[_] == "pii_data"
    input.taints[_] == "internet_data"
    not input.taints[_] == "pii_stripped"
}
"""
config.policy_engine.add_policy("access", policy)

# 2. Annotate your domain functions
@verified(added_taint=["system_a"])
def process_on_system_a(data: dict):
    """Simulates processing on a specific approved system."""
    return {"system": "System A", "processed_data": data}

@verified(enforce_rules=["data.kest.policy.allow_system_a_only"])
def secure_restricted_process(data: dict):
    """A highly secure function that ONLY accepts data processed by System A."""
    return {"status": "highly_secure", "data": data}

@verified(added_taint=["internet_data"])
def fetch_internet_data(query: str):
    return {"source": "internet", "query": query}

@verified(added_taint=["pii_stripped"])
def strip_pii(data: dict):
    safe_data = data.copy()
    if "ssn" in safe_data:
        safe_data["ssn"] = "***-**-****"
    return safe_data

@verified(enforce_rules=["data.kest.policy.allow_merge"])
def merge_data(packet_a: dict, packet_b: dict):
    return {"merged": True, "a": packet_a, "b": packet_b}

# 3. Execute with tracking
# Input is a raw dict; Kest implicitly originates a Passport
raw_pii = originate({"user": "Alice", "ssn": "123-45-678"}, taint=["pii_data"])

# Securely process PII and internet data
safe_pii = strip_pii(raw_pii)
internet_data = fetch_internet_data("news")

# Lineage and taints are propagated dynamically across the DAG
result = merge_data(safe_pii, internet_data)

# Test System Origin Policy
system_a_data = process_on_system_a(internet_data)
restricted_result = secure_restricted_process(system_a_data)

print(result.data) 
# {'merged': True, ...}
print(result.passport.history) 
# Contains full DAG of 'originate' -> 'strip_pii', and 'fetch_internet_data' -> 'merge_data'

2. Manual Origination

For data entering from external or untrusted sources, use originate to define the genesis node:

data = originate(
    {"raw": "payload"},
    taint=["untrusted_source"],
    labels={"env": "prod"}
)

Documentation

For the full technical specification, see Kest v0.1.0 Specification. See the Changelog for a high-level overview of the initial release and version history.

Contributing

We welcome contributions! Please see our Contributor Guide for details on our development process, coding standards, and architectural principles.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kest-0.1.0.tar.gz (193.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kest-0.1.0-py3-none-any.whl (23.3 kB view details)

Uploaded Python 3

File details

Details for the file kest-0.1.0.tar.gz.

File metadata

  • Download URL: kest-0.1.0.tar.gz
  • Upload date:
  • Size: 193.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for kest-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7954821f0ce89192b5993dd4e004d6dc5bbcd66b7f12681f04874d0a8c6b9715
MD5 2286947ed701be9c7e7bca8c0eb2c234
BLAKE2b-256 da3edf99e62e56abbd117a9dfefaa417216be16bb9bf6c30292a53b9c7da7c14

See more details on using hashes here.

Provenance

The following attestation bundles were made for kest-0.1.0.tar.gz:

Publisher: release.yml on eterna2/kest

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kest-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: kest-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 23.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for kest-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 69e0bc2bcafe6834f5d4552d2b3608ab21a81fb1383e31ea72127663ab0da1b6
MD5 19f9d5b3adc54fdde997fb1c63b0b213
BLAKE2b-256 4bd1339b9c6b453a08fe7e24ac0c9dd6eafd35cc5b6aefb05e1a68c56bdd5b6f

See more details on using hashes here.

Provenance

The following attestation bundles were made for kest-0.1.0-py3-none-any.whl:

Publisher: release.yml on eterna2/kest

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page