Attested Data Lineage specification and reference implementation
Project description
Kest: Attested Data Lineage
Kest is a high-integrity data lineage and security framework built for secure data pipelines and agentic workflows. It ensures that every piece of data carries a Kest Passport—a cryptographically verifiable record of its origin, the systems it traversed, and its accumulated risk profile (taints).
Core Features
- Data Lineage as a DAG: Every execution step is recorded in a Directed Acyclic Graph (DAG) for non-repudiable audit trails.
- Taint Tracking: Data is automatically marked with "taints" as it flows through untrusted or sensitive processing nodes.
- OPA Policy Enforcement: Native integration with Open Policy Agent (Rego) to enforce security constraints at runtime based on the data's entire history.
- Implicit Tracking: Secure-by-default behavior. Any data crossing a
@verifiedboundary is automatically tracked, even if it enters the system as a raw primitive. - Cryptographic Integrity: Recursive DAG hashing ($H_{bind}$) ensures that any modification to historical data or node identities invalidates the final signature.
Installation
Using pip
pip install kest
To enable support for running OPA (Open Policy Agent) locally (via lakera-regorus):
[!NOTE] The
kest[opa]local evaluation extra currently only supports Python 3.11 due to underlying upstream dependencies (lakera-regorus). For other Python versions, use the remoteopa-clientextra instead.
pip install kest[opa]
Using uv
uv add kest
To enable support for running OPA (Open Policy Agent) locally (via lakera-regorus):
[!NOTE] The
kest[opa]local evaluation extra currently only supports Python 3.11 due to underlying upstream dependencies (lakera-regorus). For other Python versions, use the remoteopa-clientextra instead.
uv add kest --extra opa
To enable support for running OPA against a remote server (via opa-python-client):
uv add kest --extra opa-client
Quick Start
from kest import verified, originate, config
from kest.core.policy import LocalOpaEngine
# 1. Setup a global policy engine
config.policy_engine = LocalOpaEngine()
policy = """
package kest.policy
default allow = false
# Specific rule: only allow input that came from System A
allow_system_a_only {
input.taints[_] == "system_a"
}
# Generic rule: must not mix unstructured internet data with unstripped PII
allow_merge {
not unsafe_mix
}
unsafe_mix {
input.taints[_] == "pii_data"
input.taints[_] == "internet_data"
not input.taints[_] == "pii_stripped"
}
"""
config.policy_engine.add_policy("access", policy)
# 2. Annotate your domain functions
@verified(added_taint=["system_a"])
def process_on_system_a(data: dict):
"""Simulates processing on a specific approved system."""
return {"system": "System A", "processed_data": data}
@verified(enforce_rules=["data.kest.policy.allow_system_a_only"])
def secure_restricted_process(data: dict):
"""A highly secure function that ONLY accepts data processed by System A."""
return {"status": "highly_secure", "data": data}
@verified(added_taint=["internet_data"])
def fetch_internet_data(query: str):
return {"source": "internet", "query": query}
@verified(added_taint=["pii_stripped"])
def strip_pii(data: dict):
safe_data = data.copy()
if "ssn" in safe_data:
safe_data["ssn"] = "***-**-****"
return safe_data
@verified(enforce_rules=["data.kest.policy.allow_merge"])
def merge_data(packet_a: dict, packet_b: dict):
return {"merged": True, "a": packet_a, "b": packet_b}
# 3. Execute with tracking
# Input is a raw dict; Kest implicitly originates a Passport
raw_pii = originate({"user": "Alice", "ssn": "123-45-678"}, taint=["pii_data"])
# Securely process PII and internet data
safe_pii = strip_pii(raw_pii)
internet_data = fetch_internet_data("news")
# Lineage and taints are propagated dynamically across the DAG
result = merge_data(safe_pii, internet_data)
# Test System Origin Policy
system_a_data = process_on_system_a(internet_data)
restricted_result = secure_restricted_process(system_a_data)
print(result.data)
# {'merged': True, ...}
print(result.passport.history)
# Contains full DAG of 'originate' -> 'strip_pii', and 'fetch_internet_data' -> 'merge_data'
2. Manual Origination
For data entering from external or untrusted sources, use originate to define the genesis node:
data = originate(
{"raw": "payload"},
taint=["untrusted_source"],
labels={"env": "prod"}
)
Documentation
For the full technical specification, see Kest v0.1.0 Specification. See the Changelog for a high-level overview of the initial release and version history.
Contributing
We welcome contributions! Please see our Contributor Guide for details on our development process, coding standards, and architectural principles.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kest-0.1.0.tar.gz.
File metadata
- Download URL: kest-0.1.0.tar.gz
- Upload date:
- Size: 193.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7954821f0ce89192b5993dd4e004d6dc5bbcd66b7f12681f04874d0a8c6b9715
|
|
| MD5 |
2286947ed701be9c7e7bca8c0eb2c234
|
|
| BLAKE2b-256 |
da3edf99e62e56abbd117a9dfefaa417216be16bb9bf6c30292a53b9c7da7c14
|
Provenance
The following attestation bundles were made for kest-0.1.0.tar.gz:
Publisher:
release.yml on eterna2/kest
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kest-0.1.0.tar.gz -
Subject digest:
7954821f0ce89192b5993dd4e004d6dc5bbcd66b7f12681f04874d0a8c6b9715 - Sigstore transparency entry: 1117463747
- Sigstore integration time:
-
Permalink:
eterna2/kest@abc1c3f24d3a83e9d18292251ff0d28fd488b63f -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/eterna2
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@abc1c3f24d3a83e9d18292251ff0d28fd488b63f -
Trigger Event:
push
-
Statement type:
File details
Details for the file kest-0.1.0-py3-none-any.whl.
File metadata
- Download URL: kest-0.1.0-py3-none-any.whl
- Upload date:
- Size: 23.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
69e0bc2bcafe6834f5d4552d2b3608ab21a81fb1383e31ea72127663ab0da1b6
|
|
| MD5 |
19f9d5b3adc54fdde997fb1c63b0b213
|
|
| BLAKE2b-256 |
4bd1339b9c6b453a08fe7e24ac0c9dd6eafd35cc5b6aefb05e1a68c56bdd5b6f
|
Provenance
The following attestation bundles were made for kest-0.1.0-py3-none-any.whl:
Publisher:
release.yml on eterna2/kest
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kest-0.1.0-py3-none-any.whl -
Subject digest:
69e0bc2bcafe6834f5d4552d2b3608ab21a81fb1383e31ea72127663ab0da1b6 - Sigstore transparency entry: 1117463761
- Sigstore integration time:
-
Permalink:
eterna2/kest@abc1c3f24d3a83e9d18292251ff0d28fd488b63f -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/eterna2
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@abc1c3f24d3a83e9d18292251ff0d28fd488b63f -
Trigger Event:
push
-
Statement type: