Data Reliability Index - native validation and filtering of datasets.
Project description
Data Reliability Index
Data Reliability Index is a Python package for attaching reliability metadata to data points, enforcing trust policies, and filtering unreliable records before they reach analysis or API boundaries.
The package is built around a simple rule: data should carry the evidence needed to decide whether it is safe to use.
Features
- Pydantic models for reliability metadata and policies.
- Scanning engine for computing reliability scores from validation evidence.
- Tiered trust classification with scores from 0 to 100.
- Policy-based acceptance checks for individual records.
- Pandas helpers for filtering DataFrames by reliability metadata.
- FastAPI example for rejecting low-reliability input at ingestion time.
- MkDocs documentation for concepts and API usage.
Installation
pip install data-reliability-index
For local development from this repository:
pip install -e ".[test]"
Quick Start
from data_reliability import DataTier, ReliabilityPolicy, ReliabilityScanner, ValidationEvidence
scanner = ReliabilityScanner()
data = scanner.scan(
{"temperature": 21.4, "unit": "celsius"},
source_id="sensor-a",
evidence=ValidationEvidence(
completeness=1.0,
consistency=1.0,
provenance=1.0,
cryptographic_verification=1.0,
calibration=1.0,
schema_compliance=1.0,
anomaly_detection=1.0,
duplicate_detection=1.0,
metadata_quality=1.0,
),
)
policy = ReliabilityPolicy(
minimum_score=90,
maximum_tier=DataTier.TIER_2,
)
assert policy.resolve(data) == {"temperature": 21.4, "unit": "celsius"}
Pandas Filtering
import pandas as pd
from data_reliability import DataTier, ReliabilityMetadata, ReliabilityPolicy, filter_reliable_df
df = pd.DataFrame([
{
"value": 10,
"reliability": ReliabilityMetadata(
score=95,
tier=DataTier.TIER_1,
source_id="sensor-a",
trace_hash="abc123",
),
},
])
policy = ReliabilityPolicy(minimum_score=90, maximum_tier=DataTier.TIER_2)
trusted = filter_reliable_df(df, policy)
Documentation
The project documentation lives in docs/ and can be served locally with MkDocs:
pip install mkdocs-material
mkdocs serve
Start with:
The longer project rationale is available in data-reliability.md.
Development
Run the test suite:
pip install -e ".[test]"
pytest
Build package artifacts:
pip install -e ".[build]"
python -m build
python -m twine check dist/*
Run the FastAPI example:
uvicorn examples.fastapi_app:app --reload
Contributing
Contributions are welcome. See CONTRIBUTING.md for the local workflow and pull request expectations.
Security
Please report security issues privately. See SECURITY.md.
License
Licensed under the Apache License 2.0.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file data_reliability_index-0.2.0.tar.gz.
File metadata
- Download URL: data_reliability_index-0.2.0.tar.gz
- Upload date:
- Size: 11.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3065a561cde34e5835c372199f330184efba159a971f263ffe0a2f367538957d
|
|
| MD5 |
9822e41307b520647adcb8e0637b3899
|
|
| BLAKE2b-256 |
ff82fc3809a04c7374f4c27cf8761540e40f8fcb4bfe17ca26f61ea5606895c3
|
Provenance
The following attestation bundles were made for data_reliability_index-0.2.0.tar.gz:
Publisher:
publish.yml on h3pdesign/data-reliability-index
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
data_reliability_index-0.2.0.tar.gz -
Subject digest:
3065a561cde34e5835c372199f330184efba159a971f263ffe0a2f367538957d - Sigstore transparency entry: 2022308999
- Sigstore integration time:
-
Permalink:
h3pdesign/data-reliability-index@0e16e54748445c59eb6d4cb6946d0ece4a1bfaf7 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/h3pdesign
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0e16e54748445c59eb6d4cb6946d0ece4a1bfaf7 -
Trigger Event:
release
-
Statement type:
File details
Details for the file data_reliability_index-0.2.0-py3-none-any.whl.
File metadata
- Download URL: data_reliability_index-0.2.0-py3-none-any.whl
- Upload date:
- Size: 11.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e08662023e80d10870c2080863840ec875227e83592b5c9e2f80cd60ae63b6f3
|
|
| MD5 |
fd0061384f71ba09fabf2e0d8aa53785
|
|
| BLAKE2b-256 |
030903f385ac3636571eff19054146c26cac0d7890f4f34c7f78e8643b10f82b
|
Provenance
The following attestation bundles were made for data_reliability_index-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on h3pdesign/data-reliability-index
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
data_reliability_index-0.2.0-py3-none-any.whl -
Subject digest:
e08662023e80d10870c2080863840ec875227e83592b5c9e2f80cd60ae63b6f3 - Sigstore transparency entry: 2022309152
- Sigstore integration time:
-
Permalink:
h3pdesign/data-reliability-index@0e16e54748445c59eb6d4cb6946d0ece4a1bfaf7 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/h3pdesign
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0e16e54748445c59eb6d4cb6946d0ece4a1bfaf7 -
Trigger Event:
release
-
Statement type: