A solemn vow on your data. From YAML to verdict.
Project description
Trust Your Data. Know Why You Can't.
Open-source data contract enforcement for modern data teams.
Define contracts in YAML. Sync to dbt. Validate in CI. Block bad data before it reaches production.
Official ODCS Vendor — Listed on the Bitol registry alongside Data Contract CLI, Data Caterer, and DQC.ai.
The problem
89% of data teams report pain points with data modeling and ownership. Data contracts are the solution — but the tooling is fragmented:
- dbt tests → SQL only, no formal contract, no pre-ingestion validation
- Great Expectations → verbose Python, steep learning curve, no standard format
- Soda → good YAML checks, but no CI/CD gate, no stakeholder reporting, no ODCS
- Data Contract CLI → ODCS compatible, but no dbt sync, no scoring, no CI gate
DataVow covers the full lifecycle: define → sync dbt → validate → block → report. One tool. One standard.
Quick start
pip install datavow
# Initialize a project
datavow init my-project
# Define a contract
datavow define contracts/orders.yaml
# Validate data against contracts
datavow validate contracts/orders.yaml --source data/orders.csv
# Generate an HTML report
datavow report contracts/orders.yaml --source data/orders.csv --format html
# Run in CI mode (exit code 1 on critical violations)
datavow ci contracts/ --source data/
Key features
YAML-first contracts (ODCS v3.1 native)
Define schemas, quality rules, and SLAs in readable YAML. DataVow supports both its own format and native ODCS v3.1 contracts — auto-detected, no config needed.
apiVersion: datavow/v1
kind: DataContract
metadata:
name: orders
version: 1.0.0
owner: data-team@company.com
domain: sales
schema:
type: table
fields:
- name: order_id
type: integer
required: true
unique: true
- name: customer_email
type: string
required: true
pii: true
quality:
rules:
- name: no_negative_totals
type: sql
query: "SELECT COUNT(*) FROM {table} WHERE total_amount < 0"
threshold: 0
severity: CRITICAL
datavow dbt sync — the killer feature
One command generates dbt-native tests from your contracts. Works on every dbt adapter — no connector needed.
# Generate dbt tests from contracts
datavow dbt sync contracts/ --dbt-project-dir .
# Generates generic + singular tests from your contracts
# All tagged `datavow` for easy filtering
Vow Score — every validation renders a verdict
Vow Score = 100 - (20 × CRITICAL + 5 × WARNING + 1 × INFO)
95-100 ✅ Vow Kept — fully compliant, ship it
80-94 ⚠️ Vow Strained — action needed
50-79 🔧 Vow Broken — blocking issues
0-49 ❌ Vow Shattered — critical violations
CI pipeline gating
Block bad data automatically. No manual intervention.
GitHub Action (Marketplace):
- uses: ludovicschmetz-stack/datavow-action@v1
with:
contracts: contracts/
source: data/
fail-on: critical
comment-on-pr: "true"
dbt on-run-end hook (datavow-dbt):
# dbt_project.yml
on-run-end:
- "{{ datavow_summary() }}"
vars:
datavow_fail_on: broken # block pipeline on Vow Broken or worse
ODCS v3.1 — validate against the official standard
# Validate a contract against the ODCS v3.1 JSON Schema
datavow odcs check contracts/orders.yaml
# Convert ODCS native → DataVow format
datavow odcs convert contracts/orders-odcs.yaml -o contracts/orders.yaml
DataVow bundles the official ODCS v3.1.0 JSON Schema (2928 lines, Draft 2019-09). No other CLI tool does this.
Full command reference
| Command | Description |
|---|---|
datavow init |
Initialize project with config and example contract |
datavow define |
Create or edit a data contract interactively |
datavow validate |
Validate data against contracts |
datavow report |
Generate HTML or Markdown reports |
datavow ci |
CI mode — validate + exit code 0/1 |
datavow dbt generate |
Auto-generate contracts from dbt manifest |
datavow dbt validate |
Validate against dbt warehouse (via profiles.yml) |
datavow dbt sync |
Generate dbt tests from contracts |
datavow dbt ci |
Full pipeline: sync → dbt test → Vow Score |
datavow odcs check |
Validate contract against ODCS v3.1 JSON Schema |
datavow odcs convert |
Convert ODCS native → DataVow format |
Data sources
DataVow validates files and databases via DuckDB:
| Source | How |
|---|---|
| CSV, Parquet, JSON, TSV | Direct file validation |
| PostgreSQL | datavow validate --source postgresql://... |
| DuckDB | datavow validate --source path/to/db.duckdb |
For cloud warehouses (Snowflake, BigQuery, Redshift, Databricks), use datavow dbt sync — it generates dbt-native tests that run on your existing dbt adapter. No extra connector needed.
Built for your whole team
| Persona | Uses | Gets |
|---|---|---|
| Data Engineer | datavow ci in pipeline |
Automated quality gate |
| Analytics Engineer | datavow dbt sync |
One source of truth, zero test duplication |
| Domain Data Owner | YAML contracts in git | Versioned, reviewable data agreements |
| Data Governance | HTML reports | Conformity view across domains |
| Tech Lead | CI gate + Vow Score | No pipeline in prod without a contract |
| Freelance / Consultant | datavow report |
Quality proof attached to every delivery |
Architecture
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ YAML │ │ DataVow │ │ Outputs │
│ Contracts │───▶│ Engine │───▶│ │
│ (ODCS/DV) │ │ (DuckDB) │ │ ✅ Score │
└─────────────┘ └──────┬───────┘ │ 📊 Report │
│ │ 🚦 Exit 1 │
┌───────────┼──────┐ └─────────────┘
▼ ▼ ▼
CSV/Parquet PostgreSQL dbt
Ecosystem
| Package | Description | Version |
|---|---|---|
datavow |
CLI — define, validate, report, CI | v0.3.0 |
datavow-action |
GitHub Action — CI gate | v1.0.0 |
datavow-dbt |
dbt package — on-run-end Vow Score | v1.0.0 |
Contributing
Contributions are welcome! See CONTRIBUTING.md for guidelines.
# Development setup
git clone https://github.com/ludovicschmetz-stack/datavow.git
cd datavow
python -m venv .venv && source .venv/bin/activate
uv pip install -e ".[dev]"
pytest # 137 tests
License
Apache 2.0 — free forever. Use it, fork it, ship it.
Website · Documentation · PyPI · Issues
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file datavow-0.4.0.tar.gz.
File metadata
- Download URL: datavow-0.4.0.tar.gz
- Upload date:
- Size: 153.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c885506db94fa5e0a52bbf3e860947d9697a882f82bb4765f154b6c2197f0ca1
|
|
| MD5 |
5232796eea603c443c55e41d91f1eab0
|
|
| BLAKE2b-256 |
1c83ec396f643879bbe09183675e4d1994bc1ce10a8145a9ebeaff9563f6e968
|
Provenance
The following attestation bundles were made for datavow-0.4.0.tar.gz:
Publisher:
publish.yml on ludovicschmetz-stack/datavow
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
datavow-0.4.0.tar.gz -
Subject digest:
c885506db94fa5e0a52bbf3e860947d9697a882f82bb4765f154b6c2197f0ca1 - Sigstore transparency entry: 1157105389
- Sigstore integration time:
-
Permalink:
ludovicschmetz-stack/datavow@20bafa9b4b45dac4b2ed91fb140dda59477c9eed -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/ludovicschmetz-stack
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@20bafa9b4b45dac4b2ed91fb140dda59477c9eed -
Trigger Event:
release
-
Statement type:
File details
Details for the file datavow-0.4.0-py3-none-any.whl.
File metadata
- Download URL: datavow-0.4.0-py3-none-any.whl
- Upload date:
- Size: 62.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
501c57a069279f6be68a4c36da5d1a405778ec8e33672e99bda2e494d392b8d7
|
|
| MD5 |
22fb6159e95b66af764fb2470aa20e9a
|
|
| BLAKE2b-256 |
8bbc9d66a46e342a016b24effbafb6a9409970366726c0841e82161adca32566
|
Provenance
The following attestation bundles were made for datavow-0.4.0-py3-none-any.whl:
Publisher:
publish.yml on ludovicschmetz-stack/datavow
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
datavow-0.4.0-py3-none-any.whl -
Subject digest:
501c57a069279f6be68a4c36da5d1a405778ec8e33672e99bda2e494d392b8d7 - Sigstore transparency entry: 1157105442
- Sigstore integration time:
-
Permalink:
ludovicschmetz-stack/datavow@20bafa9b4b45dac4b2ed91fb140dda59477c9eed -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/ludovicschmetz-stack
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@20bafa9b4b45dac4b2ed91fb140dda59477c9eed -
Trigger Event:
release
-
Statement type: