Prevent production API breaks by validating data contracts between DBT models and API frameworks
Project description
๐ก๏ธ Data Contract Validator
Prevent production API breaks by validating data contracts between your data pipelines and API frameworks
๐ฏ What This Solves
Ever deployed a DBT model change only to break your FastAPI in production? This tool prevents that by validating data contracts between your data pipelines and APIs before deployment.
DBT Models Contract FastAPI Models
(What data Validator (What APIs
produces) โ๏ธ VALIDATES โ๏ธ expect)
โ โ โ
Schema Finds Schema
Extraction Mismatches Extraction
โก Quick Start
Installation
pip install data-contract-validator
Basic Usage
# Validate local DBT project against FastAPI models
contract-validator validate \
--dbt-project ./my-dbt-project \
--fastapi-models ./my-api/models.py
# Validate across repositories (perfect for microservices)
contract-validator validate \
--dbt-project . \
--fastapi-repo "my-org/my-api-repo" \
--fastapi-path "app/models.py"
GitHub Actions Integration
# .github/workflows/validate-contracts.yml
name: Validate Data Contracts
on: [pull_request]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v4
with:
python-version: '3.9'
- name: Install validator
run: pip install data-contract-validator
- name: Validate contracts
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
contract-validator validate \
--dbt-project . \
--fastapi-repo "my-org/my-api" \
--github-token "$GITHUB_TOKEN"
๐ What It Validates
โ Critical Issues (Block Deployment)
- Missing tables: API expects
user_analyticsbut DBT doesn't provide it - Missing required columns: API requires
total_revenuebut DBT model doesn't have it
โ ๏ธ Warnings (Non-blocking)
- Type mismatches: DBT provides
varcharbut API expectsinteger - Missing optional columns: API can handle missing optional fields
โน๏ธ Info (Good to Know)
- Extra columns: DBT provides columns that API doesn't use
๐ฏ Real-World Example
Before (Production Breaks) ๐ฅ
-- DBT model changes
select
user_id,
email,
-- total_orders, โ REMOVED this column
revenue
from users
# FastAPI model (unchanged)
class UserAnalytics(BaseModel):
user_id: str
email: str
total_orders: int # โ Still expects this!
revenue: float
Result: API breaks in production ๐
After (Caught by Validator) โ
โ VALIDATION FAILED
๐ฅ user_analytics.total_orders: FastAPI REQUIRES column but DBT removed it
๐ง Fix: Add 'total_orders' back to DBT model or update FastAPI model
Result: Issue caught in CI/CD, production safe! ๐ก๏ธ
๐ Supported Frameworks
Data Sources
- โ DBT (dbt-core, all adapters)
- ๐ Databricks (coming soon)
- ๐ Airflow (coming soon)
API Frameworks
- โ FastAPI (Pydantic + SQLModel)
- ๐ Django (coming soon)
- ๐ Flask-SQLAlchemy (coming soon)
Want to add support for your framework? See extending guide
๐ฆ Installation Options
Option 1: PyPI (Recommended)
pip install data-contract-validator
Option 2: From Source
git clone https://github.com/your-org/data-contract-validator
cd data-contract-validator
pip install -e .
Option 3: GitHub Actions Only
- name: Validate Contracts
uses: your-org/data-contract-validator@v1
with:
dbt-project: '.'
fastapi-repo: 'my-org/my-api'
๐ง Configuration
Command Line
contract-validator validate \
--dbt-project ./dbt-project \ # DBT project path
--fastapi-repo "org/repo" \ # GitHub repo
--fastapi-path "app/models.py" \ # Path to models
--github-token "$GITHUB_TOKEN" \ # For private repos
--output json # Output format
Configuration File
# .contract-validator.yml
version: '1.0'
sources:
dbt:
project_path: './dbt-project'
auto_update_schemas: true
targets:
fastapi:
repo: 'my-org/my-api'
path: 'app/models.py'
validation:
fail_on: ['missing_tables', 'missing_required_columns']
warn_on: ['type_mismatches', 'missing_optional_columns']
๐ Output Formats
Terminal (Default)
๐ Contract Validation Results:
โ CRITICAL ISSUES:
๐ฅ user_analytics.total_revenue: FastAPI expects this column but DBT doesn't provide it
๐ง Fix: Add 'total_revenue' to your DBT model
โ
VALIDATION PASSED (with warnings)
GitHub Actions
::error::user_analytics.total_revenue: Missing required column
::warning::user_analytics.age: Type mismatch (varchar vs integer)
JSON
{
"success": false,
"issues": [
{
"severity": "error",
"table": "user_analytics",
"column": "total_revenue",
"message": "FastAPI expects column but DBT doesn't provide it",
"suggestion": "Add 'total_revenue' to your DBT model"
}
]
}
๐๏ธ Architecture
# Simple, extensible architecture
from data_contract_validator import ContractValidator
from data_contract_validator.extractors import DBTExtractor, FastAPIExtractor
# Initialize extractors
dbt = DBTExtractor(project_path='./dbt-project')
fastapi = FastAPIExtractor(repo='my-org/my-api', path='app/models.py')
# Run validation
validator = ContractValidator(source=dbt, target=fastapi)
result = validator.validate()
if not result.success:
print(f"โ {len(result.critical_issues)} critical issues found")
for issue in result.critical_issues:
print(f"๐ฅ {issue.table}.{issue.column}: {issue.message}")
๐ค Contributing
We love contributions! See CONTRIBUTING.md for guidelines.
Quick Setup
git clone https://github.com/your-org/data-contract-validator
cd data-contract-validator
pip install -e ".[dev]"
pytest
Adding New Extractors
from data_contract_validator.extractors import BaseExtractor
class MyFrameworkExtractor(BaseExtractor):
def extract_schemas(self) -> Dict[str, Schema]:
# Your implementation
return schemas
๐ Success Stories
"We prevented 15 production incidents in our first month using this tool. It's now required in all our data pipeline PRs."
โ Data Engineering Team, TechCorp
"Finally! A tool that validates the contract between our DBT models and FastAPI services. No more surprise 500 errors."
โ Platform Team, StartupCo
๐ Documentation
- Installation Guide
- Configuration Reference
- GitHub Actions Setup
- Extending with New Extractors
- API Reference
๐ License
MIT License - see LICENSE file for details.
๐ Support
- ๐ Bug reports: GitHub Issues
- ๐ก Feature requests: GitHub Discussions
- ๐ง Email: your-email@example.com
โญ Star History
If this tool helps you prevent production incidents, please star the repo! โญ
Built with โค๏ธ by data engineers, for data engineers.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file data_contract_validator-1.0.0.tar.gz.
File metadata
- Download URL: data_contract_validator-1.0.0.tar.gz
- Upload date:
- Size: 24.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c107c8d04f9936468b4f751f0b5b402f46e4bc8fb9b0c8b1b932f74ae9e83a34
|
|
| MD5 |
98d6bcd62181f85e32d513272cdaa7d9
|
|
| BLAKE2b-256 |
61345fdda55ed5ad3cbc6d731039cb42e60223c11a1025c0f647d1f7c71121ba
|
File details
Details for the file data_contract_validator-1.0.0-py3-none-any.whl.
File metadata
- Download URL: data_contract_validator-1.0.0-py3-none-any.whl
- Upload date:
- Size: 23.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dcca25302a0734484756f67af4539224feb0fc339a33718bf5439678df27be66
|
|
| MD5 |
c2d4c60ecfca82e4705551e4560af661
|
|
| BLAKE2b-256 |
e3885db7cd62a574fbba2591c48dac32d7882114e02a869e98d5177d2172abdd
|