High-Performance Version Control for Database Schemas with Intelligent Processing
Project description
Datatrack - Version Control for Database Schemas
A high-performance CLI tool that brings Git-like version control to your database schemas with intelligent processing optimizations. Built for Data Engineers, Analytics Engineers, and Platform Teams.
Key Features
- High Performance: 70-75% faster schema introspection for large databases
- Intelligent Processing: Auto-selects optimal strategy based on schema size
- Multi-Database Support: PostgreSQL, MySQL, SQLite, SQL Server
- Schema Comparison: Generate detailed diffs between versions
- Quality Linting: Enforce naming conventions and best practices
- Multiple Export Formats: JSON, YAML, Markdown, HTML
Performance Improvements
| Schema Size | Processing Method | Performance Gain |
|---|---|---|
| 1-49 tables | Standard | Baseline |
| 50-199 tables | Parallel (4 workers) | 65-70% faster |
| 200+ tables | Parallel + Batched | 70-75% faster |
Installation
pip install datatrack-core
pip install -e .
This method is ideal if you want to contribute or modify the tool.
## Helpful Commands
Datatrack comes with built-in help and guidance for every command. Use this to quickly learn syntax and options:
```bash
datatrack --help
or
datatrack -h
How to Use
1. Initialize Tracking
datatrack init
Creates .datatrack/, .databases/, and optional initial files.
2. Connect to a Database
Save your DB connection for future use:
MySQL
datatrack connect mysql+pymysql://root:<password>@localhost:3306/<database-name>
PostgreSQL
datatrack connect postgresql+psycopg2://postgres:<password>@localhost:5432/<database-name>
SQLite
datatrack connect sqlite:///.databases/<database-name>
3. Take a Schema Snapshot
# Standard snapshot
datatrack snapshot
# High-performance snapshot with parallel processing
datatrack snapshot --parallel
# Custom performance configuration
datatrack snapshot --parallel --max-workers 8 --batch-size 50
# For large schemas (200+ tables) - automatically optimized
datatrack snapshot # Auto-enables parallel + batched processing
Saves the current schema to .databases/exports/<db_name>/snapshots/.
4. Lint the Schema
datatrack lint
Detects issues in naming and structure.
5. Verify Schema Rules
datatrack verify
Validates schema against schema_rules.yaml.
6. View Schema Differences
datatrack diff
Shows table and column changes between the latest two snapshots.
7. Export Snapshots or Diffs
Export latest snapshot as YAML (default)
datatrack export
Explicitly export snapshot as YAML
datatrack export --type snapshot --format yaml
Export latest diff as JSON
datatrack export --type diff --format json
Output is saved in .databases/exports/<db_name>/.
8. View Snapshot History
datatrack history
Displays all snapshot timestamps and table counts.
9. Run the Full Pipeline
datatrack pipeline run
Runs lint, snapshot, verify, diff, and export together.
For advanced use cases and integration into CI/CD, visit:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file datatrack_core-1.1.1.tar.gz.
File metadata
- Download URL: datatrack_core-1.1.1.tar.gz
- Upload date:
- Size: 46.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6bd25715c0b50fa38c896ccd166b724ffc69fb4b1f432da0088e459f029abd57
|
|
| MD5 |
7f7fdadf8a20497b858ac2150a7290ac
|
|
| BLAKE2b-256 |
3c4b75c693c5f17c544b8340846e79d209c39cb0c680b2086379b3e20aa4542a
|
File details
Details for the file datatrack_core-1.1.1-py3-none-any.whl.
File metadata
- Download URL: datatrack_core-1.1.1-py3-none-any.whl
- Upload date:
- Size: 35.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
04f1508dc28e4d91185b5bb6b0ece35eb30f5b0198fed00d490bfaebc2d132a6
|
|
| MD5 |
7861584295ce898847726d251928886d
|
|
| BLAKE2b-256 |
1880f0916427b62c69730836ea1cbe66d751a693e18d1405494ab7b462b5353c
|