Auto-generated data quality testing — find data problems automatically, no config, no test writing.
Project description
DQLens
Find data problems automatically — no config, no test writing.
DQLens auto-generates data quality tests by profiling your database. No YAML, no Python, no configuration files. Just point it at your database and get instant visibility into data quality issues.
Quick Start
pip install dqlens
# Initialize (stores connection config)
dqlens init postgres://localhost/mydb --schema public
# Profile your database (auto-generates tests)
dqlens profile
# Run checks and see problems
dqlens run
What It Does
DQLens connects to your database, profiles every table, and automatically generates tests based on what it finds:
- Null anomalies — detects columns with unexpected null rates or null rate drift
- Uniqueness violations — finds duplicate values in columns that should be unique
- Foreign key mismatches — discovers orphaned rows referencing non-existent records
- Pattern violations — identifies columns where values don't match detected patterns (email, UUID, URL, etc.)
- Row count anomalies — flags unusual growth or shrinkage compared to baseline
- Freshness checks — alerts when data hasn't been updated recently
- Distribution shifts — catches value range changes between profiles
Signal Over Coverage
DQLens shows problems first, not 20 green checkmarks:
public.orders: 14 tests, 11 passed, 3 PROBLEMS FOUND
PROBLEMS:
HIGH customer_id: 142 rows reference non-existent customers (FK mismatch)
HIGH email: 3.2% null (was 0.1% in baseline) — 32x increase
MEDIUM orders grew 47% today (usual daily growth: 2-5%)
✓ 11 checks passed (use --verbose to see all)
Every finding includes:
- Severity level (HIGH / MEDIUM / LOW)
- Explanation of why it was flagged
- Baseline comparison when available
Commands
| Command | Description |
|---|---|
dqlens init <url> |
Initialize config with database connection |
dqlens profile |
Profile tables and save baseline |
dqlens run |
Run checks, show problems |
dqlens run --verbose |
Show all checks including passing |
dqlens run --focus high |
Only HIGH severity findings |
dqlens run --ci |
Exit code 1 on failure (for CI/CD) |
dqlens run --json-output |
Output as JSON |
dqlens ignore <key> |
Suppress a known finding |
Python API
import dqlens
suite = dqlens.profile("postgres://localhost/mydb", schema="public")
results = suite.run()
for table in results:
for test in table.tests:
if test.failed:
print(f"{table.name}.{test.column}: {test.message}")
Supported Databases
- PostgreSQL
- SQLite
- MySQL, Parquet, CSV (coming soon)
Development
# Clone and install
git clone https://github.com/vahid110/dqlens.git
cd dqlens
pip install -e ".[dev]"
# Run unit tests (no database needed)
pytest tests/ -k "unit" -v
# Run integration tests (needs PostgreSQL — see .env.example)
pytest tests/ -k "integration" -v
# Run all tests
pytest tests/ -v
Demo
See demo/README.md for a 5-minute walkthrough with a local PostgreSQL database.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dqlens-0.1.0.tar.gz.
File metadata
- Download URL: dqlens-0.1.0.tar.gz
- Upload date:
- Size: 66.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d6b5ea882ee05cb8ca4a85eb6941be6a89b8d30a7c082596d8104719d3380ddb
|
|
| MD5 |
34ec6d57f734231dd7695868a3071eb3
|
|
| BLAKE2b-256 |
4a7ec9ea292f56190a32850c06d089d4939b6cb45f8407acfd18ef9b0dd48904
|
File details
Details for the file dqlens-0.1.0-py3-none-any.whl.
File metadata
- Download URL: dqlens-0.1.0-py3-none-any.whl
- Upload date:
- Size: 52.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
79af5fd3cda83742d9ae6b39914a030cc7fc3b5f2ac58e35f89e9ca04b8d4dad
|
|
| MD5 |
c223ed0ff7998b2f47a537e5ab859a5a
|
|
| BLAKE2b-256 |
f009257ccb34caff23076dcc06cdffcf09e1257f56f36ff8fdd833fbe02359a8
|