Skip to main content

Auto-generated data quality testing — find data problems automatically, no config, no test writing.

Project description

DQLens

Find data problems automatically — no config, no test writing.

DQLens auto-generates data quality tests by profiling your database. No YAML, no Python, no configuration files. Just point it at your database and get instant visibility into data quality issues.

Quick Start

pip install dqlens

# Initialize (stores connection config)
dqlens init postgres://localhost/mydb --schema public

# Profile your database (auto-generates tests)
dqlens profile

# Run checks and see problems
dqlens run

What It Does

DQLens connects to your database, profiles every table, and automatically generates tests based on what it finds:

  • Null anomalies — detects columns with unexpected null rates or null rate drift
  • Uniqueness violations — finds duplicate values in columns that should be unique
  • Foreign key mismatches — discovers orphaned rows referencing non-existent records
  • Pattern violations — identifies columns where values don't match detected patterns (email, UUID, URL, etc.)
  • Row count anomalies — flags unusual growth or shrinkage compared to baseline
  • Freshness checks — alerts when data hasn't been updated recently
  • Distribution shifts — catches value range changes between profiles

Signal Over Coverage

DQLens shows problems first, not 20 green checkmarks:

public.orders: 14 tests, 11 passed, 3 PROBLEMS FOUND

  PROBLEMS:
  HIGH   customer_id: 142 rows reference non-existent customers (FK mismatch)
  HIGH   email: 3.2% null (was 0.1% in baseline) — 32x increase
  MEDIUM orders grew 47% today (usual daily growth: 2-5%)

  ✓ 11 checks passed (use --verbose to see all)

Every finding includes:

  • Severity level (HIGH / MEDIUM / LOW)
  • Explanation of why it was flagged
  • Baseline comparison when available

Commands

Command Description
dqlens init <url> Initialize config with database connection
dqlens profile Profile tables and save baseline
dqlens run Run checks, show problems
dqlens run --verbose Show all checks including passing
dqlens run --focus high Only HIGH severity findings
dqlens run --ci Exit code 1 on failure (for CI/CD)
dqlens run --json-output Output as JSON
dqlens ignore <key> Suppress a known finding

Python API

import dqlens

suite = dqlens.profile("postgres://localhost/mydb", schema="public")
results = suite.run()

for table in results:
    for test in table.tests:
        if test.failed:
            print(f"{table.name}.{test.column}: {test.message}")

Supported Databases

  • PostgreSQL
  • SQLite
  • MySQL, Parquet, CSV (coming soon)

Development

# Clone and install
git clone https://github.com/vahid110/dqlens.git
cd dqlens
pip install -e ".[dev]"

# Run unit tests (no database needed)
pytest tests/ -k "unit" -v

# Run integration tests (needs PostgreSQL — see .env.example)
pytest tests/ -k "integration" -v

# Run all tests
pytest tests/ -v

Demo

See demo/README.md for a 5-minute walkthrough with a local PostgreSQL database.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dqlens-0.1.0.tar.gz (66.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dqlens-0.1.0-py3-none-any.whl (52.6 kB view details)

Uploaded Python 3

File details

Details for the file dqlens-0.1.0.tar.gz.

File metadata

  • Download URL: dqlens-0.1.0.tar.gz
  • Upload date:
  • Size: 66.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for dqlens-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d6b5ea882ee05cb8ca4a85eb6941be6a89b8d30a7c082596d8104719d3380ddb
MD5 34ec6d57f734231dd7695868a3071eb3
BLAKE2b-256 4a7ec9ea292f56190a32850c06d089d4939b6cb45f8407acfd18ef9b0dd48904

See more details on using hashes here.

File details

Details for the file dqlens-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: dqlens-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 52.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for dqlens-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 79af5fd3cda83742d9ae6b39914a030cc7fc3b5f2ac58e35f89e9ca04b8d4dad
MD5 c223ed0ff7998b2f47a537e5ab859a5a
BLAKE2b-256 f009257ccb34caff23076dcc06cdffcf09e1257f56f36ff8fdd833fbe02359a8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page