Skip to main content

Auto-generated data quality testing — find data problems automatically, no config, no test writing.

Project description

DQLens

Find data problems automatically. No config, no test writing.

DQLens auto-generates data quality tests by profiling your database. No YAML, no Python, no configuration files. Just point it at your database and get instant visibility into data quality issues.

Quick Start

pip install dqlens

# Initialize (stores connection config)
dqlens init postgres://localhost/mydb --schema public

# Profile your database (auto-generates tests)
dqlens profile

# Run checks and see problems
dqlens run

What It Does

DQLens connects to your database, profiles every table, and automatically generates tests based on what it finds:

  • Null anomalies: columns with unexpected null rates or null rate drift
  • Uniqueness violations: duplicate values in columns that should be unique
  • Foreign key mismatches: orphaned rows referencing non-existent records
  • Pattern violations: values that don't match detected patterns (email, UUID, URL, etc.)
  • Row count anomalies: unusual growth or shrinkage compared to baseline
  • Freshness checks: data that hasn't been updated recently
  • Distribution shifts: value range changes between profiles

Signal Over Coverage

DQLens shows problems first, not 20 green checkmarks:

public.orders: 14 tests, 11 passed, 3 PROBLEMS FOUND

  PROBLEMS:
  HIGH   customer_id: 142 rows reference non-existent customers (FK mismatch)
  HIGH   email: 3.2% null (was 0.1% in baseline), 32x increase
  MEDIUM orders grew 47% today (usual daily growth: 2-5%)

  ✓ 11 checks passed (use --verbose to see all)

Every finding includes:

  • Severity level (HIGH / MEDIUM / LOW)
  • Explanation of why it was flagged
  • Baseline comparison when available

Commands

Command Description
dqlens init <url> Initialize config with database connection
dqlens profile Profile tables and save baseline
dqlens profile --quick Quick mode: sample data, under 5 seconds
dqlens run Run checks, show problems
dqlens run --verbose Show all checks including passing
dqlens run --focus high Only HIGH severity findings
dqlens run --ci Exit code 1 on failure (for CI/CD)
dqlens run --json-output Output as JSON
dqlens diff Compare two most recent profiles
dqlens diff --json-output Diff as JSON
dqlens ignore <key> Suppress a known finding

Python API

import dqlens

suite = dqlens.profile("postgres://localhost/mydb", schema="public")
results = suite.run()

for table in results:
    for test in table.tests:
        if test.failed:
            print(f"{table.name}.{test.column}: {test.message}")

Supported Databases

  • PostgreSQL
  • SQLite
  • MySQL
  • Parquet, CSV (coming soon)

Development

# Clone and install
git clone https://github.com/vahid110/dqlens.git
cd dqlens
pip install -e ".[dev]"

# Run unit tests (no database needed)
pytest tests/ -k "unit" -v

# Run integration tests (needs PostgreSQL, see .env.example)
pytest tests/ -k "integration" -v

# Run all tests
pytest tests/ -v

Demo

See demo/README.md for a 5-minute walkthrough with a local PostgreSQL database.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dqlens-0.3.0.tar.gz (83.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dqlens-0.3.0-py3-none-any.whl (61.3 kB view details)

Uploaded Python 3

File details

Details for the file dqlens-0.3.0.tar.gz.

File metadata

  • Download URL: dqlens-0.3.0.tar.gz
  • Upload date:
  • Size: 83.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dqlens-0.3.0.tar.gz
Algorithm Hash digest
SHA256 33d11daf30effbd5152065fc71f0724fafed4c436384e2da27b9ecb91e96c10b
MD5 c5fc0577f7aaab7a8b2a4f7827133de9
BLAKE2b-256 2bbd0548b88911207544df57cb0d4890ea2431abf6570f18cfe97a9025752d6c

See more details on using hashes here.

File details

Details for the file dqlens-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: dqlens-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 61.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dqlens-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f0fee806fe80fadb9b3cb2fd44308a0a92acab78243a5f8740ccdc1872f3169e
MD5 df9b04f36acddc8b5d30fd86b50114c0
BLAKE2b-256 4176809ac5f13baf1ee8df68cd098125c984e0cccbccb8279bba57df6cb2b834

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page