Skip to main content

Data quality for dbt, without writing tests. Auto-generates dbt tests from DQLens profiling.

Project description

dbt-dqlens

CI PyPI

Data quality for dbt, without writing tests.

dbt-dqlens brings auto-generated data quality checks into your dbt project. It profiles your models, detects problems (null spikes, orphaned records, schema drift, outliers), and exposes findings as native dbt tests and a queryable model.

You don't write tests. DQLens writes them for you.

Quick Start

1. Install

Add to your packages.yml:

packages:
  - package: vahid110/dbt_dqlens
    version: 0.1.0

Then:

pip install dqlens[duckdb]   # or just: pip install dqlens (for PostgreSQL/MySQL/SQLite)
dbt deps

2. Profile your models

After dbt run, profile your warehouse:

dbt run-operation dqlens_profile

This connects to your warehouse (using your dbt profile), profiles every model, and stores baselines in a _dqlens schema.

3. Generate tests

dbt run-operation dqlens_generate_tests

This creates a _dqlens_tests.yml file with auto-generated tests for every model. Review it, commit it, done.

4. Run tests

dbt test --select tag:dqlens

Your auto-generated tests run as native dbt tests. Failures show up in dbt docs, dbt Cloud, and your CI pipeline.

What it detects

Check What it catches
Null drift Null rate increased significantly from baseline
Schema drift Columns added, removed, or type changed
Orphaned records FK references to non-existent rows
Empty strings Columns full of '' that look non-null but aren't
Outliers Values beyond 1.5x IQR bounds
Row count anomalies Unusual growth or shrinkage
Freshness Data that hasn't been updated recently
Pattern violations Values that don't match detected patterns (email, UUID, etc.)

How it works

dbt run                          (your models build as usual)
    |
dbt run-operation dqlens_profile (DQLens profiles the output tables)
    |
dbt run-operation dqlens_generate_tests  (auto-generates schema.yml tests)
    |
dbt test --select tag:dqlens     (runs the generated tests)

DQLens reads your dbt profiles.yml to connect to the same warehouse. No double configuration.

The dqlens_findings model

Every profiling run materializes a dqlens_findings table in your warehouse:

column type description
finding_id text Unique identifier
table_name text Which model
column_name text Which column (null for table-level)
severity text HIGH / MEDIUM / LOW
category text null_anomaly, schema_change, fk_mismatch, etc.
message text Human-readable description
detail text Why it was flagged
current_value text Current metric value
baseline_value text Previous metric value
detected_at timestamp When the finding was detected

Query it in your BI tool, build alerts on it, or just SELECT * FROM dqlens.dqlens_findings WHERE severity = 'HIGH'.

Configuration

In your dbt_project.yml:

vars:
  dqlens:
    dqlens_schema: "dqlens"        # where findings table lives
    min_severity: "MEDIUM"          # only store MEDIUM+ findings
    exclude_tables: ["staging_*"]   # skip these models

vs other dbt quality packages

dbt_expectations elementary dbt-dqlens
Auto-generates tests No Partial Yes
Requires writing config Yes (per column) Yes (YAML) No
Drift detection No Yes (paid) Yes (free)
Baseline comparison No Yes (paid) Yes (free)
Outlier detection No Yes (paid) Yes (free)
Pricing Free Free + paid cloud Free

Requirements

  • dbt-core >= 1.0.0
  • Python with dqlens installed (pip install dqlens[duckdb] for DuckDB)
  • Supported databases: PostgreSQL, DuckDB, SQLite, MySQL (Snowflake, BigQuery coming soon)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_dqlens-0.2.0.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_dqlens-0.2.0-py3-none-any.whl (10.9 kB view details)

Uploaded Python 3

File details

Details for the file dbt_dqlens-0.2.0.tar.gz.

File metadata

  • Download URL: dbt_dqlens-0.2.0.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for dbt_dqlens-0.2.0.tar.gz
Algorithm Hash digest
SHA256 95da9ccf0a67a48e720d06246a5ca486d2b2f4afd483ff26005675f72126a976
MD5 0b71e15da7363b22591e1ea6c537eb45
BLAKE2b-256 344d4b0f8d040611371a2086fb4ba72eaa80e1969912b6a4087a9d93da353211

See more details on using hashes here.

File details

Details for the file dbt_dqlens-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: dbt_dqlens-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 10.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for dbt_dqlens-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c13b69d059d962b1f7cb4350e1272bffc337c326cc6111eb3727b98586d86d4f
MD5 a5e007fb75717d5e0566258011ea5966
BLAKE2b-256 0c6529e97dc55e27b74acaa44ce72854dcf4774f5c9f47596a2cba49572f10bf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page