Skip to main content

Data quality for dbt, without writing tests. Auto-generates dbt tests from DQLens profiling.

Project description

dbt-dqlens

Data quality for dbt, without writing tests.

dbt-dqlens brings auto-generated data quality checks into your dbt project. It profiles your models, detects problems (null spikes, orphaned records, schema drift, outliers), and exposes findings as native dbt tests and a queryable model.

You don't write tests. DQLens writes them for you.

Quick Start

1. Install

Add to your packages.yml:

packages:
  - package: vahid110/dbt_dqlens
    version: 0.1.0

Then:

pip install dqlens
dbt deps

2. Profile your models

After dbt run, profile your warehouse:

dbt run-operation dqlens_profile

This connects to your warehouse (using your dbt profile), profiles every model, and stores baselines in a _dqlens schema.

3. Generate tests

dbt run-operation dqlens_generate_tests

This creates a _dqlens_tests.yml file with auto-generated tests for every model. Review it, commit it, done.

4. Run tests

dbt test --select tag:dqlens

Your auto-generated tests run as native dbt tests. Failures show up in dbt docs, dbt Cloud, and your CI pipeline.

What it detects

Check What it catches
Null drift Null rate increased significantly from baseline
Schema drift Columns added, removed, or type changed
Orphaned records FK references to non-existent rows
Empty strings Columns full of '' that look non-null but aren't
Outliers Values beyond 1.5x IQR bounds
Row count anomalies Unusual growth or shrinkage
Freshness Data that hasn't been updated recently
Pattern violations Values that don't match detected patterns (email, UUID, etc.)

How it works

dbt run                          (your models build as usual)
    |
dbt run-operation dqlens_profile (DQLens profiles the output tables)
    |
dbt run-operation dqlens_generate_tests  (auto-generates schema.yml tests)
    |
dbt test --select tag:dqlens     (runs the generated tests)

DQLens reads your dbt profiles.yml to connect to the same warehouse. No double configuration.

The dqlens_findings model

Every profiling run materializes a dqlens_findings table in your warehouse:

column type description
finding_id text Unique identifier
table_name text Which model
column_name text Which column (null for table-level)
severity text HIGH / MEDIUM / LOW
category text null_anomaly, schema_change, fk_mismatch, etc.
message text Human-readable description
detail text Why it was flagged
current_value text Current metric value
baseline_value text Previous metric value
detected_at timestamp When the finding was detected

Query it in your BI tool, build alerts on it, or just SELECT * FROM dqlens.dqlens_findings WHERE severity = 'HIGH'.

Configuration

In your dbt_project.yml:

vars:
  dqlens:
    dqlens_schema: "dqlens"        # where findings table lives
    min_severity: "MEDIUM"          # only store MEDIUM+ findings
    exclude_tables: ["staging_*"]   # skip these models

vs other dbt quality packages

dbt_expectations elementary dbt-dqlens
Auto-generates tests No Partial Yes
Requires writing config Yes (per column) Yes (YAML) No
Drift detection No Yes (paid) Yes (free)
Baseline comparison No Yes (paid) Yes (free)
Outlier detection No Yes (paid) Yes (free)
Pricing Free Free + paid cloud Free

Requirements

  • dbt-core >= 1.0.0
  • Python with dqlens installed (pip install dqlens)
  • PostgreSQL, Snowflake, or BigQuery (PostgreSQL for MVP)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_dqlens-0.1.0.tar.gz (11.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_dqlens-0.1.0-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file dbt_dqlens-0.1.0.tar.gz.

File metadata

  • Download URL: dbt_dqlens-0.1.0.tar.gz
  • Upload date:
  • Size: 11.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for dbt_dqlens-0.1.0.tar.gz
Algorithm Hash digest
SHA256 80e0ac8b4b7af1f2ec5f1fe6f7b9e8dc9e3057e8995a8e762ba1813f2aa91bfc
MD5 7d5c5c15bcccba9be98f1a84565ef406
BLAKE2b-256 b05ee1e74b48e8c1e18f5a91fc89290c275ea4174851d2ae59af8714962a8761

See more details on using hashes here.

File details

Details for the file dbt_dqlens-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: dbt_dqlens-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for dbt_dqlens-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2bf0abdb3a2cfef6b82daefa99c796114e37162cf444eb5b3064d0c52d1a01ab
MD5 f8be9a41ca49f6ca33b2453b272e629b
BLAKE2b-256 17f07c3b5ebf6a94da95062a0ed910235f17861b8a200bbfe6dc7c52e309a9ee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page