Data quality for dbt, without writing tests. Auto-generates dbt tests from DQLens profiling.
Project description
dbt-dqlens
Data quality for dbt, without writing tests.
dbt-dqlens brings auto-generated data quality checks into your dbt project. It profiles your models, detects problems (null spikes, orphaned records, schema drift, outliers), and exposes findings as native dbt tests and a queryable model.
You don't write tests. DQLens writes them for you.
Quick Start
1. Install
Add to your packages.yml:
packages:
- package: vahid110/dbt_dqlens
version: 0.1.0
Then:
pip install dqlens[duckdb] # or just: pip install dqlens (for PostgreSQL/MySQL/SQLite)
dbt deps
2. Profile your models
After dbt run, profile your warehouse:
dbt run-operation dqlens_profile
This connects to your warehouse (using your dbt profile), profiles every model, and stores baselines in a _dqlens schema.
3. Generate tests
dbt run-operation dqlens_generate_tests
This creates a _dqlens_tests.yml file with auto-generated tests for every model. Review it, commit it, done.
4. Run tests
dbt test --select tag:dqlens
Your auto-generated tests run as native dbt tests. Failures show up in dbt docs, dbt Cloud, and your CI pipeline.
What it detects
| Check | What it catches |
|---|---|
| Null drift | Null rate increased significantly from baseline |
| Schema drift | Columns added, removed, or type changed |
| Orphaned records | FK references to non-existent rows |
| Empty strings | Columns full of '' that look non-null but aren't |
| Outliers | Values beyond 1.5x IQR bounds |
| Row count anomalies | Unusual growth or shrinkage |
| Freshness | Data that hasn't been updated recently |
| Pattern violations | Values that don't match detected patterns (email, UUID, etc.) |
How it works
dbt run (your models build as usual)
|
dbt run-operation dqlens_profile (DQLens profiles the output tables)
|
dbt run-operation dqlens_generate_tests (auto-generates schema.yml tests)
|
dbt test --select tag:dqlens (runs the generated tests)
DQLens reads your dbt profiles.yml to connect to the same warehouse. No double configuration.
The dqlens_findings model
Every profiling run materializes a dqlens_findings table in your warehouse:
| column | type | description |
|---|---|---|
| finding_id | text | Unique identifier |
| table_name | text | Which model |
| column_name | text | Which column (null for table-level) |
| severity | text | HIGH / MEDIUM / LOW |
| category | text | null_anomaly, schema_change, fk_mismatch, etc. |
| message | text | Human-readable description |
| detail | text | Why it was flagged |
| current_value | text | Current metric value |
| baseline_value | text | Previous metric value |
| detected_at | timestamp | When the finding was detected |
Query it in your BI tool, build alerts on it, or just SELECT * FROM dqlens.dqlens_findings WHERE severity = 'HIGH'.
Configuration
In your dbt_project.yml:
vars:
dqlens:
dqlens_schema: "dqlens" # where findings table lives
min_severity: "MEDIUM" # only store MEDIUM+ findings
exclude_tables: ["staging_*"] # skip these models
vs other dbt quality packages
| dbt_expectations | elementary | dbt-dqlens | |
|---|---|---|---|
| Auto-generates tests | No | Partial | Yes |
| Requires writing config | Yes (per column) | Yes (YAML) | No |
| Drift detection | No | Yes (paid) | Yes (free) |
| Baseline comparison | No | Yes (paid) | Yes (free) |
| Outlier detection | No | Yes (paid) | Yes (free) |
| Pricing | Free | Free + paid cloud | Free |
Requirements
- dbt-core >= 1.0.0
- Python with
dqlensinstalled (pip install dqlens[duckdb]for DuckDB) - Supported databases: PostgreSQL, DuckDB, SQLite, MySQL (Snowflake, BigQuery coming soon)
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dbt_dqlens-0.2.0.tar.gz.
File metadata
- Download URL: dbt_dqlens-0.2.0.tar.gz
- Upload date:
- Size: 12.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
95da9ccf0a67a48e720d06246a5ca486d2b2f4afd483ff26005675f72126a976
|
|
| MD5 |
0b71e15da7363b22591e1ea6c537eb45
|
|
| BLAKE2b-256 |
344d4b0f8d040611371a2086fb4ba72eaa80e1969912b6a4087a9d93da353211
|
File details
Details for the file dbt_dqlens-0.2.0-py3-none-any.whl.
File metadata
- Download URL: dbt_dqlens-0.2.0-py3-none-any.whl
- Upload date:
- Size: 10.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c13b69d059d962b1f7cb4350e1272bffc337c326cc6111eb3727b98586d86d4f
|
|
| MD5 |
a5e007fb75717d5e0566258011ea5966
|
|
| BLAKE2b-256 |
0c6529e97dc55e27b74acaa44ce72854dcf4774f5c9f47596a2cba49572f10bf
|