Skip to main content

Apache Airflow provider for DQLens data quality checks

Project description

airflow-provider-dqlens

PyPI

Apache Airflow provider for DQLens. Add auto-generated data quality checks to your DAGs with one task.

Install

pip install airflow-provider-dqlens

Usage

from dqlens_airflow import DQLensOperator

quality_check = DQLensOperator(
    task_id="dqlens_quality_check",
    conn_id="my_postgres",
    schema="public",
    focus="high",
)

load_data >> quality_check >> downstream_tasks

If DQLens finds problems, the task fails and downstream tasks don't run.

Parameters

Parameter Required Default Description
conn_id Yes Airflow connection ID
schema No public Schema to profile
focus No all Severity filter: high, medium, all
quick No False Sampled profiling (faster)
tables No None Specific tables to profile
exclude_tables No None Tables to skip (glob patterns)
fail_on_findings No True Fail task if problems found

What it does

  1. Reads connection from your Airflow connection
  2. Profiles all tables (null rates, uniqueness, patterns, FKs, freshness)
  3. Compares against previous profile (drift detection)
  4. Fails the task if findings exceed your severity threshold
  5. Pushes results to XCom for downstream use

XCom output

{
    "findings_count": 3,
    "total_findings": 3,
    "passed_count": 47,
    "tables_profiled": 6,
    "findings": [
        {"table": "public.orders", "column": "email", "severity": "HIGH", "message": "..."},
    ]
}

Supported databases

PostgreSQL, DuckDB, SQLite, MySQL (via Airflow connection types).

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airflow_provider_dqlens-0.1.0.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

airflow_provider_dqlens-0.1.0-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file airflow_provider_dqlens-0.1.0.tar.gz.

File metadata

  • Download URL: airflow_provider_dqlens-0.1.0.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for airflow_provider_dqlens-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f98f70294618fc8d202f9add8a44452edd6b01188bf9881a6746fc4821ca3d4b
MD5 c0d7308f7a2764e02760dc552e014e86
BLAKE2b-256 f7c2e7a6dcff4ef34d93d7d0f0457ef5eaeb6cc67ae26bfdfa46487074c8fc80

See more details on using hashes here.

File details

Details for the file airflow_provider_dqlens-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for airflow_provider_dqlens-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3314dc3a8fea7394a17d545f7729bda68c7ba3dd09b107c4d8970cb113d13aa1
MD5 38cba37274add73d84c0339ede5508bd
BLAKE2b-256 8f941e9f94f7f3f6a564398c734d22526570c652f82a87160bbe8609f3d953f1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page