Skip to main content

Package for running Soda Core data quality scans in Dagster.

Project description

dagster-soda

PyPI version

dagster-soda integrates Soda Core data quality checks with Dagster. It provides the SodaScanComponent, a Dagster component that runs Soda Core scans and maps SodaCL check results to Dagster asset checks.

Installation

pip install dagster-soda

Note: dagster-soda requires soda-core 3.x (the soda.scan API). It pins soda-core>=3.0,<4 by default.

Usage

Component: SodaScanComponent

Configure a SodaScanComponent in your Dagster project to:

  • Point at SodaCL YAML check files and a Soda configuration.yml
  • Map Soda dataset names to Dagster asset keys
  • Run scans and report pass/fail as Dagster asset check results

Scaffolding with the CLI

Use the Dagster CLI to scaffold a new Soda scan component in your project (requires dagster-dg-cli):

dg scaffold defs dagster_soda.SodaScanComponent <path>

Example (scaffold into a folder named soda_checks under your defs directory):

dg scaffold defs dagster_soda.SodaScanComponent soda_checks

This generates:

  • A defs.yaml with default attributes (checks_paths, configuration_path, data_source_name, asset_key_map)
  • A checks.yml template with example SodaCL (e.g. checks for my_table: - row_count > 0)

Edit the generated files to match your data source and checks, then load your definitions as usual.

Minimal defs.yaml example

type: dagster_soda.SodaScanComponent
attributes:
  checks_paths:
    - checks.yml
  configuration_path: configuration.yml
  data_source_name: my_datasource
  asset_key_map:
    my_table: my_table

Documentation

The docs for dagster-soda can be found here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dagster_soda-0.29.10.tar.gz (201.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dagster_soda-0.29.10-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file dagster_soda-0.29.10.tar.gz.

File metadata

  • Download URL: dagster_soda-0.29.10.tar.gz
  • Upload date:
  • Size: 201.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for dagster_soda-0.29.10.tar.gz
Algorithm Hash digest
SHA256 bc39e6949d681c385cf6c5071dc87b650a17429df6622e807c008611a3da4516
MD5 b9bccb2a23ccb7f09aea26ca2dbd1aac
BLAKE2b-256 d4d736c1ee79209a76d97da3e7e775206b5334db8e39fddb2ec05f8fba15f0f3

See more details on using hashes here.

File details

Details for the file dagster_soda-0.29.10-py3-none-any.whl.

File metadata

  • Download URL: dagster_soda-0.29.10-py3-none-any.whl
  • Upload date:
  • Size: 12.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for dagster_soda-0.29.10-py3-none-any.whl
Algorithm Hash digest
SHA256 c9e012a97b4aeb9e1811ce3d467c8e7a0b47c84be027635cf1c621790f1dfdb1
MD5 a93341e4e357dbbc2994d9319e0a5898
BLAKE2b-256 0917b7850d3eb74dc7aa548594f9a91969ebc5fd454ddecc1f8555d8dab326f6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page