Skip to main content

Data validation and integrity testing for your datasets using pytest.

Project description

pytest-dataguard

A pytest plugin for validating CSV data files as part of your test suite. It helps ensure your data files meet quality standards by checking for null values and enforcing uniqueness constraints on specified columns.

Features

  • Null value checks: Ensure your CSV files have no missing values.
  • Uniqueness checks: Verify that specified columns contain only unique values.
  • Easy integration: Run data validation as part of your regular pytest workflow.

Installation

Install via pip:

pip install pytest-dataguard

Or install from uv:

uv add pytest-dataguard .

Usage

Run pytest with the plugin and specify the options:

pytest --file path/to/data.csv [--not_null] [--unique column1 --unique column2]
  • --file: Path to the CSV file to validate (required).
  • --not_null: Check that there are no null values in the file (optional, enabled by default).
  • --unique: Specify one or more columns to check for uniqueness. Can be used multiple times.

Example

Suppose you have a CSV file data.csv and want to ensure there are no nulls and that the id column is unique:

pytest --file data.csv --unique id

To check multiple columns for uniqueness:

pytest --file data.csv --unique id --unique email

How it works

When you run pytest with the pytest-dataguard options, the plugin will:

  • Load the specified CSV file using Polars
  • Check for null values --not_null is set by default
  • Check that specified columns have unique values if --unique is used
  • Fail the test session if any validation fails

Requirements

Contributing

Contributions are welcome! Please open issues or submit pull requests.

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytest_dataguard-1.0.3.tar.gz (4.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pytest_dataguard-1.0.3-py3-none-any.whl (5.5 kB view details)

Uploaded Python 3

File details

Details for the file pytest_dataguard-1.0.3.tar.gz.

File metadata

  • Download URL: pytest_dataguard-1.0.3.tar.gz
  • Upload date:
  • Size: 4.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pytest_dataguard-1.0.3.tar.gz
Algorithm Hash digest
SHA256 1d476a79240f7a9a77d9974c2f75ddcff7c5ccac9d9c87bac503c35543f31c72
MD5 2bd92443353a823ff177a7a7f63f5c44
BLAKE2b-256 86160abb3fc064b9eee6846f08ca920b3b26275b5316ea7df4e46cf68ba931dd

See more details on using hashes here.

Provenance

The following attestation bundles were made for pytest_dataguard-1.0.3.tar.gz:

Publisher: python-package.yml on olaaustine/pytest_dataguard

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pytest_dataguard-1.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for pytest_dataguard-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d14a3440af70d1047daea1ff7600e9df86e513ec4972dac19b32e01b1cfc7ab7
MD5 dd63303e8b414b38701e208d8c20c1f7
BLAKE2b-256 c29efd5ba416227d6bf96e6ce97bfe5e395bae46757ab47932d7cb0018bb4602

See more details on using hashes here.

Provenance

The following attestation bundles were made for pytest_dataguard-1.0.3-py3-none-any.whl:

Publisher: python-package.yml on olaaustine/pytest_dataguard

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page