Data validation and integrity testing for your datasets using pytest.
Project description
pytest-dataguard
A pytest plugin for validating CSV data files as part of your test suite. It helps ensure your data files meet quality standards by checking for null values and enforcing uniqueness constraints on specified columns.
Features
- Null value checks: Ensure your CSV files have no missing values.
- Uniqueness checks: Verify that specified columns contain only unique values.
- Easy integration: Run data validation as part of your regular pytest workflow.
Installation
Install via pip:
pip install pytest-dataguard
Or install from uv:
uv add pytest-dataguard .
Usage
Run pytest with the plugin and specify the options:
pytest --file path/to/data.csv [--not_null] [--unique column1 --unique column2]
--file: Path to the CSV file to validate (required).--not_null: Check that there are no null values in the file (optional, enabled by default).--unique: Specify one or more columns to check for uniqueness. Can be used multiple times.
Example
Suppose you have a CSV file data.csv and want to ensure there are no nulls and that the id column is unique:
pytest --file data.csv --unique id
To check multiple columns for uniqueness:
pytest --file data.csv --unique id --unique email
How it works
When you run pytest with the pytest-dataguard options, the plugin will:
- Load the specified CSV file using Polars
- Check for null values
--not_nullis set by default - Check that specified columns have unique values if
--uniqueis used - Fail the test session if any validation fails
Requirements
Contributing
Contributions are welcome! Please open issues or submit pull requests.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pytest_dataguard-1.0.1.tar.gz.
File metadata
- Download URL: pytest_dataguard-1.0.1.tar.gz
- Upload date:
- Size: 4.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8ee920b7154c616f17c4005057a58d3b400fd3dfa352319e9a60fb6c367150e9
|
|
| MD5 |
ce26495cb07d6ba10165b20c7ba7b429
|
|
| BLAKE2b-256 |
fd18ea8b8943bfcc40debd9d5b55fe43ae4097ef6fca1b473ac5087770aeb511
|
Provenance
The following attestation bundles were made for pytest_dataguard-1.0.1.tar.gz:
Publisher:
python-package.yml on olaaustine/pytest_dataguard
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pytest_dataguard-1.0.1.tar.gz -
Subject digest:
8ee920b7154c616f17c4005057a58d3b400fd3dfa352319e9a60fb6c367150e9 - Sigstore transparency entry: 592475987
- Sigstore integration time:
-
Permalink:
olaaustine/pytest_dataguard@c18f6bd58c5c8eea601630d623ba12ce1fd66655 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/olaaustine
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-package.yml@c18f6bd58c5c8eea601630d623ba12ce1fd66655 -
Trigger Event:
push
-
Statement type:
File details
Details for the file pytest_dataguard-1.0.1-py3-none-any.whl.
File metadata
- Download URL: pytest_dataguard-1.0.1-py3-none-any.whl
- Upload date:
- Size: 5.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d96e1b8791b1917c3853ef38ea8e27e8729c21a43c017ce2dbba149b6ffcad14
|
|
| MD5 |
c95ba11a20b4bf5b528fb109b2d4dca1
|
|
| BLAKE2b-256 |
5409a9c963af5a49ba591ddf58fe8e57c4f97dd562302fde26754a7c6317b4a6
|
Provenance
The following attestation bundles were made for pytest_dataguard-1.0.1-py3-none-any.whl:
Publisher:
python-package.yml on olaaustine/pytest_dataguard
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pytest_dataguard-1.0.1-py3-none-any.whl -
Subject digest:
d96e1b8791b1917c3853ef38ea8e27e8729c21a43c017ce2dbba149b6ffcad14 - Sigstore transparency entry: 592475994
- Sigstore integration time:
-
Permalink:
olaaustine/pytest_dataguard@c18f6bd58c5c8eea601630d623ba12ce1fd66655 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/olaaustine
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-package.yml@c18f6bd58c5c8eea601630d623ba12ce1fd66655 -
Trigger Event:
push
-
Statement type: