Validate column specifications and constraints for SQL tables and polars data frames.
Project description
pydiverse.colspec
A data validation library that ensures type conformity of columns in SQL tables and polars data frames. It can also validate constraints regarding the data as defined in a so-called column specification provided by the user.
The purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding type hints when working with tables and data frames.
ColSpec is founded on the ideas of dataframely which does exactly the same but with focus on polars data frames. ColSpec delegates to dataframely in the back especially for features like sampling random input data conforming to a given column specification. dataframely uses the term schema as it is also used in the polars community. Since ColSpec also works with SQL databases where the term schema is used for a collection of tables, the term is avoided as much as possible. The term column specification means exactly the same but avoids the confusion.
Merit attribution
ColSpec is the brain child of dataframely. Large parts of the codebase is code duplicated from it. Unfortunately, integrating the SQL native validation into dataframely would have made it a less clean solution for people who just focus on Polars. Thus the decision was made to replicate the same functionality in the pydiverse library collection also with the benefit to enable smoother integration with other pydiverse libraries.
Usage
pydiverse.colspec can either be installed via pypi with pip install pydiverse-colspec or via
conda-forge with conda install pydiverse-colspec -c conda-forge.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pydiverse_colspec-0.2.4.tar.gz.
File metadata
- Download URL: pydiverse_colspec-0.2.4.tar.gz
- Upload date:
- Size: 300.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
413742217f8ca55b661f10b6faf3e57aa24349d5f091a460cee11ef2bde13ae9
|
|
| MD5 |
a3c297adcadd6593e001f54317015624
|
|
| BLAKE2b-256 |
57f2d68439acc613052f763a6fd588787eef46a2ed439b73cca3b22b89f99615
|
Provenance
The following attestation bundles were made for pydiverse_colspec-0.2.4.tar.gz:
Publisher:
release.yml on pydiverse/pydiverse.colspec
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pydiverse_colspec-0.2.4.tar.gz -
Subject digest:
413742217f8ca55b661f10b6faf3e57aa24349d5f091a460cee11ef2bde13ae9 - Sigstore transparency entry: 262118177
- Sigstore integration time:
-
Permalink:
pydiverse/pydiverse.colspec@5d411fd37a36effe48397cd643bc0227fb7ddeb8 -
Branch / Tag:
refs/tags/0.2.4 - Owner: https://github.com/pydiverse
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5d411fd37a36effe48397cd643bc0227fb7ddeb8 -
Trigger Event:
push
-
Statement type:
File details
Details for the file pydiverse_colspec-0.2.4-py3-none-any.whl.
File metadata
- Download URL: pydiverse_colspec-0.2.4-py3-none-any.whl
- Upload date:
- Size: 52.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0eae74b080e6b29b9e2f43a640521b07d527ae2e76ba2910e1d11c1c06599a32
|
|
| MD5 |
751887756b8b018cdd71cfd4b51789a1
|
|
| BLAKE2b-256 |
cdce77e29130b9506d78876d053895b526ebf7df6f7c9657d2589fd980ecf766
|
Provenance
The following attestation bundles were made for pydiverse_colspec-0.2.4-py3-none-any.whl:
Publisher:
release.yml on pydiverse/pydiverse.colspec
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pydiverse_colspec-0.2.4-py3-none-any.whl -
Subject digest:
0eae74b080e6b29b9e2f43a640521b07d527ae2e76ba2910e1d11c1c06599a32 - Sigstore transparency entry: 262118185
- Sigstore integration time:
-
Permalink:
pydiverse/pydiverse.colspec@5d411fd37a36effe48397cd643bc0227fb7ddeb8 -
Branch / Tag:
refs/tags/0.2.4 - Owner: https://github.com/pydiverse
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5d411fd37a36effe48397cd643bc0227fb7ddeb8 -
Trigger Event:
push
-
Statement type: