Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.10.0.tar.gz (305.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.10.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl (522.8 kB view details)

Uploaded PyPymacOS 10.12+ x86-64

dataframely-1.10.0-cp310-abi3-win_amd64.whl (426.6 kB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-1.10.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (559.8 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-1.10.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (551.0 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-1.10.0-cp310-abi3-macosx_11_0_arm64.whl (505.4 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

File details

Details for the file dataframely-1.10.0.tar.gz.

File metadata

  • Download URL: dataframely-1.10.0.tar.gz
  • Upload date:
  • Size: 305.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.10.0.tar.gz
Algorithm Hash digest
SHA256 d10ec0a05affcd5a71268d6967fdeda1e5ff98f1ea9c5be2d113bbeee8dfc7d5
MD5 987f49332c1d6994d05757d57158d068
BLAKE2b-256 38aead70ee2d329c20ae4b8e5c77c26293b893430b63949a51a2f4905e44a5b5

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.10.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.10.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.10.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 79bf708f75266394b324d65ba811faa2fab98a06c5604dd571cdffed3f3f51a7
MD5 6937c78bcfa7efa09469705524abc0e1
BLAKE2b-256 abcb6a891c75f44cb421e52eb1937ca2b04b50c5a4801cea9e97db8bfb5b616f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.10.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.10.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.10.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 426.6 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.10.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 fc082592ebc9b16826a0962c8f812f9b1b3adee56a4e9427b3b5a08c1d2d404b
MD5 d3c84ef0426e717fa5c6c7fa677784c5
BLAKE2b-256 c533c3203ca02e82c09e08820c7e8c8604b2dda8b55b0ab2000756cef251cf54

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.10.0-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.10.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.10.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 95311885f91119598c0ddc4fb57e41f6a4aef2dc4dc9e7015f0cadee9a70f2bf
MD5 b2578e69bf4010d883de48e6d9924092
BLAKE2b-256 114a1c0fd92b2c1edfc1d9e13255f8347b413fa096791fb124a2955587710f88

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.10.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.10.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.10.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 b93ab2003970f1604f3db98cdf0aec05790583ae0a30dca5f8f3ec0d28a8bfa1
MD5 198886757c50b95d5c32dbf25892df36
BLAKE2b-256 bab098d99c4ccef57d8fb8b0f69f9555504456b598ddf3769a0751f2d7144486

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.10.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.10.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.10.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5a6959da6f2466beb86607ba5505f017904b7241251a8b403db89a04a9684d25
MD5 32e09c762cb51addedd20ea7c628109f
BLAKE2b-256 8e85d315b4206db2546b113d4587a4146861188e3180a9078557fee585cfd1ba

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.10.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page