Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.8.2.tar.gz (300.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.8.2-pp310-pypy310_pp73-macosx_10_12_x86_64.whl (512.1 kB view details)

Uploaded PyPymacOS 10.12+ x86-64

dataframely-1.8.2-cp310-abi3-win_amd64.whl (417.2 kB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-1.8.2-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (548.5 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-1.8.2-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (539.7 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-1.8.2-cp310-abi3-macosx_11_0_arm64.whl (496.0 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

File details

Details for the file dataframely-1.8.2.tar.gz.

File metadata

  • Download URL: dataframely-1.8.2.tar.gz
  • Upload date:
  • Size: 300.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.8.2.tar.gz
Algorithm Hash digest
SHA256 60065ff2f9b6c0dd6e6d1eff4879b66ea1f1ebd214208d75482f6c93865b0f5e
MD5 5ef52e36bcd3a8e69181c05e1909c92f
BLAKE2b-256 a6dbb5c3e3cdc586d56e954250f3623d1071cc5389583980d2256d0993b96868

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.8.2.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.8.2-pp310-pypy310_pp73-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.8.2-pp310-pypy310_pp73-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 589e5264e54dc944ee51b84dfcea1ca203d5c5504e9aa015aa0730446168a120
MD5 d4bed9bff7878917e39e71515f3bfc43
BLAKE2b-256 d14671aca1133e93be5ca7de07fec807d109346c6a470c259b55929a53e4e4cd

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.8.2-pp310-pypy310_pp73-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.8.2-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.8.2-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 417.2 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.8.2-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 0089579a01904438b4986214413b5d41e948fe23a6048b3d9f9f41bf383faf41
MD5 44b0091495e7f094f58d7ef1a3a3beec
BLAKE2b-256 c11a056e9e5682710ee2de7b73a391ab4e4a82a5a0969eee591033c0708b4cdd

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.8.2-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.8.2-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.8.2-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 75c6db521a3b6081ec87a464a6bf4b98680c9a32c54baa68474daab11ba1edba
MD5 8a98d84a56cf1759413892ffdc725620
BLAKE2b-256 a40f50c5dac1692e5e8a3d92a3f517098382432e43aa4e5f0886dd0122cbd9e2

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.8.2-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.8.2-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.8.2-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 6eddd62d67c7fef02ffd1318b9a06d70c0edf10b3dec01f341ea1f38c4682027
MD5 5981ea3fbc6dc810786bf93388ae9eb7
BLAKE2b-256 a41839ef28a27649ae1e2fc21ecc58da83a92df36f8e57b2c14a760c42ea1b97

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.8.2-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.8.2-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.8.2-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 106f721e381be2942feda5ce1f68e501146ea9c7d3d2245d4df9f62fcc2cbdcc
MD5 a2dd0ebe1c419e82784cc1e79ffa29f2
BLAKE2b-256 c848800e9cef8d9d642128e8cb5a51e768e2e313e54387ea3bb49dbbed3b05f6

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.8.2-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page