Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio(cls) -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count(cls) -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-2.0.0.tar.gz (379.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-2.0.0-cp310-abi3-win_amd64.whl (5.1 MB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-2.0.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-2.0.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (4.9 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-2.0.0-cp310-abi3-macosx_11_0_arm64.whl (4.6 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

dataframely-2.0.0-cp310-abi3-macosx_10_12_x86_64.whl (5.1 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file dataframely-2.0.0.tar.gz.

File metadata

  • Download URL: dataframely-2.0.0.tar.gz
  • Upload date:
  • Size: 379.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.0.0.tar.gz
Algorithm Hash digest
SHA256 0d67cbe1f98c806147f8c471641205a6b868fa3d58406523b8af927d21f3f503
MD5 709a9afbd4ed79cfc2e6de0e73e66cc8
BLAKE2b-256 ba276cffd0e57ed376bb1fa19adc534f3bcda0b3a1ac0a1fadc9f66a9f3a9639

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.0.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.0.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-2.0.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 5.1 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.0.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 cdf3ee0a0f6b22c911c3b9f37005f3f231865bfe0ba3b174d6a07ffbcd8156f2
MD5 db7d184d4c6773e633d9d27cc8caecc8
BLAKE2b-256 46fa47ee729aae110f8bcbce368c27beb8c6c235e5faa28abbc45a1335947333

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.0.0-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.0.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.0.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d52d35c2fa7c0a13a96cc1826abc70f3a1e6f3d2db7cb60d10ee99d7b6a581e4
MD5 de6716d86bd99a5e2fc9b38917b8a454
BLAKE2b-256 945bda3e26bab6c3f41ca75639477404eeee727000c99949fdf1fc7cbe448f4a

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.0.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.0.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-2.0.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 f4d3411297d88cdbe945d6d7f86825c5f38cdf0b1544c20e5b48ead3b901cf7a
MD5 797643fb45e022d5caa6b78bb8f54389
BLAKE2b-256 892ad53621ca3818ccedcc37a80bb2b960f51b1f190186adc645390d7d14f591

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.0.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.0.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-2.0.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9c7fbf4984fe4ee34db15844636e19d71ddf6b641b103b4e751ee0416ba64647
MD5 373ad9614f1c59f424f8399ff6083251
BLAKE2b-256 74da88c5261c7b6ede89a1737078963e8fa0eda8bcea99cc8354365a09fefdd6

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.0.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.0.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.0.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 2ce33d5f6958006ae15ed5b71a2ef76218595bddfbace97d3f990a98f0bcd40a
MD5 66dcfc43a52dabcc3d0ba928a1797b98
BLAKE2b-256 e5c4302c86b382f9c9d374011125bca7f14163e1258ba478d99e50e60bf788b5

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.0.0-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page