Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio(cls) -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count(cls) -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-2.7.0.tar.gz (394.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-2.7.0-cp310-abi3-win_amd64.whl (5.4 MB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-2.7.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-2.7.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.0 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-2.7.0-cp310-abi3-macosx_11_0_arm64.whl (4.8 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

dataframely-2.7.0-cp310-abi3-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file dataframely-2.7.0.tar.gz.

File metadata

  • Download URL: dataframely-2.7.0.tar.gz
  • Upload date:
  • Size: 394.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.7.0.tar.gz
Algorithm Hash digest
SHA256 752a39abff4494374040dea3662ba5fed63217c9467d8ca52b68e4587988f6ff
MD5 29bc84b2d5ac08f100746d422f529d38
BLAKE2b-256 96140416e93e85d2e2c485654a6d0e2c3e207e7fd4e7bbdb95cf92f2cd386859

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.7.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.7.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-2.7.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 5.4 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.7.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 ab84c8bc74ae41cfc68f5e9d13968d1583d5449ccc9e2f85053a362e72eb9024
MD5 75d707aab1ef6a7f663e26079ea2c0ee
BLAKE2b-256 4b35276f76f675f6aa6daf68aa136521958f2305d2d53f28dd55ad000956df24

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.7.0-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.7.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.7.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6fc8850a27abe357f03f8a3f7dafb6895c3e13ead160dba67ca04d95e91059ac
MD5 20f343119a5a3f1940ca829515d41072
BLAKE2b-256 7bf5a383d8c068e546f081a058298695aa80d43a9142c1bd8bc0cac70da1abce

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.7.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.7.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-2.7.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 6f726c49fe907d87ef121b3a24050dbe155b6cf2a98c48dee999d30e5170097f
MD5 c69a1721bfb4cc14813a8e806894dbd7
BLAKE2b-256 e443270ec5bffc8bbd4b392ae004615f58c887c5a4eaf0ea43c28755c20a38e6

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.7.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.7.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-2.7.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 492eb798d336720e7df82ea794a04ce7d8eaa5c23a9e346ce3918addef579da8
MD5 cd7f02c2f5fcdfca3bcdacbfc70307ab
BLAKE2b-256 46b1c2d4cc98a72f41ad142a3c2dacb400d61eeda973cccae492dec473598315

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.7.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.7.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.7.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 9cb9717fc4ba54e718468c019a7850672caba4b3d770f7fd729c74f74ec54eae
MD5 2876bd10837d9b91ddf327876cbfe6a2
BLAKE2b-256 bf85ed400079179f8630be56ab22c1ea082c1da3433dddffcbf31dde33f20594

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.7.0-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page