Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio(cls) -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count(cls) -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-2.5.1.tar.gz (382.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-2.5.1-cp310-abi3-win_amd64.whl (5.4 MB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-2.5.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-2.5.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.0 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-2.5.1-cp310-abi3-macosx_11_0_arm64.whl (4.8 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

dataframely-2.5.1-cp310-abi3-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file dataframely-2.5.1.tar.gz.

File metadata

  • Download URL: dataframely-2.5.1.tar.gz
  • Upload date:
  • Size: 382.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.5.1.tar.gz
Algorithm Hash digest
SHA256 2f36e5057b14ecb63963839c392fc21437d38144211c96288d93b6e5a80e8a60
MD5 909ef94a1b24a0158c3dcc32a63f4624
BLAKE2b-256 feabf6614331e6b380432dc2891a61defe2a6c80b100ed453c3e043d37922bee

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.5.1.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.5.1-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-2.5.1-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 5.4 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.5.1-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 4f08630ff79de4357e475f2d79d099fc7e95cffcc4a5f711948773ef7f8f9317
MD5 6b0c0d842436a25940c8e7949e650024
BLAKE2b-256 56053448deaa97fedadcf82850dafba983ef2d600cfb9132b2b259d488b98f00

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.5.1-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.5.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.5.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9895c0b2cca881ce9dd6ec6aa8e79e05242da5ed4436971e457e9ad9ebcde3a4
MD5 2cf4a681d8d22f84ef464ee152dc9a09
BLAKE2b-256 acf1125cbdf15880c14ae55aa1514226180915e55d1310f59d0095ecb265a3ee

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.5.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.5.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-2.5.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 8aacf0b748e5c8cf0054e6703a68c337e30c11d77ee4ebf6f91f9077eb264670
MD5 6962aadb2c103f3dd4d6bc2223525ddb
BLAKE2b-256 c7da8e26a20927266a962ad0068da8e9455a222376b76e5deb6de4338e6b487f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.5.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.5.1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-2.5.1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 67d7fb0b22f39032a521738c1648748b12594a4b7e40a8e3c2af7a8adba574b5
MD5 0ea11973f497a106ecae328a6745cb45
BLAKE2b-256 7c6e322067d20d6281a729df639bb62fc1220668c78c65f5b2f8066e8885a22f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.5.1-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.5.1-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.5.1-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 007e497e5e0a0db88874f2cf4686d42b12d07fa6119b913cbed4c2a59bc48393
MD5 09ecdfa0c4c28e8edfe2cfe259558965
BLAKE2b-256 835db7df8c879b5e9855e2037015fb6b4511c35a1f151749173632847acd6a31

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.5.1-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page