Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio(cls) -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count(cls) -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-2.8.1.tar.gz (416.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-2.8.1-cp310-abi3-win_amd64.whl (5.6 MB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-2.8.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.6 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-2.8.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.3 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-2.8.1-cp310-abi3-macosx_11_0_arm64.whl (5.0 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

dataframely-2.8.1-cp310-abi3-macosx_10_12_x86_64.whl (5.4 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file dataframely-2.8.1.tar.gz.

File metadata

  • Download URL: dataframely-2.8.1.tar.gz
  • Upload date:
  • Size: 416.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.8.1.tar.gz
Algorithm Hash digest
SHA256 d96c6497432f1e5d825eb7880f4a6989bd1778b6a2953357a5c6859bb90f10c1
MD5 3c1fce6c889d286439692d9f01a629b5
BLAKE2b-256 c65112c248cbc1b2783daceb203c9904a27a6c3e83fedf3acf87c6b5eed8b400

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.8.1.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.8.1-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-2.8.1-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 5.6 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.8.1-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 488f82745af1450647633713d55ba7571c1b9832be9bcd861d9278bc689c6431
MD5 14611452a9995ed0426aaa6a206aa326
BLAKE2b-256 29d2df3f89c4271f0d07d3ae4d6042802063e8905cf5077dba40847f49c4c139

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.8.1-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.8.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.8.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 236580d0edfb9aa2353469f4ff66044f2946967858bb16cf3bf26f0560b2cc5a
MD5 7c4813edc0ee06f929bbb25fd74f9089
BLAKE2b-256 5f37d17e17e0694d3abb223fe99ee9fa1e5571e58159663c412fd69c26d59211

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.8.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.8.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-2.8.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 9739c23acebd2845e46d40a5955da29e8c4b0535a4dcca32c41978372cc4c307
MD5 a23804d71bdbe7a42d4ed3db7d77aa76
BLAKE2b-256 91f67a312dd8fbfbd87b5aeb031d965d7180056a3c095297632fcfb9fcde07ca

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.8.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.8.1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-2.8.1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a1f305c5ab3cf0fed87a90ace616e8b1c64b73312a628dd8d8468e204f8d8b50
MD5 9592cfb962a1af9c988cc1eb74effbe2
BLAKE2b-256 a9f36928f1cd90ea668431c123dc69fc2d6fef3b95b164834aab666e0c1bb1bb

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.8.1-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.8.1-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.8.1-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 5abab5fb9bd017d174b3a65db4ac4b7bafd51b1839001413f998afe56335c8b9
MD5 e884f9ef5cca1325385c19d9ff9132a1
BLAKE2b-256 e75ebe2e679ae5a6a8f5b53d14154c4b2b2e57a502d346b76d9765d9c322554e

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.8.1-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page