Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio(cls) -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count(cls) -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-2.9.1.tar.gz (422.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-2.9.1-cp310-abi3-win_amd64.whl (5.6 MB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-2.9.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.6 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-2.9.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.3 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-2.9.1-cp310-abi3-macosx_11_0_arm64.whl (5.0 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

dataframely-2.9.1-cp310-abi3-macosx_10_12_x86_64.whl (5.4 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file dataframely-2.9.1.tar.gz.

File metadata

  • Download URL: dataframely-2.9.1.tar.gz
  • Upload date:
  • Size: 422.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.9.1.tar.gz
Algorithm Hash digest
SHA256 450dc40f9c7d442e5493db45ece18861c0609ff12b0df4ac32c05b47f543978c
MD5 f7458cd4a829bc72174b695231f6b351
BLAKE2b-256 e18ee58ad5c3ab3d3f09a4d3cc49ba066c6f41b716b8a2d88b35ec7bc7d2e0f8

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.9.1.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.9.1-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-2.9.1-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 5.6 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.9.1-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 4c6f79012a3f1388f6c53c2b991e03f26b4c58bbc0d6055a585342c4bbc093bf
MD5 1d7ce3e43e71cdefef817bae26a3057d
BLAKE2b-256 1bac495db153170034549ffbb4b4edbb264309f21c7078afa42d37ddb96881fd

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.9.1-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.9.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.9.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d040e432944632e3ebdbb97710ec54f2abac4cc0fb8c4ca1e0eaa9612426a114
MD5 604de248b1de3870bea1b417496fa37d
BLAKE2b-256 590dbd6ac8608749658e4f46ee2af31ba18717eb38a5815c5e594caf8d51335f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.9.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.9.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-2.9.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 fb73dc25bc288ef177a0499d9a9bb34e9538238681b6351d926d7c1df49470c9
MD5 1c7e0ed5198e553acdd8bbaad7a54739
BLAKE2b-256 62ca2632f806f430adc8af37a6fa7c809c6e20487a5d480d01e8a6dca2c28744

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.9.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.9.1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-2.9.1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5dfa28f36151aabe4f42683766660741e8824cd5d29d4c061284681aa796ec43
MD5 cf185aead3dffdc8b9de53114a00f821
BLAKE2b-256 c5c4445984e535e6f9e7ae9be0557d91b0a60af8a4d13d04fdf0b5938845d351

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.9.1-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.9.1-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.9.1-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 37c365860eb7bf3e10c24f9f04af6c217c0b91e7611070334152f23487c79708
MD5 412f494c8b9f6b1391ecb8b8ad8e5648
BLAKE2b-256 fce00f432cf872dbeeb5e2b3f24aff0b9f4120367e33f575f35491e9cb4d7d72

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.9.1-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page