Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio(cls) -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count(cls) -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-2.3.1.tar.gz (378.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-2.3.1-cp310-abi3-win_amd64.whl (5.3 MB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-2.3.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-2.3.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.0 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-2.3.1-cp310-abi3-macosx_11_0_arm64.whl (4.8 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

dataframely-2.3.1-cp310-abi3-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file dataframely-2.3.1.tar.gz.

File metadata

  • Download URL: dataframely-2.3.1.tar.gz
  • Upload date:
  • Size: 378.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.3.1.tar.gz
Algorithm Hash digest
SHA256 70b937d9d6fa30b0c60318509a1c67ce22142a9f54617825da66ad7a3330127d
MD5 7c5c3576c9f2530cb3e9dc65ff955a0c
BLAKE2b-256 1a732066358030c8d0ecc2789b0c347abffec9179760a1c3cbf5a865a89eb983

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.3.1.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.3.1-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-2.3.1-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 5.3 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.3.1-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 415e2148645ed44fcd2a920a71275d77283f7efb3f19a1f4cce66537c8d9cae3
MD5 45df18675d1a36d05547dbec590e0e51
BLAKE2b-256 78574d2baa2f3b0d28d0cb2515ad1d94103d5cebed5e62dd5eff7f329b17d39e

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.3.1-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.3.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.3.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 faa92e74c4afde4509ed10dd6bedd5e98a38ac835123e9fc8d7f0b4c76e0c059
MD5 8f9d51e6e1273d76c56853f7ca85b9c3
BLAKE2b-256 c39b926612ac0cb46d2b1f606c69485e514afabbb44a902828c25777978e768f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.3.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.3.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-2.3.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 4e26e20d0f8d9e70fbc5926e6c1398951eb6dc74e345832f132fe1ca3e972500
MD5 60487c5771fe149263b787fc546c1d92
BLAKE2b-256 73f2235f1ac9fa998e6e08b500d50f24c35719bb3e8d40f288b3934a05770ef5

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.3.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.3.1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-2.3.1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c65ba3c6bf4e30586fc85256618921162712b2fcb80bfd8304f5a99c384389fe
MD5 4dc62618bcbd453657fa2257f0dcb66e
BLAKE2b-256 e4140b58c86fe685c3c3c461cb2bf8934d1db37b40ca4789c83cdf9e1520b1e1

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.3.1-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.3.1-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.3.1-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 c4685865392bbe5c7dafbf06394f9779d25690d37fa4f87cc0196019504101dd
MD5 403e6c02d3dfa8cde973afd711a699fe
BLAKE2b-256 fab79034788ccc16cfef1c13bcedca7bb80b45a93a7eee02f13ac7e34dbc58c9

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.3.1-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page