Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio(cls) -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count(cls) -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-2.5.0.tar.gz (381.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-2.5.0-cp310-abi3-win_amd64.whl (5.3 MB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-2.5.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-2.5.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.0 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-2.5.0-cp310-abi3-macosx_11_0_arm64.whl (4.8 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

dataframely-2.5.0-cp310-abi3-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file dataframely-2.5.0.tar.gz.

File metadata

  • Download URL: dataframely-2.5.0.tar.gz
  • Upload date:
  • Size: 381.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.5.0.tar.gz
Algorithm Hash digest
SHA256 307073154bda665fd7bb1bfaa413bbf60f3479de758dd85ad3bb5ceaddc4be1e
MD5 afc5d0464be2eefb96701270efd3a7cd
BLAKE2b-256 3a1c1ccb3d5f2aba4ba35a26af380f88cdc91c8028ac8e59bd8a72da70aa2bc6

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.5.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.5.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-2.5.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 5.3 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.5.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 7ee93aa32e6c7ef7791008a9a363b40cea47175e8a8b56833e27646a4d9b9082
MD5 f39248e707f686106745aed0ab4db4f7
BLAKE2b-256 1acac3d837fd4e81f688af15ef9394e8964cf322974ab825559cce81ece304f5

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.5.0-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.5.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.5.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ccbc5d0147095f146a52c56def76eb7e13d6538d366245062c7020c73c5c37d8
MD5 f577807a004bead9fa0892a7dca291af
BLAKE2b-256 99f54f8241ca8f363c13d5b9a3a62ddaa1547cfcfa45f6496d5b96a7ec53defe

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.5.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.5.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-2.5.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 a984ef8a91415ae9d34003564833720f42a216bfea9223a8df86655ec761b05f
MD5 c5d40d35664833db30166e1fb7404246
BLAKE2b-256 eacbb1da6ce9d8db15cbceb8be2243ead23c3bd14cb5eeaca81b78bf9e157445

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.5.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.5.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-2.5.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4fcfb312390d3819769298676cd694924a6c3a2908b16a129166450f745ab4ac
MD5 d4760db4108bb3fb16ac64898625a0eb
BLAKE2b-256 4028d91abd9a6c987c3ae1d9a063b8959b0df10c0434e436384d138178d24d0d

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.5.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.5.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.5.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 cf1b822a3b50c8411ce2ed50eb6fe2f386bf029d06df25a41d16ede8435e2641
MD5 9a0f88219a80a2c94e0e43dcbf2430fd
BLAKE2b-256 a580b63b9ee639ebd39113d092e4508633095163c2004dbb3f588ca5308acfe3

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.5.0-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page