Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio(cls) -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count(cls) -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-2.6.0.tar.gz (403.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-2.6.0-cp310-abi3-win_amd64.whl (5.4 MB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-2.6.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-2.6.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.0 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-2.6.0-cp310-abi3-macosx_11_0_arm64.whl (4.8 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

dataframely-2.6.0-cp310-abi3-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file dataframely-2.6.0.tar.gz.

File metadata

  • Download URL: dataframely-2.6.0.tar.gz
  • Upload date:
  • Size: 403.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.6.0.tar.gz
Algorithm Hash digest
SHA256 43c7eadf12b4a62e69ca85132f37360314619918e9892f25ea80b9674b267807
MD5 5ad98bb8f59fc2a622051215b2cfde2a
BLAKE2b-256 b47773aad50970a2566d57d6077a1880fa013a48963326ff48adbabda62a8d32

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.6.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.6.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-2.6.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 5.4 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.6.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 01cc13d6f7d48c6e4bf628c5118fd1be89b49b08d446b44dbf67f24a476b699e
MD5 8d1a89f6c43003bc91994218d1cd4de3
BLAKE2b-256 6a0106696a9cfa49426e973a0d3c44e6a32654b44275429c612b90ce167d80a2

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.6.0-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.6.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.6.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2d5615c5cce8d4aa19149bb3d9ce498e22596b78c1b855b822ed5097b30e5d41
MD5 1f204c9ae8ef599512ef6f37b0b637d3
BLAKE2b-256 17b63d814af025f94f9dc0c2200e3bd04a5d5f05ee0af6a35a61c03fc0826f80

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.6.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.6.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-2.6.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 61cb934b0e6272e322b76cd5ca82ccda709913d453336be4b7828c90b9c23712
MD5 2001e6927a4bfcd77f9497c9de06c1d6
BLAKE2b-256 fd6ee1695173f958f59861738610bdf580a8f8cced681e2257191b4e94d04249

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.6.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.6.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-2.6.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a3616a8cf5dfbf5851a05a5b5b77b8845d9696f87a890436b4077efaa0b11dd3
MD5 0910d396dce41e5bcc51608625543c01
BLAKE2b-256 87162d1be2e73cbc5eac2ce6cebfd6b5267590d9423c73ff0d258df8c996b512

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.6.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.6.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.6.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 f2697724ebfd68bf3a7c52c1afb939ca9e46dd1243c14d7770f3093de372648a
MD5 ab94c1dec1041d69e53409dda0b9c1c7
BLAKE2b-256 97604fab9d99542366d50c55eb43616b9fe5075effed90928108aee0477cf521

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.6.0-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page