Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.13.0.tar.gz (310.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.13.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl (525.5 kB view details)

Uploaded PyPymacOS 10.12+ x86-64

dataframely-1.13.0-cp310-abi3-win_amd64.whl (429.4 kB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-1.13.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (552.7 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-1.13.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (553.8 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-1.13.0-cp310-abi3-macosx_11_0_arm64.whl (508.1 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

File details

Details for the file dataframely-1.13.0.tar.gz.

File metadata

  • Download URL: dataframely-1.13.0.tar.gz
  • Upload date:
  • Size: 310.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.13.0.tar.gz
Algorithm Hash digest
SHA256 02d679399f613be25622f0196eb9a267c43789d1750fd615ea3392e6901d2640
MD5 475df631517ff2f44d9c0f07bc90ab7e
BLAKE2b-256 d3feb694bfc8e11d5a82d519043c3ee3193b0f127fabdaa5eb17a47fde94f778

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.13.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.13.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.13.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 41d3104c9ac4e91b0055d11436241aa36c305376c89c0b43364d0bf9aa6c2121
MD5 31074c3ac4bde0491d24896e868a99bd
BLAKE2b-256 bb93e4ad495398a58bab9e5e01829d9a371b476974cb7d49e13b6e8cf444065e

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.13.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.13.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.13.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 429.4 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.13.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 5f8e45a871d8372f7c6de456eb05ffc71f902f379f8dab280dc5de64813921c6
MD5 e1c790c03a496dffd46c7e172af8cec5
BLAKE2b-256 1571ba79bc936b69d2fcd3fd38d955d2a3d835f2ed3ac005258a1d887968da39

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.13.0-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.13.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.13.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3e7104fcc492586a4a2a9a1fe6674ca913a31d23a6f6902612c57966ec245904
MD5 e8bba8d4771a5812eaa338f182a5b131
BLAKE2b-256 592ee64ec9bd09a459d2fa6836226b54683157aec95747ed34fb8f481f4a471d

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.13.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.13.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.13.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 6bb5b5e30d2c1b0744b276458f7343ff44848399939b06d4b7048f626d84d0bd
MD5 5fdc65b6658cfb98aa5cfcd7e2c923e1
BLAKE2b-256 fe247558d581986230dbce5f67cf8042c099a878cdf806f34ee0b6d321003ce3

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.13.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.13.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.13.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d5929fd8ccd061dc23325a7673c463ac6131022d0fd2ac6925e75ea3c9b402b9
MD5 ecf7163bc1c31560763110a78431b087
BLAKE2b-256 5773a9ed820f17fcc373f9085c6159e2be65dcd93c9e4cccbebc51798273cca8

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.13.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page