Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.11.0.tar.gz (306.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.11.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl (522.9 kB view details)

Uploaded PyPymacOS 10.12+ x86-64

dataframely-1.11.0-cp310-abi3-win_amd64.whl (426.8 kB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-1.11.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (559.9 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-1.11.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (551.2 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-1.11.0-cp310-abi3-macosx_11_0_arm64.whl (505.6 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

File details

Details for the file dataframely-1.11.0.tar.gz.

File metadata

  • Download URL: dataframely-1.11.0.tar.gz
  • Upload date:
  • Size: 306.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.11.0.tar.gz
Algorithm Hash digest
SHA256 0240db7f92e3af09c73f13af92f3687db2d1b91ec4464cdf6fe0ba31a9c03984
MD5 2b9e21fad4a0e7c18f3edc1461956e94
BLAKE2b-256 0085a9d189277ef03b2c4cf18773f82e4f52612b313cc397b174a6b620719c2e

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.11.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.11.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.11.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 03b41806e21816d773284780fa0238cfaa6299dd4eaa0d252e119c0f208aaed3
MD5 c6a05cccf27bbf1362503aacf199aeef
BLAKE2b-256 9811097b4d1f4f1149946596af165460a5eceaf21d7aa46300c2020f04deadb9

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.11.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.11.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.11.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 426.8 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.11.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 8169943f618bafbeb56681a9fc3c731da6686591c490e84d164b84221a797091
MD5 7ea986fa878e43e90f9aa1fa01548660
BLAKE2b-256 460526f77e542705b42b34da0e34aa49d47e143fcaee7c9bcdfcf68ca89fdf32

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.11.0-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.11.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.11.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 81b7ff693690add5da54cb78d131f58846f3473b5d947863c708c261db11042f
MD5 5a763faf393b777f02a8f7d4cafcd4ef
BLAKE2b-256 9a508b37f048fe4bef19031e1da821333497049f006481dbda6fdb6ed5114e23

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.11.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.11.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.11.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 072a00ccc3fdaeada7b6f076d35f33ff19f57b9c1c8fe8d9eecbbf5c45cda87b
MD5 2acb22b0a7b37ca433b45db5803f6379
BLAKE2b-256 c6be872d87c6f4976d21a21c0e5051803db32e3e9a39d867895c77c61776eb84

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.11.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.11.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.11.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 49fb1506d63513d4a6246a5788abebea86396bdeee1eb55f148b375f59ed0cfb
MD5 c58c1b08d0c3888b0ed929e7da2fa73a
BLAKE2b-256 bffcb7353e1dbdde9e08b6de56eb6cff83b2776c6deca9d8852f86fe6c6ee560

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.11.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page