Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio(cls) -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count(cls) -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-2.8.2.tar.gz (419.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-2.8.2-cp310-abi3-win_amd64.whl (5.6 MB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-2.8.2-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.6 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-2.8.2-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.3 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-2.8.2-cp310-abi3-macosx_11_0_arm64.whl (5.0 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

dataframely-2.8.2-cp310-abi3-macosx_10_12_x86_64.whl (5.4 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file dataframely-2.8.2.tar.gz.

File metadata

  • Download URL: dataframely-2.8.2.tar.gz
  • Upload date:
  • Size: 419.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.8.2.tar.gz
Algorithm Hash digest
SHA256 fb3c78a88552ed9e3fc1f1b6db6fd61dde93afce67978eca3d23c24504b35ffc
MD5 7c1bb61f6f34e9c3044861a76a0f6b1f
BLAKE2b-256 cb514ee1367c82139aa96bb465e3fc8327082b29adad82e94156b184eb5d86bf

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.8.2.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.8.2-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-2.8.2-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 5.6 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.8.2-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 6e69e7244e82a147cf58c25b0fa459ee848ddc37d13e746a655db45f8e66eb8f
MD5 f63c80849b9c8cf5e235ba989eac08fd
BLAKE2b-256 d73a10ed5be11a463901f7f0149ce749b103ae0c11efff3fc92bce78548e5f95

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.8.2-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.8.2-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.8.2-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 478ee45e5a61f0145d888a5f903f5fb727fd562c6b97779650cd95d74c563579
MD5 b83cc41aadc9eb0cf21aa582033fc4ba
BLAKE2b-256 c912752819e62e2983ebdd0d75b940b12da1aacf819d983349ef9d9995a8bf16

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.8.2-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.8.2-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-2.8.2-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 f39bf75d96e094f0d359682bef4c90e26a21a192c6c541fc65c8a2f3244258cf
MD5 c67d1ec643cca7d45602c4704169e239
BLAKE2b-256 c7c53e95a17ac55546ddc79ce66f60298cc01bb3c35e29cfb1c3a33e64044c23

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.8.2-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.8.2-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-2.8.2-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 02e6e16675d96959b8dacc8ff1eae40a37b3136f60ce674ddbc39caab85ea6ea
MD5 8ffe938228d0cad9906da34c66decb65
BLAKE2b-256 65b70e42caafe2cf1717c4691ead8567dde61687a7f192a496a2f74d165fb96d

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.8.2-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.8.2-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.8.2-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 994b1a5d1f642ac60f7b0efed1b67a995411546aef8dfa5c7bb33e80e5c4b3ac
MD5 6ba41288b6dd87bbced053051856c645
BLAKE2b-256 be3486d71906343811819eef6407948a4ac4b143d11b431adb555c5ed0c9641a

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.8.2-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page