Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio(cls) -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count(cls) -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-2.8.0.tar.gz (416.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-2.8.0-cp310-abi3-win_amd64.whl (5.6 MB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-2.8.0-cp310-abi3-macosx_11_0_arm64.whl (5.0 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

dataframely-2.8.0-cp310-abi3-macosx_10_12_x86_64.whl (5.4 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

dataframely-2.8.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.6 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

dataframely-2.8.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.3 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ ARM64

File details

Details for the file dataframely-2.8.0.tar.gz.

File metadata

  • Download URL: dataframely-2.8.0.tar.gz
  • Upload date:
  • Size: 416.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.8.0.tar.gz
Algorithm Hash digest
SHA256 35931e3aab12294a913ce6ca2a69e046957b87d5bede1ae33481deaf7591094b
MD5 3d6f1f26fc9a96b3d7000d92be91bb7d
BLAKE2b-256 f74dcb16dff7dbbb694afbb872635c838bc86705e0eeb5dc2e29dba6d4687782

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.8.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.8.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-2.8.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 5.6 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.8.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 aad9c8ab8779c464b8f93d5f2a5e3f47de2cda18e85b1325df3dfb6b4e4d8081
MD5 b7394c8636f055c3b4924dea29d2e2b3
BLAKE2b-256 4ec48245692503e5b95204df58ca9dca8cfcb327dbcac968fe498ac5674ae206

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.8.0-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.8.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-2.8.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 346dec97b1ec9660fed6c574544ebf724c24ccd0ee2d9a39f161002f985447a4
MD5 6c3e06e85ebf8c3d8d810d458e967648
BLAKE2b-256 880ac0cff1d709d2afad60cd80c1840f79a218747f522760efb2e4f61cf3b92b

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.8.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.8.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.8.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 f6cd0cb8ba911cca2b1003280fde25aefd6e3cd2eaf4addec1f5467f7133f61c
MD5 63dcfc00268fbc07902fa293a757b529
BLAKE2b-256 cc7aeeab60d2f295c492c970f82724bb9bf9b0312bd081856b2a5633d74d9b24

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.8.0-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.8.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.8.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 49c215be2b37f562cbf0c7fba05c8b6704dce1034e1c25e97deec70aa7cacf5f
MD5 2b5fee3d7ac4fa301702ef3b77004244
BLAKE2b-256 3470811c27255a7a1b270cb4abf5a560ec7058a669771aa1a6c8f93d8dfb80bf

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.8.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.8.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-2.8.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 20f337ef67f0ff6753b86e5f9d80f58d3b7eb28c03274ae7b42e50fa73dbcb06
MD5 1a2e908e969a00370c6e51dfcf3b4f47
BLAKE2b-256 c1c4dc87ce31d48857fd36a50f41323d4e8cd3c28bc1e771d22cd1df55e9f045

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.8.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page