Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.14.0.tar.gz (305.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.14.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl (525.8 kB view details)

Uploaded PyPymacOS 10.12+ x86-64

dataframely-1.14.0-cp310-abi3-win_amd64.whl (429.8 kB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-1.14.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (553.0 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-1.14.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (554.1 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-1.14.0-cp310-abi3-macosx_11_0_arm64.whl (508.4 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

File details

Details for the file dataframely-1.14.0.tar.gz.

File metadata

  • Download URL: dataframely-1.14.0.tar.gz
  • Upload date:
  • Size: 305.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-1.14.0.tar.gz
Algorithm Hash digest
SHA256 3135c4475a3906a4d62d845d20da7afafef0bdec268c322d1f54473bb852e087
MD5 23be9c36b38c8bb76c0c584251673fd3
BLAKE2b-256 f12e90ab7650f822e43838f31b79201e104991593678f6ca89eae36c2fb2c34f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.14.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.14.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.14.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 6d95f430b2b15eaab64b0e4b746f5cb2b4caf82334cd70015a9a674b1a5510fa
MD5 37346cbf93f108f4f6a781a6a0b172bf
BLAKE2b-256 cd652c4696a9d96a3b008156a99a93b8d80ff92eae8dd463959934ba191f41d4

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.14.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.14.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.14.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 429.8 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-1.14.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 de4de51c954b9efcfd33e5d4725b12e750c394ab5c9577c6d20be6b01a8fbd76
MD5 0dbddadefb9fa90d140d85abfc884ae4
BLAKE2b-256 3164061f4220892fe18b4b53840e85869eae015306c4cc8d4077a276a564d326

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.14.0-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.14.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.14.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 95a6b53f40e74764811c93efa9af98a66b05fd85e5a12d609acb5ae0715e8dee
MD5 6f582cdd417c5c44cf76a2b0db4ad2e1
BLAKE2b-256 d999a0304789af90d9c9c84a221df8e6dd12759c11832ba20ad9445436d740f9

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.14.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.14.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.14.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 38a59df983e26531ba6303101570c3589f9f28158705a6098df270c390aa10a9
MD5 56a2789781d8c5310edc1cb83a3a44be
BLAKE2b-256 4c37de92bbbe5e94f6db9f96ed326e0d904f7fb3cd938ddbe6c1da08c4e49230

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.14.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.14.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.14.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d3d16cbfdcd05d497e904794d082aad9d83bd443573cfe20621104195e62b13b
MD5 77dba47e175382768fcf8ff168fe53c0
BLAKE2b-256 34c5a703f9b2b4e398b8f831308159824543675ec2cdefb5795e36a91f9439d6

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.14.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page