Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio(cls) -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count(cls) -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-2.10.1.tar.gz (454.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-2.10.1-cp310-abi3-win_amd64.whl (8.6 MB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-2.10.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.6 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-2.10.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (8.0 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-2.10.1-cp310-abi3-macosx_11_0_arm64.whl (7.6 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

dataframely-2.10.1-cp310-abi3-macosx_10_12_x86_64.whl (8.2 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file dataframely-2.10.1.tar.gz.

File metadata

  • Download URL: dataframely-2.10.1.tar.gz
  • Upload date:
  • Size: 454.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for dataframely-2.10.1.tar.gz
Algorithm Hash digest
SHA256 983c73d21654b8ce8f1ecfdbd4e5f5f35b3a44c0823a562c46558eb915261cce
MD5 e6e892c0b66cdbf77b825a5416a237c3
BLAKE2b-256 449a6e49739c5fd066fa95cfa0252a55c919577dd38b4ae41d5a7498629cdab5

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.10.1.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.10.1-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-2.10.1-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 8.6 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for dataframely-2.10.1-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 ab7f5ce91df543c9d7cd541e551fc78f0f0068c09c8c74cc707e5c35da992224
MD5 a29d183692c69a3b413162561d089cdd
BLAKE2b-256 dc37e552f7357f0ed3ddff98a4419591c7155a786a7b6a4a5ceebc905443ccf4

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.10.1-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.10.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.10.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4776d8738ef439a9fda3184338f61a608431c5a0b2fcbd2ac74cc2ecff1fdfe9
MD5 835172d335589c8620da326999388268
BLAKE2b-256 58ebd7f97b91aa2b27e28693fafbed70d54af1683b7341c6d1a546399bc1e8cb

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.10.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.10.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-2.10.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 436619c825c65ae56ca6702b1b544e6bdeae89e9d9876846c1bae5c5e56d619b
MD5 593fd0f614973492ab3f86cbaa9d3a3b
BLAKE2b-256 3569b0634d7e839809505033bfadf17fa3d11384c8b6f12c085409d1d518f505

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.10.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.10.1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-2.10.1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3240dcdcf00b9f69da23dd517de5967bcca9e2be1ea34da4e2c3a0a5688381cc
MD5 533b1dfa9115c6aee47b3041c8347ac8
BLAKE2b-256 4a3d164066237b600c886e35d62cd4f373e6dc25703b707b1416454bce3c4abf

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.10.1-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.10.1-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.10.1-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 20fbff132de54934fce4444c55c3b86a5562103011d1622c06b8d6f9c7c67750
MD5 941d2bb7cc35b710596cc4aa73fb7b0d
BLAKE2b-256 a80e7e874a446e15aa755653ecf3546c99fabfbb8e83e6be825bb2736be08617

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.10.1-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page