Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio(cls) -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count(cls) -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-2.3.0.tar.gz (378.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-2.3.0-cp310-abi3-win_amd64.whl (5.3 MB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-2.3.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-2.3.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.0 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-2.3.0-cp310-abi3-macosx_11_0_arm64.whl (4.8 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

dataframely-2.3.0-cp310-abi3-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file dataframely-2.3.0.tar.gz.

File metadata

  • Download URL: dataframely-2.3.0.tar.gz
  • Upload date:
  • Size: 378.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.3.0.tar.gz
Algorithm Hash digest
SHA256 9d077439e9a0a01937ac2560363cac00ad066f8feea33dee3a085175ee3101a8
MD5 f0f14d46109e0a94b8a93dd7fffb2baa
BLAKE2b-256 05162f7a85c93453989412ca2e4492290d0c2fdefb0e8bc421cab75b93afb57b

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.3.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.3.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-2.3.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 5.3 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.3.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 acf7b4e3c3c13a90f14a331a30ac42a517ef14a65c26690c8e0d0c88557f0ce0
MD5 296bc17f5eb3ccbab4e46dfe64667c3d
BLAKE2b-256 96bd02b4271dfaee79cee299522e2404be046bcca04c36ef0aa64626e4b8f5c9

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.3.0-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.3.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.3.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 226f98553eea90f0fb257575ca5ce7ee6124411b52cdc6548009eeb073032b3c
MD5 fcc7adb8e2bef3917eafbf2ed6e93cf8
BLAKE2b-256 4add5a20d0e90e9024abb76ae868179d2dfcd03b1b554af28be48b73cb10b8ed

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.3.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.3.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-2.3.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 0cbcc41a275906caa4601e9e29d265649c2f569c90d0b318b54c06ab4d85a114
MD5 0fe89d6bc64a97ffa32d0eb7b57847f5
BLAKE2b-256 b16675b9cdd39a7abef985db42271bc37bb3fbf90ec9ed5541c81aa0cebcddb2

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.3.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.3.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-2.3.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dd66756a46d2e8e3194e9f3c2f0de35d9afba31b78a3c197d0687566c255555a
MD5 167afd10a5f5170a0f3bc96bc19a4635
BLAKE2b-256 bd3a0577fdeb8f87d4cb98e5754748b000b513b4fc1fb333115e41878e5da9fd

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.3.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.3.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.3.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 a4624f670b1f05e1aeec5deec624d1acfe23d24a23246d554c6ef148d8bde75e
MD5 88e553df675eeca6761e4816699c0d17
BLAKE2b-256 b433019fe8958eb1e42a1c4c1d02f055e364bcc7ea7b5451bcad57f0df5b6c4b

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.3.0-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page