Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio(cls) -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count(cls) -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-2.9.0.tar.gz (422.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-2.9.0-cp310-abi3-win_amd64.whl (5.6 MB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-2.9.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.6 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-2.9.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.3 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-2.9.0-cp310-abi3-macosx_11_0_arm64.whl (5.0 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

dataframely-2.9.0-cp310-abi3-macosx_10_12_x86_64.whl (5.4 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file dataframely-2.9.0.tar.gz.

File metadata

  • Download URL: dataframely-2.9.0.tar.gz
  • Upload date:
  • Size: 422.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.9.0.tar.gz
Algorithm Hash digest
SHA256 5a90c61cf3b76c76c2902ce1c6a9a277ea43e735b0c036bbd6e66488f6f2d6f3
MD5 7518e18f8aaaa9a69d35ddf9c5bb6ec1
BLAKE2b-256 d03aadb30c3902fa5d3bb08721718ae3307bbf974f0d0c9fcf8103893389752e

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.9.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.9.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-2.9.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 5.6 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.9.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 185315cb90aced547bd0eb6f5381ba2f9fb16960ce9ee6ed550850a100bcbe24
MD5 0d638b2b55e50b7652770064d1712de5
BLAKE2b-256 1aa11ea0fe4a56230c5930ac29e6a444ae11ea0244d2cd8fbb973de2c48f0a08

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.9.0-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.9.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.9.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e5b66e7786a37b653262aa2956ce6d94440f81a7395c7a57c96796d20b9e6e87
MD5 518ef454c8edbd283ef33cb8fd6a6639
BLAKE2b-256 308d9fecb83126023ce598046b3d049053c9a494ed6c2ef9463be6e66ad04fe8

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.9.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.9.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-2.9.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 d07a9072c6670b3db2d01a3a945b444b49e91e7ca3d329e8392367ca0d998cd3
MD5 759fffd1493b3b13e6444c4fcf697e48
BLAKE2b-256 6dbd581e795c94beb7fdc0f2863aeca26e5c1c4d2c81bd2773d0f7dda965f79b

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.9.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.9.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-2.9.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 44054d8821c45e9ad214f25ee781c7e775837214947e38105a996a8d8e9a77ea
MD5 e3026a207ace81e7fb0e519423b76016
BLAKE2b-256 fe790dbde4c86973ad51bf75c629b8c2db16614239405cd452a5f997f090fbdc

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.9.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.9.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.9.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 27e2634863eb9c11f503ab9c158bf3527c51b6e307dddbd625e6eee7b7c80a11
MD5 9ce687d0f41ddf1725976b737e99a76a
BLAKE2b-256 33f2f822390eeb9c00812f6db0260268cedd5c8626f5ac187e26ae6b269fa272

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.9.0-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page