Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedrooom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.3.1.tar.gz (248.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.3.1-cp311-abi3-win_amd64.whl (398.3 kB view details)

Uploaded CPython 3.11+Windows x86-64

dataframely-1.3.1-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (533.2 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ x86-64

dataframely-1.3.1-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (528.7 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ ARM64

dataframely-1.3.1-cp311-abi3-macosx_11_0_arm64.whl (476.2 kB view details)

Uploaded CPython 3.11+macOS 11.0+ ARM64

dataframely-1.3.1-cp311-abi3-macosx_10_12_x86_64.whl (489.6 kB view details)

Uploaded CPython 3.11+macOS 10.12+ x86-64

File details

Details for the file dataframely-1.3.1.tar.gz.

File metadata

  • Download URL: dataframely-1.3.1.tar.gz
  • Upload date:
  • Size: 248.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.3.1.tar.gz
Algorithm Hash digest
SHA256 a88e579201a471263eaad91c008ec90ffa81afb9cad514aa32debd39a07ec8c5
MD5 35f86761683250a549b215639384a394
BLAKE2b-256 64afe04a29c74b2b8b320c81615b0d1e49b1795e388266aedc5f33dc5c9bf560

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.3.1.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.3.1-cp311-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.3.1-cp311-abi3-win_amd64.whl
  • Upload date:
  • Size: 398.3 kB
  • Tags: CPython 3.11+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.3.1-cp311-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 7152c5506b6768784c6cb4f34544032d76c0f4ab2cb48a441338d619a11dd716
MD5 aab07d0efcf658fa4ac109a43f2b16f4
BLAKE2b-256 c60a0a48abdc5f245ee687e23db72c1fb20b3b611c78577a222e6927e0ed1d6d

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.3.1-cp311-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.3.1-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.3.1-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 20813977b07bfd1d77f7cf6bde4b51a7e9fbe42e7e971a36fa7072f24703cd89
MD5 e34daadb0a70d32b2a44be8a7e679aa2
BLAKE2b-256 cb10abb8d96377943352cb76e7ffddba1ac50dfcef4c7835442870915cc84e99

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.3.1-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.3.1-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.3.1-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 8903e390a12d8ff1afdb7a3ac2e16d9bbf56cf98e9c287f2efc94c0eff5e9816
MD5 a1270bd9fa55622ab04cc03f622dbb25
BLAKE2b-256 05b000ce4c08dca54523995344f4b189c63628c17f3a9b36e38a16bb4bbeb96d

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.3.1-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.3.1-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.3.1-cp311-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c3c1b3008d9da703bac30a5661a10060a2139884088a0dcdee46ad0bbd069167
MD5 6f4ab2d3600c412d5964ab9851df4fcc
BLAKE2b-256 15f4e52c78a66d4280ef0f792983d96d0e7d1b65ab99ddeeabe4fc250f4fed26

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.3.1-cp311-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.3.1-cp311-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.3.1-cp311-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 759e647a99af80a7ea23cf1dc6ae397e89f0b048b34839807dddca83dc755f09
MD5 716224282df8a42eaee455d780a467fa
BLAKE2b-256 d5f6c8c944eac0db6812d137e825af66a896b2a64a6ef0eb5031ab11f2479ce2

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.3.1-cp311-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page