Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.7.5.tar.gz (326.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.7.5-pp310-pypy310_pp73-macosx_10_12_x86_64.whl (509.7 kB view details)

Uploaded PyPymacOS 10.12+ x86-64

dataframely-1.7.5-cp310-abi3-win_amd64.whl (414.8 kB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-1.7.5-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (546.8 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-1.7.5-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (539.4 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-1.7.5-cp310-abi3-macosx_11_0_arm64.whl (493.6 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

File details

Details for the file dataframely-1.7.5.tar.gz.

File metadata

  • Download URL: dataframely-1.7.5.tar.gz
  • Upload date:
  • Size: 326.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.7.5.tar.gz
Algorithm Hash digest
SHA256 1c40e5c29217fc584636ae1f5ffaf8b46729ce3d49d88ce3d6b3551bf4cded8f
MD5 6c82e6f4699996564321b2b425d0b2f5
BLAKE2b-256 598d0023f429c0cd00b7d2c81fa8e729847fa90f106dc9cca684298b02efae22

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.5.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.5-pp310-pypy310_pp73-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.5-pp310-pypy310_pp73-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 2281f875ac045c48dfe1a8984566efb3cb04a6063a277441c32b9fb7f349b986
MD5 78992d52e0f785b690584ce8fa42d2ef
BLAKE2b-256 94b03f92e54dc322bb928c81a786da5ce22e57782ee4b245a11c755ba0bcfce9

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.5-pp310-pypy310_pp73-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.5-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.7.5-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 414.8 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.7.5-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 08256435281cb12bccb779e558ab4091e689da1a886010e85d4d01c34ea9f74c
MD5 c2b755b4763f3d49f4300313248187d4
BLAKE2b-256 f74180224ef353330455abeea65878ecdce86ae6fc7731d5aef79c8cbd09ae14

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.5-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.5-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.5-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 94595751caed34f3604c9bbc30a50ed336749a13367c737c9bf24181316091b0
MD5 573c719c083d10c5ba5ef0169939b617
BLAKE2b-256 3c592e5c1fa386c62cbfcf31555951deae2719517adec78597fc9293ff8c9ba7

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.5-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.5-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.5-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7d6790de10f9badeb28a852e66f1a7bed220390d03eff1947a447b3a20072ebf
MD5 b783b5c2ce3e2da558250e4e98509e6b
BLAKE2b-256 ee59911c0e7fe786bade4830fdef317aa650c2f384ea9b92ed7c052c578448c2

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.5-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.5-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.5-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bd311a74ab037ae1660ba28e8f4020c0b60e86c8f10bdec5a1f770f688d5fa0c
MD5 f571fcda7d2872f96fa1ff3eb1cd9a1b
BLAKE2b-256 5a7c52231050de5a717a530fdcbd2519304eb27f332b4fb06462cbe26bc9f687

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.5-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page