Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.8.1.tar.gz (300.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.8.1-pp310-pypy310_pp73-macosx_10_12_x86_64.whl (512.1 kB view details)

Uploaded PyPymacOS 10.12+ x86-64

dataframely-1.8.1-cp310-abi3-win_amd64.whl (417.2 kB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-1.8.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (548.4 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-1.8.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (539.7 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-1.8.1-cp310-abi3-macosx_11_0_arm64.whl (496.0 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

File details

Details for the file dataframely-1.8.1.tar.gz.

File metadata

  • Download URL: dataframely-1.8.1.tar.gz
  • Upload date:
  • Size: 300.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.8.1.tar.gz
Algorithm Hash digest
SHA256 06f7f0c001214042b09a805810d457c4cd69abf1de2917d70b2e1e8bb033da46
MD5 909daeff7095b818fbeaaba3a270c6a6
BLAKE2b-256 96a8ebb16461fe969ce129f6f95d1d798b6b8a92ba9eed53ca81729d26ef201e

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.8.1.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.8.1-pp310-pypy310_pp73-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.8.1-pp310-pypy310_pp73-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 098c514c63d0ce0ece1bff8845ed2732b9ec22c6642247c7ff3b243fa91b1d92
MD5 0c18033b6b6d374a9d8667345d52c6b2
BLAKE2b-256 67b952c7ebc197806bcdd246eb023f626c20ee3dc471dfd6cbf6b1acae63cae1

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.8.1-pp310-pypy310_pp73-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.8.1-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.8.1-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 417.2 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.8.1-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 720ab8ab2001e75398faca21cb872d39705a2b6ce596dbf80a0108a577b33d9f
MD5 1c9d56b1ddc3dcf53c65271004768c56
BLAKE2b-256 4e323156cce0e3a5679b20f66b352714e25adbcef2477409853b9a8d1bf8da73

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.8.1-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.8.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.8.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7735cc3f5d52cdaab4799f142ec40467d82a4771a86eafbb1aaed86ea7ee5e21
MD5 9d42623898dad79b1eed5d95c9414c7f
BLAKE2b-256 bfc32b316b259c9e0f814748ec98f975ab4666edf1eadfeda57deafdbd236227

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.8.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.8.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.8.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 3d4de4a2203abc27ccadda6680a74d23938feb37e5e8406f500f900a16d9184f
MD5 f5711b8ab7d93fe0cbe14f3b29699e46
BLAKE2b-256 fa5c005d7d68a88f8e20deb5b31328dd5c44c2bca779c8809e923c6b475737be

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.8.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.8.1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.8.1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3dd4b1a3e3ee27c169157062c5bd6e7f3ce240b0222ad65614f211ab9e6eb0b1
MD5 bc1d4ceff7a13d530462374e8443afba
BLAKE2b-256 a64c3162d734f1b5f21c331c6d5042c937a41b77670f5ff7d9bcb36f440fb455

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.8.1-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page