Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedrooom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.0.0.tar.gz (246.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.0.0-cp311-abi3-win_amd64.whl (432.3 kB view details)

Uploaded CPython 3.11+Windows x86-64

dataframely-1.0.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (626.3 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ x86-64

dataframely-1.0.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (625.9 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ ARM64

dataframely-1.0.0-cp311-abi3-macosx_11_0_arm64.whl (546.4 kB view details)

Uploaded CPython 3.11+macOS 11.0+ ARM64

dataframely-1.0.0-cp311-abi3-macosx_10_12_x86_64.whl (557.0 kB view details)

Uploaded CPython 3.11+macOS 10.12+ x86-64

File details

Details for the file dataframely-1.0.0.tar.gz.

File metadata

  • Download URL: dataframely-1.0.0.tar.gz
  • Upload date:
  • Size: 246.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.0.0.tar.gz
Algorithm Hash digest
SHA256 56681a29e424dbc1e6b2c2decb879eea46abca06e416742dfccd565dda627c65
MD5 386c82f46a350647e1b7eafa5f4faff3
BLAKE2b-256 34d6bce2dba11af407320a8a307cdfc8caaa5919141970a95c3577d039d1659f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.0.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.0.0-cp311-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.0.0-cp311-abi3-win_amd64.whl
  • Upload date:
  • Size: 432.3 kB
  • Tags: CPython 3.11+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.0.0-cp311-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 d8ddb1eacda20c4c2fdbf9c8f8f3137369e15ec0d70a223790d444e509eaeb17
MD5 31a2ddcf9f1d9a6e86d8acb0f1b5211d
BLAKE2b-256 451238dc6a99a52026d30a30fede2e44634852c5c9745c2c9372d12a92dcdfc4

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.0.0-cp311-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.0.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.0.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e8016bd4c12001080e3f4572281e2a0847e2b4bf7275439ba6733d999b2b0264
MD5 c62bab68e603fb2c646fb45e76db4b34
BLAKE2b-256 089dbcf9d0ec641e9c39a24c01c0fec55ff71551112a7631a6e59406cd63b46f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.0.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.0.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.0.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 b881ab20c4ebff37bbb1657fe537cfad3c90e1dfccbaaa046c8a6acc9584edfa
MD5 23960972bdc181e04a7bce54405da4dc
BLAKE2b-256 9fdf1ecd738d82b9106f4ec1a2689806a5f95351df86ed5694b65e2f5da0401d

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.0.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.0.0-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.0.0-cp311-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 23d52a2dde50cf85a2508d8bb175e79dd66ca8a5e9c1618e13bb19ea4161368f
MD5 ec2fcf7c6e840ea50a4243d32922ed33
BLAKE2b-256 20a45df6f5edcc873a5a9a7d104adb732af222695d89d4488038c0da397b8fd9

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.0.0-cp311-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.0.0-cp311-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.0.0-cp311-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 f00722594cea1856229848bc56eb76880e5f9bca67af1f5a394f940571d0a0ef
MD5 e59ac3dbf85edc9432d0a4bd37987cfe
BLAKE2b-256 4ef7e8271f1c6b4e27a9403b0765f155e4269d89d1e6b5e54249f09070ade918

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.0.0-cp311-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page