Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.7.4.tar.gz (326.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.7.4-pp310-pypy310_pp73-macosx_10_12_x86_64.whl (509.3 kB view details)

Uploaded PyPymacOS 10.12+ x86-64

dataframely-1.7.4-cp310-abi3-win_amd64.whl (414.4 kB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-1.7.4-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (546.3 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-1.7.4-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (539.1 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-1.7.4-cp310-abi3-macosx_11_0_arm64.whl (493.3 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

File details

Details for the file dataframely-1.7.4.tar.gz.

File metadata

  • Download URL: dataframely-1.7.4.tar.gz
  • Upload date:
  • Size: 326.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.7.4.tar.gz
Algorithm Hash digest
SHA256 7f37753b7eb5123745714a935ffba8e5472ffb234f4bd86db46d48599c623143
MD5 9417f892d36e2daa6030e2e4f24ea897
BLAKE2b-256 c912f400a45f10417fc0c1fc7669fc52c4083c217fa555ae005514baf230f5d9

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.4.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.4-pp310-pypy310_pp73-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.4-pp310-pypy310_pp73-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 992e121448ca8bf14771b7ff5b5e39b73795f4bb9c30040429833af58b3d92e8
MD5 8ec79d4a40e12557c75896f2b61c2ea3
BLAKE2b-256 13c7a680d28f6eb1f39f46a8faaba215f0e448a19ea7bbe2470535e451219c10

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.4-pp310-pypy310_pp73-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.4-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.7.4-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 414.4 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.7.4-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 8a1eafd68bafc4843fbcd8098e737e6f162f609bfb165fde728e361581e481c5
MD5 2eca0323bd7c4a05f4b438cad7321b3a
BLAKE2b-256 17845a01bf67a8d4948eb9306c13a627695eed7556d84c125da0b3f7226c707f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.4-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.4-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.4-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a750c02468069c512fb608ab43fae6ff1bc2909254ed0dd3d0ecb8fb0ba718b2
MD5 1a824a242bcfb0a5f3955b8fa1a44adc
BLAKE2b-256 dae227e1782b1fb8aaa2d1c1488c1bc2a1117f010f20c57a88826cef96d7113d

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.4-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.4-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.4-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 5f03ad597f172d9155b689b8716be003e55c40f0524fdd62c169c5618b9ee87c
MD5 10c5daa7a96db8be0b4addf0605a9fa8
BLAKE2b-256 5f532c3b4b99f8aa1eb8345afef591911dbf7e02af5fbf5b1774373a60204dd9

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.4-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.4-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.4-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dba8a71ba828fda83894d9b4f5042732dcca03601e9bb0ce2d6dd982f33191e2
MD5 56fa7fc81715f6ed1e500b366a5cdace
BLAKE2b-256 c94ae28463136c779f2bc22c4502d28f34b7bb96cff16cd1052cfd15a14994b5

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.4-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page