Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.9.0.tar.gz (302.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.9.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl (512.8 kB view details)

Uploaded PyPymacOS 10.12+ x86-64

dataframely-1.9.0-cp310-abi3-win_amd64.whl (416.5 kB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-1.9.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (549.8 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-1.9.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (541.0 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-1.9.0-cp310-abi3-macosx_11_0_arm64.whl (495.5 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

File details

Details for the file dataframely-1.9.0.tar.gz.

File metadata

  • Download URL: dataframely-1.9.0.tar.gz
  • Upload date:
  • Size: 302.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.9.0.tar.gz
Algorithm Hash digest
SHA256 c8aaedc284c530975b466cb8a4cef4e046a86bc1b1af013d3cb22d6d2256d1fb
MD5 ba24ba485b95bc4d9958ddd41ae285ea
BLAKE2b-256 25c94d63c49659c2855b25f22b73d71fa1980b48949695a0045ad09361d3047e

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.9.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.9.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.9.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 4791131865619e31e543aaee7a3d211507dfc259848b4c8cf978e8266677b96e
MD5 ea50e6a039950f9c61ea1ef7495e5907
BLAKE2b-256 4ba0c2d8959693b82c3d2adf8e0188ccdd27611fd74bf9b6d26a0fff1f12aa90

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.9.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.9.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.9.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 416.5 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.9.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 26ebdefe437dc58942418fee8b68ba9ca8a112f8cf9f0927f24bd601672b9002
MD5 c1b0ebddbc0ef8ecd54173604fa045eb
BLAKE2b-256 8d8b6cc3dafbbddb5cce3c798d5c719ddbf03c2634eef658b4837dbf1ec46cfb

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.9.0-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.9.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.9.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a115c94c1da0a80b287e07c765e893f38080ebeb4e422de093e4d4c67e92aaf0
MD5 5bea0b81a99dd4652c5eb387817c6544
BLAKE2b-256 5864ece8f99c398d593d8f2b3183d54fcf5547c17bfca18539c0f4a5b40996a1

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.9.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.9.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.9.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 1e0502af7e51c065c59425bc272539a6e491820bc8754161a24d85e91430f364
MD5 6713824442d72ebb888d66ec08cf6517
BLAKE2b-256 e4bc8f43d5c36c837b6912f3f257ab5ce8fbcaa7db678b03df9df3266477ab16

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.9.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.9.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.9.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ca7ced6f690ae6fbb13d822f09514c9bcc68ddce4ecc44175184b8830c79bf8c
MD5 f86d6581cb7020369d07ed2eeac0750b
BLAKE2b-256 e6f47ad3f2847a286f62f83874ef14e52e87b7fe21e8fa30e7225e602749656e

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.9.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page