Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio(cls) -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count(cls) -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-2.1.0.tar.gz (382.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-2.1.0-cp310-abi3-win_amd64.whl (5.3 MB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-2.1.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-2.1.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.0 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-2.1.0-cp310-abi3-macosx_11_0_arm64.whl (4.8 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

dataframely-2.1.0-cp310-abi3-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file dataframely-2.1.0.tar.gz.

File metadata

  • Download URL: dataframely-2.1.0.tar.gz
  • Upload date:
  • Size: 382.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.1.0.tar.gz
Algorithm Hash digest
SHA256 dcf67393b445e87100d8a64ea96be572ea29cacb4c836388b9ded5af3dc02514
MD5 df4984ae1bec93741b05ad8e8e6de024
BLAKE2b-256 fb36dfbf5e739b9f18e07dfd23ba8a63c775f5bb0e63e03d3c4012b167581d95

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.1.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.1.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-2.1.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 5.3 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dataframely-2.1.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 1a98e6e4801e46a550aac67dd4dcaafe77bfeb0b1148a786a59181cca2dc542e
MD5 d764e003c8c20ce2d25c1ed5b3ddaa7f
BLAKE2b-256 c683209760737dc526cdc141d6505e8e4f42edf743514f00368610149c6810e5

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.1.0-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.1.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.1.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 30c2621468c5484010b4e55d13c4b49ca8cdab5f7bf2ee31141dce91f41572fa
MD5 80a37a50c2220ccf1d0fb73e3d19ad39
BLAKE2b-256 3e1d553c8faef4018ec952129b436b9acf5140f6598e905ad4aaf2ae5af03feb

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.1.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.1.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-2.1.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 3f28d9326c2c37184c667f5e39191590a501f940d488f89a65c53c8214e7e9ac
MD5 2fcee2fc247f8036a9889c47f6755519
BLAKE2b-256 1f2df7e4c81a93ed9575c3d89c2b357743440388b4dc1f52e5ded9360c286531

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.1.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.1.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-2.1.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f4bfb98a28f0bdd680c602d512e1c2ffc8c2401d949c4f9ab67fab50cc992def
MD5 2dd9a3816b8ee53ff0e2ba497134b29f
BLAKE2b-256 f0530f0fbede52751a2fcb20380cf99cc4955d673eeee7e225c974122009b668

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.1.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.1.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.1.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 fcdd6cc5cd9b8cb5a20043647a296b3bf6d6b496ab4d48ee1fa25850212e3501
MD5 daea4fe6c6de288d5ab00c98a067c52d
BLAKE2b-256 1eceba96f35e80050ec8de147bb8b461cd29a08250b8c187303f3ceafa05772f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.1.0-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page