Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio(cls) -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count(cls) -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-2.11.0.tar.gz (455.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-2.11.0-cp310-abi3-win_amd64.whl (8.6 MB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-2.11.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.6 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-2.11.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (8.0 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-2.11.0-cp310-abi3-macosx_11_0_arm64.whl (7.6 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

dataframely-2.11.0-cp310-abi3-macosx_10_12_x86_64.whl (8.2 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file dataframely-2.11.0.tar.gz.

File metadata

  • Download URL: dataframely-2.11.0.tar.gz
  • Upload date:
  • Size: 455.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for dataframely-2.11.0.tar.gz
Algorithm Hash digest
SHA256 91b368ce7b3d0408d9f181f7967e316eaff9e56fb13493a1e6de071ca1df342d
MD5 3533e218fa709765bd823a36440d3f4e
BLAKE2b-256 6d1894278123698c1947b39c0e9edeb3b75ea78884f0718c2bff6d1292257ca0

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.11.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.11.0-cp310-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for dataframely-2.11.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 aa62ca3dfcb4396d2045319db3d1fbaba96c7bdffcdc0f76bd47eca5a8a6e4e4
MD5 1df6624043c66fe112ae652f4fca9aef
BLAKE2b-256 94f81d12d6b0e2d93250bab7febff817f7e753917ca36ba4356c82ef48619748

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.11.0-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.11.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.11.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4036fcf7a7b5c68c62f755d9b4e183786df63ef23defb0ceae0079cf8c09203a
MD5 eb040501112b4dbb43a0f7a82fc0a4c0
BLAKE2b-256 07bd2088125322507287a75b7aaca9ae84b3caf4148ee71bfd307eb2f2e12f69

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.11.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.11.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-2.11.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7ca6ab8deee474e0a6bea18abd5c14d3afc5797dccbaf24f7292cda8da624942
MD5 96f859543a1f71bae6d8274650209703
BLAKE2b-256 5f4c000909af5242d7093243a1a6d183b82a80e27e682fbe57c90e51a0cf6e4f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.11.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.11.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-2.11.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0de2ff87b01c46b0691d91a0fc40c2e134b4be1eabe447e8ecff47a56119d5a6
MD5 882d32e864e40e0a4428a979e1f2b7c9
BLAKE2b-256 621d7ea662c2fca127420207eecfdd7d491e1cdc60cae4509fa3262c531fd940

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.11.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-2.11.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-2.11.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 9a7d15a4831a1a7715afeac1673ec9ce3d6cb6ef09070df9ec0f4ac54fc23ff9
MD5 e55116c5f2e101bbc1baa1d231cc93a1
BLAKE2b-256 57c2dce9ada639bfef04ea7369382f7a1c705c987d407f76a0e80e5b8b89d532

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-2.11.0-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page