Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.12.1.tar.gz (307.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.12.1-pp310-pypy310_pp73-macosx_10_12_x86_64.whl (523.7 kB view details)

Uploaded PyPymacOS 10.12+ x86-64

dataframely-1.12.1-cp310-abi3-win_amd64.whl (427.5 kB view details)

Uploaded CPython 3.10+Windows x86-64

dataframely-1.12.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (550.9 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

dataframely-1.12.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (552.0 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

dataframely-1.12.1-cp310-abi3-macosx_11_0_arm64.whl (506.3 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

File details

Details for the file dataframely-1.12.1.tar.gz.

File metadata

  • Download URL: dataframely-1.12.1.tar.gz
  • Upload date:
  • Size: 307.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.12.1.tar.gz
Algorithm Hash digest
SHA256 18f4d1995407bb53164b8c317d0fb1a1ddb5d6fa94db759985db37a6f38cc23b
MD5 2fe540a8dd793231a8a0132a2cdde739
BLAKE2b-256 1431f999b9094d0eac0a9bbbc79785a2ec72211106cf1f1f3b942823e5e70f2b

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.12.1.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.12.1-pp310-pypy310_pp73-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.12.1-pp310-pypy310_pp73-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 0fa67890a1bf6dc974e0d32bc72496fe8a615aeb44f3c67b497160db1fe1aa10
MD5 257c1403d9b2e04bb8f1b8955f7f69ef
BLAKE2b-256 f3f9a2285952d8f270eca537394b2341429367c960c58b358e32d85dd7bb54ec

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.12.1-pp310-pypy310_pp73-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.12.1-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.12.1-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 427.5 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.12.1-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 876dcb126f60b7d47570a4d5cbf7110e27000a98cd8f3371ca0a21263887a461
MD5 2f1d80f555b5ce463a43c9b4c106dffb
BLAKE2b-256 ab950a1174e9e86bb09e16d3632ecd0f2f6244277581f68dbba54e61bc1625b8

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.12.1-cp310-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.12.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.12.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9193ca2effe7217347b141ab418da01a93629e4411808a160bdc3722517aed90
MD5 c72248399c2873c0772d946a1d53c6b8
BLAKE2b-256 05d337e18a8011af45c1afb7adaaecd9b4dc7dc3e77c750b5f074f52b0999a91

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.12.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.12.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.12.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 a195ebf18fb6aa7b07b1c6cef468291e7649d8cba6448bd0b4db2dd0dcf61d2c
MD5 6819c6a1df44bd18d7969202003af2ca
BLAKE2b-256 31068ff1fe6d6b72ce7bc5741a9dcaeaabeaeecb9611f2c9e416d1a7bcf669b7

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.12.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.12.1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.12.1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 525435806f6e5f28d3984f9e63de119c25f63b3b01b40c06df423a633790aca7
MD5 22d5f236a0124ecadfc9fd37b901b36f
BLAKE2b-256 7d474e37cb87ee3e81a541cc1a5870d5f10a39ce7c9049c031fac201fe1426c1

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.12.1-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page