Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedrooom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.3.0.tar.gz (250.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.3.0-cp311-abi3-win_amd64.whl (398.2 kB view details)

Uploaded CPython 3.11+Windows x86-64

dataframely-1.3.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (533.2 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ x86-64

dataframely-1.3.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (528.6 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ ARM64

dataframely-1.3.0-cp311-abi3-macosx_11_0_arm64.whl (476.1 kB view details)

Uploaded CPython 3.11+macOS 11.0+ ARM64

dataframely-1.3.0-cp311-abi3-macosx_10_12_x86_64.whl (489.5 kB view details)

Uploaded CPython 3.11+macOS 10.12+ x86-64

File details

Details for the file dataframely-1.3.0.tar.gz.

File metadata

  • Download URL: dataframely-1.3.0.tar.gz
  • Upload date:
  • Size: 250.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.3.0.tar.gz
Algorithm Hash digest
SHA256 1c5b93ad1e3e86b3706c7ab5fcd84e846b1629b2f0b5565794775773b03794d4
MD5 b4fea1250abe2738eefb1d599d9fd8b1
BLAKE2b-256 b5342d1b0edf16fb89548aee1d5fe6239f1cd37fe426c674aaa0dc54b8f4a58a

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.3.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.3.0-cp311-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.3.0-cp311-abi3-win_amd64.whl
  • Upload date:
  • Size: 398.2 kB
  • Tags: CPython 3.11+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.3.0-cp311-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 cb99b54f31452a0c1caa46cfcdb7df207f93318661af78926b98511171c5b7bc
MD5 0a884deafec2c678c1a431f210b0f4a7
BLAKE2b-256 c68d581f28e72d786b875e684a094fbe1eef0238f9fc2930102f8246ccc81531

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.3.0-cp311-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.3.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.3.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 67d05b9c7fcd1d0f2df9e3797438ee156f7bde0d72f9663f8ebd17653fa18d3e
MD5 03a6227a2fe66026f45a51e7f71ed664
BLAKE2b-256 0973e562ac07ac908597c20067e978df1017078cacb06d0c21bee09d11acfda4

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.3.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.3.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.3.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 bab33955167585a2f04d646fe5464f28a37371751da9b8b0a8a94fe444e73a21
MD5 0cec566d30ffe4c05e6e674419fa5890
BLAKE2b-256 3b686d61b6e8ba73c1460db42622d68de91d58adfffad047cf4dc4c498eb5d21

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.3.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.3.0-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.3.0-cp311-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3cf55ef06ef1ca7d03493ffb34d73014a0cbe1317828f89d0d0d181582573c05
MD5 dc1227418b035360ade53da4e64457d1
BLAKE2b-256 94c7f382f9fc052a4d2c4d2f8441d1be6c7b3f78f2ff12b4c49a3cf1380f6263

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.3.0-cp311-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.3.0-cp311-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.3.0-cp311-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 504bd266dc6530f8ca1b1fe18c5c0c50ce04261d66273f772e4508dac115afbc
MD5 59593df1da991307d92f16444e422ead
BLAKE2b-256 9350002299c11af5ec713f76cb5faec263e7763b5b5fb5b64aadf65521fb78a5

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.3.0-cp311-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page