Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedrooom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.1.0.tar.gz (247.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.1.0-cp311-abi3-win_amd64.whl (432.9 kB view details)

Uploaded CPython 3.11+Windows x86-64

dataframely-1.1.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (626.9 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ x86-64

dataframely-1.1.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (626.5 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ ARM64

dataframely-1.1.0-cp311-abi3-macosx_11_0_arm64.whl (547.0 kB view details)

Uploaded CPython 3.11+macOS 11.0+ ARM64

dataframely-1.1.0-cp311-abi3-macosx_10_12_x86_64.whl (557.6 kB view details)

Uploaded CPython 3.11+macOS 10.12+ x86-64

File details

Details for the file dataframely-1.1.0.tar.gz.

File metadata

  • Download URL: dataframely-1.1.0.tar.gz
  • Upload date:
  • Size: 247.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.1.0.tar.gz
Algorithm Hash digest
SHA256 b7a7be69e4fe0a4b3157514ac58312b3520c98bd9931672749a520e82ed80fbd
MD5 9d3d2561e6545191a6b572ae13c23cb6
BLAKE2b-256 557a4a5e4e2bd0cdb7d5addd642fde5c4335db4bc35ea6164bdddf7a8b71db2e

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.1.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.1.0-cp311-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.1.0-cp311-abi3-win_amd64.whl
  • Upload date:
  • Size: 432.9 kB
  • Tags: CPython 3.11+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.1.0-cp311-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 220d63ecc9abe6071d92ae12be7e955ffde8da93d3b8ee239a7d048c8160e4f7
MD5 1127c155738ea06945095ef6bfe82b99
BLAKE2b-256 f630a07de889afa15037815da2dd3c4cc53794efae449414ddfd940ab6f4ad4a

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.1.0-cp311-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.1.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.1.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6422b36f0822872d47f01d20975e0e575823032403a411cbd09006301d5e7211
MD5 e62448d2f37457a7d59b1afcd8fbf160
BLAKE2b-256 174907c00abf2649fdfe9502b528ac27a09b399296437a74309dc05fae306748

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.1.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.1.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.1.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 602fe7e676df342848d304d2c7cba436bd3641665d79de1385742d0736c09fd5
MD5 ec4f3bff905da3faa1cbecd073f221d9
BLAKE2b-256 021a0cd5e30d0c6f492a4fa862d6d2fb59d54d53a1ce207c51caa2e262c45ce9

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.1.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.1.0-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.1.0-cp311-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3b2144bc7ea064d06aff6693021d1f9dc27fdfb529ad148b39d0ad8bb7de2611
MD5 658195e28d8ba7330efee0dc144eeda2
BLAKE2b-256 65e80984f7b327efeb9f416f815b0f1b473347aa7014ca80fcc0dd09b3cc0726

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.1.0-cp311-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.1.0-cp311-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.1.0-cp311-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 e18e4057fb5624ea2a3f88378b76ebb2985e2dd9421fdc436688d39118cf2d4f
MD5 a941c30f597b7c533f4122d0edb5f94f
BLAKE2b-256 dda6f2a22ee18c97140a5b3407a20f0e07a54af8140861b91a968e410b6f5e38

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.1.0-cp311-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page