Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedrooom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.6.0.tar.gz (261.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.6.0-cp311-abi3-win_amd64.whl (410.2 kB view details)

Uploaded CPython 3.11+Windows x86-64

dataframely-1.6.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (544.9 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ x86-64

dataframely-1.6.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (537.3 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ ARM64

dataframely-1.6.0-cp311-abi3-macosx_11_0_arm64.whl (488.4 kB view details)

Uploaded CPython 3.11+macOS 11.0+ ARM64

dataframely-1.6.0-cp311-abi3-macosx_10_12_x86_64.whl (505.0 kB view details)

Uploaded CPython 3.11+macOS 10.12+ x86-64

File details

Details for the file dataframely-1.6.0.tar.gz.

File metadata

  • Download URL: dataframely-1.6.0.tar.gz
  • Upload date:
  • Size: 261.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.6.0.tar.gz
Algorithm Hash digest
SHA256 51f2158dc116a04f0d6466e5db6be418773c46981b43ef89537a8fdbc7ea4f76
MD5 5a1f7e3c75eb884703ecaf4d247cd872
BLAKE2b-256 b30916944959f7570d1d48919e6860002fd0a4f06dc3a3ae0fb9d57d3f67d1a4

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.6.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.6.0-cp311-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.6.0-cp311-abi3-win_amd64.whl
  • Upload date:
  • Size: 410.2 kB
  • Tags: CPython 3.11+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.6.0-cp311-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 ecbb4cbcaf1d308459113bd6add0680ef84bd13074b6b74501a70e21878ea0c4
MD5 688a0303e557f5e03669fea9f4789c49
BLAKE2b-256 a3dc446378f9f4b255798eacb7f5da3156a0d7eb723fe84fb9fcd08dde5f3921

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.6.0-cp311-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.6.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.6.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 01d143ebbc6836d42ec987a00fcbe67a4e35a9dbbdf7e334719d7b5f3fc6f65a
MD5 ced22e6582abcb96b04b97d278600262
BLAKE2b-256 f525c8ddfcaafa8bac5ecc40b4913660b27309f04a008511b312e1a6d15d9e91

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.6.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.6.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.6.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 83ce064613bd4caea856698c66a5c2e8f0ae8401d59b7474a4d617bc2869fd51
MD5 3f2b7f966693077b03c027aeff9602b1
BLAKE2b-256 db65309c9ef696a406dd91d6bed70eb0ce90c7e141ca52a15dc163fb209faebd

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.6.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.6.0-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.6.0-cp311-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dcca445d0827795084718c5316f8cda7e696c1d587b4561700c099f0c7488a85
MD5 3f537ffcdd32c0f73e85e273e3c14ea9
BLAKE2b-256 585196afa03beaa2f8d3e1a0ae6eaa774a3f4ba9b96e9d8c03cb08605c87eb7d

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.6.0-cp311-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.6.0-cp311-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.6.0-cp311-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 63abdafb2776a493550bca2c08917b4e67d4cf4b4ebcdd16bda93698312f1333
MD5 8f46db53192c08004525ef7d3c9d44d3
BLAKE2b-256 fe0f1ebff6ecaceb84a1ce49713adf9b07de481b1590e13dd9a5cd44ff794687

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.6.0-cp311-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page