Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedrooom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.4.0.tar.gz (252.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.4.0-cp311-abi3-win_amd64.whl (403.2 kB view details)

Uploaded CPython 3.11+Windows x86-64

dataframely-1.4.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (538.4 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ x86-64

dataframely-1.4.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (530.9 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ ARM64

dataframely-1.4.0-cp311-abi3-macosx_11_0_arm64.whl (481.1 kB view details)

Uploaded CPython 3.11+macOS 11.0+ ARM64

dataframely-1.4.0-cp311-abi3-macosx_10_12_x86_64.whl (494.5 kB view details)

Uploaded CPython 3.11+macOS 10.12+ x86-64

File details

Details for the file dataframely-1.4.0.tar.gz.

File metadata

  • Download URL: dataframely-1.4.0.tar.gz
  • Upload date:
  • Size: 252.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.4.0.tar.gz
Algorithm Hash digest
SHA256 36cea41d1841166821eaacc3f163806f35c3b683a0c250acf398fe88b04d12e6
MD5 e1606cd9ea160a052bb7959b5b67b6c7
BLAKE2b-256 090ff8aa235d55c668684e2f49450252a95bda2777cc4df10e860d75c5664994

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.4.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.4.0-cp311-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.4.0-cp311-abi3-win_amd64.whl
  • Upload date:
  • Size: 403.2 kB
  • Tags: CPython 3.11+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.4.0-cp311-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 eb1e5262159dc82a8b39bf50e357644fb59e22f533054e05638ace4a2d5aafeb
MD5 bdd7e1e8711ebd5f8fb7253e10c64f3e
BLAKE2b-256 020f02192a936fb068fa0c338ac8e1c5975b50be2de6f2fb218f39f6a2273918

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.4.0-cp311-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.4.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.4.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bcf1d58590853a5049d85768cbf2698468112ec7aeeca37215332bbf435d4ab1
MD5 df85917ea9c3692e9f9537416838afa4
BLAKE2b-256 3cfe0106217a825e819c0680a365d46e80e3ebe7bfbec43497fb750f117ba530

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.4.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.4.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.4.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 3970c41e025c750c06e160cbb8695cbbf7b63f716bf0216871aa9135fda8ace1
MD5 269a74c8b6217941cae7a7ae9dd4aba9
BLAKE2b-256 36166c17f57a1d807177ba02f5613aa2ec2d469cb3aac4cfe0ac84c5e5c76a67

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.4.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.4.0-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.4.0-cp311-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c37a6feeb164bcc984feadf37be774ad720af4216b6859d8e713d148cc2d95c5
MD5 2eecd70570d3421469a83b0aad8e6169
BLAKE2b-256 baeeadfc92f6aef761b15c2767b5fa888d0679800384ceb1c003df685b9a21e0

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.4.0-cp311-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.4.0-cp311-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.4.0-cp311-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 a4eb2d732ec4a249c77bf25f95cd2a324b636ed5dc2d12364ffef70426ad6f2c
MD5 bd16ae3005db2f29e3f6fbcea1d83825
BLAKE2b-256 dad335a668bcda4fc6efd540889516c4dc7f16865209e9292c7c43b73800ba2f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.4.0-cp311-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page