Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedrooom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.2.1.tar.gz (248.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.2.1-cp311-abi3-win_amd64.whl (434.1 kB view details)

Uploaded CPython 3.11+Windows x86-64

dataframely-1.2.1-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (628.0 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ x86-64

dataframely-1.2.1-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (627.7 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ ARM64

dataframely-1.2.1-cp311-abi3-macosx_11_0_arm64.whl (551.0 kB view details)

Uploaded CPython 3.11+macOS 11.0+ ARM64

dataframely-1.2.1-cp311-abi3-macosx_10_12_x86_64.whl (558.8 kB view details)

Uploaded CPython 3.11+macOS 10.12+ x86-64

File details

Details for the file dataframely-1.2.1.tar.gz.

File metadata

  • Download URL: dataframely-1.2.1.tar.gz
  • Upload date:
  • Size: 248.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.2.1.tar.gz
Algorithm Hash digest
SHA256 8900487b7a8febcbde5ed6ecd73744fcc317fdcc2585674fd5b8b537272badc1
MD5 c2b44a5179edcd321f580f77aca30f01
BLAKE2b-256 80d2ca6161af3175781b063a7e64d9c815bac24410214df3880bff78c323c9bc

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.2.1.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.2.1-cp311-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.2.1-cp311-abi3-win_amd64.whl
  • Upload date:
  • Size: 434.1 kB
  • Tags: CPython 3.11+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.2.1-cp311-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 da51a7b1a671034d8267852e51a8a08a7d1281d07fdf71caf340fa6ef372bc0f
MD5 3cdada92b2dbeba54caf5d35ecbe0576
BLAKE2b-256 71d89f5c72f84fb18e0f1716be41f4c18f3caf95b5653bdb293485331f70d5e0

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.2.1-cp311-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.2.1-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.2.1-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6e29a105a273f23aba040a836ca86657f0b1d509fcd42d8f4229ca20831a9b52
MD5 db1d843be3fcb8f5aab05fd3e863d2ea
BLAKE2b-256 80fbddbf7667c9baabf493212a5492eaac1b53bf619f3f9b95ca447a5eb1e59e

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.2.1-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.2.1-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.2.1-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 843a6149e221c91ee74d556ad0e93ebe5c8d8a6e8afdaa9b021e94fcd25762f0
MD5 8e5b907f459689036e87187039e9d6cd
BLAKE2b-256 2db0829050114678e5fbafab7b0fa004e21ee3fb80f889dbf001fef1d748dd3b

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.2.1-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.2.1-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.2.1-cp311-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e9677b37b47921efc82b767a3fec52f6e1227fb87b9f5a22fc183b5435d156a1
MD5 b3c680c8ff505699c7651b9f23827a74
BLAKE2b-256 21a8ee286d8a5162f4045150eeb759d83784747342d1a609b06231da7e260f6f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.2.1-cp311-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.2.1-cp311-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.2.1-cp311-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 6b75ace0349e13a8879ad93173d589880701f1e92707aa6552adf5ed389c7c34
MD5 35fac111978a60f553d9a16cdc0385b3
BLAKE2b-256 f054340b4edb7105cd1967238849b30608168ab97e00daf09509a89ef6fa13cd

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.2.1-cp311-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page