Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.7.1.tar.gz (305.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.7.1-cp311-abi3-win_amd64.whl (413.9 kB view details)

Uploaded CPython 3.11+Windows x86-64

dataframely-1.7.1-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (546.0 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ x86-64

dataframely-1.7.1-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (538.7 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ ARM64

dataframely-1.7.1-cp311-abi3-macosx_11_0_arm64.whl (492.8 kB view details)

Uploaded CPython 3.11+macOS 11.0+ ARM64

dataframely-1.7.1-cp311-abi3-macosx_10_12_x86_64.whl (508.7 kB view details)

Uploaded CPython 3.11+macOS 10.12+ x86-64

File details

Details for the file dataframely-1.7.1.tar.gz.

File metadata

  • Download URL: dataframely-1.7.1.tar.gz
  • Upload date:
  • Size: 305.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.7.1.tar.gz
Algorithm Hash digest
SHA256 290688535297565dff4b2d9bb86aa2a78d20ab8ebff7eeee87577b18d589c33f
MD5 757e562b8ca1ba25e05c45479eab9d19
BLAKE2b-256 d5894faf38f15c1819448b177ed3e6df5e15b165bb938b6c8ee02998e31fc6fa

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.1.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.1-cp311-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.7.1-cp311-abi3-win_amd64.whl
  • Upload date:
  • Size: 413.9 kB
  • Tags: CPython 3.11+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.7.1-cp311-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 9bc8c562b6ae720c416ecba2b532b98009685dc0ad25a950a0216021455bb998
MD5 4d3c982487823c3cbba3ddaa90209b7c
BLAKE2b-256 ab56ae052fb17951ca51b64ea32275b042d054cc3a81531c1ce419efb25ca00d

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.1-cp311-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.1-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.1-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fb035680521e40484475bd067061991bdf5f1dbb65e71861388348469e0f6177
MD5 cfb6ff84df5d5df2ad89f9014a42e5ce
BLAKE2b-256 3c339d1a2bb5e997dd4bf29a5dcc746eb77b82d5fcccec165d1bfd7d0a16514e

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.1-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.1-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.1-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 3ea16ac96f4011605cca3a0479de3f4e203c8dd226dbb0c12c65a19067634aa2
MD5 0c0fae9f9a3592b87c2843d60c4140bc
BLAKE2b-256 f120346a317a39085a87cdbdbbf58322c69eafb3b0d74edfc68eca47a8040a51

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.1-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.1-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.1-cp311-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ef8599badb683681df1208bfece7766e257e8f8a7ffcbe318c0f84433c7b6ba5
MD5 cadb4aea76b36ab319fd4c74d639b5df
BLAKE2b-256 b4c06ce25d74259bcaa1d41ef60b8777bb1169de6b43cc8ed86e9e12b67b1e28

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.1-cp311-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.1-cp311-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.1-cp311-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 a6acc798a8ef1edb6a8eb91941e3d2bbfe799c5482286c3bf2dfae5c41114ef5
MD5 a1d6bb48fb6abb172313ecacefb61e1f
BLAKE2b-256 0e0cbda57f1a3851f121601b77747368c7e6bd73cb4451fe7dfde2561f2d9ea2

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.1-cp311-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page