Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.7.2.tar.gz (305.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.7.2-cp311-abi3-win_amd64.whl (413.9 kB view details)

Uploaded CPython 3.11+Windows x86-64

dataframely-1.7.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (546.0 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ x86-64

dataframely-1.7.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (538.7 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ ARM64

dataframely-1.7.2-cp311-abi3-macosx_11_0_arm64.whl (492.8 kB view details)

Uploaded CPython 3.11+macOS 11.0+ ARM64

dataframely-1.7.2-cp311-abi3-macosx_10_12_x86_64.whl (508.7 kB view details)

Uploaded CPython 3.11+macOS 10.12+ x86-64

File details

Details for the file dataframely-1.7.2.tar.gz.

File metadata

  • Download URL: dataframely-1.7.2.tar.gz
  • Upload date:
  • Size: 305.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.7.2.tar.gz
Algorithm Hash digest
SHA256 5eaf8a8a29598810a81d680670e5d1d6ee7a232bf640ec3173aea4b66e2ed026
MD5 d9d05b7246c9b731449f2143b2b940dc
BLAKE2b-256 aa24d84fb84519f9b4d32244bbab688d9859e09b47de975af84d1f7226678bba

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.2.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.2-cp311-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.7.2-cp311-abi3-win_amd64.whl
  • Upload date:
  • Size: 413.9 kB
  • Tags: CPython 3.11+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.7.2-cp311-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 aff23215a74d3fbed9e64fafc4843611f1bf1662051478824189e75dad2f9ba7
MD5 55e5e97e5b8117c1c4501f7dce81b4a2
BLAKE2b-256 200deea6350623ce421ee99294a37900fc081385808a22f686998de368246366

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.2-cp311-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f382087383b08a6a8d4698554044ef56aa9ceeaef746609504dd6b6e5c3793d9
MD5 3ec5da1b4aec38fb6915ae29e2a0542d
BLAKE2b-256 cdba4ab32080807461c033782e8856cfd0f92cf63c86e4349c948de621f842ad

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 04eb8c6ee9a5e6967552b960ec0d5aca6cc7c85100f0fc2464c50a6053bea5c7
MD5 287ac9c549e53f0ab6872f013eaa6c39
BLAKE2b-256 507f36af016d6b3bd9d230c86caa274c3884adfdd68731bfdaffee03fd1d9b02

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.2-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.2-cp311-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 42fab2496c9b7c9d93a0ec43d0132e9f69815e18a5ed31acaaf5706fbbda7464
MD5 18f9b5a968ca495adf48450f80dbc3b7
BLAKE2b-256 dedbf0a29ff6916009f1c9ac885190c7411a9bccdd1b2c6d0b6d3ba5a8b20f22

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.2-cp311-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.2-cp311-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.2-cp311-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 3f5adf286eaad2da3803f4560de71d384adfe23cfb3e260dc7bc801244248862
MD5 794c9174056959e34bdff3e6f6ce1566
BLAKE2b-256 9f65fca791e6bcdd0e2cefe37ddeedaf57b25147540daf87ac2979705eafbd45

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.2-cp311-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page