Skip to main content

A declarative, polars-native data frame validation library

Project description


dataframely — A declarative, 🐻‍❄️-native data frame validation library

CI Nightly CI conda-forge pypi-version python-version codecov

🗂 Table of Contents

📖 Introduction

Dataframely is a Python package to validate the schema and content of polars data frames. Its purpose is to make data pipelines more robust by ensuring that data meets expectations and more readable by adding schema information to data frame type hints.

💿 Installation

You can install dataframely using your favorite package manager, e.g., pixi or pip:

pixi add dataframely
pip install dataframely

🎯 Usage

Defining a data frame schema

import dataframely as dy
import polars as pl

class HouseSchema(dy.Schema):
    zip_code = dy.String(nullable=False, min_length=3)
    num_bedrooms = dy.UInt8(nullable=False)
    num_bathrooms = dy.UInt8(nullable=False)
    price = dy.Float64(nullable=False)

    @dy.rule()
    def reasonable_bathroom_to_bedroom_ratio() -> pl.Expr:
        ratio = pl.col("num_bathrooms") / pl.col("num_bedrooms")
        return (ratio >= 1 / 3) & (ratio <= 3)

    @dy.rule(group_by=["zip_code"])
    def minimum_zip_code_count() -> pl.Expr:
        return pl.len() >= 2

Validating data against schema

import polars as pl

df = pl.DataFrame({
    "zip_code": ["01234", "01234", "1", "213", "123", "213"],
    "num_bedrooms": [2, 2, 1, None, None, 2],
    "num_bathrooms": [1, 2, 1, 1, 0, 8],
    "price": [100_000, 110_000, 50_000, 80_000, 60_000, 160_000]
})

# Validate the data and cast columns to expected types
validated_df: dy.DataFrame[HouseSchema] = HouseSchema.validate(df, cast=True)

See more advanced usage examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframely-1.7.0.tar.gz (305.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dataframely-1.7.0-cp311-abi3-win_amd64.whl (413.8 kB view details)

Uploaded CPython 3.11+Windows x86-64

dataframely-1.7.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (546.0 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ x86-64

dataframely-1.7.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (538.7 kB view details)

Uploaded CPython 3.11+manylinux: glibc 2.17+ ARM64

dataframely-1.7.0-cp311-abi3-macosx_11_0_arm64.whl (492.7 kB view details)

Uploaded CPython 3.11+macOS 11.0+ ARM64

dataframely-1.7.0-cp311-abi3-macosx_10_12_x86_64.whl (508.7 kB view details)

Uploaded CPython 3.11+macOS 10.12+ x86-64

File details

Details for the file dataframely-1.7.0.tar.gz.

File metadata

  • Download URL: dataframely-1.7.0.tar.gz
  • Upload date:
  • Size: 305.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.7.0.tar.gz
Algorithm Hash digest
SHA256 db1a78c94f24dadfb30826d63763b3b7d66cad9c8890e11d3b109d96c96e2f14
MD5 1a781fac2dcb7db4f230ea07b4f34f9f
BLAKE2b-256 da64309173c7ca830739e562dbe2e2ace90e3b12a1ab4d2603542e7446947f9b

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.0.tar.gz:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.0-cp311-abi3-win_amd64.whl.

File metadata

  • Download URL: dataframely-1.7.0-cp311-abi3-win_amd64.whl
  • Upload date:
  • Size: 413.8 kB
  • Tags: CPython 3.11+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dataframely-1.7.0-cp311-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 05e401c41903ba5612b0b050e18ceb6505c9ad03dd9cccf1fc1bfda7a967e3d4
MD5 4a61881085d328361bf419aa0d4893a9
BLAKE2b-256 ca7615d63994d6a37d01af173599ad6aecdc9b1132e94ffe47faf52e8d04d8e6

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.0-cp311-abi3-win_amd64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 859384e20a18a5b0bad759cef994e87febd5c9a1348c29c3c9c007089da2db2f
MD5 b4da2826077fe8a2606acf9c3511aa05
BLAKE2b-256 e0aa55fcf6fa162ee9e1f1d1673a811b7c605b9a70287fa7d1ec3d343c92ec5f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 a25be015e42e6f14791c4aefc981ba0f60df6ec189e90cdb80afac0ce89942ce
MD5 1d892fde30b3d6a875730c8eeb391aba
BLAKE2b-256 691e2ba7f4793ae83ab10acba564f0d0693587aca573e4e3758a65074d875980

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.0-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.0-cp311-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2b9e280c8fc9b4291bdec29fe8a51f919cbff164d1e8ef96abcafa76ef0a22e7
MD5 e28f6b106630a303b00c37118cb1d2cd
BLAKE2b-256 2be5fa34fdc78e044072af24403db02d99082efd75ff87e21473efb34ba1e1d2

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.0-cp311-abi3-macosx_11_0_arm64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dataframely-1.7.0-cp311-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dataframely-1.7.0-cp311-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 5c1af679295a612e1bd6156b8738f9ca986d67c5394fb4d9b692728557aff51a
MD5 afe5f959147423375d3dc94330925740
BLAKE2b-256 b908db78798af20b74926ce8b323dc417075c7043297899f142b163cb41835e0

See more details on using hashes here.

Provenance

The following attestation bundles were made for dataframely-1.7.0-cp311-abi3-macosx_10_12_x86_64.whl:

Publisher: build.yml on Quantco/dataframely

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page