Skip to main content

Function decorators for Pandas Dataframe column name and data type validation

Project description

DAFFY DataFrame Column Validator

test

Description

In projects using Pandas, it's very common to have functions that take Pandas DataFrames as input or produce them as output. It's hard to figure out quickly what these DataFrames contain. This library offers simple decorators to annotate your functions so that they document themselves and that documentation is kept up-to-date by validating the input and output on runtime.

Table of Contents

Installation

Install with your favorite Python dependency manager like

pip install daffy

or

poetry add daffy

Usage

Start by importing the needed decorators:

from daffy import df_in, df_out

To check a DataFrame input to a function, annotate the function with @df_in. For example the following function expects to get a DataFrame with columns Brand and Price:

@df_in(columns=["Brand", "Price"])
def process_cars(car_df):
    # do stuff with cars

If your function takes multiple arguments, specify the field to be checked with it's name:

@df_in(name="car_df", columns=["Brand", "Price"])
def process_cars(year, style, car_df):
    # do stuff with cars

To check that a function returns a DataFrame with specific columns, use @df_out decorator:

@df_out(columns=["Brand", "Price"])
def get_all_cars():
    # get those cars
    return all_cars_df

To check both input and output, just use both annotations on the same function:

@df_in(columns=["Brand", "Price"])
@df_out(columns=["Brand", "Price"])
def filter_cars(car_df):
    # filter some cars
    return filtered_cars_df

If you want to also check the data types of each column, you can replace the column array:

columns=["Brand", "Price"]

with a dict:

columns={"Brand": "object", "Price": "int64"}

This will not only check that the specified columns are found from the DataFrame but also that their dtype is the expected.

Contributing

Contributions are accepted. Include tests in PR's.

Development

To run the tests, clone the repository, install dependencies with Poetry and run tests with PyTest:

poetry install
poetry shell
pytest

To enable linting on each commit, run pre-commit install. After that, your every commit will be checked with isort, black and flake8.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

daffy-0.2.1.tar.gz (4.0 kB view details)

Uploaded Source

Built Distribution

daffy-0.2.1-py3-none-any.whl (4.6 kB view details)

Uploaded Python 3

File details

Details for the file daffy-0.2.1.tar.gz.

File metadata

  • Download URL: daffy-0.2.1.tar.gz
  • Upload date:
  • Size: 4.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.5 Linux/4.19.84-microsoft-standard

File hashes

Hashes for daffy-0.2.1.tar.gz
Algorithm Hash digest
SHA256 fa5c6d1029be8cae3b7bacfd61b8e5c3b8f1a271c3bbbd61c2ce3be9cfd50ffa
MD5 99875819aaa8b719374904fb54090df5
BLAKE2b-256 064898633cc7c43ca909e265e39990b2b18a700204d299a50f587cf88314b0e2

See more details on using hashes here.

File details

Details for the file daffy-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: daffy-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 4.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.5 Linux/4.19.84-microsoft-standard

File hashes

Hashes for daffy-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4224fb76c37228dfe3040d4db2b9e278e52770aed70ca67ccfeee848f1fe70a6
MD5 75c9338f1c03e0d0fe225d59ec0f1be1
BLAKE2b-256 7a284aecc767059196e5857cd1aec3cbf1d8762332a50d6cfa5e52c873d7225b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page