Skip to main content

Define input and output columns for functions working on pandas dataframes.

Project description

pandas-contract

Provide decorators to check functions arguments and return values using pandas DataFrame.

The decorators utilize the pandera.io library to validate data types and constraints of the input arguments and output values of functions.

Documentation

Full documentation on https://pandas-contract.readthedocs.io/en/latest/

Installation

pip install pandas-contract

Usage

ℹ️ Info: Generally, the standard abbreviations for the package imports are

import pandas as pd
import pandas_contract as pc
import pandera as pa

Setup

See Setup for first-time setup information.

Check Dataframe structure

The following defines a function that takes a DataFrame with a column 'x' of type integer as input and returns a DataFrame with the column 'x' of type string as output.

See pandera.io for the full documentation.

import pandas as pd
import pandas_contract as pc
import pandera as pa

@pc.argument("df", schema=pa.DataFrameSchema({"x": pa.Int}))
@pc.result(schema=pa.DataFrameSchema({"x": pa.String}))
def col_x_to_string(df: pd.DataFrame) -> pd.DataFrame:
    """Convert column x to string"""
    return df.assign(x=df["x"].astype(str))

Retrieve dataframes from a more complex argument

Sometimes the dataframe is not a direct argument of the function, but is part of a more complex argument. In this case, the decorator argument key can be used to specify the key of the dataframe in the argument.

If key is a callable, the If it's a callable, it will be called with the argument and the result will be used as the dataframe. Otherwise, it will be used as a key to retrieve the dataframe from the argument, i.e. arg[key]

Dataframe result is wrapped within another object

import pandas as pd
import pandas_contract as pc

@pc.result(key="data")
def into_dict():
    """Dataframe wrapped in a dict"""
    return dict(data=pd.DataFrame())


@pc.result(key=0)
def into_list():
    """Dataframe wraped in a list"""
    return [pd.DataFrame(), ...]


@pc.result(key=lambda out: out.foo)
def into_object():
    """Dataframe wrapped in an object"""
    class Out:
        foo = pd.DataFrame()
    # result.foo holds the dataframe
    return Out()

Note, if the key is a callable, it must be wrapped in a lambda function, otherwise it will be called with the argument:

import pandas as pd
import pandas_contract as pc
import pandera as pa

def f1():
    ...

# Get the dataframe from the output item `f1`.
# @pc.result(key=f1, schema=pa.DataFrameSchema({"name": pa.String}))  - this will fail
@pc.result(key=lambda res: res[f1], schema=pa.DataFrameSchema({"name": pa.String}))
def return_generators():
    # f1 is a key to a dictionary holding the data frame to be tested.
    return {
        f1: pd.DataFrame([{"name": "f1"}])
    }

Dynamic Arguments and return values

Required columns and arguments can also be specified dynamically using a function that returns a schema.

import pandas as pd
import pandas_contract as pc
import pandera as pa

@pc.argument("df", schema=pa.DataFrameSchema(
    {pc.from_arg("col"): pa.Column()})
)
@pc.result(schema=pa.DataFrameSchema({pc.from_arg("col"): pa.String}))
def col_to_string(df: pd.DataFrame, col: str) -> pd.DataFrame:
    return df.assign(**{col: df[col].astype(str)})

Multiple columns in function argument

The decorator also supports multiple columns from the function argument.

import pandas as pd
import pandas_contract as pc
import pandera as pa

@pc.argument("df", schema=pa.DataFrameSchema(
        {pc.from_arg("cols"): pa.Column()}
    )
)
@pc.result(schema=pa.DataFrameSchema({pc.from_arg("cols"): pa.String}))
def cols_to_string(df: pd.DataFrame, cols: list[str]) -> pd.DataFrame:
    return df.assign(**{col: df[col].astype(str) for col in cols})

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas_contract-0.6.4.tar.gz (94.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pandas_contract-0.6.4-py3-none-any.whl (17.9 kB view details)

Uploaded Python 3

File details

Details for the file pandas_contract-0.6.4.tar.gz.

File metadata

  • Download URL: pandas_contract-0.6.4.tar.gz
  • Upload date:
  • Size: 94.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pandas_contract-0.6.4.tar.gz
Algorithm Hash digest
SHA256 e81d780bd4428e9e762e2b7c05b0630d159ea68246fb4467c45eb2c4ad4094e6
MD5 46232afa2ceedfddbcd58592cd3605d1
BLAKE2b-256 fa1f70884fec54fef6a51763c5a07a03d0faa6c6c4100320e0bdb3a9b0f3a626

See more details on using hashes here.

Provenance

The following attestation bundles were made for pandas_contract-0.6.4.tar.gz:

Publisher: python.yml on schollm/pandas-contract

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pandas_contract-0.6.4-py3-none-any.whl.

File metadata

File hashes

Hashes for pandas_contract-0.6.4-py3-none-any.whl
Algorithm Hash digest
SHA256 cfc65b4ee6eede0bc01c9d67ca5a80e92a0a8a56352f8ebf7bddc8c86b8a8c71
MD5 e0fb06f2a78883d41453e0dbd2993841
BLAKE2b-256 682cdc4d88a0814d99e15c7da29b48abaf0148eaa79d33c31e59dbf23a31f84a

See more details on using hashes here.

Provenance

The following attestation bundles were made for pandas_contract-0.6.4-py3-none-any.whl:

Publisher: python.yml on schollm/pandas-contract

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page