Skip to main content

Define input and output columns for functions working on pandas dataframes.

Project description

pandas-contract

Provide decorators to check functions arguments and return values using pandas DataFrame.

The decorators utilize the pandera.io library to validate data types and constraints of the input arguments and output values of functions.

Installation

pip install pandas-contract

Usage

The library provides decorators to check the input arguments and return values of functions.

The following defines a function that takes a DataFrame with a column x of type integer as input and returns a DataFrame with the column x of type string as output.

import pandas as pd
import pandas_contract as pc
import pandera as pa

@pc.argument("df", schema=pa.DataFrameSchema({"x": pa.Int}))
@pc.result(schema=pa.DataFrameSchema({"x": pa.String}))
def col_x_to_string(df: pd.DataFrame) -> pd.DataFrame:
    return df.assign(x=df["x"].astype(str))

Dynamic Arguments and return values

Required columns and arguments can also be specified dynamically using a function that returns a schema.

import pandas as pd
import pandas_contract as pc
import pandera as pa

@pc.argument("df", schema=pa.DataFrameSchema(
        {
            pc.from_arg("val_col"): pa.Column(),
            pc.from_arg("group_cols"): pa.Column(),
        }
    )
)
@pc.result(schema=pa.DataFrameSchema({pc.from_arg("val_col"): pa.String}))
def col_to_string(df: pd.DataFrame, val_col: str) -> pd.DataFrame:
    return df.assign(**{val_col: df[val_col].astype(str)})

Cross-argument and output constraints

Additionally, it provides checks to ensure cross-argument and output constraints like

Dataframes should have the same index.

import pandas as pd
import pandas_contract as pc

@pc.result(same_index_as=["df1", "df2"])
def my_func(df1: pd.DataFrame, df2: pd.DataFrame):
  # Output has the same index as input
  return pd.DataFrame(index=df1.index)

@pc.argument("df1", same_index_as="df2")
def my_func(df1, df2):
  # Input dataframes have the same index
  df1["x"] = df2["x"]

Size of dataframes should be equal.

import pandas as pd
import pandas_contract as pc

@pc.result(same_size_as="df1")
def my_func(df1: pd.DataFrame):
  # Output has the same size as input
  return pd.DataFrame(index=df1.index  + 1)

Additionally, it allows to extract a dataframe from a more complex argument or return value by specifying a key.

import pandas as pd
import pandas_contract as pc
import pandera as pa

@pc.argument(arg="df", schema=pa.DataFrameSchema({"x": pa.Int}))
@pc.result(key="data", schema=pa.DataFrameSchema({"x": pa.Int}), same_index_as="df")
def into_dict(df: pd.DataFrame) -> dict[str, pd.DataFrame]:
    return dict(data=df)

Dataframe output extends input

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas_contract-0.5.1.tar.gz (60.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pandas_contract-0.5.1-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file pandas_contract-0.5.1.tar.gz.

File metadata

  • Download URL: pandas_contract-0.5.1.tar.gz
  • Upload date:
  • Size: 60.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pandas_contract-0.5.1.tar.gz
Algorithm Hash digest
SHA256 59b4905b4a15c3b931e698467f3ce4dfbf7a8c6390a5ac37108cf65b9b5571e3
MD5 4afe5c4fc09d2052dc406ccd7593f51e
BLAKE2b-256 922609a6b892a090ef4d02cd1536d38fd198c1fefb986eaa0f9800bcd1a2140b

See more details on using hashes here.

Provenance

The following attestation bundles were made for pandas_contract-0.5.1.tar.gz:

Publisher: python.yml on schollm/pandas-contract

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pandas_contract-0.5.1-py3-none-any.whl.

File metadata

File hashes

Hashes for pandas_contract-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4d6114a2c34d42047b9ca03b1077baf50fb3b333f341b4fe0a6be5e8ccc00ee6
MD5 c4113ce652686cdb2c7b4ce4002716e4
BLAKE2b-256 c0645ee4cc093c8c5ed894ed632402e325b6e0d9392636301cd009fd50192bd8

See more details on using hashes here.

Provenance

The following attestation bundles were made for pandas_contract-0.5.1-py3-none-any.whl:

Publisher: python.yml on schollm/pandas-contract

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page