Define input and output columns for functions working on pandas dataframes.
Project description
pandas-contract
Provide decorators to check functions arguments and return values using pandas DataFrame.
The decorators utilize the pandera.io library to validate data types and constraints of the input arguments and output values of functions.
Documentation
Full documentation on https://pandas-contract.readthedocs.io/en/latest/
Installation
pip install pandas-contract
Usage
ℹ️ Info: Generally, the standard abbreviations for the package imports are
import pandas as pd import pandas_contract as pc import pandera as pa
Setup
See Setup for first-time setup information.
Check Dataframe structure
The following defines a function that takes a DataFrame with a column 'x' of type
integer as input and returns a DataFrame with the column 'x' of type string as output.
See pandera.io for the full documentation.
import pandas as pd
import pandas_contract as pc
import pandera as pa
@pc.argument("df", schema=pa.DataFrameSchema({"x": pa.Int}))
@pc.result(schema=pa.DataFrameSchema({"x": pa.String}))
def col_x_to_string(df: pd.DataFrame) -> pd.DataFrame:
"""Convert column x to string"""
return df.assign(x=df["x"].astype(str))
Retrieve dataframes from a more complex argument
Sometimes the dataframe is not a direct argument of the function, but is part of a more complex argument.
In this case, the decorator argument key can be used to specify the key of the dataframe in the argument.
If key is a callable, the
If it's a callable, it will be called with the argument and the result will be used as the dataframe.
Otherwise, it will be used as a key to retrieve the dataframe from the argument, i.e. arg[key]
Dataframe result is wrapped within another object
import pandas as pd
import pandas_contract as pc
@pc.result(key="data")
def into_dict():
"""Dataframe wrapped in a dict"""
return dict(data=pd.DataFrame())
@pc.result(key=0)
def into_list():
"""Dataframe wraped in a list"""
return [pd.DataFrame(), ...]
@pc.result(key=lambda out: out.foo)
def into_object():
"""Dataframe wrapped in an object"""
class Out:
foo = pd.DataFrame()
# result.foo holds the dataframe
return Out()
Note, if the key is a callable, it must be wrapped in a lambda function, otherwise it will be called with the argument:
import pandas as pd
import pandas_contract as pc
import pandera as pa
def f1():
...
# Get the dataframe from the output item `f1`.
# @pc.result(key=f1, schema=pa.DataFrameSchema({"name": pa.String})) - this will fail
@pc.result(key=lambda res: res[f1], schema=pa.DataFrameSchema({"name": pa.String}))
def return_generators():
# f1 is a key to a dictionary holding the data frame to be tested.
return {
f1: pd.DataFrame([{"name": "f1"}])
}
Dynamic Arguments and return values
Required columns and arguments can also be specified dynamically using a function that returns a schema.
import pandas as pd
import pandas_contract as pc
import pandera as pa
@pc.argument("df", schema=pa.DataFrameSchema(
{pc.from_arg("col"): pa.Column()})
)
@pc.result(schema=pa.DataFrameSchema({pc.from_arg("col"): pa.String}))
def col_to_string(df: pd.DataFrame, col: str) -> pd.DataFrame:
return df.assign(**{col: df[col].astype(str)})
Multiple columns in function argument
The decorator also supports multiple columns from the function argument.
import pandas as pd
import pandas_contract as pc
import pandera as pa
@pc.argument("df", schema=pa.DataFrameSchema(
{pc.from_arg("cols"): pa.Column()}
)
)
@pc.result(schema=pa.DataFrameSchema({pc.from_arg("cols"): pa.String}))
def cols_to_string(df: pd.DataFrame, cols: list[str]) -> pd.DataFrame:
return df.assign(**{col: df[col].astype(str) for col in cols})
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pandas_contract-0.7.0.tar.gz.
File metadata
- Download URL: pandas_contract-0.7.0.tar.gz
- Upload date:
- Size: 94.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba65d4879927c8c95e6dda81b7ab54ede1d55a9247ff81687481d38d346c48af
|
|
| MD5 |
bd502ab03087e5bfba84e9f0492d62bf
|
|
| BLAKE2b-256 |
650ae76dcd1018f3019219748b4780946775cea4292af2fdeb361c1c7d418527
|
Provenance
The following attestation bundles were made for pandas_contract-0.7.0.tar.gz:
Publisher:
python.yml on schollm/pandas-contract
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pandas_contract-0.7.0.tar.gz -
Subject digest:
ba65d4879927c8c95e6dda81b7ab54ede1d55a9247ff81687481d38d346c48af - Sigstore transparency entry: 213416993
- Sigstore integration time:
-
Permalink:
schollm/pandas-contract@a4b5b33b136e937b34ae5bf85c4dee043132fcaf -
Branch / Tag:
refs/tags/v0.7.0 - Owner: https://github.com/schollm
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python.yml@a4b5b33b136e937b34ae5bf85c4dee043132fcaf -
Trigger Event:
push
-
Statement type:
File details
Details for the file pandas_contract-0.7.0-py3-none-any.whl.
File metadata
- Download URL: pandas_contract-0.7.0-py3-none-any.whl
- Upload date:
- Size: 20.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d044e53bc8dc0052c1172861065502b071b13aff8ad00ad41516f2f01afbf958
|
|
| MD5 |
4ea34207053053a0365ccffa16c57fe6
|
|
| BLAKE2b-256 |
463e150f72999d138d00df417d7d6bed9018ebf121aa919faff12b0385177b8b
|
Provenance
The following attestation bundles were made for pandas_contract-0.7.0-py3-none-any.whl:
Publisher:
python.yml on schollm/pandas-contract
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pandas_contract-0.7.0-py3-none-any.whl -
Subject digest:
d044e53bc8dc0052c1172861065502b071b13aff8ad00ad41516f2f01afbf958 - Sigstore transparency entry: 213416994
- Sigstore integration time:
-
Permalink:
schollm/pandas-contract@a4b5b33b136e937b34ae5bf85c4dee043132fcaf -
Branch / Tag:
refs/tags/v0.7.0 - Owner: https://github.com/schollm
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python.yml@a4b5b33b136e937b34ae5bf85c4dee043132fcaf -
Trigger Event:
push
-
Statement type: