Define input and output columns for functions working on pandas dataframes.
Project description
pandas-contract
Provide decorators to check functions arguments and return values using pandas DataFrame.
The decorators utilize the pandera.io library to validate data types and constraints of the input arguments and output values of functions.
Installation
pip install pandas-contract
Usage
The library provides decorators to check the input arguments and return values of functions.
The following defines a function that takes a DataFrame with a column x of type integer as input and returns a DataFrame with the column x of type string as output.
import pandas as pd
import pandas_contract as pc
import pandera as pa
@pc.argument("df", schema=pa.DataFrameSchema({"x": pa.Int}))
@pc.result(schema=pa.DataFrameSchema({"x": pa.String}))
def col_x_to_string(df: pd.DataFrame) -> pd.DataFrame:
return df.assign(x=df["x"].astype(str))
Dynamic Arguments and return values
Required columns and arguments can also be specified dynamically using a function that returns a schema.
import pandas as pd
import pandas_contract as pc
import pandera as pa
@pc.argument("df", schema=pa.DataFrameSchema({pc.from_arg("val_col"): pa.Int}))
@pc.result(schema=pa.DataFrameSchema({pc.from_arg("val_col"): pa.String}))
def col_to_string(df: pd.DataFrame, val_col: str) -> pd.DataFrame:
return df.assign(**{val_col: df[val_col].astype(str)})
Cross-argument and output constraints
Additionally, it provides checks to ensure cross-argument and output constraints like
Dataframes should have the same index.
import pandas as pd
import pandas_contract as pc
@pc.result(same_index_as=["df1", "df2"])
def my_func(df1: pd.DataFrame, df2: pd.DataFrame):
# Output has the same index as input
return pd.DataFrame(index=df1.index)
@pc.argument("df1", same_index_as="df2")
def my_func(df1, df2):
# Input dataframes have the same index
df1["x"] = df2["x"]
Size of dataframes should be equal.
import pandas as pd
import pandas_contract as pc
@pc.result(same_size_as="df1")
def my_func(df1: pd.DataFrame):
# Output has the same size as input
return pd.DataFrame(index=df1.index + 1)
Additionally, it allows to extract a dataframe from a more complex argument or return value by specifying a key.
import pandas as pd
import pandas_contract as pc
import pandera as pa
@pc.argument(arg="df", schema=pa.DataFrameSchema({"x": pa.Int}))
@pc.result(key="data", schema=pa.DataFrameSchema({"x": pa.Int}), same_index_as="df")
def into_dict(df: pd.DataFrame) -> dict[str, pd.DataFrame]:
return dict(data=df)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pandas_contract-0.1.1.tar.gz.
File metadata
- Download URL: pandas_contract-0.1.1.tar.gz
- Upload date:
- Size: 45.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df0649024ed6c51f32929c018d357a88c25a1f2aa5adc4e7b354cd445ef08d97
|
|
| MD5 |
ab2a2dbedd37604020f8182b9ec757bd
|
|
| BLAKE2b-256 |
6da6a65506de7e2cf8fe61128da69e7b9630b5b16cbc01841ac1186d475c8a47
|
Provenance
The following attestation bundles were made for pandas_contract-0.1.1.tar.gz:
Publisher:
python.yml on schollm/pandas-contract
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pandas_contract-0.1.1.tar.gz -
Subject digest:
df0649024ed6c51f32929c018d357a88c25a1f2aa5adc4e7b354cd445ef08d97 - Sigstore transparency entry: 190136254
- Sigstore integration time:
-
Permalink:
schollm/pandas-contract@1b718a68d64078e66187372345415ba3dde411ce -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/schollm
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python.yml@1b718a68d64078e66187372345415ba3dde411ce -
Trigger Event:
push
-
Statement type:
File details
Details for the file pandas_contract-0.1.1-py3-none-any.whl.
File metadata
- Download URL: pandas_contract-0.1.1-py3-none-any.whl
- Upload date:
- Size: 10.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dae339df2d43875d572e6f07404391df45897c15144b80f453742b12dd7760db
|
|
| MD5 |
cc8a6ffe3a83b408dc5b9437281d4607
|
|
| BLAKE2b-256 |
f1b36aab859fe4ce4fc1cf89c7a9efdab0382e85865d286c073929204576a290
|
Provenance
The following attestation bundles were made for pandas_contract-0.1.1-py3-none-any.whl:
Publisher:
python.yml on schollm/pandas-contract
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pandas_contract-0.1.1-py3-none-any.whl -
Subject digest:
dae339df2d43875d572e6f07404391df45897c15144b80f453742b12dd7760db - Sigstore transparency entry: 190136256
- Sigstore integration time:
-
Permalink:
schollm/pandas-contract@1b718a68d64078e66187372345415ba3dde411ce -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/schollm
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python.yml@1b718a68d64078e66187372345415ba3dde411ce -
Trigger Event:
push
-
Statement type: