DataFrame validation library using Python Protocol for structural subtyping
Project description
Pavise
DataFrame validation library using Python Protocol for structural subtyping.
Features
- Use Python Protocol to define DataFrame schemas
DataFrame[Schema]type annotation for static type checking- Structural subtyping: validate only required columns, ignore extra columns
- Covariant type parameters:
DataFrame[ChildSchema]is compatible withDataFrame[ParentSchema] - Optional runtime validation
- No inheritance required
- Support for both pandas and polars backends
Installation
# For pandas support
pip install pavise[pandas]
# For polars support
pip install pavise[polars]
# For both
pip install pavise[all]
Usage
Pandas Backend
Static Type Checking Only (Recommended)
from typing import Protocol
from pavise.pandas import DataFrame
class UserSchema(Protocol):
name: str
age: int
def process_users(df: DataFrame[UserSchema]) -> DataFrame[UserSchema]:
# mypy/pyrefly will check types, no runtime validation
return df[df['age'] >= 18]
# Use regular pandas DataFrame
import pandas as pd
df = pd.DataFrame({'name': ['Alice', 'Bob'], 'age': [30, 17]})
result = process_users(df)
Runtime Validation (Explicit)
from typing import Protocol
import pandas as pd
from pavise.pandas import DataFrame
class UserSchema(Protocol):
name: str
age: int
def load_users(raw_df: pd.DataFrame) -> DataFrame[UserSchema]:
# Validate at runtime when needed
return DataFrame[UserSchema](raw_df)
raw_df = pd.DataFrame({'name': ['Alice', 'Bob'], 'age': [30, 17]})
validated_df = load_users(raw_df) # Runtime validation occurs here
Polars Backend
Static Type Checking Only (Recommended)
from typing import Protocol
from pavise.polars import DataFrame
class UserSchema(Protocol):
name: str
age: int
def process_users(df: DataFrame[UserSchema]) -> DataFrame[UserSchema]:
# mypy/pyrefly will check types, no runtime validation
return df.filter(df['age'] >= 18)
# Use regular polars DataFrame
import polars as pl
df = pl.DataFrame({'name': ['Alice', 'Bob'], 'age': [30, 17]})
result = process_users(df)
Runtime Validation (Explicit)
from typing import Protocol
import polars as pl
from pavise.polars import DataFrame
class UserSchema(Protocol):
name: str
age: int
def load_users(raw_df: pl.DataFrame) -> DataFrame[UserSchema]:
# Validate at runtime when needed
return DataFrame[UserSchema](raw_df)
raw_df = pl.DataFrame({'name': ['Alice', 'Bob'], 'age': [30, 17]})
validated_df = load_users(raw_df) # Runtime validation occurs here
Structural Subtyping
from typing import Protocol
import pandas as pd
from pavise.pandas import DataFrame
class UserSchema(Protocol):
name: str
class UserWithEmailSchema(Protocol):
name: str
email: str
def process_user(df: DataFrame[UserSchema]) -> None:
print(df['name'])
# This works! UserWithEmailSchema has all required columns of UserSchema
df: DataFrame[UserWithEmailSchema] = pd.DataFrame({
'name': ['Alice'],
'email': ['alice@example.com']
})
process_user(df) # OK - covariant type parameter
Extra Columns are Ignored
from typing import Protocol
import pandas as pd
from pavise.pandas import DataFrame
class SimpleSchema(Protocol):
a: int
# Extra columns are ignored during validation
df = pd.DataFrame({
'a': [1, 2, 3],
'b': ['x', 'y', 'z'], # Extra column - ignored
'c': [10.0, 20.0, 30.0] # Extra column - ignored
})
validated = DataFrame[SimpleSchema](df) # OK
Supported Types
intfloatstrbool
Development
# Install with dev dependencies (includes both pandas and polars)
uv pip install -e ".[dev]"
# Run all tests
uv run pytest
# Run tests for specific backend
uv run pytest tests/test_pandas.py
uv run pytest tests/test_polars.py
Testing with tox
# Run tests for all Python versions and backends
tox
# Run tests for specific environment
tox -e py312-pandas # Test pandas backend with Python 3.12
tox -e py312-polars # Test polars backend with Python 3.12
tox -e py312-all # Test both backends with Python 3.12
# Run linting
tox -e lint
# Run type checking
tox -e type
# Available Python versions: py39, py310, py311, py312
# Available backends: pandas, polars, all
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pavise-0.0.1.tar.gz
(71.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
pavise-0.0.1-py3-none-any.whl
(13.7 kB
view details)
File details
Details for the file pavise-0.0.1.tar.gz.
File metadata
- Download URL: pavise-0.0.1.tar.gz
- Upload date:
- Size: 71.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e86db7e044e1072aa95ae112e637ec020c0461a8a0a7f211832025e1e3334ca6
|
|
| MD5 |
a3ad334628577116cd6247eeb69155de
|
|
| BLAKE2b-256 |
0f1ac28a385c95ca1581e98f76415fc6d08a6b0beb3aa4e7d565382f25019ed6
|
File details
Details for the file pavise-0.0.1-py3-none-any.whl.
File metadata
- Download URL: pavise-0.0.1-py3-none-any.whl
- Upload date:
- Size: 13.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
36fe9033a93dceabcfeefecfcc7ea2256353fc253d2702dab2c993d0bf76c975
|
|
| MD5 |
fa0a0409db7baa55b557047e37a10e3c
|
|
| BLAKE2b-256 |
1c9cbebd68785c1e328b7e7e79b4c5af9ed93bda3972ee4ef76592bf2b67eb81
|