Skip to main content

A light-weight and flexible validation package for pandas data structures.

Project description


A light-weight and flexible validation package for pandas data structures.


Build Status PyPI version shields.io PyPI license Project Status: Active – The project has reached a stable, usable state and is being actively developed. Documentation Status codecov PyPI pyversions

Why?

Because pandas data structures hide a lot of information, and explicitly validating them at runtime in production-critical or reproducible research settings is a good idea. It also makes it easier to review pandas code :)

Documentation

The official documentation is hosted on ReadTheDocs: https://pandera.readthedocs.io

Install

Using pip:

pip install pandera

Using conda:

conda install -c cosmicbboy pandera

Example Usage

DataFrameSchema

import pandas as pd

from pandera import Column, DataFrameSchema, Float, Int, String, Check


# validate columns
schema = DataFrameSchema({
    # the check function expects a series argument and should output a boolean
    # or a boolean Series.
    "column1": Column(Int, Check(lambda s: s <= 10)),
    "column2": Column(Float, Check(lambda s: s < -1.2)),
    # you can provide a list of validators
    "column3": Column(String, [
        Check(lambda s: s.str.startswith("value_")),
        Check(lambda s: s.str.split("_", expand=True).shape[1] == 2)
    ]),
})

# alternatively, you can pass strings representing the legal pandas datatypes:
# http://pandas.pydata.org/pandas-docs/stable/basics.html#dtypes
schema = DataFrameSchema({
    "column1": Column("int64", Check(lambda s: s <= 10)),
    ...
})

df = pd.DataFrame({
    "column1": [1, 4, 0, 10, 9],
    "column2": [-1.3, -1.4, -2.9, -10.1, -20.4],
    "column3": ["value_1", "value_2", "value_3", "value_2", "value_1"]
})

validated_df = schema.validate(df)
print(validated_df)

#     column1  column2  column3
#  0        1     -1.3  value_1
#  1        4     -1.4  value_2
#  2        0     -2.9  value_3
#  3       10    -10.1  value_2
#  4        9    -20.4  value_1

Tests

pip install pytest
pytest tests

Contributing to pandera GitHub contributors

All contributions, bug reports, bug fixes, documentation improvements, enhancements and ideas are welcome.

A detailed overview on how to contribute can be found in the contributing guide on GitHub.

Issues

Go here to submit feature requests or bugfixes.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandera-0.1.4.tar.gz (12.5 kB view details)

Uploaded Source

Built Distribution

pandera-0.1.4-py3-none-any.whl (11.7 kB view details)

Uploaded Python 3

File details

Details for the file pandera-0.1.4.tar.gz.

File metadata

  • Download URL: pandera-0.1.4.tar.gz
  • Upload date:
  • Size: 12.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.1

File hashes

Hashes for pandera-0.1.4.tar.gz
Algorithm Hash digest
SHA256 7a573067539d28e2257e1501b8df4ce4e7c62377468ceec5d30480a3a07c64d2
MD5 6481afc7c23352f5180ab6cee652a619
BLAKE2b-256 03740833f4552003a00e6c5c69627f6bac693b2fdb38ba03fc0f8dafb40aadcb

See more details on using hashes here.

File details

Details for the file pandera-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: pandera-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 11.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.1

File hashes

Hashes for pandera-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 e7d68d4470eb6ae6b0cfea1332eb6760911da8aa54438072ed898c1206d51777
MD5 0bdfd44ad4c975caf752524883dccbe1
BLAKE2b-256 adfd21dd9f8904666c544eedb6fb2253338d789c0d85a15e3dbfb6717ee49ad4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page