Skip to main content

Non-invasive health checks for Pandas method chains

Project description

Pandas Checks

PyPI - Python Version

Banner image for Pandas Checks

Pandas Checks adds .check methods to Pandas so you can inspect method chains without cutting them up.

As Fleetwood Mac says, you would never break the chain.

import pandas_checks

iris_processed = (
    iris
    .dropna()
    .check.assert_positive(subset=["petal_length", "sepal_length"]) # 🐼🩺 Validate assumptions
    .check.hist(column='petal_length') # 🐼🩺 Plot the distribution of a column after cleaning

    .query("species=='setosa'")
    .check.head(3)  # 🐼🩺 Display the first few rows after more cleaning
    .check.write("iris_processed.parquet") # 🐼🩺 Export the interim data, with type inferred from name
)
Sample output

The .check methods didn't modify how iris data got processed. That's the difference between .head() and .check.head().

Table of Contents

💡 See the docs for details and configuration options.

Installation

# With uv
uv add pandas-checks

# Or with pip
pip install pandas-checks

.check methods

Here's what's in the doctor's bag.

Assertions

General:

  • .check.assert_data() - Check that data passes an arbitrary condition, expressed as a lambda function - DataFrame | Series

Type assertions:

Value assertions:

Describe data

Disable Pandas Checks

These methods can disable Pandas Checks methods, temporarily or permanently.

  • .check.disable_checks() - Don't run checks. By default, still runs assertions. - DataFrame | Series
  • .check.enable_checks() - Run checks again. - DataFrame | Series

Export interim files

  • .check.write() - Export the current data, inferring file format from the name - DataFrame | Series

Time your code

  • .check.print_time_elapsed(start_time) - Print the execution time since you called start_time = pdc.start_timer() - DataFrame | Series

💡 Tip: You can use this stopwatch anywhere in your Python code.

from pandas_checks import print_elapsed_time, start_timer

start_time = start_timer()
...
print_elapsed_time(start_time)

Visualize data

Giving feedback and contributing

If you run into trouble or have questions, I'd love to know. Please open an issue.

Contributions are appreciated! Please see more details.

License

Pandas Checks is licensed under the BSD-3 License.

🐼🩺

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas_checks-1.3.0.tar.gz (32.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pandas_checks-1.3.0-py3-none-any.whl (34.0 kB view details)

Uploaded Python 3

File details

Details for the file pandas_checks-1.3.0.tar.gz.

File metadata

  • Download URL: pandas_checks-1.3.0.tar.gz
  • Upload date:
  • Size: 32.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.5 {"installer":{"name":"uv","version":"0.11.5","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pandas_checks-1.3.0.tar.gz
Algorithm Hash digest
SHA256 f048d7b409a0852209e1b68714e6689a0e1514ea9d3e6f42292f94a6c3a39a5d
MD5 7adae0d87be62e4b2284b5ab69c3d9c6
BLAKE2b-256 8769f21c5c5b8fdbea2f631fd1e757ead9db167957505c1ce4b46ae938551a3f

See more details on using hashes here.

File details

Details for the file pandas_checks-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: pandas_checks-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 34.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.5 {"installer":{"name":"uv","version":"0.11.5","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pandas_checks-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 95829ad8f7d8072c98bf4ce57a67ca4a3cb2ecd3547f40d0545e19b4b670f8ed
MD5 0099cba43e33a42198960f87c0fb5b64
BLAKE2b-256 c4f6dc192405db85a68537b2293625eca93d946733301cd4e245821d80a8664e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page