Skip to main content

Non-invasive health checks for Pandas method chains

Project description

Pandas Checks

PyPI - Python Version

Banner image for Pandas Checks

Pandas Checks adds .check methods to Pandas so you can inspect method chains without cutting them up.

As Fleetwood Mac says, you would never break the chain.

import pandas_checks

iris_processed = (
    iris
    .dropna()
    .check.assert_positive(subset=["petal_length", "sepal_length"]) # 🐼🩺 Validate assumptions
    .check.hist(column='petal_length') # 🐼🩺 Plot the distribution of a column after cleaning

    .query("species=='setosa'")
    .check.head(3)  # 🐼🩺 Display the first few rows after more cleaning
    .check.write("iris_processed.parquet") # 🐼🩺 Export the interim data, with type inferred from name
)
Sample output

The .check methods didn't modify how iris data got processed. That's the difference between .head() and .check.head().

Table of Contents

💡 See the docs for details and configuration options.

Installation

# With uv
uv add pandas-checks

# Or with pip
pip install pandas-checks

.check methods

Here's what's in the doctor's bag.

Assertions

General:

  • .check.assert_data() - Check that data passes an arbitrary condition, expressed as a lambda function - DataFrame | Series

Type assertions:

Value assertions:

Describe data

Disable Pandas Checks

These methods can disable Pandas Checks methods, temporarily or permanently.

  • .check.disable_checks() - Don't run checks. By default, still runs assertions. - DataFrame | Series
  • .check.enable_checks() - Run checks again. - DataFrame | Series

Export interim files

  • .check.write() - Export the current data, inferring file format from the name - DataFrame | Series

Time your code

  • .check.print_time_elapsed(start_time) - Print the execution time since you called start_time = pdc.start_timer() - DataFrame | Series

💡 Tip: You can use this stopwatch anywhere in your Python code.

from pandas_checks import print_elapsed_time, start_timer

start_time = start_timer()
...
print_elapsed_time(start_time)

Visualize data

Giving feedback and contributing

If you run into trouble or have questions, I'd love to know. Please open an issue.

Contributions are appreciated! Please see more details.

License

Pandas Checks is licensed under the BSD-3 License.

🐼🩺

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas_checks-1.2.0.tar.gz (31.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pandas_checks-1.2.0-py3-none-any.whl (34.0 kB view details)

Uploaded Python 3

File details

Details for the file pandas_checks-1.2.0.tar.gz.

File metadata

  • Download URL: pandas_checks-1.2.0.tar.gz
  • Upload date:
  • Size: 31.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.5 {"installer":{"name":"uv","version":"0.11.5","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pandas_checks-1.2.0.tar.gz
Algorithm Hash digest
SHA256 84ac50e53117985f53a69729436c6bde9c20bd6d982bb05b779b43de494cb60b
MD5 f213c421899710ea207a7fe57618de57
BLAKE2b-256 e62cd0fb07e92bf0eb35ac0b073efdfd5e1d326717e737bc29e6994472888783

See more details on using hashes here.

File details

Details for the file pandas_checks-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: pandas_checks-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 34.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.5 {"installer":{"name":"uv","version":"0.11.5","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pandas_checks-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8accc2d47a8dad8ed215e6b1cad2b743015efadb59f9b4bcd34784d6f6ff54d6
MD5 29fe55d66b93df58206fcd55bc3cff58
BLAKE2b-256 fe3a3a37057240cdc68267ebf334142448a0d1a7063976c83f7b794097385793

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page