Skip to main content

A powerful, production-ready tabular data preprocessing and visualization library.

Project description

QPX Logo

QPX Tabular

Python Version License: MIT Code Coverage Documentation

QPX Tabular is a powerful, production-ready tabular data preprocessing and visualization library designed to accelerate data science workflows. It turns raw, messy pandas DataFrames into machine-learning ready datasets with a single line of code.

Features

  • Automated Preprocessing (auto_preprocess): Automatically handles missing values, drops constants, drops high-cardinality nominals, encodes categoricals intelligently, and downcasts memory.
  • Fail-Loud Architecture: Built for production. Instead of failing silently, QPX immediately alerts you (KeyError, ValueError) if you provide invalid data configurations.
  • Comprehensive Data Health Diagnostics: Get 360-degree views of your dataset's health via dataset_health and statistical_snapshot.
  • Beautiful Visualizations: One-line correlation heatmaps, distribution plots, and hierarchical feature clustering matrices.

Installation

To install qpx-tabular via PyPI (once published) or from source, you can simply clone this repository and install it locally using pip:

git clone https://github.com/punitxdev/QPX.git
cd QPX
pip install -e .

Dependencies

  • pandas
  • numpy
  • matplotlib
  • seaborn
  • scipy

Quickstart

Clean an entire dataset with one function:

import pandas as pd
from qpx.tabular import preprocessing

# Load your raw data
df = pd.read_csv("my_messy_data.csv")

# Clean, encode, impute, and downcast in one go!
clean_df, report = preprocessing.auto_preprocess(
    df,
    max_onehot=10, 
    return_report=True
)

print(report)

Generate a deep-dive correlation map:

from qpx.tabular import visuals

visuals.corr_map(clean_df, target="my_target_column")

Documentation

The complete API reference and user guide is hosted online at: https://punitxdev.github.io/QPX/

If you want to build the documentation locally for development:

pip install -e .[dev]
mkdocs serve

To publish the documentation to GitHub Pages, simply run:

mkdocs gh-deploy

License

This project is licensed under the MIT License - see the LICENSE file for details.

Made with love by Punit

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qpx_tabular-0.1.1.tar.gz (19.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qpx_tabular-0.1.1-py3-none-any.whl (19.0 kB view details)

Uploaded Python 3

File details

Details for the file qpx_tabular-0.1.1.tar.gz.

File metadata

  • Download URL: qpx_tabular-0.1.1.tar.gz
  • Upload date:
  • Size: 19.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for qpx_tabular-0.1.1.tar.gz
Algorithm Hash digest
SHA256 009485011eb7e1c76d205d8dae6fac7dc1d2eaa4e94d1d750126e0e301f5b7a6
MD5 e2817f4d22756894c042ca32889c0531
BLAKE2b-256 0c713fe831bcf504070dae5fc376ef6e1f82d9667972f10f8ce91ad276a603cd

See more details on using hashes here.

File details

Details for the file qpx_tabular-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: qpx_tabular-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 19.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for qpx_tabular-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7876263422de118a5a811927880e6cdaf66e2fd9ea6416a3f6815b16c339238b
MD5 c98ba65cf05cac1c787c64f27265fc5d
BLAKE2b-256 25e9fd282a224c95309681fd09b34d16befdbb73c3b0780284d34ed7812271b9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page