Skip to main content

A powerful, production-ready tabular data preprocessing and visualization library.

Project description

QPX Logo

QPX Tabular

Python Version License: MIT Code Coverage Documentation

QPX Tabular is a powerful, production-ready tabular data preprocessing and visualization library designed to accelerate data science workflows. It turns raw, messy pandas DataFrames into machine-learning ready datasets with a single line of code.

Features

  • Automated Preprocessing (auto_preprocess): Automatically handles missing values, drops constants, drops high-cardinality nominals, encodes categoricals intelligently, and downcasts memory.
  • Fail-Loud Architecture: Built for production. Instead of failing silently, QPX immediately alerts you (KeyError, ValueError) if you provide invalid data configurations.
  • Comprehensive Data Health Diagnostics: Get 360-degree views of your dataset's health via dataset_health and statistical_snapshot.
  • Beautiful Visualizations: One-line correlation heatmaps, distribution plots, and hierarchical feature clustering matrices.

Installation

To install qpx-tabular via PyPI (once published) or from source, you can simply clone this repository and install it locally using pip:

git clone https://github.com/punitxdev/QPX.git
cd QPX
pip install -e .

Dependencies

  • pandas
  • numpy
  • matplotlib
  • seaborn
  • scipy

Quickstart

Clean an entire dataset with one function:

import pandas as pd
from qpx.tabular import preprocessing

# Load your raw data
df = pd.read_csv("my_messy_data.csv")

# Clean, encode, impute, and downcast in one go!
clean_df, report = preprocessing.auto_preprocess(
    df,
    max_onehot=10, 
    return_report=True
)

print(report)

Generate a deep-dive correlation map:

from qpx.tabular import visuals

visuals.corr_map(clean_df, target="my_target_column")

Documentation

The complete API reference and user guide is hosted online at: https://punitxdev.github.io/QPX/

If you want to build the documentation locally for development:

pip install -e .[dev]
mkdocs serve

To publish the documentation to GitHub Pages, simply run:

mkdocs gh-deploy

License

This project is licensed under the MIT License - see the LICENSE file for details.

Made with love by Punit

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qpx_tabular-0.1.2.tar.gz (19.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qpx_tabular-0.1.2-py3-none-any.whl (19.0 kB view details)

Uploaded Python 3

File details

Details for the file qpx_tabular-0.1.2.tar.gz.

File metadata

  • Download URL: qpx_tabular-0.1.2.tar.gz
  • Upload date:
  • Size: 19.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for qpx_tabular-0.1.2.tar.gz
Algorithm Hash digest
SHA256 511c6a86c272d9bffa46d8346c7d70580aaa2b8ff0511e4f2fdf7e351a16be0c
MD5 078e1bdd7109527b7728177170427bba
BLAKE2b-256 2b83055e88a384df52e101776ef70a61ae5df27ab91eb3a1a0b4e057e645b390

See more details on using hashes here.

File details

Details for the file qpx_tabular-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: qpx_tabular-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 19.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for qpx_tabular-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c3a982479e21074fb08c27d4d08fa20e8698a6af44e796fa21e57e44edc4f1cc
MD5 54f032f937b916e7507a1c0287773207
BLAKE2b-256 380e13fa7010ebbebc2e88031f55993cb27a6e2a4a1b0bb63e10d9c9193cdca2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page