Zero-friction AutoML + Data Cleaning Toolkit
Project description
🚀 KaizenStat
KaizenStat is a zero-friction data validation, automatic cleaning, and AutoML benchmarking toolkit. Diagnose datasets instantly, auto-repair issues, train baseline models, generate standalone Python code, and launch interactive dashboards — all in one command.
✨ Features
| Command | What it does |
|---|---|
kz audit |
🔍 Diagnostic sweep — duplicates, NaNs, infs, ID columns, imbalance |
kz heal |
🩹 Auto-clean — impute, deduplicate, drop dead columns |
kz benchmark |
🚀 Train & rank ML models with cross-validation |
kz auto |
⚡ Full pipeline in one command (audit → heal → benchmark) |
kz explain |
💬 Plain-English summary of findings and recommendations |
kz codegen |
📝 Generate a standalone Python training script |
kz export-model |
💾 Train best model and save to .joblib |
kz report |
📊 Generate interactive HTML report with charts |
kz serve |
🌐 Launch interactive Streamlit web dashboard |
📦 Installation
pip install kaizenstat
Optional extras:
pip install kaizenstat[ui] # + Streamlit dashboard
pip install kaizenstat[gpu] # + XGBoost GPU support
pip install kaizenstat[fast] # + Polars fast data loading
pip install kaizenstat[all] # everything
🚀 Quick Start
Python API
from kaizenstat import KaizenStat
# Full pipeline in one call
KaizenStat.auto("data.csv", target="price")
# Or step-by-step
import pandas as pd
df = pd.read_csv("data.csv")
KaizenStat.audit(df, target="price")
df_clean = KaizenStat.heal(df, target="price")
results = KaizenStat.benchmark(df_clean, target="price")
💬 Get a Plain-English Explanation
KaizenStat.explain("data.csv", target="price")
📝 Generate Standalone Code
KaizenStat.codegen("data.csv", target="price", output_path="deploy.py")
💾 Export & Load Models
# Train + save
KaizenStat.auto("data.csv", target="price")
KaizenStat.save_model(path="model.joblib")
# Load later
pipeline = KaizenStat.load_model("model.joblib")
predictions = pipeline.predict(new_data)
📊 Generate HTML Report
KaizenStat.report("data.csv", target="price", output_path="report.html")
🌐 Launch Web Dashboard
KaizenStat.serve("data.csv", target="price")
💻 CLI Usage
# Diagnostic sweep
kz audit data.csv --target price
# Auto-clean dataset
kz heal data.csv --target price -o clean.csv
# Train & rank models
kz benchmark clean.csv --target price
# Full pipeline
kz auto data.csv --target price
# Plain-English summary
kz explain data.csv --target price
# Generate standalone Python script
kz codegen data.csv --target price -o deploy.py
# Train best model and export
kz export-model data.csv --target price -o model.joblib
# Generate interactive HTML report
kz report data.csv --target price -o report.html
# Launch web dashboard
kz serve data.csv --target price
🛠 Development
git clone https://github.com/yourusername/kaizenstat.git
cd kaizenstat
pip install -e ".[all]"
📄 License
Distributed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kaizenstat-0.2.0.tar.gz.
File metadata
- Download URL: kaizenstat-0.2.0.tar.gz
- Upload date:
- Size: 17.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2d8160dad9e8de3bc287376d013b4d7e29028d1cf6af1a74e8e0b60346ab1a3e
|
|
| MD5 |
94d38ab32f179b062e0f4eaf08dc028d
|
|
| BLAKE2b-256 |
b20cb3c8d6f36984003148ce4e421fadf4cb5c1d5cf89a442ce143ee71ac797a
|
File details
Details for the file kaizenstat-0.2.0-py3-none-any.whl.
File metadata
- Download URL: kaizenstat-0.2.0-py3-none-any.whl
- Upload date:
- Size: 16.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c1228d4b171f9db56bd161cffec0be2393186f454b414a582c9d25b0ce6f41a5
|
|
| MD5 |
e468a083453dca9d7c0c086fee39600f
|
|
| BLAKE2b-256 |
1a2b2e00f64f8c8d0e53b64632048e255dc82b27d62955479028504712a2dae4
|