Advising-first AutoML: EDA, preprocessing, selection/extraction, training, metrics, plots.
Project description
softauto 0.7.0 — Advising‑first AutoML
softauto is a light, zero‑boilerplate AutoML toolkit that starts with AI advice: it inspects your dataframe, infers the task, suggests preprocessing & model shortlists, then trains, evaluates, and saves the best pipeline — with plots and a JSON summary.
- ✅ Advisor: suggests task, preprocessing, feature selection, PCA, and model shortlists (with reasons)
- ✅ User‑choice or Auto: pass a model name or let softauto pick
- ✅ EDA: target distribution, missingness, correlation matrix
- ✅ Preprocessing: impute, encode, scale, rare‑category binning, outlier clip
- ✅ Selection/Extraction: Mutual Info, RFECV, PCA
- ✅ Training/Testing: robust CV, leaderboard, hold‑out metrics, artifacts + saved model
- ✅ Boosters optional: XGBoost, LightGBM, CatBoost auto‑detected (no hard dependency)
- ✅ Tiny API:
autorun(df, target, task=...)orRunner(...).fit()
Installation
Core (lightweight):
pip install softauto
With boosters & imbalance extras:
pip install "softauto[all]"
# or pick subsets:
# pip install "softauto[boosters]"
# pip install "softauto[imbalance]"
Boosters are optional; if not present they’re skipped silently.
Quick Start (Classification)
import pandas as pd
from sklearn.datasets import load_breast_cancer
from softauto import autorun
cancer = load_breast_cancer(as_frame=True)
df = cancer.frame.copy()
df.rename(columns={"target":"target"}, inplace=True)
res = autorun(df, target="target", task="classification", report_dir="report_cls", model="auto")
print(res["best_model"], res["test_metrics"])
# Artifacts in report_cls/: target_dist.png, missingness.png, corr_matrix.png, best_<model>.joblib, summary.json
Quick Start (Regression)
from sklearn.datasets import fetch_california_housing
from softauto import autorun
cal = fetch_california_housing(as_frame=True)
df = cal.frame.copy()
df["target"] = df["MedHouseVal"]; df = df.drop(columns=["MedHouseVal"])
res = autorun(df, target="target", task="regression", report_dir="report_reg", model="auto")
print(res["best_model"], res["test_metrics"])
API
autorun(df, target, task=None, **kwargs) -> dict
Single‑shot run. Returns a summary dict with:
task,best_model,cv_leaderboard,test_metricsartifacts(plot paths),model_path,advisor_notes,selected_features
Common kwargs:
report_dir="softauto_artifacts",random_state=42,test_size=0.2model="auto"or list/str of model names ("random_forest","logreg","xgb","lgbm","catboost", ...)- Preprocess:
scale,one_hot,cat_min_freq,numeric_impute,categorical_impute - Selection:
feature_selection=("mutual_info"|"rfecv"|None),top_k_features,pca_components - CV:
cv=5
Runner
from softauto import Runner
r = Runner(df, target="target", task="classification", report_dir="report")
summary = r.fit()
y_pred = r.predict(r.X_test_) # after fit
Artifacts
target_dist.png— class/target distributionmissingness.png— top-30 missing ratescorr_matrix.png— numeric correlation heatmap (skips if target not numeric)best_<model>.joblib— saved sklearn pipelinesummary.json— complete run data
Model Names
Classification: logreg, random_forest, gb, svm_rbf, knn, mlp, (xgb, lgbm, catboost if installed)
Regression: linear, ridge, lasso, random_forest, gb, svr_rbf, knn, mlp, (xgb, lgbm, catboost)
License
MIT © Soft Tech Talks
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file softauto-0.7.0.tar.gz.
File metadata
- Download URL: softauto-0.7.0.tar.gz
- Upload date:
- Size: 12.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
867db35a9bc18d97d44ce01c08ecdaeb5b0ac0c1fe14ebe55f212e272105a7e3
|
|
| MD5 |
14727b1c8c9645a637464feac3cfe3c5
|
|
| BLAKE2b-256 |
ab4f19ac33a0aea64c7e422c0fc8ecf9abd943dd8b36029a839fe53986f75622
|
File details
Details for the file softauto-0.7.0-py3-none-any.whl.
File metadata
- Download URL: softauto-0.7.0-py3-none-any.whl
- Upload date:
- Size: 12.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b0b0da2d35b2918edca75cddeae3314490552c43f0684187f5f48938a489380c
|
|
| MD5 |
ce6d18b0d526a8112a74549eb70c089b
|
|
| BLAKE2b-256 |
52abeb40f22d6b1272b5ce210d456880a5fcecc902a72ccd341e09160efce711
|