Throw any data, get a working model. One-click AutoML with intelligent preprocessing, feature engineering, and ensemble optimization.
Project description
AutoThink
Throw any data, get a working model.
One-click AutoML for tabular data.
Auto-detects task type • Engineers features • Trains LightGBM + XGBoost + CatBoost • Optimizes blend weights
All in a single function call.
Quickstart
pip install autothink
import pandas as pd
from autothink import fit
df = pd.read_csv("train.csv")
model = fit(df, target="price")
predictions = model.predict(pd.read_csv("test.csv"))
That's it. Three lines.
How It Works
Your DataFrame
|
v
+--------------------+ +-------------------------+ +---------------------+
| Task Detection | --> | Intelligent | --> | Adaptive Feature |
| binary / multiclass| | Preprocessing | | Engineering |
| / regression | | missing values, encode, | | learns thresholds & |
| | | scale | | interactions |
+--------------------+ +-------------------------+ +---------------------+
|
v
+--------------------+ +-------------------------+ +---------------------+
| Verification | <-- | Blend Optimization | <-- | Ensemble Training |
| fold stability, | | scipy-optimized weights | | LightGBM + XGBoost |
| leakage check | | + Platt calibration | | + CatBoost (K-fold) |
+--------------------+ +-------------------------+ +---------------------+
|
v
model.predict(test_df)
| Step | What happens |
|---|---|
| Task detection | Determines binary, multiclass, or regression from the target column |
| Data validation | Checks for leakage, class imbalance, and quality issues |
| Preprocessing | Handles missing values, one-hot / target-encodes categoricals, scales numerics |
| Feature engineering | Learns optimal split thresholds and feature interactions from data |
| Ensemble training | Trains LightGBM, XGBoost, and CatBoost with adaptive hyperparameters |
| Blend optimization | Finds optimal ensemble weights via scipy on out-of-fold predictions |
| Calibration | Platt scaling for well-calibrated probabilities |
| Verification | Post-training diagnostics: fold variance, leakage, feature importance |
Installation
From PyPI (coming soon):
pip install autothink
From source:
git clone https://github.com/ranausmanai/autothink.git
cd autothink
pip install -e .
With optional extras:
pip install autothink[dev] # pytest
pip install autothink[api] # FastAPI serving
pip install autothink[onnx] # ONNX export
API Reference
fit(df, target, **kwargs)
One-line AutoML. Returns a fitted AutoThinkV4 instance.
| Parameter | Type | Default | Description |
|---|---|---|---|
df |
DataFrame |
required | Training data (features + target) |
target |
str |
required | Name of the target column |
time_budget |
int |
600 |
Maximum training time in seconds |
verbose |
bool |
True |
Log progress to console |
AutoThinkV4
from autothink import AutoThinkV4
model = AutoThinkV4(time_budget=300, verbose=True)
model.fit(df, target_col="price")
preds = model.predict(test_df)
Attributes after fitting:
| Attribute | Description |
|---|---|
model.cv_score |
Mean cross-validation score |
model.cv_std |
CV score standard deviation |
model.task_info |
Detected task type, metric, class info |
model.verification_report |
Post-training diagnostics |
Logging
AutoThink uses Python's logging module. The library is silent by default.
import autothink
autothink.setup_logging() # Enable INFO-level output to stderr
Or just use verbose=True (the default) which auto-configures a console handler.
Benchmarks
AutoThink V4 is competitive with FLAML and AutoGluon on standard tabular tasks:
| Dataset | AutoThink V4 | FLAML | AutoGluon |
|---|---|---|---|
| Heart Disease (AUC) | 0.918 | 0.912 | 0.920 |
| Loan Default (AUC) | 0.874 | 0.869 | 0.871 |
| House Price (RMSE) | 30,241 | 31,102 | 29,876 |
60-second time budget, single 80/20 split, seed=42. Lower RMSE is better.
Examples
See the examples/ directory:
| Example | Description |
|---|---|
quickstart.py |
Minimal 15-line fit/predict on sklearn data |
kaggle_competition.py |
Full Kaggle pipeline with CLI and submission output |
benchmark.py |
Compare AutoThink against FLAML |
Project Structure
autothink/
__init__.py # Public API: fit(), setup_logging()
core/
autothink_v4.py # Main engine (TaskDetector, IntelligentEnsemble, AutoThinkV4)
autothink_v3.py # V3 engine (Kaggle-optimized)
autothink_v2.py # V2 engine (meta-learning)
preprocessing.py # IntelligentPreprocessor, FeatureEngineer
feature_engineering_general.py # Adaptive, data-driven feature engineering
validation.py # DataValidator, LeakageDetector
meta_learning.py # MetaLearningDB, dataset fingerprinting
production.py # ModelExporter, ModelCard, DriftDetector, APIGenerator
advanced.py # CausalAutoML, ExplanationEngine, SmartEnsemble
kaggle_beast.py # Competition-grade ensemble mode
kaggle_fast.py # Fast Kaggle mode
tests/ # 25 tests (pytest)
examples/ # Quickstart, Kaggle, benchmark
Contributing
Contributions are welcome! Please open an issue or submit a PR.
# Development setup
git clone https://github.com/ranausmanai/autothink.git
cd autothink
pip install -e ".[dev]"
pytest tests/
License
Apache 2.0 — see LICENSE.
Built with scikit-learn, LightGBM, XGBoost, and CatBoost.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file autothink-4.0.0.tar.gz.
File metadata
- Download URL: autothink-4.0.0.tar.gz
- Upload date:
- Size: 50.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
65c7699a1c6040b470496d0aae5c37db4cca889b78fa4aea1d92c4860d1c7aa3
|
|
| MD5 |
aedc7652f14987a2d933d753904e3a67
|
|
| BLAKE2b-256 |
20505a74bb3f0b29a2a325373b0613fb357d09997e409dd6373e5dea2106b0d5
|
File details
Details for the file autothink-4.0.0-py3-none-any.whl.
File metadata
- Download URL: autothink-4.0.0-py3-none-any.whl
- Upload date:
- Size: 56.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f6e1dd2486fffa3fc865676574bb803227108d40d3b3dbd578f54a05e1c9c772
|
|
| MD5 |
759ac8f6900f8a0bb078532e4ec39cca
|
|
| BLAKE2b-256 |
e17d97c23ccf825b961f4425e25df45d6cea9921ee1eb013e64b56998b4cd7e9
|