vespatune: no-code training for tabular models
Project description
VespaTune
Gradient Boosting + Optuna: no brainer
- Web UI for training, monitoring, and managing models
- Tune models directly from CSV files
- Real-time training progress with WebSocket updates
- Export models to ONNX format for deployment
Installation
Install using pip:
pip install vespatune
Quick Start
Web UI (Recommended)
Start the web interface:
vespatune
This launches the VespaTune UI at http://127.0.0.1:9999 where you can:
- Upload train/validation CSV files
- Configure model type, target columns, and hyperparameters
- Start training with real-time progress monitoring
- View trial results and metrics
- Download trained models and artifacts
- Manage multiple training runs
You can also specify host and port:
vespatune --host 0.0.0.0 --port 8080
CLI
Train a model:
vespatune train \
--train_filename train.csv \
--valid_filename valid.csv \
--output outputs/my_model \
--model xgboost
Make predictions:
vespatune predict \
--model_path outputs/my_model \
--test_filename test.csv \
--output_filename predictions.csv
Serve a trained model for predictions:
vespatune serve --model_path outputs/my_model --host 0.0.0.0 --port 8000
Python API
from vespatune import VespaTune
vtune = VespaTune(
train_filename="train.csv",
valid_filename="valid.csv",
output="outputs/my_model",
model_type="xgboost", # or "lightgbm" or "catboost"
targets=["target"],
num_trials=100,
time_limit=3600,
)
vtune.train()
Web UI Features
The web interface provides:
- File Upload: Drag and drop CSV files for training and validation
- Auto Column Detection: Automatically detects columns for target and ID selection
- Model Selection: Choose between XGBoost, LightGBM, or CatBoost
- Real-time Monitoring: Watch training progress with live trial updates via WebSocket
- Metrics Visualization: View loss curves and hyperparameter importance
- Run Management: Start, stop, and delete training runs
- Artifact Downloads: Download trained models, configs, and ONNX exports
Parameters
Required
| Parameter | Description |
|---|---|
train_filename |
Path to training CSV file |
valid_filename |
Path to validation CSV file |
output |
Path to output directory for model artifacts |
Optional
| Parameter | Default | Description |
|---|---|---|
model_type |
"xgboost" |
Model to use: "xgboost", "lightgbm", or "catboost" |
test_filename |
None |
Path to test CSV file (predictions saved if provided) |
task |
None |
"classification" or "regression" (auto-detected if not specified) |
idx |
"id" |
Name of the ID column |
targets |
["target"] |
List of target column names |
features |
None |
List of feature columns (all non-id/target columns if not specified) |
categorical_features |
None |
List of categorical columns (auto-detected if not specified) |
use_gpu |
False |
Whether to use GPU for training |
seed |
42 |
Random seed for reproducibility |
num_trials |
1000 |
Number of Optuna trials for hyperparameter tuning |
time_limit |
None |
Time limit for optimization in seconds |
Supported Models
XGBoost
- Default model with extensive hyperparameter search
- Supports GPU acceleration
- Best for general-purpose tasks
LightGBM
- Native categorical feature support
- Fast training on large datasets
- Supports GPU acceleration
CatBoost
- Best native categorical feature handling
- Robust to overfitting
- Supports GPU acceleration
Data Splitting
VespaTune uses an explicit train/validation split. If you have a single dataset, use the splitter utility:
vespatune splitter \
--data_filename data.csv \
--output splits/ \
--target target \
--task classification \
--num_folds 5
Or via Python:
from vespatune import VespaTuneSplitter
splitter = VespaTuneSplitter(
data_filename="data.csv",
output="splits/",
target="target",
task="classification",
num_folds=5,
)
splitter.split()
This creates fold_0_train.csv, fold_0_valid.csv, etc. for k-fold cross-validation.
Prediction
Using the trained model
from vespatune import VespaTunePredict
predictor = VespaTunePredict(model_path="outputs/my_model")
predictions = predictor.predict_file("test.csv")
Using ONNX model
from vespatune import VespaTuneONNXPredict
predictor = VespaTuneONNXPredict(model_path="onnx_model/")
predictions = predictor.predict_file("test.csv")
CLI Reference
Default (UI)
vespatune [--host HOST] [--port PORT]
options:
--host Host to serve on (default: 127.0.0.1)
--port Port to serve on (default: 9999)
--version, -v Display VespaTune version
train
vespatune train --help
options:
--train_filename Path to training file (required)
--valid_filename Path to validation file (required)
--output Path to output directory (required)
--model Model type: xgboost, lightgbm, catboost (default: xgboost)
--test_filename Path to test file
--task Task type: classification, regression
--idx ID column name
--targets Target column(s), separate multiple by ';'
--features Feature columns, separate by ';'
--use_gpu Use GPU for training
--seed Random seed (default: 42)
--num_trials Number of Optuna trials (default: 100)
--time_limit Time limit in seconds
predict
vespatune predict --help
options:
--model_path Path to trained model directory (required)
--test_filename Path to test file (required)
--output_filename Path to output predictions file (required)
export
vespatune export --help
options:
--model_path Path to trained model directory (required)
--output_dir Path to ONNX output directory
serve
vespatune serve --help
options:
--model_path Path to ONNX export directory
--host Host to bind (default: 127.0.0.1)
--port Port to bind (default: 9999)
--workers Number of workers (default: 1)
--reload Enable auto-reload for development
splitter
vespatune splitter --help
options:
--data_filename Path to data file (required)
--output Path to output directory (required)
--target Target column name (required)
--task Task type: classification, regression (required)
--num_folds Number of folds (default: 5)
Output Files
After training, the following files are created in the output directory:
| File | Description |
|---|---|
vtune_model.final |
Trained model |
vtune.config |
Model configuration |
vtune.best_params |
Best hyperparameters from Optuna |
vtune.categorical_encoder |
Categorical feature encoder |
vtune.target_encoder |
Target encoder (for classification) |
params.db |
Optuna study database |
train.feather |
Processed training data |
valid.feather |
Processed validation data |
onnx/ |
ONNX export directory (after export) |
Example
from vespatune import VespaTune
# Train with LightGBM
vtune = VespaTune(
train_filename="data/train.csv",
valid_filename="data/valid.csv",
output="outputs/lgb_model",
model_type="lightgbm",
targets=["price"],
task="regression",
num_trials=200,
time_limit=1800,
use_gpu=False,
seed=42,
)
vtune.train()
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vespatune-0.0.2.tar.gz.
File metadata
- Download URL: vespatune-0.0.2.tar.gz
- Upload date:
- Size: 58.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b5940d2a8882154bb5ce06050fdc840948a40ae561a09dbe7bffa0e2121e8fce
|
|
| MD5 |
2222e46e7d315940fdcc53e085d7528c
|
|
| BLAKE2b-256 |
3c978b75cff152efe68726a1ec9d2937bd22d0e845891603e7941c6f06f70186
|
File details
Details for the file vespatune-0.0.2-py3-none-any.whl.
File metadata
- Download URL: vespatune-0.0.2-py3-none-any.whl
- Upload date:
- Size: 59.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
90ccb39ce4c4dbaeca07eb36a0e060ccc336a710c4173ae545b3d0ce295e510a
|
|
| MD5 |
1ff54c5e851819c71ad61979ab1ecc0a
|
|
| BLAKE2b-256 |
ffca88d5b05769843b9b3c52b44e898dff62916efb4999ab39212d3a3794b176
|