ML Fast Opt - Advanced ensemble optimization system for LightGBM hyperparameter tuning
Project description
MLFastOpt
MLFastOpt is a high-speed ensemble optimization system for Bayesian hyperparameter tuning of LightGBM, XGBoost, and Random Forest models.
Features
- 🚀 Fast Optimization: Advanced Bayesian optimization algorithms (Sobol + BoTorch).
- 🧩 Multi-Model Support: Tune LightGBM, XGBoost, or Random Forest ensembles.
- ⚙️ Simple Config: Hierarchical JSON configuration and YAML/Python search spaces.
- 📊 Rich Analytics: Built-in web dashboards and visualization tools.
Prerequisites
- Python 3.9+
- macOS Users: You must install
openmpfor LightGBM/XGBoost to work:brew install libomp
Installation
-
Activate Virtual Environment:
source .venv/bin/activate # OR if you haven't created one yet: # python3.12 -m venv .venv && source .venv/bin/activate
-
Install Package:
pip install -e .[dev]
Quick Start (End Users)
If you installed the package via pip install mlfastopt, follow these steps:
-
Create Configuration Files: You need a
config.jsonand a hyperparameter space file (e.g.,hyperparameters.yaml).config.json:
{ "data": { "path": "train.parquet", "label_column": "target", "features": "features.yaml" }, "model": { "type": "xgboost", "hyperparameter_path": "config/hyperparameters/xgboost.yaml" }, "training": { "metric": "f1", "total_trials": 20 }, "output": { "dir": "outputs" } }
-
Run Optimization:
export OMP_NUM_THREADS=1 mlfastopt-optimize --config config.json
Quick Start (Developers)
Prerequisite: Input data must be preprocessed and numerical. Handle all categorical encoding (e.g., one-hot, label encoding) before using MLFastOpt (except for LightGBM/XGBoost which have some categorical support).
1. Setup
Create the required directory structure:
mkdir -p config/hyperparameters data
2. Define Parameter Space
We recommend using YAML for parameter spaces. Create config/hyperparameters/my_space.yaml:
parameters:
- name: learning_rate
type: range
bounds: [0.01, 0.3]
value_type: float
log_scale: true
- name: max_depth
type: range
bounds: [3, 10]
value_type: int
3. Configure
Create my_config.json using the nested structure:
{
"data": {
"path": "data/your_dataset.parquet",
"label_column": "target",
"features": ["feature1", "feature2"],
"class_weight": { "0": 1, "1": 5 },
"under_sample_majority_ratio": 1.0
},
"model": {
"type": "lightgbm",
"hyperparameter_path": "config/hyperparameters/my_space.yaml",
"ensemble_size": 5
},
"training": {
"total_trials": 20,
"sobol_trials": 5,
"metric": "soft_recall",
"parallel": true,
"n_jobs": -1
},
"output": {
"dir": "outputs/runs"
}
}
4. Run
Execute optimization (ensure single-threading for LightGBM/XGBoost to avoid deadlocks):
OMP_NUM_THREADS=1 python -m mlfastopt.cli --config my_config.json
Configuration Reference
Data Section (data)
| Parameter | Description | Default |
|---|---|---|
path |
Path to dataset (CSV/Parquet). | Required |
label_column |
Name of target column. | Required |
features |
List of features or path to YAML file. | Required |
class_weight |
Dictionary of class weights (e.g., {"0": 1, "1": 10}). |
None |
Model Section (model)
| Parameter | Description | Default |
|---|---|---|
type |
Model type: lightgbm, xgboost, random_forest. |
lightgbm |
hyperparameter_path |
Path to parameter space file. | Required |
ensemble_size |
Models per ensemble. | 1 |
Training Section (training)
| Parameter | Description | Default |
|---|---|---|
total_trials |
Total optimization trials. | 20 |
metric |
Metric to maximize (soft_recall, soft_f1_score, etc). |
soft_recall |
parallel |
Enable parallel training of ensemble members. | false |
Outputs
Results are saved to outputs/:
runs/: Detailed logs and models for each run.best_trials/: JSON configurations of the best performing trials.visualizations/: Generated plots.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mlfastopt-0.0.9b2.tar.gz.
File metadata
- Download URL: mlfastopt-0.0.9b2.tar.gz
- Upload date:
- Size: 64.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
762bfcc08c05699f6380ebe54795c11b77281421594d0a2dd2ac652fda655048
|
|
| MD5 |
fe6248d158d832d1ceaf8026dece10db
|
|
| BLAKE2b-256 |
d303c7518ba0c54bd0e55f1db4394730d831987a9529736815bff343bc5955ce
|
File details
Details for the file mlfastopt-0.0.9b2-py3-none-any.whl.
File metadata
- Download URL: mlfastopt-0.0.9b2-py3-none-any.whl
- Upload date:
- Size: 67.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
02f546fc7b60ae92f1aa40dad339f36096752fdb8af120d991facec47a31e419
|
|
| MD5 |
ee3bc2df1d222a693d406443877b7d98
|
|
| BLAKE2b-256 |
83e2b8708ad467652b96660a86aa490f71986fff30f0f12f45290860a43181d5
|