Sample-Efficient Bayesian Optimizer — GP surrogate with ensemble acquisition
Project description
SEBO — Sample-Efficient Bayesian Optimizer
Author: Nikolas Karefyllidis, PhD
I designed and implemented a GP-based sequential optimizer from scratch, applied it to the NeurIPS 2020 Black-Box Optimisation Challenge format — 8 unknown objective functions (2D–8D), one evaluation per function per round, 13 rounds — and benchmarked it head-to-head against common open-source solvers (Optuna-TPE, TuRBO, DE-GP-EI) on identical observation histories. The core contribution is a full BO pipeline: automatic kernel selection by log-marginal likelihood, ensemble acquisition (EI + PI + UCB with centroid fallback), and output warping for skewed objectives. The same suggest / observe API generalises directly to AutoML hyperparameter search, drug discovery, and materials design — any setting where evaluations are expensive and every query counts.
Use SEBO
pip install git+https://github.com/karefyllidis/SEBO.git
Or clone for development:
git clone https://github.com/karefyllidis/SEBO.git && cd SEBO && pip install -e .
With benchmark solvers (Optuna, TuRBO):
pip install "sebo[benchmark] @ git+https://github.com/karefyllidis/SEBO.git"
from sebo import BayesianOptimizer
optimizer = BayesianOptimizer(
bounds=[(0.0, 1.0)] * 4, # search space — any dimension
output_warping="log", # for skewed objectives (log or boxcox)
use_ensemble=True, # EI + PI + UCB with centroid fallback
)
optimizer.fit(X_init, y_init) # warm-start with existing observations
for _ in range(n_rounds):
x_next = optimizer.suggest() # GP surrogate + ensemble acquisition
y_next = oracle(x_next) # your expensive function here
optimizer.observe(x_next, y_next) # update the surrogate
print(optimizer.best) # (best_x, best_y)
See notebooks/demo_sklearn_hpo.ipynb for a fully worked example — no external data required.
Benchmark
notebooks/sebo_benchmark.ipynb — SEBO (built from scratch) benchmarked against common open-source solvers — Optuna-TPE, TuRBO, DE-GP-EI, and Random Search — on 6 synthetic black-box functions spanning four orders of magnitude in output scale (log-warping on F3, asymmetric Gaussian peaks on F6). Adaptive stopping: all solvers halt as soon as any one reaches ≥99% of the true maximum (cap: 80 evaluations) — each subplot shows a different evaluation count depending on function difficulty.
Incumbent best-y convergence. Green band = LHS warm-start. Dashed black line = true maximum. Dash-dot vertical line = stopping point (first solver to reach ≥99% of true max).
notebooks/demo_sklearn_hpo.ipynb — self-contained HPO demo. Tunes a RandomForestClassifier on sklearn's Digits dataset (4D search space: n_estimators, max_depth, min_samples_split, max_features). 10 LHS warm-start + 20 BO iterations vs 30 random search evaluations.
The Problem
Eight unknown objective functions, dimensions 2–8, domain [0, 1]^d. No formula, no gradients. One evaluation per function per round, across 13 rounds. Maximise each within budget.
| # | Dim | Real-world analogy | Landscape character |
|---|---|---|---|
| F1 | 2D | Radiation detection | Sparse signal; near-zero almost everywhere with a narrow high-value peak |
| F2 | 2D | Unknown ML model | Noisy; multiple local maxima |
| F3 | 3D | Drug discovery | Smooth, always negative; optimisation = least negative |
| F4 | 4D | Warehouse logistics | Many local optima, extreme outliers |
| F5 | 4D | Chemical process yield | Unimodal; output spans orders of magnitude near domain boundary |
| F6 | 5D | Recipe formulation | Noisy oracle; same input returned different y across rounds |
| F7 | 6D | Hyperparameter tuning | Sparse in 6D; smooth locally |
| F8 | 8D | High-dimensional ML model | Hardest; strong cumulative improvement with coverage |
Domain: [0, 1]^d for all functions. Higher y is always better; F3 and F6 outputs are negative.
Results
Best observed y after 13 rounds (10 warm-start points per function):
| Function | Initial best y | Final best y | Improvement |
|---|---|---|---|
| F1 | ~0.0 | 0.6704 | Large — narrow peak located in round 10 |
| F2 | ~0.19 | 0.7248 | Large |
| F3 | ~−0.44 | −0.0032 | Large (always negative; less negative = better) |
| F4 | ~0.04 | 0.2987 | Moderate |
| F5 | ~1700 | 7493.9 | Very large — near-boundary region [0.99, …] confirmed |
| F6 | ~−1.3 | −0.1402 | Large |
| F7 | ~0.003 | 2.7968 | Large |
| F8 | ~5.6 | 9.9619 | Large |
Full per-round strategy notes and GP diagnostics: docs/model_card.md.
GP Surrogate Evolution — Function 3 (Drug Discovery, 3D)
Weekly evolution of pairwise IDW-interpolated projections of observed y. Red dots are evaluations numbered by round. Colour scale fixed across all frames for direct comparison. Regenerate with python scripts/export_function3_gp_evolution_gif.py once local observations.csv is present.
Methodology
Bayesian Optimisation maintains a probabilistic surrogate (GP) over the unknown function and uses it to select the next query — balancing exploitation of known good regions against exploration of uncertain ones.
fit GP → maximise acquisition → evaluate f(x*) → append (x*, y*) → repeat
Kernel Selection
At each round, three kernels compete for the best log-marginal likelihood (LML): RBF, Matérn ν=1.5, and RBF + WhiteKernel. The winner is selected automatically; hyperparameters are tuned by L-BFGS-B MLE with multiple restarts.
Ensemble Acquisition
Three acquisition functions run simultaneously — EI, PI, and UCB. If their suggested next points are close together (agree), SEBO follows the EI recommendation. If they diverge (disagree), SEBO queries the centroid of all three — a soft blend that avoids over-committing to one strategy.
Output Warping
For objectives spanning orders of magnitude (e.g. F5: values from ~1700 to ~7500), targets are log-transformed before GP fitting. The GP, acquisition functions, and incumbent tracking all operate in warped space; raw y values are stored and reported.
Why This Matters — Real-World Applications
The BO loop in SEBO is the same engine used by Optuna, SMAC, and Ax internally. Building it from scratch makes every design decision explicit and auditable.
- AutoML / Hyperparameter Optimisation — GP surrogate replaces grid/random search; finds better configs in fewer model-training calls. Demonstrated directly in the HPO demo: SEBO tunes a RandomForest on Digits using 30 evaluations and consistently outperforms random search.
- Drug Discovery & Materials Science — sample-efficient search over molecular or material property spaces where each lab measurement is costly (F3 analogue: drug potency proxy).
- Simulation Optimisation — engineering or physics simulations where one run takes minutes to hours; BO is the standard approach for tuning parameters.
- Neural Architecture Search — treating layer widths, learning rates, and dropout as a continuous search space; same GP + acquisition loop applies.
- Sequential Experiment Design — A/B tests, clinical dose-finding, adaptive sampling — any setting where observations arrive one at a time and each one is expensive.
Project Structure
sebo/
│
├── notebooks/
│ ├── function_{1..8}_*.ipynb # One notebook per objective — full BO pipeline
│ ├── sebo_benchmark.ipynb # Head-to-head benchmark vs open-source solvers
│ └── demo_sklearn_hpo.ipynb # Standalone HPO demo — start here
│
├── src/
│ ├── optimizers/
│ │ ├── optimizer.py # BayesianOptimizer — stateful suggest/observe API
│ │ ├── my_bayesian/my_gp_skopt.py # GP + ensemble acquisition (EI/PI/UCB)
│ │ └── wrappers/ # optuna, turbo, de_gp_ei, hyperopt solvers
│ └── utils/
│ ├── warping.py # log / Box-Cox output warping
│ ├── sampling_utils.py # Sobol / LHS candidate generation
│ ├── plot_utilities.py # Shared plot helpers
│ └── load_challenge_data.py # Data loader
│
├── append_results/
│ ├── append_week{1..13}_results.py # Round-append scripts (idempotent)
│ └── run_optimizers_on_data.py # Benchmark all solvers on oracle data
│
├── scripts/
│ └── export_function3_gp_evolution_gif.py
│
├── configs/ # Solver configs (optuna, turbo, de_gp_ei, hyperopt)
├── docs/
│ ├── model_card.md # Architecture, per-round performance, limitations
│ ├── datasheet.md # Data provenance, composition, uses
│ └── TECHNICAL_FOUNDATIONS.md # BO theory, kernel choices, key papers
│
├── run_pipeline.py
├── requirements.txt
└── requirements-benchmark.txt
Note on data: Raw evaluation CSVs (
data/problems/,initial_data/) are gitignored. The demo and benchmark notebooks run without them; GP evolution plots require localobservations.csvfiles.
Stack
Python 3.10+ · NumPy · SciPy · scikit-learn (GaussianProcessRegressor) · scikit-optimize · Optuna · BoTorch/TuRBO · Matplotlib
References
- Turner et al., PMLR Vol. 133 — Bayesian Optimization is Superior to Random Search for Machine Learning Hyperparameter Tuning: Analysis of the Black-Box Optimization Challenge 2020
- NeurIPS 2020 Black-Box Optimisation Challenge
Licence
MIT. Initial warm-start data provided by Imperial College London for educational use; redistribution permitted for non-commercial, academic purposes.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sebo-0.1.0.tar.gz.
File metadata
- Download URL: sebo-0.1.0.tar.gz
- Upload date:
- Size: 37.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dbc8700486eea851f45ec84471134c710c4671422986b95cf379efd3a0fc81f3
|
|
| MD5 |
74fa0aa2b376b17f8440a3690d85fb29
|
|
| BLAKE2b-256 |
fc430572f631890e52961951b2827a9e62008058b579380b97d976ff72a87b4e
|
File details
Details for the file sebo-0.1.0-py3-none-any.whl.
File metadata
- Download URL: sebo-0.1.0-py3-none-any.whl
- Upload date:
- Size: 46.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d1012a53ca30a0dece9c497d163e2387cc1c53d3c8dd4bf44dc7fe75820c47f5
|
|
| MD5 |
ee5cd6c375c303e825774c4bea16f98b
|
|
| BLAKE2b-256 |
d5c8fc0970c96f38664a1772696e8afd276bee700efdeed6428850c8721b3492
|