Stepwise hyperparameter search for scikit-learn estimators

Project description

sk-stepwise

sk-stepwise is a small Python library for staged hyperparameter optimization of scikit-learn compatible estimators.

The main API is StepwiseOptunaSearchCV, which runs Optuna search one step at a time. Each step optimizes a subset of parameters while carrying forward the best settings found in earlier steps.

Why stepwise search

A flat search space is often larger than it needs to be. Many workflows are easier to reason about in stages:

tune structural parameters first
tune regularization or sampling parameters next
tune learning-rate style parameters later

That is the model this library supports.

Installation

uv add sk-stepwise

For development:

uv sync
uv run pytest
uv run pytest -q tests/test_readme_doctest.py

Quickstart

>>> import numpy as np
>>> import pandas as pd
>>> from sklearn.ensemble import RandomForestRegressor
>>> from sk_stepwise import Float, Int, StepwiseOptunaSearchCV
>>>
>>> rng = np.random.default_rng(42)
>>> X = pd.DataFrame(rng.random((100, 5)), columns=[f"feature_{i}" for i in range(5)])
>>> y = pd.Series(rng.random(100))
>>>
>>> estimator = RandomForestRegressor(random_state=0)
>>> param_distributions = [
...     {"n_estimators": Int(50, 150)},
...     {"max_depth": Int(3, 10)},
...     {"min_samples_split": Float(0.1, 1.0)},
... ]
>>>
>>> search = StepwiseOptunaSearchCV(
...     estimator=estimator,
...     param_distributions=param_distributions,
...     n_trials_per_step=2,
...     random_state=0,
... )
>>> search.fit(X, y)  # doctest: +ELLIPSIS
StepwiseOptunaSearchCV(...)
>>> predictions = search.predict(X)
>>> len(predictions)
100
>>> sorted(search.best_params_.keys())
['max_depth', 'min_samples_split', 'n_estimators']
>>> isinstance(search.best_score_, float)
True

Build a real model from the search results

You can use best_params_ directly with a fresh estimator instance.

>>> from sklearn.ensemble import RandomForestRegressor
>>>
>>> best_params = search.best_params_
>>> sorted(best_params)
['max_depth', 'min_samples_split', 'n_estimators']
>>> final_model = RandomForestRegressor(random_state=0, **best_params)
>>> final_model.fit(X, y)
RandomForestRegressor(...)
>>> isinstance(final_model.get_params()["n_estimators"], int)
True
>>> tuned_predictions = final_model.predict(X)
>>> len(tuned_predictions)
100

Search-space types

Use the backend-neutral dimension helpers:

Int(low, high, log=False) for ordered integer values like n_estimators, max_depth, depth, min_samples_leaf
Float(low, high, log=False) for continuous values like learning_rate, subsample, regularization strengths
Categorical(choices) for unordered values like criterion, solver, bootstrap

Examples:

>>> from sk_stepwise import Categorical, Float, Int
>>>
>>> space = [
...     {"n_estimators": Int(50, 300)},
...     {"max_depth": Int(2, 12)},
...     {"learning_rate": Float(1e-3, 1e-1, log=True)},
...     {"criterion": Categorical(["squared_error", "absolute_error"])},
... ]
>>> len(space)
4

Numeric categorical warning

If you write Categorical([10, 20, 30]), the library now emits a warning. For ordered numeric values, Int(...) or Float(...) is usually a better fit because the optimizer can use the numeric ordering.

Progress logging

Set verbose=1 to print step-by-step progress:

Optimizing step 1/3
Best parameters after step 1: ...
Best score after step 1: ...
Improvement: ...

This is intentionally opt-in.

scikit-learn behavior

StepwiseOptunaSearchCV is designed to behave like a sklearn-style search estimator:

supports fit, predict, and score
exposes best_params_, best_score_, best_estimator_, study_, studies_, and step_results_
works with pipelines and namespaced params like regressor__max_depth
supports scorer strings and scorer callables
supports cv as an int, splitter object, or iterable of splits
passes fit metadata such as sample_weight through sklearn evaluation

Optional methods are delegated when supported by the fitted best estimator:

predict_proba
decision_function
transform

Migration from Hyperopt

The old StepwiseHyperoptOptimizer name is deprecated.

Current behavior:

StepwiseHyperoptOptimizer(...) still works as a compatibility shim
it emits DeprecationWarning
it maps old constructor names onto StepwiseOptunaSearchCV

Example migration:

>>> import warnings
>>> from sk_stepwise import StepwiseHyperoptOptimizer, StepwiseOptunaSearchCV
>>> warnings.simplefilter("ignore", DeprecationWarning)
>>> # old
>>> search = StepwiseHyperoptOptimizer(
...     model=estimator,
...     param_space_sequence=space,
...     max_evals_per_step=20,
... )
>>> # new
>>> search = StepwiseOptunaSearchCV(
...     estimator=estimator,
...     param_distributions=space,
...     n_trials_per_step=20,
... )

Important:

backend-neutral dimensions such as Int, Float, and Categorical are the supported path
old Hyperopt space objects are not part of the new mainline API

Example: pipeline usage

>>> from sklearn.ensemble import RandomForestRegressor
>>> from sklearn.pipeline import Pipeline
>>> from sklearn.preprocessing import StandardScaler
>>> from sk_stepwise import Int, StepwiseOptunaSearchCV
>>>
>>> pipeline = Pipeline(
...     [
...         ("scale", StandardScaler()),
...         ("regressor", RandomForestRegressor(random_state=0)),
...     ]
... )
>>> space = [
...     {"regressor__n_estimators": Int(50, 150)},
...     {"regressor__max_depth": Int(2, 8)},
... ]
>>> search = StepwiseOptunaSearchCV(
...     estimator=pipeline,
...     param_distributions=space,
...     n_trials_per_step=2,
...     random_state=0,
... )

Example: sample weights

>>> import numpy as np
>>> from sklearn.linear_model import LinearRegression
>>> from sk_stepwise import Categorical, StepwiseOptunaSearchCV
>>>
>>> sample_weight = np.linspace(1.0, 2.0, len(y))
>>> search = StepwiseOptunaSearchCV(
...     estimator=LinearRegression(),
...     param_distributions=[{"fit_intercept": Categorical([True, False])}],
...     n_trials_per_step=2,
...     random_state=0,
... )
>>> search.fit(X, y, sample_weight=sample_weight)  # doctest: +ELLIPSIS
StepwiseOptunaSearchCV(...)

Status

The core Optuna path is implemented and covered by tests for:

NumPy, pandas, and plain list inputs
regression and classification
sklearn pipelines
XGBoost and CatBoost integration
deprecated Hyperopt shim behavior

License

MIT

Project details

Release history Release notifications | RSS feed

This version

0.2.0

Mar 20, 2026

0.1.6

Jul 1, 2025

0.1.5

Jun 26, 2025

0.1.4

Jun 26, 2025

0.1.3

Jun 25, 2025

0.1.2

May 23, 2025

0.1.1

May 23, 2025

0.1.0

Oct 10, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sk_stepwise-0.2.0.tar.gz (5.3 kB view details)

Uploaded Mar 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sk_stepwise-0.2.0-py3-none-any.whl (6.4 kB view details)

Uploaded Mar 20, 2026 Python 3

File details

Details for the file sk_stepwise-0.2.0.tar.gz.

File metadata

Download URL: sk_stepwise-0.2.0.tar.gz
Upload date: Mar 20, 2026
Size: 5.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for sk_stepwise-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`6620f5c45f2e57bd9fb2f2aa7f36bc7bed32abbc4605c1e8244bc04b685beca7`
MD5	`ac98997bd4195a5c990dbd2e3306e848`
BLAKE2b-256	`64de3174340ae3f6a870dc3a79368e979b8ba3aee4a54793f9479daa8de92a2e`

See more details on using hashes here.

File details

Details for the file sk_stepwise-0.2.0-py3-none-any.whl.

File metadata

Download URL: sk_stepwise-0.2.0-py3-none-any.whl
Upload date: Mar 20, 2026
Size: 6.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for sk_stepwise-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9b1ffaa95d8162d5133152e73cdf0db293fc167a1dfd32cfa4b5fdbbfc65990f`
MD5	`271069f323b019a94d2286f8a551dd68`
BLAKE2b-256	`bc60b544b3a489c5fea21e2ae352bb3bb630c4b0ed8933abb923c0b83a3ed532`

See more details on using hashes here.

sk-stepwise 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

sk-stepwise

Why stepwise search

Installation

Quickstart

Build a real model from the search results

Search-space types

Numeric categorical warning

Progress logging

scikit-learn behavior

Migration from Hyperopt

Example: pipeline usage

Example: sample weights

Status

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes