Skip to main content

A python implementation of 'Explaining Hyperparameter Optimization via Partial Dependence Plots' by Moosbauer et al.

Project description

Python PDP with Partitioner

GitHub

Installation

You need to either create an environment or update an existing environment. After creating an environment you have to activate it:

conda activate pyPDPPartitioner

Create environment

conda env create -f environment.yml

Update environment (if env exists)

conda env update -f environment.yml --prune

Installation via pip

pip install pyPDPPartitioner

For HPO-Bench examples, you further need to install HPOBench from git (e.g. pip install git+https://github.com/automl/HPOBench.git@master).

Usage

Blackbox functions

To use this package you need

  • A Blackbox function (a function that gets any input and outputs a score)
  • A Configuration Space that matches the required input of the blackbox function

There are some synthetic Blackbox-functions implemented that are ready to use:

f = StyblinskiTang.for_n_dimensions(3)  # Create 3D-StyblinskiTang function
cs = f.config_space  # A config space that is suitable for this function

Samplers

To sample points for fitting a surrogate, there are multiple samplers available:

  • RandomSampler
  • GridSampler
  • BayesianOptimizationSampler with Acquisition-Functions:
    • LowerConfidenceBound
    • (ExpectedImprovement)
    • (ProbabilityOfImprovement)
sampler = BayesianOptimizationSampler(f, cs)
sampler.sample(80)

Surrogate Models

All algorithms require a SurrogateModel, which can be fitted with SurrogateModel.fit(X, y) and yields means and variances with SurrogateModel.predict(X).

Currently, there is only a GaussianProcessSurrogate available.

surrogate = GaussianProcessSurrogate()
surrogate.fit(sampler.X, sampler.y)

Algorithms

There are some available algorithms:

  • ICE
  • PDP
  • DecisionTreePartitioner
  • RandomForestPartitioner

Each algorithm needs:

  • A SurrogateModel
  • One or many selected hyperparameter
  • samples
  • num_grid_points_per_axis

Samples can be randomly generated via

# Algorithm.from_random_points(...)
ice = ICE.from_random_points(surrogate, selected_hyperparameter="x1")

Also, all other algorithms can be built from an ICE-Instance.

pdp = PDP.from_ICE(ice)
dt_partitioner = DecisionTreePartitioner.from_ICE(ice)
rf_partitioner = RandomForestPartitioner.from_ICE(ice)

The Partitioners can split the Hyperparameterspace of not selected Hyperparameters into multiple regions. The best region can be obtained using the incumbent of the sampler.

incumbent_config = sampler.incumbent_config
dt_partitioner.partition(max_depth=3)
dt_region = dt_partitioner.get_incumbent_region(incumbent_config)

rf_partitioner.partition(max_depth=1, num_trees=10)
rf_region = rf_partitioner.get_incumbent_region(incumbent_config)

Finally, a new PDP can be obtained from the region. This PDP has the properties of a single ICE-Curve since the mean of the ICE-Curve results in a new ICE-Curve.

pdp_region = region.pdp_as_ice_curve

Plotting

Most components can create plots. These plots can be drawn on a given axis or are drawn on plt.gca() by default.

Samplers

sampler.plot()  # Plots all samples

Surrogate

surrogate.plot_means()  # Plots mean predictions of surrogate
surrogate.plot_confidences()  # Plots confidences

Acquisition Function

surrogate.acq_func.plot()  # Plot acquisition function of surrogate model

ICE

ice.plot()  # Plots all ice curves. Only possible for 1 selected hyperparameter

ICE Curve

ice_curve = ice[0]  # Get first ice curve
ice_curve.plot_values()  # Plot values of ice curve 
ice_curve.plot_confidences()  # Plot confidences of ice curve 
ice_curve.plot_incumbent()  # Plot position of smallest value 

PDP

pdp.plot_values()  # Plot values of pdp
pdp.plot_confidences()  # Plot confidences of pdp 
pdp.plot_incumbent()  # Plot position of smallest value 

Partitioner

dt_partitioner.plot()  # only 1 selected hp, plots all ice curves in different color per region
dt_partitioner.plot_incumbent_cs(incumbent_config)  # plot config space of best region

rf_partitioner.plot_incumbent_cs(incumbent_config)  # plot incumbent config of all trees

Regions

region.plot_values()  # plot pdp of region
region.plot_confidences()  # plot confidence of pdp in region

Plotting examples

Surrogate

Source: tests/sampler/test_acquisition_function.py

  • 1D-Surrogate model with mean + confidence
  • acquisition function

Sampler

Source: tests/sampler/test_mmd.py

  • Underlying blackbox function (2D-Styblinski-Tang)
  • Samples from RandomSampler
  • Samples from BayesianOptimizationSampler

ICE

Source: tests/algorithms/test_ice.py

  • All ICE-Curves from 2D-Styblinski-Tang with 1 selected Hyperparameter

PDP

Source: tests/algorithms/test_pdp.py

  • 2D PDP (means)
  • 2D PDP (confidences)
  • All Samples for surrogate model

PDP

Source: examples/main_2d_pdp.py (num_grid_points_per_axis=100)

  • 2D PDP (means)

Decision Tree Partitioner

Source: tests/algorithms/partitioner/test_partitioner.py

  • All ICE-Curves splitt into 8 different regions (3 splits) (used 2D-Styblinski-Tang with 1 selected hyperparameter)

Decision Tree Config Spaces

Source: tests/algorithms/partitioner/test_partitioner.py

  • All Leaf-Config spaces from Decision Tree Partitioner with 3D-Styblinski-Tang Function and 1 Selected Hyperparameter (x3)
  • 2D-Styblinkski-Tang in background

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyPDPPartitioner-0.1.9.tar.gz (31.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyPDPPartitioner-0.1.9-py3-none-any.whl (39.8 kB view details)

Uploaded Python 3

File details

Details for the file pyPDPPartitioner-0.1.9.tar.gz.

File metadata

  • Download URL: pyPDPPartitioner-0.1.9.tar.gz
  • Upload date:
  • Size: 31.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.18

File hashes

Hashes for pyPDPPartitioner-0.1.9.tar.gz
Algorithm Hash digest
SHA256 8f8d46651950be19ea7a9873bebd4be574ffa30e4e0286a9783ef7deb116f274
MD5 a223cf3b938478d240840cd5b21a8aec
BLAKE2b-256 659cabe954b741cfee7137679313fc16984c4263bbc6f33ff47ed5b6f94d3db9

See more details on using hashes here.

File details

Details for the file pyPDPPartitioner-0.1.9-py3-none-any.whl.

File metadata

File hashes

Hashes for pyPDPPartitioner-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 a9ba7a1e284ceaa3b7ea1870f1e1831f021d0d16973782710ce3d38edfda25f5
MD5 7e893643b64cd536780aca2c7dc6f682
BLAKE2b-256 5e58ba5d4be48f911572074fc3063cba92c6ece04c0f9cc96169c18640935b80

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page