Package with encoder-based approaches to warm-starting Bayesian Hyperparameter Optimization.
Project description
WSMF
tl;dr
This package contains implementations of two novel approaches to warm-starting encoder-based warm-start of the Bayesian Hyperparameter Optimization. It allows both training and using meta-models which can help in this meta-task.
Contents
Meta-models - As for now it contains two approaches to encoder-based warm-start:
- Metric learning (
Dataset2VecMetricLearning
) - As an encoder it uses Dataset2Vec which is trained in a way that it produces representations whose distances to each other correspond to distances of the landmarkers (vectors of performances of a predefined set of hyperparameter configuration) - Landmarker reconstruction (
LandmarkerReconstructionTrainingInterface
) - As an encoder it uses Dataset2Vec which produces a latent representation of the entire dataset (of any size) and passes it to MLP which outputs predictions of the landmarker vector
Selectors - for usage for this intended meta-task wsmf
provides API to use encoder for proposing hyperparameter configuration. It contains the following samplers:
- Selector which is choosing based on the learned representation that is applicable in the metric learning approach (
RepresentationBasedHpSelector
) - Selector which is based on the reconstructed landmarkers (
ReconstructionBasedHpSelector
) - Random selector from the predefined portfolio (
RandomHpSelector
) - Selector which chooses the best configuration on average (
RankBasedHpSelector
) - Selector that chooses configurations based on the vector of landmarkers itself (
LandmarkerHpSelector
)
Examples of usage
Training metric learning based meta-model
# tensors X, y are torch.Tensor objects which correspond to feature and target matrices
train_datasets = { # training meta-dataset
"dataset_train_1": (tensor_X1, tensor_y1),
"dataset_train_2": (tensor_X2, tensor_y2),
...
}
val_datasets = { # validation meta-dataset
"dataset_val_1": (tensor_X1, tensor_y1),
"dataset_val_2": (tensor_X2, tensor_y2),
...
}
# tensors l1, l2, .. corresponds to vector of landmarkers in torch.Tensor format
train_landmarkers = { # training meta-dataset
"dataset_train_1": l1,
"dataset_train_2": l2,
...
}
val_landmarkers = { # validation meta-dataset
"dataset_val_1": l1,
"dataset_val_2": l2,
...
}
train_dataset = EncoderHpoDataset(train_datasets, train_landmarkers)
train_dataloader = EncoderMetricLearningLoader(train_dataset, train_num_batches, train_batch_size)
val_dataset = EncoderHpoDataset(val_datasets, val_landmarkers)
val_dataloader = EncoderMetricLearningLoader(val_dataset, val_num_batches, val_batch_size)
val_dataloader = GenericRepeatableDataLoader(val_dataloader) # Loader which produces repeatable batches
model = Dataset2VecMetricLearning()
trainer = pl.Trainer()
trainer.fit(model, train_loader, val_loader)
Using selector based on reconstruction
datasets = { # datasets to search from (in this case used for closest dataset search)
"dataset_1": (tensor_X1, tensor_y1),
"dataset_2": (tensor_X2, tensor_y2),
...
}
landmarkers = { # landmarkers to search from (is this case used for proposing best configurations)
"dataset_val_1": l1,
"dataset_val_2": l2,
...
}
configurations = [
{"hp1": val1, "hp2": val2},
{"hp1": val3, "hp2": val4},
...
]
meta_model = Dataset2VecForLandmarkerReconstruction.load_from_checkpoint("path_to_meta_model.ckpt")
selector = ReconstructionBasedHpSelector(
meta_model,
datasets,
landmarkers,
configurations
)
# Usage
new_dataset = (X, y) # torch.Tensor
n_configurations = 10
configurations = selector.propose_configurations(new_dataset, n_configurations)
Development
Commands useful during development:
- Seting env variables -
export PYTHONPATH=(backtick)pwd(backtick)
- Install dependencies -
pip install -r requirements_dev.txt
- To run unit tests -
pytest
- Check code quality -
./scripts/check_code.sh
- Relase -
python -m build && twine upload dist/*
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file wsmf-0.0.2.tar.gz
.
File metadata
- Download URL: wsmf-0.0.2.tar.gz
- Upload date:
- Size: 40.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1966063a19ed4b845c15aac77960ebee26a8b6db95eb99f59aa912e1cbd6b5a8 |
|
MD5 | 234c12e297c72af9f7df0d502a0a6dc4 |
|
BLAKE2b-256 | 30f279c8e2ea3668f6246d98f9ce214b12fa465a48ae960265c4140f1270a40e |
File details
Details for the file wsmf-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: wsmf-0.0.2-py3-none-any.whl
- Upload date:
- Size: 15.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 58028e13448fe28e30f1f47ed45d52aa77567fc34f0b20c2d3a4ec4a12d27312 |
|
MD5 | 03eb2f960123a3b884f6e1ebd6cc5c75 |
|
BLAKE2b-256 | 5254fc1e91001ad8cf9905d845b5df77ea6512df862536f7fa1c351ab4bbfe9c |