Significance Analysis for HPO-algorithms performing on multiple benchmarks
Project description
Significance Analysis
This package is used to analyse datasets of different HPO-algorithms performing on multiple benchmarks.
Note
As indicated with the v0.x.x
version number, Significance Analysis is early stage code and APIs might change in the future.
Documentation
Please have a look at our example. The dataset should have the following format:
system_id (algorithm name) |
input_id (benchmark name) |
metric (mean/estimate) |
optional: bin_id (budget/traininground) |
---|---|---|---|
Algorithm1 | Benchmark1 | x.xxx | 1 |
Algorithm1 | Benchmark1 | x.xxx | 2 |
Algorithm1 | Benchmark2 | x.xxx | 1 |
... | ... | ... | ... |
Algorithm2 | Benchmark2 | x..xxx | 2 |
In this dataset, there are two different algorithms, trained on two benchmarks for two iterations each. The variable-names (system_id, input_id...) can be customized, but have to be consistent throughout the dataset, i.e. not "mean" for one benchmark and "estimate" for another. The conduct_analysis
function is then called with the dataset and the variable-names as parameters.
Optionally the dataset can be binned according to a fourth variable (bin_id) and the analysis is conducted on each of the bins seperately, as shown in the code example above. To do this, provide the name of the bin_id-variable and if wanted the exact bins and bin labels. Otherwise a bin for each unique value will be created.
Installation
Using R, >=4.0.0 install packages: Matrix, emmeans, lmerTest and lme4
Using pip
pip install significance-analysis
Usage
- Generate data from HPO-algorithms on benchmarks, saving data according to our format.
- Call function
conduct_analysis
on dataset, while specifying variable-names
In code, the usage pattern can look like this:
import pandas as pd
from signficance_analysis import conduct_analysis
# 1. Generate/import dataset
data = pd.read_csv("./significance_analysis_example/exampleDataset.csv")
# 2. Analyse dataset
conduct_analysis(data, "mean", "acquisition", "benchmark")
For more details and features please have a look at our example.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for significance_analysis-0.1.7.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25b1bb3da0c934a8ca077a7fd511ac429155b7ccd5e9dde9de9e04252e1c9a78 |
|
MD5 | 46b033ccedf3a4df51f076ad7d2fc0f6 |
|
BLAKE2b-256 | 7f8588a6e1e6c61bb430f564f8b2a13de3ab0214dcc77d515a13b0a4b06d9be6 |
Hashes for significance_analysis-0.1.7-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b647a11df2b0872dfd47475e0a70d0243dedd3ad0d31ad2ac77237b960ea8890 |
|
MD5 | 3048b1322215156c9d9fd742fcf6af8d |
|
BLAKE2b-256 | bc2a1ece182fe36b50e4b2239d33fd40a7534c9d3440ee8b2977f8fcd728b18a |