Significance Analysis for HPO-algorithms performing on multiple benchmarks

These details have not been verified by PyPI

Project description

Significance Analysis

This package is used to analyse datasets of different HPO-algorithms performing on multiple benchmarks, using a Linear Mixed-Effects Model-based approach.

Note

As indicated with the v0.x.x version number, Significance Analysis is early stage code and APIs might change in the future.

Documentation

Please have a look at our example. The dataset should be a pandas dataframe of the following format:

algorithm	benchmark	metric	optional: budget/prior/...
Algorithm1	Benchmark1	x.xxx	1.0
Algorithm1	Benchmark1	x.xxx	2.0
Algorithm1	Benchmark2	x.xxx	1.0
...	...	...	...
Algorithm2	Benchmark2	x.xxx	2.0

Our function dataset_validator checks for this format.

Installation

Using R, >=4.0.0 install packages: Matrix, emmeans, lmerTest and lme4

Using pip

pip install significance-analysis

Usage for significance testing

Generate data from HPO-algorithms on benchmarks, saving data according to our format.
Build a model with all interesting factors
Do post-hoc testing
Plot the results as CD-diagram

In code, the usage pattern can look like this:

import pandas as pd
from significance_analysis import dataframe_validator, model, cd_diagram


# 1. Generate/import dataset
data = dataframe_validator(pd.read_parquet("datasets/priorband_data.parquet"))

# 2. Build the model
mod = model("value ~ algorithm + (1|benchmark) + prior", data)

# 3. Conduct the post-hoc analysis
post_hoc_results = mod.post_hoc("algorithm")

# 4. Plot the results
cd_diagram(post_hoc_results)

Usage for hypothesis testing

Use the GLRT implementation or our prepared sanity checks to conduct LMEM-based hypothesis testing.

In code:

from significance_analysis import (
    dataframe_validator,
    glrt,
    model,
    seed_dependency_check,
    benchmark_information_check,
    fidelity_check,
)

# 1. Generate/import dataset
data = dataframe_validator(pd.read_parquet("datasets/priorband_data.parquet"))

# 2. Run the preconfigured sanity checks
seed_dependency_check(data)
benchmark_information_check(data)
fidelity_check(data)

# 3. Run a custom hypothesis test, comparing model_1 and model_2
model_1 = model("value ~ algorithm", data)
model_2 = model("value ~ 1", data)
glrt(model_1, model_2)

Usage for metafeature impact analysis

Analyzing the influence, a metafeature has on two algorithms performances.

In code:

from significance_analysis import dataframe_validator, metafeature_analysis

# 1. Generate/import dataset
data = dataframe_validator(pd.read_parquet("datasets/priorband_data.parquet"))

# 2. Run the metafeature analysis
scores = metafeature_analysis(data, ("HB", "PB"), "prior")

For more details and features please have a look at our example.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.1

Sep 11, 2024

0.2.0

Sep 10, 2024

0.1.11

Oct 6, 2023

0.1.10

Aug 19, 2023

0.1.9

Aug 9, 2023

0.1.8

Aug 8, 2023

0.1.7

Apr 8, 2023

0.1.6

Apr 7, 2023

0.1.5

Mar 27, 2023

0.1.4

Mar 27, 2023

0.1.2

Mar 14, 2023

0.1.1

Mar 14, 2023

0.1.0

Mar 14, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

significance_analysis-0.2.1.tar.gz (16.2 kB view hashes)

Uploaded Sep 11, 2024 Source

Built Distribution

significance_analysis-0.2.1-py3-none-any.whl (14.4 kB view hashes)

Uploaded Sep 11, 2024 Python 3

Hashes for significance_analysis-0.2.1.tar.gz

Hashes for significance_analysis-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`78285d2af842b1cc296abbffda01f312e00f7efe37cee8ca336cddd7cb01e05f`
MD5	`58543cacca7900d9ce8b8918814c6557`
BLAKE2b-256	`9336f6d0ea6ee62ddfe638cac828aae69c3dc17f0e952ff34d5f0bb031f03891`

Hashes for significance_analysis-0.2.1-py3-none-any.whl

Hashes for significance_analysis-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4be51cf86bcd8f8a1c5a4253969a08be9e2346b0e6fca981a791ca607667071d`
MD5	`a0ef1f10d1cf3c0f8e0fb98906fa86fe`
BLAKE2b-256	`5f2a5b971b55d9240b4ee97a66e49eacd419d83249103128fb8a297e90332025`