Benchmarking Likelihood Ratio systems
Project description
LRBenchmark
Repository for benchmarking Likelihood Ratio systems.
Prerequisites
- This repository is developed for Python 3.8.
Dependencies
All dependencies can either be installed by running pip install -r requirements.txt
or pip install .
.
Add new dependencies to the setup.py and always update the requirements by running:
pip-compile --output-file=requirements.txt setup.py
.
Usage
Running the benchmark can be done as follows:
- Specify the parameters for the benchmark in the
lrbenchmark.yaml
- Run
python run.py
The parameters for the benchmark must be provided in the following structure:
experiment:
repeats: 10
scorer:
- 'name scorer 1'
- 'name scorer 2'
calibrator:
- 'name calibrator'
dataset:
- 'name dataset'
preprocessor:
- 'name preprocessor 1'
- 'name preprocessor 2'
At least 1 setting needs to be provided for each parameter, but more settings per parameter can be provided. The pipeline will
create the cartesian product over all parameter settings (except repeats
) and will execute the experiments accordingly.
All possible settings can be found in params.py
. The parameters that need to be set are:
repeats
: Number of repeats for each experiment.scorer
: Scoring models for generating scores.calibrator
: Models for calibrating scores.dataset
: Datasets on which the experiments can be executed.preprocessor
: Data preprocessing steps. You can use the value'dummy'
if no preprocessing is needed.
Example: Benchmark feature rank based LR system
This repository supports several data transformations, such as the possibility to transform X features from values to ranks.
To benchmark these models against models without any transformations on X features, the following experiments (among others) could be
defined in lrbenchmark.yaml
.
experiment:
repeats: 10
scorer:
- 'LR'
- 'XGB'
calibrator:
- 'logit'
dataset:
drugs_xtc:
n_splits: 2
glass:
n_splits: 2
preprocessor:
- 'dummy'
- 'rank_transformer'
When executing python run.py
an experiment for all possible combination of parameters will be executed. The results for each experiment (metrics + plots)
will be stored in separate folders within the output
folder.
Datasets
There are currently two datasets implemented for this project:
- drugs_xtc: will be published on our github soon
- glass: LA-ICPMS measurements of elemental concentration from floatglass. The data will be downloaded automatically from https://github.com/NetherlandsForensicInstitute/elemental_composition_glass when used in the pipeline for the first time.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file lrbenchmark-0.1.2.tar.gz
.
File metadata
- Download URL: lrbenchmark-0.1.2.tar.gz
- Upload date:
- Size: 12.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cb93e95285bd52c99da62d05b9152ba6cc7f99bd559218d79107cfbda684ddfc |
|
MD5 | b0142f7d3f0fc094d7c604ba2e59c5b4 |
|
BLAKE2b-256 | dda9c1b6c40c1fb565b33934a88d9592c72d404518ea271b20ffa54f7f9e8691 |
File details
Details for the file lrbenchmark-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: lrbenchmark-0.1.2-py3-none-any.whl
- Upload date:
- Size: 12.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35eb5cce45a76463dc4d4945866d103816c6275e47cb56641ca0d2bba60ea961 |
|
MD5 | cde537d17ae2bba6074c5a46b016c11a |
|
BLAKE2b-256 | 6e874c34fc7804dccf18e8c4c3bc7907c762a8d705714dc6a1c0b13aa4046de8 |