Provides utilities for the training and evaluation of multi-label rule learning algorithms
Project description
"MLRL-Testbed": Utilities for Evaluating Multi-label Rule Learning Algorithms
Important links: Documentation | Issue Tracker | Changelog | Contributors | Code of Conduct | License
This software package provides utilities for training and evaluating single- and multi-label rule learning algorithms that have been implemented using the "MLRL-Common" library, including the following ones:
- BOOMER (Gradient Boosted Multi-label Classification Rules): A state-of-the art algorithm that uses gradient boosting to learn an ensemble of rules that is built with respect to a given multivariate loss function.
Functionalities
Most notably, the package includes command line APIs that allow configuring the algorithms mentioned above, applying them to different datasets, and evaluating their predictive performance in terms of commonly used measures (provided by the scikit-learn framework). In summary, it provides the following functionalities:
- Sinle- and multi-label datasets in the Mulan and Meka format are supported.
- Datasets can automatically be split into training and test data, including the possibility to use cross validation. Alternatively, predefined splits can be used by supplying the data as separate files.
- One-hot-encoding can be applied to nominal or binary features.
- Binary predictions, regression scores, or probability estimates can be obtained from a model. Evaluation measures that are suited for the respective type of predictions are picked automatically.
- Evaluation scores can be saved to output files and printed on the console.
- Rule models can be evaluated incrementally, i.e., they can be evaluated repeatedly using a subset of the rules with increasing size.
- Textual representations of rule models can be saved to output files and printed on the console. In addition, the characteristics of models can also be saved and printed.
- Characteristics of datasets can be saved to output files and printed on the console.
- Unique label vectors contained in a dataset can be saved to output files and printed on the console.
- Predictions can be saved to output files and printed on the console. In addition, characteristics of binary predictions can also be saved and printed.
- Models for the calibration of probabilities can be saved to output files and printed on the console.
- Models can be saved on disk in order to be reused by future experiments.
- Algorithmic parameters can be read from configuration files instead of providing them via command line arguments. When providing parameters via the command line, corresponding configuration files can automatically be saved on disk.
License
This project is open source software licensed under the terms of the MIT license. We welcome contributions to the project to enhance its functionality and make it more accessible to a broader audience. A frequently updated list of contributors is available here.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file mlrl_testbed-0.10.1-py3-none-any.whl
.
File metadata
- Download URL: mlrl_testbed-0.10.1-py3-none-any.whl
- Upload date:
- Size: 57.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b6e8160d208a65dc4aedc3a291109ccf63d27c568d1b966374c278e1fbf5c4f3 |
|
MD5 | f4c99bd1b39c025f2eb4d2c977ebd166 |
|
BLAKE2b-256 | eecb94412c3d8945fda62b5b6cdd1c2567de40b1af2562988f375e3902593369 |