Skip to main content

Provides utilities for the training and evaluation of multi-label rule learning algorithms

Project description

"MLRL-Testbed": Utilities for Evaluating Multi-label Rule Learning Algorithms

License: MIT PyPI version Documentation Status Build Code style

Important links: Documentation | Issue Tracker | Changelog | Contributors | Code of Conduct | License

This software package provides utilities for training and evaluating single- and multi-label rule learning algorithms that have been implemented using the "MLRL-Common" library, including the following ones:

  • BOOMER (Gradient Boosted Multi-label Classification Rules): A state-of-the art algorithm that uses gradient boosting to learn an ensemble of rules that is built with respect to a given multivariate loss function.

Functionalities

Most notably, the package includes command line APIs that allow configuring the algorithms mentioned above, applying them to different datasets, and evaluating their predictive performance in terms of commonly used measures (provided by the scikit-learn framework). In summary, it provides the following functionalities:

  • Sinle- and multi-label datasets in the Mulan and Meka format are supported.
  • Datasets can automatically be split into training and test data, including the possibility to use cross validation. Alternatively, predefined splits can be used by supplying the data as separate files.
  • One-hot-encoding can be applied to nominal or binary features.
  • Binary predictions, regression scores, or probability estimates can be obtained from a model. Evaluation measures that are suited for the respective type of predictions are picked automatically.
  • Evaluation scores can be saved to output files and printed on the console.
  • Rule models can be evaluated incrementally, i.e., they can be evaluated repeatedly using a subset of the rules with increasing size.
  • Textual representations of rule models can be saved to output files and printed on the console. In addition, the characteristics of models can also be saved and printed.
  • Characteristics of datasets can be saved to output files and printed on the console.
  • Unique label vectors contained in a dataset can be saved to output files and printed on the console.
  • Predictions can be saved to output files and printed on the console. In addition, characteristics of binary predictions can also be saved and printed.
  • Models for the calibration of probabilities can be saved to output files and printed on the console.
  • Models can be saved on disk in order to be reused by future experiments.
  • Algorithmic parameters can be read from configuration files instead of providing them via command line arguments. When providing parameters via the command line, corresponding configuration files can automatically be saved on disk.

License

This project is open source software licensed under the terms of the MIT license. We welcome contributions to the project to enhance its functionality and make it more accessible to a broader audience. A frequently updated list of contributors is available here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

mlrl_testbed-0.10.0-py3-none-any.whl (57.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page