Skip to main content

A scikit-learn implementation of BOOMER - an algorithm for learning gradient boosted multi-label classification rules

Project description

BOOMER - Gradient Boosted Multi-Label Classification Rules

License: MIT PyPI version Documentation Status

This software package provides an implementation of BOOMER - an algorithm for learning gradient boosted multi-label classification rules that integrates with the popular scikit-learn machine learning framework.

The goal of multi-label classification is the automatic assignment of sets of labels to individual data points, for example, the annotation of text documents with topics. The BOOMER algorithm uses gradient boosting to learn an ensemble of rules that is built with respect to a given multivariate loss function. To provide a versatile tool for different use cases, great emphasis is put on the efficiency of the implementation. To ensure its flexibility, it is designed in a modular fashion and can therefore easily be adjusted to different requirements.

References

The algorithm was first published in the following paper. A preprint version is publicly available here.

Michael Rapp, Eneldo Loza Mencía, Johannes Fürnkranz Vu-Linh Nguyen and Eyke Hüllermeier. Learning Gradient Boosted Multi-label Classification Rules. In: Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), 2020, Springer.

If you use the algorithm in a scientific publication, we would appreciate citations to the mentioned paper. An overview of publications that are concerned with the BOOMER algorithm, together with information on how to cite them, can be found in the section References of the documentation.

Features

The algorithm that is provided by this project currently supports the following core functionalities to learn an ensemble of boosted classification rules:

  • Different label-wise or example-wise loss functions can be minimized during training (optionally using L1 or L2 regularization).
  • The rules may predict for a single label or for all labels (which enables to model local label dependencies).
  • When learning a new rule, random samples of the training examples, features or labels may be used (including different techniques such as sampling with or without replacement or stratification methods).
  • The impact of individual rules on the ensemble can be controlled using shrinkage.
  • Hyper-parameters that provide fine-grained control over the specificity/generality of rules are available.
  • The conditions of rules can be pruned based on a hold-out set.
  • The algorithm can natively handle numerical, ordinal and nominal features (without the need for pre-processing techniques such as one-hot encoding).
  • The algorithm is able to deal with missing feature values, i.e., occurrences of NaN in the feature matrix.
  • Different strategies for prediction, which can be tailored to the used loss function, are available.

In addition, the following features that may speed up training or reduce the memory footprint are currently implemented:

  • Approximate methods for evaluating potential conditions of rules, based on unsupervised binning methods, can be used.
  • Gradient-based label binning (GBLB) can be used to assign the available labels to a limited number of bins. The use of label binning may speed up training significantly when using rules that predict for multiple labels to minimize a non-decomposable loss function.
  • Dense or sparse feature matrices can be used for training and prediction. The use of sparse matrices may speed up training significantly on some data sets.
  • Dense or sparse label matrices can be used for training. The use of sparse matrices may reduce the memory footprint in case of large data sets.
  • Dense or sparse matrices can be used to store predictions. The use of sparse matrices may reduce the memory footprint in case of large data sets.
  • Multi-threading can be used to parallelize the evaluation of a rule's potential refinements across multiple CPU cores.

Documentation

An extensive user guide, as well as an API documentation for developers, is available at https://mlrl-boomer.readthedocs.io. If you are new to the project, you probably want to read about the following topics:

A collection of benchmark datasets that are compatible with the algorithm are provided in a separate repository.

For an overview of changes and new features that have been included in past releases, please refer to the changelog.

License

This project is open source software licensed under the terms of the MIT license. We welcome contributions to the project to enhance its functionality and make it more accessible to a broader audience. A frequently updated list of contributors is available here.

All contributions to the project and discussions on the issue tracker are expected to follow the code of conduct.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

mlrl_boomer-0.8.2-cp310-cp310-win_amd64.whl (321.9 kB view details)

Uploaded CPython 3.10 Windows x86-64

mlrl_boomer-0.8.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.9 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

mlrl_boomer-0.8.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.9 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

mlrl_boomer-0.8.2-cp310-cp310-macosx_10_9_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

mlrl_boomer-0.8.2-cp39-cp39-win_amd64.whl (321.9 kB view details)

Uploaded CPython 3.9 Windows x86-64

mlrl_boomer-0.8.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

mlrl_boomer-0.8.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.8 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

mlrl_boomer-0.8.2-cp39-cp39-macosx_10_9_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

mlrl_boomer-0.8.2-cp38-cp38-win_amd64.whl (321.6 kB view details)

Uploaded CPython 3.8 Windows x86-64

mlrl_boomer-0.8.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

mlrl_boomer-0.8.2-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.6 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

mlrl_boomer-0.8.2-cp38-cp38-macosx_10_9_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

mlrl_boomer-0.8.2-cp37-cp37m-win_amd64.whl (316.9 kB view details)

Uploaded CPython 3.7m Windows x86-64

mlrl_boomer-0.8.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

mlrl_boomer-0.8.2-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.5 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARM64

mlrl_boomer-0.8.2-cp37-cp37m-macosx_10_9_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

File details

Details for the file mlrl_boomer-0.8.2-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for mlrl_boomer-0.8.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 2a7486d4669f8a8e2546c7b600264ff60ebdacf0a3e8cc56cca56912f26ba770
MD5 4c1144649faa5b0ac554562303f17aa1
BLAKE2b-256 0597fe631bd9a85ef32d6f549209bd65cb64e6a277de12c0397520125e67bce8

See more details on using hashes here.

File details

Details for the file mlrl_boomer-0.8.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for mlrl_boomer-0.8.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 90651f0068f83f6280f934c654c8d6e4993a49e70df7db33c43d4f30833a67bc
MD5 f4147d569f8628d4b4baf7b3b1d9dd67
BLAKE2b-256 4ec9e4758a3de0b657340a51bfe6d71766fa67659f5bc43437d5957274efef0c

See more details on using hashes here.

File details

Details for the file mlrl_boomer-0.8.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for mlrl_boomer-0.8.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 0d46c18b55ecf7ef3a753e4de24fe1ac62b7d156ca9875c9eff6f7390d1266f8
MD5 cf8ca1d8c3a7e8dd63bce936d4d3c125
BLAKE2b-256 3391db7580cab3310c878fc043eb67c499819351ce81df5cf394119261db46e8

See more details on using hashes here.

File details

Details for the file mlrl_boomer-0.8.2-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for mlrl_boomer-0.8.2-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 e561d2dd31b1f3361ed2d3423785c657c397ed9297834933af0ea653c1786c35
MD5 afb2381729704129aa9ebe33989569f6
BLAKE2b-256 a773a7ab0c6a2e884d08b32cde410a133987c3970bce8cd82657ca41b46d0b66

See more details on using hashes here.

File details

Details for the file mlrl_boomer-0.8.2-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for mlrl_boomer-0.8.2-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 7cc9b00f31652ed3277cdab3304e6a9ca69c0d610aef7b9014a21e0cc8a67404
MD5 2cc6c96f670d1556b5ca32d3d46e995a
BLAKE2b-256 eac4d85310652032bc57d59df6572ec286007293a40fe109393dd772ba0890f9

See more details on using hashes here.

File details

Details for the file mlrl_boomer-0.8.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for mlrl_boomer-0.8.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bdc49a673c3bb105fb16236b246c46b6038d673688487334e3337300f3e5f1ca
MD5 a560f64e7be82b0b2fbfe621f079cb82
BLAKE2b-256 8ebb27e433a35f5a56b9e35b281562b7af4b33bd7f78f348aa97ca8ff4d66bfd

See more details on using hashes here.

File details

Details for the file mlrl_boomer-0.8.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for mlrl_boomer-0.8.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 1690e01e12f9e91e0ec1caa500019d0fc0feb0c12efa93212a5dc58f247eabbd
MD5 00ca6986d56d865a7c1e4ba65d331742
BLAKE2b-256 1d5ac0ebf0420bd9d8218b2110e8bda5306a27876f308a5ff3f7413812e705fe

See more details on using hashes here.

File details

Details for the file mlrl_boomer-0.8.2-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for mlrl_boomer-0.8.2-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 9b0687d1384316a7f161698b519b46a57ff4bcd4c19e204ba184a08854d2cd14
MD5 8ed01c9228e9caf161c061d7054e138f
BLAKE2b-256 cbfcee1dd64f2e42fe270391fa2eea8e3e652ea4d9a9ac66447f01014b10f1bd

See more details on using hashes here.

File details

Details for the file mlrl_boomer-0.8.2-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for mlrl_boomer-0.8.2-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 523cb4d26f24a2422804f187f900121ad13df0e6e9a871790bedc432eaf495cd
MD5 6b5404abde535219c0169c04afb454f6
BLAKE2b-256 5e01cb63621f741e6e3c7da956eb9b53272d8333b73bb7df96481748acd5da71

See more details on using hashes here.

File details

Details for the file mlrl_boomer-0.8.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for mlrl_boomer-0.8.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0eaa3a2c4b60feea9f9354a806cc9881557deb0f029008316e8ff4c2bb561791
MD5 b2ce7fe48ae997970490a4a3979b1a7b
BLAKE2b-256 cabebb56e63f51182de2f1c2857f6aa25a4915030bd80c986a03459d46cf97d2

See more details on using hashes here.

File details

Details for the file mlrl_boomer-0.8.2-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for mlrl_boomer-0.8.2-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7c0813129d08baf50973c4a2f13f20e43750c65690b364b4192f271c379c2fb2
MD5 19106f19bf39da26fef2ca6307a7b426
BLAKE2b-256 1a18f53780c342aadf683447e23d5e42290c4b7b310a8a4120fa9c044173f0ee

See more details on using hashes here.

File details

Details for the file mlrl_boomer-0.8.2-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for mlrl_boomer-0.8.2-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 d2e98f1a919e38fb9874b7be734a1b5232ad268199af0b29feed7711b1cda2a4
MD5 dad7d77f39828384db7af7c968d9a6d9
BLAKE2b-256 a50078194df3005391ef7cdd3be73baff921098092c0d072db21aa7f0c790a7f

See more details on using hashes here.

File details

Details for the file mlrl_boomer-0.8.2-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for mlrl_boomer-0.8.2-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 4d331c039403f7ce930e24a2086dd2d427fc143395fc5b1f39bc8a568233fa26
MD5 8600a0c47d9d14b544de5d4e4c315e25
BLAKE2b-256 29d9ddde8c052b206e9924a51ada2efe8856e7c93bd207d68f0899d70f391a96

See more details on using hashes here.

File details

Details for the file mlrl_boomer-0.8.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for mlrl_boomer-0.8.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 dc8ff0f0e590e5d315f7597d66a403be38ad73d2db6bcbcd0f16dd3f280bf9c3
MD5 b3350a7f40d9a53f34cefe4536013039
BLAKE2b-256 ca8dd36a8b856d30930e749be6747c1d38a48aeec8b0461656e28f9e673e6380

See more details on using hashes here.

File details

Details for the file mlrl_boomer-0.8.2-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for mlrl_boomer-0.8.2-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 3b3902de218f2934f3c24541af3c7031fe547b44cd72d2d10eec2418dac024d9
MD5 406a9f0e5e1352cedd3c1b0ef0e42b2f
BLAKE2b-256 8f8cff1f1954a3fb76d5b63d9db8971478ac4dbe6d80210a7ff550c703a4e84c

See more details on using hashes here.

File details

Details for the file mlrl_boomer-0.8.2-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for mlrl_boomer-0.8.2-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 f0f280a2b7d2891ce7bf1c83f69144cf153458189e01848f3e78affcfaf8a7a6
MD5 c36a740e8ff539e1fa56727780748555
BLAKE2b-256 2abf1c310cd90a1dbdf4adc246da1530532eb2a6e331f48c231866b5bbb48938

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page