A Python package implementing a distributed randomised feature selection algorithm.
Project description
drfsc
An open-source library for a distributed randomised feature selection and classification algorithm.
Authors and Contributors:
Mark Chiu Chong, Aida Brankovic.
Overview
drfsc
is an open-source Python implementation of the Distributed Randomised Feature Selection algorithm for Classification problems (D-RFSC). Beside addressing some of the shortcomings of the conventional FS method, its good performance has previously been shown on a range of benchmark datasets. However, to date no Python implementation is available. drfsc
offers an easy to use, parallelized probabilistic population-based feature selection scheme that is flexible and can be adapted to a wide range of binary classification problems and is particularly useful for large data problems where model interpretability and model explainability is of high importance. It provides modules for model fitting, evaluation, and visualization. Tutorial notebooks are provided to demonstrate the use of the package.
Installation
The easiest way to install is from PyPI: just use
pip install drfsc
License
We invite anyone interested to use and modify this code under a MIT license.
Dependencies
drfsc
depends on the following packages:
References
The package has been developed based on research that came out of the Polytechnical University of Milan. The interested reader is referred to [2] for details related to the distribution procedure, and to [1] for a more thorough mathematical overview and for experimental comparisons to various alternate feature selection methods.
[1] Brankovic, A., Falsone, A., Prandini, M., Piroddi, L. (2018). A feature selection and classification algorithm based on randomized extraction of model populations
[2] Brankovic, A., Piroddi, L. (2019). A distributed feature selection scheme with partial information sharing
Citations
This package is developed in CSIRO’s Australian e-Health Research Centre. If you use drfsc
package in your research we would appreciate a citation to the appropriate paper(s):
- For general use of
drfsc
package you can read/cite the original article. - For information/use of the Randomised Feature Selection and classification concept you can read/cite original article [1].
- For information/use of the Distributed Feature Selection architecture with partial information you can read/cite original article [2].
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.