Skip to main content

A Python package implementing a distributed randomised feature selection algorithm.

Project description

drfsc

License: MIT version: 0.0.6

An open-source library for a distributed randomised feature selection and classification algorithm.

Authors and Contributors:

Mark Chiu Chong, Aida Brankovic.

Overview

drfsc is an open-source Python implementation of the Distributed Randomised Feature Selection algorithm for Classification problems (D-RFSC). Beside addressing some of the shortcomings of the conventional FS method, its good performance has previously been shown on a range of benchmark datasets. However, to date no Python implementation is available. drfsc offers an easy to use, parallelized probabilistic population-based feature selection scheme that is flexible and can be adapted to a wide range of binary classification problems and is particularly useful for large data problems where model interpretability and model explainability is of high importance. It provides modules for model fitting, evaluation, and visualization. Tutorial notebooks are provided to demonstrate the use of the package.

Installation

The easiest way to install is from PyPI: just use

pip install drfsc

License

We invite anyone interested to use and modify this code under a MIT license.

Dependencies

drfsc depends on the following packages:

References

The package has been developed based on research that came out of the Polytechnical University of Milan. The interested reader is referred to [2] for details related to the distribution procedure, and to [1] for a more thorough mathematical overview and for experimental comparisons to various alternate feature selection methods.

[1] Brankovic, A., Falsone, A., Prandini, M., Piroddi, L. (2018). A feature selection and classification algorithm based on randomized extraction of model populations

[2] Brankovic, A., Piroddi, L. (2019). A distributed feature selection scheme with partial information sharing

Citations

This package is developed in CSIRO’s Australian e-Health Research Centre. If you use drfsc package in your research we would appreciate a citation to the appropriate paper(s):

  • For general use of drfsc package you can read/cite the original article.
  • For information/use of the Randomised Feature Selection and classification concept you can read/cite original article [1].
  • For information/use of the Distributed Feature Selection architecture with partial information you can read/cite original article [2].

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

drfsc-0.0.7.tar.gz (24.0 kB view hashes)

Uploaded Source

Built Distribution

drfsc-0.0.7-py3-none-any.whl (22.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page