Skip to main content

An information theoretic feature selection toolbox.

Project description

FEAST

A FEAture Selection Toolbox for C/C++, Java, Python, & MATLAB/Octave, v2.1.0.

FEAST provides implementations of common mutual information based filter feature selection algorithms, and an implementation of RELIEF for Matlab. All functions expect discrete inputs (except RELIEF, which does not depend on the MIToolbox), and they return the selected feature indices. These implementations were developed to help our research into the similarities between these algorithms, and our results are presented in the following paper:

 Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection
 G. Brown, A. Pocock, M.-J. Zhao, M. Lujan
 Journal of Machine Learning Research, 13:27-66 (2012)

The weighted feature selection algorithms are described in Chapter 7 of:

 Feature Selection via Joint Likelihood
 A. Pocock
 PhD Thesis, University of Manchester, 2012

If you use these implementations for academic research please cite the relevant paper above. All FEAST code is licensed under the BSD 3-Clause License.

Contains implementations of: mim, mrmr, mifs, cmim, jmi, disr, cife, icap, condred, cmi, relief, fcbf, betagamma

And weighted implementations of: mim, cmim, jmi, disr, cmi

References for these algorithms are provided in the accompanying feast.bib file (in BibTeX format).

FEAST works on discrete inputs, and all continuous values must be discretised before use with FEAST. In our experiments we've found that using 10 equal width bins is suitable for many problems, though this is data set size dependent. FEAST produces unreliable results when used with continuous inputs, runs slowly and uses much more memory than usual. The discrete inputs should have small cardinality, FEAST will treat values {1,10,100} the same way it treats {1,2,3} and the latter will be both faster and use less memory.

MATLAB Example (using "data" as our feature matrix, and "labels" as the class label vector):

>> size(data)
ans = 
     (569,30)                                     %% denoting 569 examples, and 30 features
>> selectedIndices = feast('jmi',5,data,labels) %% selecting the top 5 features using the jmi algorithm
selectedIndices =

    28
    21
     8
    27
    23
>> selectedIndices = feast('mrmr',10,data,labels) %% selecting the top 10 features using the mrmr algorithm
selectedIndices =

    28
    24
    22
     8
    27
    21
    29
     4
     7
    25
>> selectedIndices = feast('mifs',5,data,labels,0.7) %% selecting the top 5 features using the mifs algorithm with beta = 0.7
selectedIndices =

    28
    24
    22
    20
    29

The library is written in ANSI C for compatibility with the MATLAB mex compiler, except for MIM, FCBF and RELIEF, which are written in MATLAB/OCTAVE script. There is a different implementation of MIM available for use in the C library. It depends on MIToolbox which is incorporated as a git submodule.

MIToolbox is developed on GitHub.

The C library expects all matrices in column-major format (i.e. Fortran style). This is for two reasons, a) MATLAB generates Fortran-style arrays, and b) feature selection iterates over columns rather than rows, unlike most other ML processes.

Compilation instructions: Run git submodule init then,

  • MATLAB/OCTAVE
    • run CompileFEAST.m in the matlab folder.
  • Linux C shared library
    • run make x86 or make x64 for a 32-bit or 64-bit library.
  • Windows C dll (expects pre built libMIToolbox.dll)
  • Java (requires Java 8)
    • run make x64, sudo make install to build and install the C library.
    • then make java to build the JNI wrapper.
    • then run mvn package in the java directory to build the jar file.
    • Note: the Java code should work on all platforms and future versions of Java, but the included Makefile only works on Ubuntu & Java 8.
  • Python
    • run python setup.py in the python folder.

Update History

  • xx/xx/xxxx - v2.1.0 - Added a python API and refactored the package structure.
  • 07/01/2017 - v2.0.0 - Added weighted feature selection, major refactoring of the code to improve speed and portability. FEAST functions now return the internal scores assigned by each criteria as well. Added a Java API via JNI. FEAST v2 is approximately 30% faster when called from Matlab.
  • 12/03/2016 - v1.1.4 - Fixed an issue where Matlab would segfault if all features had zero MI with the label.
  • 12/10/2014 - v1.1.2 - Updated documentation to note that FEAST expects column-major matrices.
  • 11/06/2014 - v1.1.1 - Fixed an issue where MIM wasn't compiled into libFSToolbox.
  • 22/02/2014 - v1.1.0 - Bug fixes in memory allocation, added a C implementation of MIM, moved the selected feature increment into the mex code.
  • 12/02/2013 - v1.0.1 - Bug fix for 32-bit Windows MATLAB's lcc.
  • 08/11/2011 - v1.0.0 - Public Release to complement the JMLR publication.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

fstoolbox-0.0.2-cp311-cp311-manylinux_2_35_x86_64.whl (52.8 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.35+ x86-64

fstoolbox-0.0.2-cp310-cp310-manylinux_2_35_x86_64.whl (52.8 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.35+ x86-64

fstoolbox-0.0.2-cp39-cp39-manylinux_2_35_x86_64.whl (52.8 kB view details)

Uploaded CPython 3.9 manylinux: glibc 2.35+ x86-64

File details

Details for the file fstoolbox-0.0.2-cp311-cp311-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for fstoolbox-0.0.2-cp311-cp311-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 1cf17fce605bf9822535fc369eb3824190c22225914b35092615bd65e82bb362
MD5 56bd49d733cd0d2aebdd0e6f605ef4b6
BLAKE2b-256 52a3a09d4cc60215c69dd19527a274b4cbdb26bf6236633b1d691f6ea9c23ed6

See more details on using hashes here.

File details

Details for the file fstoolbox-0.0.2-cp310-cp310-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for fstoolbox-0.0.2-cp310-cp310-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 c360eb5b9b7a61d00355bba4c7f2efb69eff2f0f050e6686f173f7cbab332a38
MD5 9fd28f66a0d372b73a3b40b519d8dcd7
BLAKE2b-256 044d7ee46852ec099ec86975a64e0a23e68b9ffcd7323cc67fb335fa987bda1c

See more details on using hashes here.

File details

Details for the file fstoolbox-0.0.2-cp39-cp39-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for fstoolbox-0.0.2-cp39-cp39-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 e67b6edcc6ae1990d7c8d6fb0f5a8919235ccaf9643e10d347c9ace62253bcf1
MD5 54f2899347aef4f488f9e5b41093d0c9
BLAKE2b-256 7b68af758071bec34225a2e535730e1a88359c19caa5560bdae7c98fd43f921b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page