Skip to main content

MICO: Mutual Information and Conic Optimization for feature selection.

Project description

MICO: Mutual Information and Conic Optimization for feature selection

MICO is a Python package that implements a conic optimization based feature selection method with mutual information (MI) measure. The idea behind the approach is to measure the features’relevance and redundancy using MI, and then formulate a feature selection problem as a pure-binary quadratic optimization problem, which can be heuristically solved by an efficient randomization algorithm via semidefinite programming. Optimization software Colin is used for solving the underlying conic optimization problems.

This package

  • implements three methods for feature selections:
    • MICO : Conic Optimization approach
    • MIFS : Forward Selection approach
    • MIBS : Backward Selection approach
  • supports three different MI measures:
    • JMI : Joint Mutual Information
    • JMIM : Joint Mutual Information Maximisation
    • MRMR : Max-Relevance Min-Redundancy
  • generates feature importance scores for all selected features.
  • provides scikit-learn compatible APIs.

Installation

  1. Download Colin distribution from http://www.colinopt.org/downloads.php and unpack it into a chosen directory (<CLNHOME>). Then install Colin package:
cd <CLNHOME>/python
pip install -r requirements.txt
python setup.py install
  1. Next, install MICO package dependencies:
pip install -r requirements.txt
  1. To install MICO package, use:
python setup.py install

or

pip install colin-mico

To install the development version, you may use:

pip install --upgrade git+https://github.com/jupiters1117/mico

Usage

This package provides scikit-learn compatible APIs:

  • fit(X, y)
  • transform(X)
  • fit_transform(X, y)

Examples

The following example illustrates the use of the package:

import pandas as pd
from sklearn.datasets import load_breast_cancer

# Prepare data.
data = load_breast_cancer()
y = data.target
X = pd.DataFrame(data.data, columns=data.feature_names)

# Perform feature selection.
mico = MutualInformationConicOptimization(verbose=1, categorical=True)
mico.fit(X, y)

# Populate selected features.
print("Selected features: {}".format(mico.get_support()))

# Populate feature importance scores.
print("Feature importance scores: {}".format(mico.feature_importances_))

# Call transform() on X.
X_transformed = mico.transform(X)

Documentation

User guide, examples, and API are available here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for colin-mico, version 0.1.0a0
Filename, size File type Python version Upload date Hashes
Filename, size colin_mico-0.1.0a0-py3-none-any.whl (26.1 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size colin-mico-0.1.0a0.tar.gz (25.4 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page