Skip to main content

MICO: Mutual Information and Conic Optimization for feature selection.

Project description

MICO: Mutual Information and Conic Optimization for feature selection

MICO is a Python package that implements a conic optimization based feature selection method with mutual information (MI) measure. The idea behind the approach is to measure the features’relevance and redundancy using MI, and then formulate a feature selection problem as a pure-binary quadratic optimization problem, which can be heuristically solved by an efficient randomization algorithm via semidefinite programming. Optimization software Colin is used for solving the underlying conic optimization problems.

This package

  • implements three methods for feature selections:

    • MICO : Conic Optimization approach

    • MIFS : Forward Selection approach

    • MIBS : Backward Selection approach

  • supports three different MI measures:

    • JMI : Joint Mutual Information

    • JMIM : Joint Mutual Information Maximisation

    • MRMR : Max-Relevance Min-Redundancy

  • generates feature importance scores for all selected features.

  • provides scikit-learn compatible APIs.

Installation

  1. Download Colin distribution from http://www.colinopt.org/downloads.php and unpack it into a chosen directory (<CLNHOME>). Then install Colin package:

cd <CLNHOME>/python
pip install -r requirements.txt
python setup.py install
  1. Next, install MICO package dependencies:

pip install -r requirements.txt
  1. To install MICO package, use:

python setup.py install

or

pip install colin-mico

To install the development version, you may use:

pip install --upgrade git+https://github.com/jupiters1117/mico

Usage

This package provides scikit-learn compatible APIs:

  • fit(X, y)

  • transform(X)

  • fit_transform(X, y)

Examples

The following example illustrates the use of the package:

import pandas as pd
from sklearn.datasets import load_breast_cancer

# Prepare data.
data = load_breast_cancer()
y = data.target
X = pd.DataFrame(data.data, columns=data.feature_names)

# Perform feature selection.
mico = MutualInformationConicOptimization(verbose=1, categorical=True)
mico.fit(X, y)

# Populate selected features.
print("Selected features: {}".format(mico.get_support()))

# Populate feature importance scores.
print("Feature importance scores: {}".format(mico.feature_importances_))

# Call transform() on X.
X_transformed = mico.transform(X)

Documentation

User guide, examples, and API are available here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

colin-mico-0.1.0a0.tar.gz (25.4 kB view details)

Uploaded Source

Built Distribution

colin_mico-0.1.0a0-py3-none-any.whl (26.1 kB view details)

Uploaded Python 3

File details

Details for the file colin-mico-0.1.0a0.tar.gz.

File metadata

  • Download URL: colin-mico-0.1.0a0.tar.gz
  • Upload date:
  • Size: 25.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.1.post20191125 requests-toolbelt/0.9.1 tqdm/4.40.0 CPython/3.6.9

File hashes

Hashes for colin-mico-0.1.0a0.tar.gz
Algorithm Hash digest
SHA256 d629e45eb46f941ef2704503dd9a5ae8352524dee2de88eb21b695ef53510fa8
MD5 b758dfe9f214f7a6406c801d978f6f7f
BLAKE2b-256 9af0f5bed19e25b488b7fb99c8ba0a8534ff4029b82d6d4e56315fc0f3e6246b

See more details on using hashes here.

File details

Details for the file colin_mico-0.1.0a0-py3-none-any.whl.

File metadata

  • Download URL: colin_mico-0.1.0a0-py3-none-any.whl
  • Upload date:
  • Size: 26.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.1.post20191125 requests-toolbelt/0.9.1 tqdm/4.40.0 CPython/3.6.9

File hashes

Hashes for colin_mico-0.1.0a0-py3-none-any.whl
Algorithm Hash digest
SHA256 29f16d6ba2e2d65305199a1b5115f1ccd20063da5ef27466a17da56e7fc58da6
MD5 f0ae55dba7496cdd4e81892d31a26610
BLAKE2b-256 a07cead5cbdf08f2ae460971ffbcceb4abb74ef7353cd7e02f6c4b120cacd4d1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page