MICO: Mutual Information and Conic Optimization for feature selection.
Project description
MICO: Mutual Information and Conic Optimization for feature selection
MICO is a Python package that implements a conic optimization based feature selection method with mutual information (MI) measure. The idea behind the approach is to measure the features’relevance and redundancy using MI, and then formulate a feature selection problem as a pure-binary quadratic optimization problem, which can be heuristically solved by an efficient randomization algorithm via semidefinite programming. Optimization software Colin is used for solving the underlying conic optimization problems.
This package
implements three methods for feature selections:
MICO : Conic Optimization approach
MIFS : Forward Selection approach
MIBS : Backward Selection approach
supports three different MI measures:
JMI : Joint Mutual Information
JMIM : Joint Mutual Information Maximisation
MRMR : Max-Relevance Min-Redundancy
generates feature importance scores for all selected features.
provides scikit-learn compatible APIs.
Installation
Download Colin distribution from http://www.colinopt.org/downloads.php and unpack it into a chosen directory (<CLNHOME>). Then install Colin package:
cd <CLNHOME>/python
pip install -r requirements.txt
python setup.py install
Next, install MICO package dependencies:
pip install -r requirements.txt
To install MICO package, use:
python setup.py install
or
pip install colin-mico
To install the development version, you may use:
pip install --upgrade git+https://github.com/jupiters1117/mico
Usage
This package provides scikit-learn compatible APIs:
fit(X, y)
transform(X)
fit_transform(X, y)
Examples
The following example illustrates the use of the package:
import pandas as pd
from sklearn.datasets import load_breast_cancer
# Prepare data.
data = load_breast_cancer()
y = data.target
X = pd.DataFrame(data.data, columns=data.feature_names)
# Perform feature selection.
mico = MutualInformationConicOptimization(verbose=1, categorical=True)
mico.fit(X, y)
# Populate selected features.
print("Selected features: {}".format(mico.get_support()))
# Populate feature importance scores.
print("Feature importance scores: {}".format(mico.feature_importances_))
# Call transform() on X.
X_transformed = mico.transform(X)
Documentation
User guide, examples, and API are available here.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file colin-mico-0.1.0a0.tar.gz
.
File metadata
- Download URL: colin-mico-0.1.0a0.tar.gz
- Upload date:
- Size: 25.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.1.post20191125 requests-toolbelt/0.9.1 tqdm/4.40.0 CPython/3.6.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d629e45eb46f941ef2704503dd9a5ae8352524dee2de88eb21b695ef53510fa8 |
|
MD5 | b758dfe9f214f7a6406c801d978f6f7f |
|
BLAKE2b-256 | 9af0f5bed19e25b488b7fb99c8ba0a8534ff4029b82d6d4e56315fc0f3e6246b |
File details
Details for the file colin_mico-0.1.0a0-py3-none-any.whl
.
File metadata
- Download URL: colin_mico-0.1.0a0-py3-none-any.whl
- Upload date:
- Size: 26.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.1.post20191125 requests-toolbelt/0.9.1 tqdm/4.40.0 CPython/3.6.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 29f16d6ba2e2d65305199a1b5115f1ccd20063da5ef27466a17da56e7fc58da6 |
|
MD5 | f0ae55dba7496cdd4e81892d31a26610 |
|
BLAKE2b-256 | a07cead5cbdf08f2ae460971ffbcceb4abb74ef7353cd7e02f6c4b120cacd4d1 |