A Python package for Parallelized Minimum Redundancy, Maximum Relevance (mRMR) Ensemble Feature selections.
Project description
PymRMRe
Description
Feature selection is one of the main challenges in analyzing high-throughput genomic data. Minimum redundancy maximum relevance (mRMR) is a particularly fast feature selection method for finding a set of both relevant and complementary features. The Pymrmre package, extend the mRMR technique by using an ensemble approach to better explore the feature space and build more robust predictors. To deal with the computational complexity of the ensemble approach, the main functions of the package are implemented and parallelized in C++ using openMP Application Programming Interface. The package also supports making best selections with some fixed-selected features.
Prerequisite
Python(>=3.6.0)
Cython(>=0.29.12)
numpy(>=1.16.4)
pandas(>=0.25.0)
Installation
pip install Pymrmre
Insturctions
Two primary functions are provided in this package currently:
-
mrmr_ensemble: It provides the ensemble (multiple) solutions of feature selection given the input of feature dataset and target column, it supports the feature selection with preselection as well.
- :param features: Pandas dataframe, the input dataset
- :param targets: Pandas dataframe, the target features
- :param fixed_features: List, the list of fixed features (column names), the default is empty list
- :param category_features: List, the list of features whose types are categorical (column names), the default is empty list
- :param solution_length: Integer, the number of features contained in one solution
- :param solution_count: Integer, the number of solutions to be returned, the default is 1
- :param estimator: String, the way of computing continuous estimators, the default is Pearson
- :param return_index: Boolean, to determine whether the solution contains the indices or column names of selected features, the default is False
- :param return_with_fixed: Boolean, to determine whether the solution contains the fixed selected features, the default is True
- :return: Pandas series, the solutions of selected features
-
mrmr_ensemble_survival: It provides the ensemble (multiple) solutions of feature selection given the input of feature dataset and target column, it supports the feature selection with preselection as well.
- :param features: Pandas dataframe, the input dataset
- :param targets: Pandas dataframe, the target features, it must have two columns (event and time of survival data)
- :param fixed_features: List, the list of fixed features (column names), the default is empty list
- :param category_features: List, the list of features whose types are categorical (column names), the default is empty list
- :param solution_length: Integer, the number of features contained in one solution
- :param solution_count: Integer, the number of solutions to be returned, the default is 1
- :param estimator: String, the way of computing continuous estimators, the default is Pearson
- :param return_index: Boolean, to determine whether the solution contains the indices or column names of selected features, the default is False
- :param return_with_fixed: Boolean, to determine whether the solution contains the fixed selected features, the default is True
- :return: Pandas series, the solutions of selected features
Example code:
import pandas as pd
from Pymrmre import mrmr
Load the input data and target variable, suppose for input X we have ten features (f1, f2, ..., f10):
X = pd.read_csv('train_x.csv')
Y = pd.read_csv('train_y.csv')
Suppose we want to generate 3 solutions, where each solution has 5 features. We want to see f1 exists in all solutions (preselection), and we know that f4 and f5 are categorical variables as well, the code should be like this:
solutions = mrmr.mrmr_ensemble(features=X,targets=Y,fixed_features=['f1'],category_features=['f4','f5'],solution_length=5,solution_count=3)
Because the solution we generated is of the type Pandas series, which has the target variable name as column header. To access the contents of all three solutions, the code is like this:
solutions.iloc[0]
To access one of the solutions, the code is like this (i is 0 - 2 here since we generate 3 solutions here):
solutions.iloc[0][i]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file pymrmre-1.0.7.tar.gz
.
File metadata
- Download URL: pymrmre-1.0.7.tar.gz
- Upload date:
- Size: 18.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d608bca4e5795e6ff9fa202acc2a9f538faab605efb4f48c7faa5589a3a5360b |
|
MD5 | 74c87fb84d6f567ad6898a0508c6b47b |
|
BLAKE2b-256 | cd45c9074cb84c2b802e73ba77e0e1e067f6f5e16011cda39c42dbe8909e012d |
File details
Details for the file pymrmre-1.0.7-py3.8-win-amd64.egg
.
File metadata
- Download URL: pymrmre-1.0.7-py3.8-win-amd64.egg
- Upload date:
- Size: 129.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.8.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 29de30cf28cb6752482da1cf99943577b20dc773911ffed7b6038f7c34941fe1 |
|
MD5 | e91503f9162e6df01c1f1e87625e9c15 |
|
BLAKE2b-256 | 34cbd4c5af2bb3691daf90fe26e4c03bf785d9ca0c31a0f3bb45635ca92b2e3d |
File details
Details for the file pymrmre-1.0.7-py3.8-macosx-10.14-x86_64.egg
.
File metadata
- Download URL: pymrmre-1.0.7-py3.8-macosx-10.14-x86_64.egg
- Upload date:
- Size: 129.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.8.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6759b9484346a5a2a65c1a81216eeb0a4678aac26bb322bedd0eafb52ee679fa |
|
MD5 | 13807d996c2b81b416b1868d134189eb |
|
BLAKE2b-256 | 73da0b5d609b441a1161b6e686a5f44709cf7e238519fdb4ff417f497b62cc43 |
File details
Details for the file pymrmre-1.0.7-py3.8-linux-x86_64.egg
.
File metadata
- Download URL: pymrmre-1.0.7-py3.8-linux-x86_64.egg
- Upload date:
- Size: 427.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.8.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5f30cb5826bbf1d556a59e7f4bee8d7f374e39608d36e96621dbd2ead2adc3f3 |
|
MD5 | 77bf33acb5cfd782e42302132a0ca77a |
|
BLAKE2b-256 | f542a97d7813b56383c8e0a242bf9f551b447fc4ce44ff35a984cf1158b96272 |
File details
Details for the file pymrmre-1.0.7-py3.7-win-amd64.egg
.
File metadata
- Download URL: pymrmre-1.0.7-py3.7-win-amd64.egg
- Upload date:
- Size: 128.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 07297fcb130b9b351f4dae966382d59bf7405af1d4be91948422e5fa0579e576 |
|
MD5 | c09adc9f4e89a98457fc9df814255ce2 |
|
BLAKE2b-256 | abd9103e1dafef20f25df4a482ea63eb36427fc858d707a16c5e2d44f5407140 |
File details
Details for the file pymrmre-1.0.7-py3.7-macosx-10.14-x86_64.egg
.
File metadata
- Download URL: pymrmre-1.0.7-py3.7-macosx-10.14-x86_64.egg
- Upload date:
- Size: 129.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4472e78aa13a45fd801e9f92e7b701409e260ab9d1fb98a571a7ee7b53ac47c6 |
|
MD5 | 9779e556e3026f47ef41ee7eb286fa2c |
|
BLAKE2b-256 | ccfdfbc06544117b94da4a73caa6df6b1fdf18404f5929331f28148cb892a93e |
File details
Details for the file pymrmre-1.0.7-py3.7-linux-x86_64.egg
.
File metadata
- Download URL: pymrmre-1.0.7-py3.7-linux-x86_64.egg
- Upload date:
- Size: 415.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1dee1e27e6d21acc0736c73691513567d0f2b29429f9656ea0821008bbbd36f6 |
|
MD5 | 6cb983abc5418efa5a7a3d7ce242f744 |
|
BLAKE2b-256 | cee15d35fd4e54f58c37a529de148b07d990cd69bfe6c4a7ac352828d3291cfe |
File details
Details for the file pymrmre-1.0.7-cp38-cp38-win_amd64.whl
.
File metadata
- Download URL: pymrmre-1.0.7-cp38-cp38-win_amd64.whl
- Upload date:
- Size: 118.8 kB
- Tags: CPython 3.8, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.8.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 30b660c74470f00c4bb757f7da7f511be34c0d08f63d654b8897cec5bac807df |
|
MD5 | bff5f26ce97d3a3c715900a629cba328 |
|
BLAKE2b-256 | a82970d0b845bdaab966d29b57ddd2ef1eb574669d8ea253804c4cbc1ceebcb4 |
File details
Details for the file pymrmre-1.0.7-cp38-cp38-macosx_10_14_x86_64.whl
.
File metadata
- Download URL: pymrmre-1.0.7-cp38-cp38-macosx_10_14_x86_64.whl
- Upload date:
- Size: 119.1 kB
- Tags: CPython 3.8, macOS 10.14+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.8.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d1d2da0607c806c037fda52305df328bc70c9a47b31b2794aa89e7fb1033f27c |
|
MD5 | dbe697088b632040f5ad8f60055cf504 |
|
BLAKE2b-256 | 73dcde6022054c36d34168223948a8480a2c60ab22ab1f6542abcd5cf4335123 |
File details
Details for the file pymrmre-1.0.7-cp37-cp37m-win_amd64.whl
.
File metadata
- Download URL: pymrmre-1.0.7-cp37-cp37m-win_amd64.whl
- Upload date:
- Size: 118.3 kB
- Tags: CPython 3.7m, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 84c6fb4391ee19858da37126e94cb09fd5a7545bf2ba83f0d41334374d9bec5b |
|
MD5 | 18c507ef464017ba8feae5e948fa12bf |
|
BLAKE2b-256 | 81492a2e131441aa2dc09bd62f090ecc839c0a304738298defe8a8478fa8c438 |
File details
Details for the file pymrmre-1.0.7-cp37-cp37m-macosx_10_14_x86_64.whl
.
File metadata
- Download URL: pymrmre-1.0.7-cp37-cp37m-macosx_10_14_x86_64.whl
- Upload date:
- Size: 118.9 kB
- Tags: CPython 3.7m, macOS 10.14+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 11f1702f0f325894f275ec1c9d7ab471231b4f5ad7dec9d315712782d990ee36 |
|
MD5 | 931e6e6cc37424b78fb045beea51594f |
|
BLAKE2b-256 | 5a2de0f3dc926c311c4ed9c686c66446c60ac8e2f6d3d2595b76a755bdd6fa58 |