Skip to main content

MAFESE: Metaheuristic Algorithm for Feature Selection - An Open Source Python Library

Project description

MAFESE


GitHub release Wheel PyPI version PyPI - Python Version PyPI - Status PyPI - Downloads Downloads Tests & Publishes to PyPI GitHub Release Date Documentation Status Chat Average time to resolve an issue Percentage of issues still open GitHub contributors GitTutorial DOI License: GPL v3

MAFESE (Metaheuristic Algorithms for FEature SElection) is the largest python library focused on feature selection using meta-heuristic algorithms.

  • Free software: GNU General Public License (GPL) V3 license
  • Total Wrapper-based (Metaheuristic Algorithms): > 170 methods
  • Total Filter-based (Statistical-based): > 6 methods
  • Total classification dataset: > 20 datasets
  • Total estimator methods: > 3 methods
  • Total performance metrics (as fitness): > 10 metrics
  • Documentation: https://mafese.readthedocs.io/en/latest/
  • Python versions: 3.7.x, 3.8.x, 3.9.x, 3.10.x, 3.11.x
  • Dependencies: numpy, scipy, scikit-learn, pandas, matplotlib, mealpy, permetrics

Installation

Install with pip

Install the current PyPI release:

$ pip install mafese==0.1.0

Install directly from source code

$ git clone https://github.com/thieu1995/mafese.git
$ cd mafese
$ python setup.py install

Lib's structure

docs
examples
mafese
    wrapper
        recursive.py
        sequential.py
    filter.py
    utils
        correlation.py
        encoder.py
        estimator.py
        validator.py
    __init__.py
    selector.py
README.md
setup.py

Usage

After installation, you can import MAFESE as any other Python module:

$ python
>>> import mafese
>>> mafese.__version__

Let's go through some examples.

Examples

First, you need to load your dataset, or you can load own available datasets:

# Load available dataset from MAFESE
from mafese import get_dataset

# Try unknown data
get_dataset("unknown")
# Enter: 1

data = get_dataset("Arrhythmia")
# Load your own dataset 
import pandas as pd
from mafese import Data

# load X and y
# NOTE mafese accepts numpy arrays only, hence the .values attribute
dataset = pd.read_csv('examples/dataset.csv', index_col=0).values
X, y = dataset[:, 0:-1], dataset[:, -1]
data = Data(X, y)

Next, split dataset into train and test set

data.split_train_test(test_size=0.2, inplace=True)
print(data.X_train[:2].shape)
print(data.y_train[:2].shape)

Next, how to use Recursive wrapper-based method:

from mafese.wrapper.recursive import Recursive

# define mafese feature selection method
feat_selector = Recursive(problem="classification", estimator="rf", n_features=5)

# find all relevant features - 5 features should be selected
feat_selector.fit(data.X_train, data.y_train)

# check selected features - True (or 1) is selected, False (or 0) is not selected
print(feat_selector.selected_feature_masks)
print(feat_selector.selected_feature_solution)

# check the index of selected features
print(feat_selector.selected_feature_indexes)

# call transform() on X to filter it down to selected features
X_train_selected = feat_selector.transform(data.X_train)
X_test_selected = feat_selector.transform(data.X_test)

Or, how to use Sequential wrapper-based method:

from mafese.wrapper.sequential import Sequential

# define mafese feature selection method
feat_selector = Sequential(problem="classification", estimator="knn", n_features=3, direction="forward")

# find all relevant features - 5 features should be selected
feat_selector.fit(data.X_train, data.y_train)

# check selected features - True (or 1) is selected, False (or 0) is not selected
print(feat_selector.selected_feature_masks)
print(feat_selector.selected_feature_solution)

# check the index of selected features
print(feat_selector.selected_feature_indexes)

# call transform() on X to filter it down to selected features
X_train_selected = feat_selector.transform(data.X_train)
X_test_selected = feat_selector.transform(data.X_test)

Or, how to use Filter-based feature selection with different correlation methods:

from mafese.filter import Filter

# define mafese feature selection method
feat_selector = Filter(problem='classification', method='SPEARMAN', n_features=5)

# find all relevant features - 5 features should be selected
feat_selector.fit(data.X_train, data.y_train)

# check selected features - True (or 1) is selected, False (or 0) is not selected
print(feat_selector.selected_feature_masks)
print(feat_selector.selected_feature_solution)

# check the index of selected features
print(feat_selector.selected_feature_indexes)

# call transform() on X to filter it down to selected features
X_train_selected = feat_selector.transform(data.X_train)
X_test_selected = feat_selector.transform(data.X_test)

For more usage examples please look at examples folder.

Get helps (questions, problems)

Want to have an instant assistant? Join our telegram community at link We share lots of information, questions, and answers there. You will get more support and knowledge there.

References

1. https://neptune.ai/blog/feature-selection-methods
https://github.com/LBBSoft/FeatureSelect
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2754-0

https://github.com/scikit-learn-contrib/boruta_py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mafese-0.1.0.tar.gz (2.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mafese-0.1.0-py3-none-any.whl (2.5 MB view details)

Uploaded Python 3

File details

Details for the file mafese-0.1.0.tar.gz.

File metadata

  • Download URL: mafese-0.1.0.tar.gz
  • Upload date:
  • Size: 2.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.5

File hashes

Hashes for mafese-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6d8080784701cc2ae2f0177be6c170c4a7b82c15b30052827ce9e11f80c86bad
MD5 cd877078bdb70c8ec4a7996b693a62a5
BLAKE2b-256 bf8f5cf489da70db7045de34f5e0c085931001df018a326912f8e40d18bc4468

See more details on using hashes here.

File details

Details for the file mafese-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mafese-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 2.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.5

File hashes

Hashes for mafese-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2fc750acd940321458ee3af06a2cff3b4bce319b1721308b2675a996097a4018
MD5 537b75847b24d21600db0d5a667f0b4f
BLAKE2b-256 e92873b3932269ba1f799190ae4385dfc733b33226eb77594b0b3a57063cba88

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page