Skip to main content

A Python Package for Feature Selection

Project description

Py_FS: A Python Package for Feature Selection

Py_FS is a toolbox developed with complete focus on Feature Selection (FS) using Python as the underlying programming language. It comes with capabilities like nature-inspired evolutionary feature selection algorithms, filter methods and simple evaulation metrics to help with easy applications and comparisons among different feature selection algorithms over different datasets. It is still in the development phase. We wish to extend this package further to contain more extensive set of feature selection procedures and corresponding utilities.


Installation

Please install the required utilities for the package by running this piece of code:

pip3 install -r requirements.txt

The package is publicly avaliable at PYPI: Python Package Index. Anybody willing to use the package can install it by simply running:

pip3 install Py_FS

Structure

The current structure of the package is mentioned below. Depending on the level of the function intended to call, it should be imported using the period(.) hierarchy.

Py_FS

For example, if someone wants to use GA, it should be imported using the following statement:

import Py_FS.wrapper.nature_inspired.GA

There are mainly three utilities in the current version of the package. The next part discusses these three parts in detail:

Quick User Guide

For a quicl demonstration of the process of using Py_FS, please proceed to this Colab link: Py_FS: Demonstration.

1. Wrapper-based Nature-inpsired Feature Selection

Wrapper-based Nature-inspired methods are very popular feature selection approaches due to their efficiency and simplicity. These methods progress by introducing random set of candidate solutions (agents which are natural elements like particles, whales, bats etc.) and improving these solutions gradually by using guidance mechanisms of fitter agents. In order to calculate the fitness of the candidate solutions, wrappers require some learning algorithm (like classifiers) to calculate the worth of a solution at every iteration. This makes wrapper methods extremely reliable but computationally expensive as well.

Py_FS currently supports the following 12 wrapper-based FS methods:

  • Binary Bat Algorithm (BBA)
  • Cuckoo Search Algorithm (CS)
  • Equilibrium Optimizer (EO)
  • Genetic Algorithm (GA)
  • Gravitational Search Algorithm (GSA)
  • Grey Wolf Optimizer (GWO)
  • Harmony Search (HS)
  • Mayfly Algorithm (MA)
  • Particle Swarm Optimization (PSO)
  • Red Deer Algorithm (RDA)
  • Sine Cosine Algorithm (SCA)
  • Whale Optimization Algorithm (WOA)

These wrapper approaches can be imported in your code using the following statements:

import Py_FS.wrapper.nature_inspired.BBA
import Py_FS.wrapper.nature_inspired.CS
import Py_FS.wrapper.nature_inspired.EO
import Py_FS.wrapper.nature_inspired.GA
import Py_FS.wrapper.nature_inspired.GSA
import Py_FS.wrapper.nature_inspired.GWO
import Py_FS.wrapper.nature_inspired.HS
import Py_FS.wrapper.nature_inspired.MA
import Py_FS.wrapper.nature_inspired.PSO
import Py_FS.wrapper.nature_inspired.RDA
import Py_FS.wrapper.nature_inspired.SCA
import Py_FS.wrapper.nature_inspired.WOA

2. Filter-based Feature Selection

Filter methods do not use any intermediate learning algorithm to verify the strength of the generated solutions. Instead, they use statistical measures to identify the importance of different features in the context. So, finally every feature gets a rank according to their relevance in the dataset. The top features can then be used for classification.

Py_FS currently supports the following 4 filter-based FS methods:

  • Pearson Correlation Coefficient (PCC)
  • Spearman Correlation Coefficient (SCC)
  • Relief
  • Mutual Information (MI)

These filter approaches can be imported in your code using the following statements:

import Py_FS.filter.PCC
import Py_FS.filter.SCC
import Py_FS.filter.Relief
import Py_FS.filter.MI

3. Evaluation Metrics

The package comes with tools to evaluate features before or after FS. This helps to easily compare and analyze performances of different FS procedures.

Py_FS currently supports the following evaluation metrics:

  • classification accuracy
  • average recall
  • average precision
  • average f1 score
  • confusion matrix
  • confusion graph

The evaulation capabilities can be imported in your code using the following statement:

from Py_FS.evaluation import evaluate

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Py_FS-0.0.5.8.tar.gz (84.8 kB view hashes)

Uploaded Source

Built Distribution

Py_FS-0.0.5.8-py3-none-any.whl (49.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page