Skip to main content

Algorithm that uses genetic and MRMR for feature selection

Project description

GenMRMR

This repository proposes a new feature selection algorithm based on genetic and the MRMR algorithms. The algorithm aims to select a subset of features that maximizes classification accuracy. The algorithm works by creating a population of subsets of features, evaluating their weighted F1-score, and then hybridizing the best individuals to create new subsets. The probability of adding a feature during hybridization is estimated using the MRMR algorithm. Finally, the best subset of features is selected based on the F1-score.

Documentation

There are 3 methods available: fit, transform, fit_transform:

fit method is used to fit GenMRMR to given

Args:
    x_train, y_train, x_cv, y_cv(numpy.ndarray): train and cv data and labels that will be used for feature selection

transform method is used to transform fitted GenMRMR

Args:
    data(numpy.ndarray): data to transform

fit_transform method is used to fit and then transform data using GenMRMR

Args:
    data(pandas.core.frame.DataFrame): data to transform
    labels(pandas.core.series.Series): labels for data

Authors

  • Algorithm created by Maksim Tislenko(makstislenko@gmail.com)
    • Created the idea of algorithm
    • Implemented the whole algorithm
  • Сomparison with existing feature selection algorithms was conducted with the participation of Rustam Paringer (rusparinger@ssau.ru) from the Department of Technical Cybernetics at Samara University.

Acknowledgements

I would also like to express my gratitude to Arkadiy Sotnikov (cotnikoarkady@gmail.com) for assistance in preparing this readme.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genmrmr-0.1.0.tar.gz (5.9 kB view hashes)

Uploaded Source

Built Distribution

genmrmr-0.1.0-py3-none-any.whl (6.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page