A Feature Selection and Feature ranking Package that can be used to select and rank features in datasets
Project description
Feature Selection and Feature Ranking Algorithms :
A Python package that provides many feature selection and feature ranking algorithms
Use the function call like :
fsfr(dataset, fs = 'string_value', fr = 'string_value', ftf = 'string_value')
Parameters :
dataset : pandas dataframe of the original dataset
It must only contain numerical values (categorical, ordinal values are excluded) and
the class variable (decisional attribute or variable) should be also of numerical type.
fs : string values - 'gpso' or 'ga'
fs means feature selection method can be either :
gpso : Geometric Particle Swarm Optimisation
ga : Genetic Alogorithm
fr : string values - 'rsm_a' , 'rsm_b' , 'rsm_c' , 'mifsnd' , 'mrmr'
fr means feature ranking and can be either :
rsm_a : Rough Set Method 1
rsm_b : Rough Set Method 2
rsm_c : Rough Set Method 3
mifsnd : Mutual Information Feature Selection-ND
mrmr : Minimum Redundancy Maximum Relevance
ftf : string values - 'ftf_1' , 'ftf_2' , 'ftf_3'
ftf means fitness function
If 'fs' is used then, it is mandatory to specify the value of 'ftf'
ftf_1 : fitness function = 0.75 * (100/accuracy) + 0.25 * (no of features)
ftf_2 : fitness function = 0.75 * accuracy + 0.25 * (1 / no of features)
ftf_3 : fitness_function = accuracy * (1 - no of features/total no of features)
no of features = no of features that are selected by the algorithm at that instance
Returns : list of features ranked in descending order if both 'fs' and 'fr' are used or only 'fr' is used.
The feature selection and ranking can be used independently of each other by mentioning either fs='' or fr='' but both cannot be '' and it is preferable to use both at the same time in case of larger datasets.
Refrences for algorithms :
gpso with ftf_1 : https://www.researchgate.net/publication/4307926_Gene_selection_in_cancer_ rsm_a : http://library.isical.ac.in:8080/jspui/bitstream/10263/5158/1/Rough%20Sets%20for%20Selection %20of%20Molecular%20Descriptors%20to%20Predict%20Biological%20Activity%20of%20Molecules-IEEETOSMAC-% 20Part%20C-AAR-40-6-2010-p%20639-648.pdf rsm_b : https://ieeexplore.ieee.org/document/7104131 mifsnd : https://www.sciencedirect.com/science/article/pii/S0957417414002164 The rest of the algorithms have been self developed and do not contain any materials from any other sources
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for Feature Selction-Ranking Algorithms-1.0.7.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | a3066cf2a58b08f03b61dccb7d660452017566eea9c653b3f50ca2fd6ffeb783 |
|
MD5 | 688ed7ca9dfa23ce7726d87585cb85d1 |
|
BLAKE2b-256 | d8b19808031fa37251a77c2c969c7e3567595cdd5b15b768e9d4b028c7365660 |
Hashes for Feature_Selction_Ranking_Algorithms-1.0.7-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8762c755109b8967683ecf3101d9ef5e2637399cac6c590d6d23bfa7f9b3b307 |
|
MD5 | 15fe67c7c89aadffd860e4c3ec7a8f27 |
|
BLAKE2b-256 | 23bc9dcec20c1c366ef80de0a68b4fe36faa759c9e6cb82bfda80a092f0a92ac |