zoofs ( Zoo Feature Selection )
zoofs
is a Python library for performing feature selection using an variety of nature inspired wrapper algorithms. The algorithms range from swarm-intelligence to physics based to Evolutionary.
It's easy to use ,flexible and powerful tool to reduce your feature size.
Installation
Using pip
Use the package manager pip to install zoofs.
pip install zoofs
Available Algorithms
Algorithm Name |
Class Name |
Description |
Particle Swarm Algorithm |
ParticleSwarmOptimization |
Utilizes swarm behaviour |
Grey Wolf Algorithm |
GreyWolfOptimization |
Utilizes wolf hunting behaviour |
Dragon Fly Algorithm |
DragonFlyOptimization |
Utilizes dragonfly swarm behaviour |
Genetic Algorithm Algorithm |
GeneticOptimization |
Utilizes genetic mutation behaviour |
Gravitational Algorithm |
GravitationalOptimization |
Utilizes newtons gravitational behaviour |
- [Try It Now?]
Usage
Define your own objective function for optimization !
from sklearn.metrics import log_loss
# define your own objective function, make sure the function receives four parameters,
# fit your model and return the objective value !
def objective_function_topass(model,X_train, y_train, X_valid, y_valid):
model.fit(X_train,y_train)
P=log_loss(y_valid,model.predict_proba(X_valid))
return P
# import an algorithm !
from zoofs import ParticleSwarmOptimization
# create object of algorithm
algo_object=ParticleSwarmOptimization(objective_function_topass,n_iteration=20,
population_size=20,minimize=True)
import lightgbm as lgb
lgb_model = lgb.LGBMClassifier()
# fit the algorithm
algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid,verbose=True)
#plot your results
algo_object.plot_history()
Algorithms
Particle Swarm Algorithm
class zoofs.ParticleSwarmOptimization(objective_function,n_iteration=50,population_size=50,minimize=True,c1=2,c2=2,w=0.9)
|
|
Parameters |
objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'. - The function must return a value, that needs to be minimized/maximized.
n_iteration : int, default=50 - Number of time the algorithm will run
population_size : int, default=50 - Total size of the population
minimize : bool, default=True - Defines if the objective value is to be maximized or minimized
c1 : float, default=2.0 - first acceleration coefficient of particle swarm
c2 : float, default=2.0 - second acceleration coefficient of particle swarm
w : float, default=0.9 - weight parameter
|
Attributes |
best_feature_list : array-like - Final best set of features
|
Methods
Methods |
Class Name |
fit |
Run the algorithm |
plot_history |
Plot results achieved across iteration |
fit(model,X_train, y_train, X_test, y_test,verbose=True)
|
|
Parameters |
model : - machine learning model's object
X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features)
- Training input samples to be used for machine learning model
y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) - The target values (class labels in classification, real numbers in regression).
X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features) - Validation input samples
y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) - The Validation target values .
verbose : bool,default=True - Print results for iterations
|
Returns |
best_feature_list : array-like - Final best set of features
|
plot_history()
Plot results across iterations
Example
from sklearn.metrics import log_loss
# define your own objective function, make sure the function receives four parameters,
# fit your model and return the objective value !
def objective_function_topass(model,X_train, y_train, X_valid, y_valid):
model.fit(X_train,y_train)
P=log_loss(y_valid,model.predict_proba(X_valid))
return P
# import an algorithm !
from zoofs import ParticleSwarmOptimization
# create object of algorithm
algo_object=ParticleSwarmOptimization(objective_function_topass,n_iteration=20,
population_size=20,minimize=True,c1=2,c2=2,w=0.9)
import lightgbm as lgb
lgb_model = lgb.LGBMClassifier()
# fit the algorithm
algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid,verbose=True)
#plot your results
algo_object.plot_history()
Grey Wolf Algorithm
class zoofs.GreyWolfOptimization(objective_function,n_iteration=50,population_size=50,minimize=True)
|
|
Parameters |
objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'. - The function must return a value, that needs to be minimized/maximized.
n_iteration : int, default=50 - Number of time the algorithm will run
population_size : int, default=50 - Total size of the population
minimize : bool, default=True - Defines if the objective value is to be maximized or minimized
|
Attributes |
best_feature_list : array-like - Final best set of features
|
Methods
Methods |
Class Name |
fit |
Run the algorithm |
plot_history |
Plot results achieved across iteration |
fit(model,X_train,y_train,X_valid,y_valid,method=1,verbose=True)
|
|
Parameters |
model : - machine learning model's object
X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features)
- Training input samples to be used for machine learning model
y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) - The target values (class labels in classification, real numbers in regression).
X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features) - Validation input samples
y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) - The Validation target values .
method : {1, 2}, default=1 - Choose the between the two methods of grey wolf optimization
verbose : bool,default=True - Print results for iterations
|
Returns |
best_feature_list : array-like - Final best set of features
|
plot_history()
Plot results across iterations
Example
from sklearn.metrics import log_loss
# define your own objective function, make sure the function receives four parameters,
# fit your model and return the objective value !
def objective_function_topass(model,X_train, y_train, X_valid, y_valid):
model.fit(X_train,y_train)
P=log_loss(y_valid,model.predict_proba(X_valid))
return P
# import an algorithm !
from zoofs import GreyWolfOptimization
# create object of algorithm
algo_object=GreyWolfOptimization(objective_function_topass,n_iteration=20,
population_size=20,minimize=True)
import lightgbm as lgb
lgb_model = lgb.LGBMClassifier()
# fit the algorithm
algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid,method=1,verbose=True)
#plot your results
algo_object.plot_history()
Dragon Fly Algorithm
class zoofs.DragonFlyOptimization(objective_function,n_iteration=50,population_size=50,minimize=True)
|
|
Parameters |
objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'. - The function must return a value, that needs to be minimized/maximized.
n_iteration : int, default=50 - Number of time the algorithm will run
population_size : int, default=50 - Total size of the population
minimize : bool, default=True - Defines if the objective value is to be maximized or minimized
|
Attributes |
best_feature_list : array-like - Final best set of features
|
Methods
Methods |
Class Name |
fit |
Run the algorithm |
plot_history |
Plot results achieved across iteration |
fit(model,X_train,y_train,X_valid,y_valid,method='sinusoidal',verbose=True)
|
|
Parameters |
model : - machine learning model's object
X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features)
- Training input samples to be used for machine learning model
y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) - The target values (class labels in classification, real numbers in regression).
X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features) - Validation input samples
y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) - The Validation target values .
method : {'linear','random','quadraic','sinusoidal'}, default='sinusoidal' - Choose the between the three methods of Dragon Fly optimization
verbose : bool,default=True - Print results for iterations
|
Returns |
best_feature_list : array-like - Final best set of features
|
plot_history()
Plot results across iterations
Example
from sklearn.metrics import log_loss
# define your own objective function, make sure the function receives four parameters,
# fit your model and return the objective value !
def objective_function_topass(model,X_train, y_train, X_valid, y_valid):
model.fit(X_train,y_train)
P=log_loss(y_valid,model.predict_proba(X_valid))
return P
# import an algorithm !
from zoofs import DragonFlyOptimization
# create object of algorithm
algo_object=DragonFlyOptimization(objective_function_topass,n_iteration=20,
population_size=20,minimize=True)
import lightgbm as lgb
lgb_model = lgb.LGBMClassifier()
# fit the algorithm
algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid, method='sinusoidal', verbose=True)
#plot your results
algo_object.plot_history()
Genetic Algorithm
class zoofs.GeneticOptimization(objective_function,n_iteration=20,population_size=20,selective_pressure=2,elitism=2,mutation_rate=0.05,minimize=True)
|
|
Parameters |
objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'. - The function must return a value, that needs to be minimized/maximized.
n_iteration : int, default=50 - Number of time the algorithm will run
population_size : int, default=50 - Total size of the population
selective_pressure : int, default=2 - measure of reproductive opportunities for each organism in the population
elitism : int, default=2 - number of top individuals to be considered as elites
mutation_rate : float, default=0.05 - rate of mutation in the population's gene
minimize : bool, default=True - Defines if the objective value is to be maximized or minimized
|
Attributes |
best_feature_list : array-like - Final best set of features
|
Methods
Methods |
Class Name |
fit |
Run the algorithm |
plot_history |
Plot results achieved across iteration |
fit(model,X_train,y_train,X_valid,y_valid,verbose=True)
|
|
Parameters |
model : - machine learning model's object
X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features)
- Training input samples to be used for machine learning model
y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) - The target values (class labels in classification, real numbers in regression).
X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features) - Validation input samples
y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) - The Validation target values .
verbose : bool,default=True - Print results for iterations
|
Returns |
best_feature_list : array-like - Final best set of features
|
plot_history()
Plot results across iterations
Example
from sklearn.metrics import log_loss
# define your own objective function, make sure the function receives four parameters,
# fit your model and return the objective value !
def objective_function_topass(model,X_train, y_train, X_valid, y_valid):
model.fit(X_train,y_train)
P=log_loss(y_valid,model.predict_proba(X_valid))
return P
# import an algorithm !
from zoofs import GeneticOptimization
# create object of algorithm
algo_object=GeneticOptimization(objective_function_topass,n_iteration=20,
population_size=20,selective_pressure=2,elitism=2,
mutation_rate=0.05,minimize=True)
import lightgbm as lgb
lgb_model = lgb.LGBMClassifier()
# fit the algorithm
algo_object.fit(lgb_model,X_train, y_train,X_valid, y_valid, verbose=True)
#plot your results
algo_object.plot_history()
Gravitational Algorithm
class zoofs.GravitationalOptimization(self,objective_function,n_iteration=50,population_size=50,g0=100,eps=0.5,minimize=True)
|
|
Parameters |
objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'. - The function must return a value, that needs to be minimized/maximized.
n_iteration : int, default=50 - Number of time the algorithm will run
population_size : int, default=50 - Total size of the population
g0 : float, default=100 - gravitational strength constant
eps : float, default=0.5 - distance constant
minimize : bool, default=True - Defines if the objective value is to be maximized or minimized
|
Attributes |
best_feature_list : array-like - Final best set of features
|
Methods
Methods |
Class Name |
fit |
Run the algorithm |
plot_history |
Plot results achieved across iteration |
fit(model,X_train,y_train,X_valid,y_valid,verbose=True)
|
|
Parameters |
model : - machine learning model's object
X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features)
- Training input samples to be used for machine learning model
y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) - The target values (class labels in classification, real numbers in regression).
X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features) - Validation input samples
y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples) - The Validation target values .
verbose : bool,default=True - Print results for iterations
|
Returns |
best_feature_list : array-like - Final best set of features
|
plot_history()
Plot results across iterations
Example
from sklearn.metrics import log_loss
# define your own objective function, make sure the function receives four parameters,
# fit your model and return the objective value !
def objective_function_topass(model,X_train, y_train, X_valid, y_valid):
model.fit(X_train,y_train)
P=log_loss(y_valid,model.predict_proba(X_valid))
return P
# import an algorithm !
from zoofs import GravitationalOptimization
# create object of algorithm
algo_object=GravitationalOptimization(objective_function,n_iteration=50,
population_size=50,g0=100,eps=0.5,minimize=True)
import lightgbm as lgb
lgb_model = lgb.LGBMClassifier()
# fit the algorithm
algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid, verbose=True)
#plot your results
algo_object.plot_history()
Support zoofs
The development of zoofs
relies completely on contributions.
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.
First roll out
*, 2021 *
License
apache-2.0