zoofs is a Python library for performing feature selection using an variety of nature inspired wrapper algorithms..
Project description
zoofs
is a Python library for performing feature selection using a variety of nature inspired wrapper algorithms. The algorithms range from swarmintelligence to physics based to Evolutionary.
It's an easy to use, flexible and powerful tool to reduce your feature size.
🌟 Like this Project? Give us a star !
📘 Documentation
https://jaswinder9051998.github.io/zoofs/
🔗 Whats new in V0.1.24
 pass kwargs through objective function
 improved logger for results
 added harris hawk algorithm
 now you can pass
timeout
as a parameter to stop operation after the given number of second(s). An amazing alternative to passing number of iterations  Feature score hashing of visited feature sets to increase the overall performance
🛠 Installation
Using pip
Use the package manager to install zoofs.
pip install zoofs
📜 Available Algorithms
Algorithm Name  Class Name  Description  References doi 

Particle Swarm Algorithm  ParticleSwarmOptimization  Utilizes swarm behaviour  https://doi.org/10.1007/9783319135632_51 
Grey Wolf Algorithm  GreyWolfOptimization  Utilizes wolf hunting behaviour  https://doi.org/10.1016/j.neucom.2015.06.083 
Dragon Fly Algorithm  DragonFlyOptimization  Utilizes dragonfly swarm behaviour  https://doi.org/10.1016/j.knosys.2020.106131 
Harris Hawk Algorithm  HarrisHawkOptimization  Utilizes hawk hunting behaviour  https://link.springer.com/chapter/10.1007/9789813299900_12 
Genetic Algorithm Algorithm  GeneticOptimization  Utilizes genetic mutation behaviour  https://doi.org/10.1109/ICDAR.2001.953980 
Gravitational Algorithm  GravitationalOptimization  Utilizes newtons gravitational behaviour  https://doi.org/10.1109/ICASSP.2011.5946916 
More algos soon, stay tuned !
⚡️ Usage
Define your own objective function for optimization !
Classification Example
from sklearn.metrics import log_loss # define your own objective function, make sure the function receives four parameters, # fit your model and return the objective value ! def objective_function_topass(model,X_train, y_train, X_valid, y_valid): model.fit(X_train,y_train) P=log_loss(y_valid,model.predict_proba(X_valid)) return P # import an algorithm ! from zoofs import ParticleSwarmOptimization # create object of algorithm algo_object=ParticleSwarmOptimization(objective_function_topass,n_iteration=20, population_size=20,minimize=True) import lightgbm as lgb lgb_model = lgb.LGBMClassifier() # fit the algorithm algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid,verbose=True) #plot your results algo_object.plot_history()
Regression Example
from sklearn.metrics import mean_squared_error # define your own objective function, make sure the function receives four parameters, # fit your model and return the objective value ! def objective_function_topass(model,X_train, y_train, X_valid, y_valid): model.fit(X_train,y_train) P=mean_squared_error(y_valid,model.predict(X_valid)) return P # import an algorithm ! from zoofs import ParticleSwarmOptimization # create object of algorithm algo_object=ParticleSwarmOptimization(objective_function_topass,n_iteration=20, population_size=20,minimize=True) import lightgbm as lgb lgb_model = lgb.LGBMRegressor() # fit the algorithm algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid,verbose=True) #plot your results algo_object.plot_history()
Suggestions for Usage
 As available algorithms are wrapper algos, it is better to use ml models that build quicker, e.g lightgbm, catboost.
 Take sufficient amount for 'population_size' , as this will determine the extent of exploration and exploitation of the algo.
 Ensure that your ml model has its hyperparamters optimized before passing it to zoofs algos.
objective score plot
Algorithms
Particle Swarm Algorithm
In computational science, particle swarm optimization (PSO) is a computational method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. It solves a problem by having a population of candidate solutions, here dubbed particles, and moving these particles around in the searchspace according to simple mathematical formula over the particle's position and velocity. Each particle's movement is influenced by its local best known position, but is also guided toward the best known positions in the searchspace, which are updated as better positions are found by other particles. This is expected to move the swarm toward the best solutions.
class zoofs.ParticleSwarmOptimization(objective_function,n_iteration=50,population_size=50,minimize=True,c1=2,c2=2,w=0.9)
Parameters  objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'.
n_iteration : int, default=1000
timeout : int = None
population_size : int, default=50
minimize : bool, default=True
c1 : float, default=2.0
c2 : float, default=2.0
w : float, default=0.9

Attributes  best_feature_list : arraylike

Methods
Methods  Class Name 

fit  Run the algorithm 
plot_history  Plot results achieved across iteration 
fit(model,X_train, y_train, X_test, y_test,verbose=True)
Parameters  model :
X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features)
y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features)
y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
verbose : bool,default=True

Returns  best_feature_list : arraylike

plot_history()
Plot results across iterations
Example
from sklearn.metrics import log_loss # define your own objective function, make sure the function receives four parameters, # fit your model and return the objective value ! def objective_function_topass(model,X_train, y_train, X_valid, y_valid): model.fit(X_train,y_train) P=log_loss(y_valid,model.predict_proba(X_valid)) return P # import an algorithm ! from zoofs import ParticleSwarmOptimization # create object of algorithm algo_object=ParticleSwarmOptimization(objective_function_topass,n_iteration=20, population_size=20,minimize=True,c1=2,c2=2,w=0.9) import lightgbm as lgb lgb_model = lgb.LGBMClassifier() # fit the algorithm algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid,verbose=True) #plot your results algo_object.plot_history()
Grey Wolf Algorithm
The Grey Wolf Optimizer (GWO) mimics the leadership hierarchy and hunting mechanism of grey wolves in nature. Four types of grey wolves such as alpha, beta, delta, and omega are employed for simulating the leadership hierarchy. In addition, three main steps of hunting, searching for prey, encircling prey, and attacking prey, are implemented to perform optimization.
class zoofs.GreyWolfOptimization(objective_function,n_iteration=50,population_size=50,minimize=True)
Parameters  objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'.
n_iteration : int, default=50
timeout : int = None
population_size : int, default=50
method : {1, 2}, default=1
minimize : bool, default=True

Attributes  best_feature_list : arraylike

Methods
Methods  Class Name 

fit  Run the algorithm 
plot_history  Plot results achieved across iteration 
fit(model,X_train,y_train,X_valid,y_valid,method=1,verbose=True)
Parameters  model :
X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features)
y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features)
y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
verbose : bool,default=True

Returns  best_feature_list : arraylike

plot_history()
Plot results across iterations
Example
from sklearn.metrics import log_loss # define your own objective function, make sure the function receives four parameters, # fit your model and return the objective value ! def objective_function_topass(model,X_train, y_train, X_valid, y_valid): model.fit(X_train,y_train) P=log_loss(y_valid,model.predict_proba(X_valid)) return P # import an algorithm ! from zoofs import GreyWolfOptimization # create object of algorithm algo_object=GreyWolfOptimization(objective_function_topass,n_iteration=20,method=1, population_size=20,minimize=True) import lightgbm as lgb lgb_model = lgb.LGBMClassifier() # fit the algorithm algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid,verbose=True) #plot your results algo_object.plot_history()
Dragon Fly Algorithm
The main inspiration of the Dragonfly Algorithm (DA) algorithm originates from static and dynamic swarming behaviours. These two swarming behaviours are very similar to the two main phases of optimization using metaheuristics: exploration and exploitation. Dragonflies create sub swarms and fly over different areas in a static swarm, which is the main objective of the exploration phase. In the static swarm, however, dragonflies fly in bigger swarms and along one direction, which is favourable in the exploitation phase.
class zoofs.DragonFlyOptimization(objective_function,n_iteration=50,population_size=50,minimize=True)
Parameters  objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'.
n_iteration : int, default=50
timeout : int = None
population_size : int, default=50
method : {'linear','random','quadraic','sinusoidal'}, default='sinusoidal'
minimize : bool, default=True

Attributes  best_feature_list : arraylike

Methods
Methods  Class Name 

fit  Run the algorithm 
plot_history  Plot results achieved across iteration 
fit(model,X_train,y_train,X_valid,y_valid,method='sinusoidal',verbose=True)
Parameters  model :
X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features)
y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features)
y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
verbose : bool,default=True

Returns  best_feature_list : arraylike

plot_history()
Plot results across iterations
Example
from sklearn.metrics import log_loss # define your own objective function, make sure the function receives four parameters, # fit your model and return the objective value ! def objective_function_topass(model,X_train, y_train, X_valid, y_valid): model.fit(X_train,y_train) P=log_loss(y_valid,model.predict_proba(X_valid)) return P # import an algorithm ! from zoofs import DragonFlyOptimization # create object of algorithm algo_object=DragonFlyOptimization(objective_function_topass,n_iteration=20,method='sinusoidal', population_size=20,minimize=True) import lightgbm as lgb lgb_model = lgb.LGBMClassifier() # fit the algorithm algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid, verbose=True) #plot your results algo_object.plot_history()
Harris Hawk Optimization
HHO is a popular swarmbased, gradientfree optimization algorithm with several active and timevarying phases of exploration and exploitation. This algorithm initially published by the prestigious Journal of Future Generation Computer Systems (FGCS) in 2019, and from the first day, it has gained increasing attention among researchers due to its flexible structure, high performance, and highquality results. The main logic of the HHO method is designed based on the cooperative behaviour and chasing styles of Harris' hawks in nature called "surprise pounce". Currently, there are many suggestions about how to enhance the functionality of HHO, and there are also several enhanced variants of the HHO in the leading Elsevier and IEEE transaction journals.
class zoofs.HarrisHawkOptimization(objective_function,n_iteration=50,population_size=50,minimize=True,beta=0.5)
Parameters  objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'.
n_iteration : int, default=1000
timeout : int = None
population_size : int, default=50
minimize : bool, default=True
beta : float, default=0.5

Attributes  best_feature_list : arraylike

Methods
Methods  Class Name 

fit  Run the algorithm 
plot_history  Plot results achieved across iteration 
fit(model,X_train, y_train, X_test, y_test,verbose=True)
Parameters  model :
X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features)
y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features)
y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
verbose : bool,default=True

Returns  best_feature_list : arraylike

plot_history()
Plot results across iterations
Example
from sklearn.metrics import log_loss # define your own objective function, make sure the function receives four parameters, # fit your model and return the objective value ! def objective_function_topass(model,X_train, y_train, X_valid, y_valid): model.fit(X_train,y_train) P=log_loss(y_valid,model.predict_proba(X_valid)) return P # import an algorithm ! from zoofs import HarrisHawkOptimization # create object of algorithm algo_object=HarrisHawkOptimization(objective_function_topass,n_iteration=20, population_size=20,minimize=True) import lightgbm as lgb lgb_model = lgb.LGBMClassifier() # fit the algorithm algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid,verbose=True) #plot your results algo_object.plot_history()
Genetic Algorithm
In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to generate highquality solutions to optimization and search problems by relying on biologically inspired operators such as mutation, crossover and selection. Some examples of GA applications include optimizing decision trees for better performance, automatically solve sudoku puzzles, hyperparameter optimization, etc.
class zoofs.GeneticOptimization(objective_function,n_iteration=20,population_size=20,selective_pressure=2,elitism=2,mutation_rate=0.05,minimize=True)
Parameters  objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'.
n_iteration : int, default=50
timeout : int = None
population_size : int, default=50
selective_pressure : int, default=2
elitism : int, default=2
mutation_rate : float, default=0.05
minimize : bool, default=True

Attributes  best_feature_list : arraylike

Methods
Methods  Class Name 

fit  Run the algorithm 
plot_history  Plot results achieved across iteration 
fit(model,X_train,y_train,X_valid,y_valid,verbose=True)
Parameters  model :
X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features)
y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features)
y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
verbose : bool,default=True

Returns  best_feature_list : arraylike

plot_history()
Plot results across iterations
Example
from sklearn.metrics import log_loss # define your own objective function, make sure the function receives four parameters, # fit your model and return the objective value ! def objective_function_topass(model,X_train, y_train, X_valid, y_valid): model.fit(X_train,y_train) P=log_loss(y_valid,model.predict_proba(X_valid)) return P # import an algorithm ! from zoofs import GeneticOptimization # create object of algorithm algo_object=GeneticOptimization(objective_function_topass,n_iteration=20, population_size=20,selective_pressure=2,elitism=2, mutation_rate=0.05,minimize=True) import lightgbm as lgb lgb_model = lgb.LGBMClassifier() # fit the algorithm algo_object.fit(lgb_model,X_train, y_train,X_valid, y_valid, verbose=True) #plot your results algo_object.plot_history()
Gravitational Algorithm
Gravitational Algorithm is based on the law of gravity and mass interactions is introduced. In the algorithm, the searcher agents are a collection of masses which interact with each other based on the Newtonian gravity and the laws of motion.
class zoofs.GravitationalOptimization(self,objective_function,n_iteration=50,population_size=50,g0=100,eps=0.5,minimize=True)
Parameters  objective_function : user made function of the signature 'func(model,X_train,y_train,X_test,y_test)'.
n_iteration : int, default=50
timeout : int = None
population_size : int, default=50
g0 : float, default=100
eps : float, default=0.5
minimize : bool, default=True

Attributes  best_feature_list : arraylike

Methods
Methods  Class Name 

fit  Run the algorithm 
plot_history  Plot results achieved across iteration 
fit(model,X_train,y_train,X_valid,y_valid,verbose=True)
Parameters  model :
X_train : pandas.core.frame.DataFrame of shape (n_samples, n_features)
y_train : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
X_valid : pandas.core.frame.DataFrame of shape (n_samples, n_features)
y_valid : pandas.core.frame.DataFrame or pandas.core.series.Series of shape (n_samples)
verbose : bool,default=True

Returns  best_feature_list : arraylike

plot_history()
Plot results across iterations
Example
from sklearn.metrics import log_loss # define your own objective function, make sure the function receives four parameters, # fit your model and return the objective value ! def objective_function_topass(model,X_train, y_train, X_valid, y_valid): model.fit(X_train,y_train) P=log_loss(y_valid,model.predict_proba(X_valid)) return P # import an algorithm ! from zoofs import GravitationalOptimization # create object of algorithm algo_object=GravitationalOptimization(objective_function_topass,n_iteration=50, population_size=50,g0=100,eps=0.5,minimize=True) import lightgbm as lgb lgb_model = lgb.LGBMClassifier() # fit the algorithm algo_object.fit(lgb_model,X_train, y_train, X_valid, y_valid, verbose=True) #plot your results algo_object.plot_history()
Support zoofs
The development of zoofs
relies completely on contributions.
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.
First roll out
18,08,2021
License
Project details
Release history Release notifications  RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.