A python package that automates algorithm selection and hyperparameter tuning for the recommender system library Surprise
Auto-Surprise is built as a wrapper around the Python Surprise recommender-system library. It automates algorithm selection and hyper parameter optimization in a highly parallelized manner.
- Documentation is available at Auto-Surprise ReadTheDocs
- AutoSurprise is currently in development.
Auto-Surprise is easy to install with Pip. You will require Python>=3.6 installed on a linux system. Currently not supported in windows, but can be used using WSL.
$ pip install auto-surprise
Basic usage of AutoSurprise is given below.
from surprise import Dataset from auto_surprise.engine import Engine # Load the dataset data = Dataset.load_builtin('ml-100k') # Intitialize auto surprise engine engine = Engine(verbose=True) # Start the trainer best_algo, best_params, best_score, tasks = engine.train( data=data, target_metric='test_rmse', cpu_time_limit=60 * 60, max_evals=100 )
In the above example, we first initialize the
Engine. We then run
engine.train() to begin training our model. To train the model we need to pass the following
data: The data as an instance of
surprise.dataset.DatasetAutoFolds. Please read Surprise Dataset docs
target_metric: The metric we seek to minimize. Available options are
cpu_time_limit: The time limit we want to train. This is in seconds. For datasets like Movielens 100k, 1 hour is sufficient. But you may want to increase this based on the size of your dataset
max_evals: The maximum number of evaluations each algorithm gets for hyper parameter optimization.
hpo_algo: Auto-Surprise uses Hyperopt for hyperparameter tuning. By default, it's set to use TPE, but you can change this to any algorithm supported by hyperopt, such as Adaptive TPE or Random search.
Setting the Hyperparameter Optimization Algorithm
Auto-Surprise uses Hyperopt. You can change the HPO algo as shown below.
# Example for setting the HPO algorithm to adaptive TPE import hyperopt ... engine = Engine(verbose=True) engine.train( data=data, target_metric='test_rmse', cpu_time_limit=60 * 60, max_evals=100, hpo_algo=hyperopt.atpe.suggest )
Building back the best model
You can build a pickelable model as shown.
model = engine.build_model(best_algo, best_params)
In my testing, Auto-Surprise performed anywhere from 0.8 to 4% improvement in RMSE compared to the best performing default algorithm configuration. In the table below are the results for the Jester 2 dataset. Benchmark results for Movielens and Book-Crossing dataset are also available here
|KNN with Means||5.124||3.955||00:02:16|
|KNN with Z-score||5.219||3.955||00:02:20|
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size auto_surprise-0.1.7-py3-none-any.whl (19.7 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size auto-surprise-0.1.7.tar.gz (15.4 kB)||File type Source||Python version None||Upload date||Hashes View|
Hashes for auto_surprise-0.1.7-py3-none-any.whl