Skip to main content

A python package that automates algorithm selection and hyperparameter tuning for the recommender system library Surprise

Project description

Auto-Surprise

GitHub release (latest by date) PyPI Downloads Auto-Surprise is built as a wrapper around the Python Surprise recommender-system library. It automates algorithm selection and hyper parameter optimization in a highly parallelized manner.

Setup

Auto-Surprise is easy to install with Pip, and required Python>=3.6 installed on a linux system. Currently not supported in windows, but can be used using WSL.

$ pip install auto-surprise

Usage

Basic usage of AutoSurprise is given below.

from surprise import Dataset
from auto_surprise.engine import Engine

# Load the dataset
data = Dataset.load_builtin('ml-100k')

# Intitialize auto surprise engine
engine = Engine(verbose=True)

# Start the trainer
best_algo, best_params, best_score, tasks = engine.train(
    data=data, 
    target_metric='test_rmse', 
    cpu_time_limit=60 * 60, 
    max_evals=100
)

In the above example, we first initialize the Engine. We then run engine.train() to begin training our model. To train the model we need to pass the following

  • data : The data as an instance of surprise.dataset.DatasetAutoFolds. Please read Surprise Dataset docs
  • target_metric : The metric we seek to minimize. Available options are test_rmse and test_mae.
  • cpu_time_limit : The time limit we want to train. This is in seconds. For datasets like Movielens 100k, 1 hour is sufficient. But you may want to increase this based on the size of your dataset
  • max_evals: The maximum number of evaluations each algorithm gets for hyper parameter optimization.
  • hpo_algo: Auto-Surprise uses Hyperopt for hyperparameter tuning. By default, it's set to use TPE, but you can change this to any algorithm supported by hyperopt, such as Adaptive TPE or Random search.

Setting the Hyperparameter Optimization Algorithm

Auto-Surprise uses Hyperopt. You can change the HPO algo as shown below.

# Example for setting the HPO algorithm to adaptive TPE
import hyperopt

...

engine = Engine(verbose=True)
engine.train(
    data=data,
    target_metric='test_rmse',
    cpu_time_limit=60 * 60,
    max_evals=100,
    hpo_algo=hyperopt.atpe.suggest
)

Building back the best model

You can build a pickelable model as shown.

model = engine.build_model(best_algo, best_params)

Benchmarks

In my testing, Auto-Surprise performed anywhere from 0.8 to 4% improvement in RMSE compared to the best performing default algorithm configuration. In the table below are the results for the Jester 2 dataset. Benchmark results for Movielens and Book-Crossing dataset are also available here

Algorithm RMSE MAE Time
Normal Predictor 7.277 5.886 00:00:01
SVD 4.905 3.97 00:00:13
SVD++ 5.102 4.055 00:00:29
NMF -- -- --
Slope One 5.189 3.945 00:00:02
KNN Basic 5.078 4.034 00:02:14
KNN with Means 5.124 3.955 00:02:16
KNN with Z-score 5.219 3.955 00:02:20
KNN Baseline 4.898 3.896 00:02:14
Co-clustering 5.153 3.917 00:00:12
Baseline Only 4.849 3.934 00:00:01
GridSearch 4.7409 3.8147 80:52:35
Auto-Surprise (TPE) 4.6489 3.6837 02:00:10
Auto-Surprise (ATPE) 4.6555 3.6906 02:00:01

Papers

Auto-Surprise: An Automated Recommender-System (AutoRecSys) Library with Tree of Parzens Estimator (TPE) Optimization

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auto-surprise-0.1.9.tar.gz (15.9 kB view details)

Uploaded Source

Built Distribution

auto_surprise-0.1.9-py3-none-any.whl (19.8 kB view details)

Uploaded Python 3

File details

Details for the file auto-surprise-0.1.9.tar.gz.

File metadata

  • Download URL: auto-surprise-0.1.9.tar.gz
  • Upload date:
  • Size: 15.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.16

File hashes

Hashes for auto-surprise-0.1.9.tar.gz
Algorithm Hash digest
SHA256 dc6d7c872c1dd109b947392459a8a21f842841f278fb05545f6fe2206a4e9e89
MD5 3244c05264852f71bd37b5100157164a
BLAKE2b-256 bb545ea9dda1e0a80111178141f5f5e5da8089678ce2ea86f7ef9e82aab1e418

See more details on using hashes here.

File details

Details for the file auto_surprise-0.1.9-py3-none-any.whl.

File metadata

File hashes

Hashes for auto_surprise-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 bfa4c2736da9e0798d903cce2b7ca8e868470b10f79762e2cccc5c217ad90af3
MD5 d6325df78c02a63c3ae5cd7206d5033a
BLAKE2b-256 010610456a30b98614c9a910b1b7e29b4bfaf49aacfebe0d8c363372b6c03b33

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page