Skip to main content

An easy-to-use library for recommender systems.

Project description

GitHub version Documentation Status python versions License DOI

logo

About this repository — This is a community fork of Nicolas Hug’s Surprise. I am not the original author or owner; I forked it so the codebase can be updated regularly (e.g. Python 3.13, NumPy 2.x). All credit goes to Nicolas Hug and the contributors of the original project.

Overview

Surprise is a Python scikit for building and analyzing recommender systems that deal with explicit rating data.

Surprise was designed with the following purposes in mind:

The name SurPRISE (roughly :) ) stands for Simple Python RecommendatIon System Engine.

Features — Easy to use (built-in datasets like Movielens and Jester, or your own), rich set of algorithms (SVD, SVD++, NMF, Slope One, k-NN, Co-Clustering, baselines, etc.), multiple similarity measures (cosine, MSD, Pearson), and scikit-learn–style tools for evaluation and parameter tuning (e.g. GridSearchCV).

Please note that surprise does not support implicit ratings or content-based information.

Getting started, example

Here is a simple example showing how you can (down)load a dataset, split it for 5-fold cross-validation, and compute the MAE and RMSE of the SVD algorithm.

from surprise import SVD
from surprise import Dataset
from surprise.model_selection import cross_validate

# Load the movielens-100k dataset (download it if needed).
data = Dataset.load_builtin('ml-100k')

# Use the famous SVD algorithm.
algo = SVD()

# Run 5-fold cross-validation and print results.
cross_validate(algo, data, measures=['RMSE', 'MAE'], cv=5, verbose=True)

Output:

Evaluating RMSE, MAE of algorithm SVD on 5 split(s).

                  Fold 1  Fold 2  Fold 3  Fold 4  Fold 5  Mean    Std     
RMSE (testset)    0.9367  0.9355  0.9378  0.9377  0.9300  0.9355  0.0029  
MAE (testset)     0.7387  0.7371  0.7393  0.7397  0.7325  0.7375  0.0026  
Fit time          0.62    0.63    0.63    0.65    0.63    0.63    0.01    
Test time         0.11    0.11    0.14    0.14    0.14    0.13    0.02    

Surprise can do much more (e.g, GridSearchCV)! You'll find more usage examples in the documentation .

Benchmarks

Here are the average RMSE, MAE and total execution time of various algorithms (with their default parameters) on a 5-fold cross-validation procedure. The datasets are the Movielens 100k and 1M datasets. The folds are the same for all the algorithms. All experiments are run on a laptop with an intel i5 11th Gen 2.60GHz. The code for generating these tables can be found in the benchmark example.

Movielens 100k RMSE MAE Time
SVD 0.934 0.737 0:00:06
SVD++ (cache_ratings=False) 0.919 0.721 0:01:39
SVD++ (cache_ratings=True) 0.919 0.721 0:01:22
NMF 0.963 0.758 0:00:06
Slope One 0.946 0.743 0:00:09
k-NN 0.98 0.774 0:00:08
Centered k-NN 0.951 0.749 0:00:09
k-NN Baseline 0.931 0.733 0:00:13
Co-Clustering 0.963 0.753 0:00:06
Baseline 0.944 0.748 0:00:02
Random 1.518 1.219 0:00:01
Movielens 1M RMSE MAE Time
SVD 0.873 0.686 0:01:07
SVD++ (cache_ratings=False) 0.862 0.672 0:41:06
SVD++ (cache_ratings=True) 0.862 0.672 0:34:55
NMF 0.916 0.723 0:01:39
Slope One 0.907 0.715 0:02:31
k-NN 0.923 0.727 0:05:27
Centered k-NN 0.929 0.738 0:05:43
k-NN Baseline 0.895 0.706 0:05:55
Co-Clustering 0.915 0.717 0:00:31
Baseline 0.909 0.719 0:00:19
Random 1.504 1.206 0:00:19

Installation

Requirements: Python ≥ 3.13, NumPy ≥ 2.4.2, SciPy ≥ 1.17.0, joblib ≥ 1.5.3.

With pip (you'll need a C compiler. Windows users might prefer using conda):

$ pip install scikit-surprise

With conda:

$ conda install -c conda-forge scikit-surprise

For the latest version from this fork, clone the repo and build from source (you'll need Cython and NumPy); replace luissanchez with the fork's GitHub username if different:

$ git clone https://github.com/luissanchez/Surprise.git
$ cd Surprise
$ pip install .

Links

License and reference

This project is licensed under the BSD 3-Clause license, so it can be used for pretty much everything, including commercial applications.

If you find Surprise useful, consider opening an issue to share how you use it!

Please make sure to cite the paper if you use Surprise for your research:

@article{Hug2020,
  doi = {10.21105/joss.02174},
  url = {https://doi.org/10.21105/joss.02174},
  year = {2020},
  publisher = {The Open Journal},
  volume = {5},
  number = {52},
  pages = {2174},
  author = {Nicolas Hug},
  title = {Surprise: A Python library for recommender systems},
  journal = {Journal of Open Source Software}
}

Contributors

The following persons have contributed to Surprise:

ashtou, Abhishek Bhatia, bobbyinfj, caoyi, Chieh-Han Chen, Raphael-Dayan, Олег Демиденко, Charles-Emmanuel Dias, dmamylin, Lauriane Ducasse, Marc Feger, franckjay, Lukas Galke, Tim Gates, Pierre-François Gimenez, Zachary Glassman, Jeff Hale, Nicolas Hug, Janniks, jyesawtellrickson, Doruk Kilitcioglu, Ravi Raju Krishna, lapidshay, Hengji Liu, Ravi Makhija, Maher Malaeb, Manoj K, James McNeilis, Naturale0, nju-luke, Pierre-Louis Pécheux, Jay Qi, Lucas Rebscher, Craig Rodrigues, Skywhat, Hercules Smith, David Stevens, Vesna Tanko, TrWestdoor, Victor Wang, Mike Lee Williams, Jay Wong, Chenchen Xu, YaoZh1918.

Thanks a lot :) !

Development Status

This fork is maintained to keep Surprise working with recent Python and library versions (e.g. Python 3.13, NumPy 2.x). The original author’s last note (from version 1.1.0) was that the official package would focus on bugfixes and maintenance; this fork continues that in a community-driven way.

Recent updates in this fork: Python 3.13 support; NumPy 2.x compatibility (Cython types updated for NumPy 2.0, e.g. in co-clustering).

For bugs, issues, or questions, please use the GitHub project page (or this fork’s issues) so others can benefit from the discussion.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scikit_surprise_2-1.2.3.tar.gz (155.3 kB view details)

Uploaded Source

File details

Details for the file scikit_surprise_2-1.2.3.tar.gz.

File metadata

  • Download URL: scikit_surprise_2-1.2.3.tar.gz
  • Upload date:
  • Size: 155.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scikit_surprise_2-1.2.3.tar.gz
Algorithm Hash digest
SHA256 e157da9f1d847fecc97904ff21ed9d0776dd8988befe1b3a137666e0ac917cca
MD5 37ee3871f52c11d31dc99cead93f187a
BLAKE2b-256 b2f205c876505895fc1d271e2686fe7937eaaa99c8241419ef974268f1a1d712

See more details on using hashes here.

Provenance

The following attestation bundles were made for scikit_surprise_2-1.2.3.tar.gz:

Publisher: python-publish.yml on LuisSanchez/Surprise

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page