Skip to main content

Implicit feedback-based recommender system pack

Project description

irspack

Python pypi GitHub license Read the Docs codecov

irspack is a Python package to train, evaluate, and optimize recommender systems based on implicit feedback.

While there are already many other great packages for this purpose, like

I have decided to implement my own one to

  • Use optuna for more efficient parameter search. In particular, if an early stopping scheme is available, optuna can prune unpromising trial based on the intermediate validation score, which drastically reduces overall running time for tuning.
  • Use multi-threaded implementations of the number of algorithms (KNN and IALS) in C++.
  • Deal with user cold-start scenarios using CB2CF strategy, which I found very convenient in practice.

Installation & Optional Dependencies

There are binaries for Linux & MacOS with python>=3.6. You can install them via

pip install irspack

The binary has been compiled to use AVX instruction. If you want to use AVX2/AVX512 or your environment does not support AVX, install it from source like

CFLAGS="-march=native" pip install git+https://github.com/tohtsky/irspack.git

In that case, you must have a decent version of C++ compiler (with C++11 support).

Optional Dependencies

I have also prepared a wrapper class (BPRFMRecommender) to train and optimize BPR/warp loss Matrix factorization implemented in lightfm. To use it you have to install lightfm separately by e.g.,

pip install lightfm

If you want to use Mult-VAE and CB2CF features in cold-start scenarios, you'll need the following additional (pip-installable) packages:

Basic Usage

Step 1. Train a recommender

We first represent the user/item interaction as a scipy.sparse matrix. Then we can feed it into our Recommender classes:

import numpy as np
import scipy.sparse as sps
from irspack.recommenders import P3alphaRecommender
from irspack.dataset.movielens import MovieLens100KDataManager

df = MovieLens100KDataManager().read_interaction()
unique_user_id, user_index = np.unique(df.userId, return_inverse=True)
unique_movie_id, movie_index = np.unique(df.movieId, return_inverse=True)
X_interaction = sps.csr_matrix(
  (np.ones(df.shape[0]), (user_index, movie_index))
)

recommender = P3alphaRecommender(X_interaction)
recommender.learn()

# for user 0 (whose userId is unique_user_id[0]),
# compute the masked score (i.e., already seen items have the score of negative infinity)
# of items.
recommender.get_score_remove_seen([0])

Step 2. Evaluate on a validation set

We have to split the dataset to train and validation sets

from irspack.split import rowwise_train_test_split
from irspack.evaluator import Evaluator
X_train, X_val = rowwise_train_test_split(
    X_interaction, test_ratio=0.2, random_seed=0
)

# often, X_val is defined for a subset of users.
# `offset` specifies where the validated user blocks begin.
# In this split, X_val is defined for the same users as X_train,
# so offset = 0
evaluator = Evaluator(ground_truth=X_val, offset=0)

recommender = P3alphaRecommender(X_train)
recommender.learn()
evaluator.get_score(recommender)

This will print something like

{
  'appeared_item': 106.0,
  'entropy': 3.840445116672292,
  'gini_index': 0.9794929280523742,
  'hit': 0.8854718981972428,
  'map': 0.11283343078231302,
  'n_items': 1682.0,
  'ndcg': 0.3401244303579389,
  'precision': 0.27560975609756017,
  'recall': 0.19399215770339678,
  'total_user': 943.0,
  'valid_user': 943.0
}

Step 3. Optimize the Hyperparameter

Now that we can evaluate the recommenders' performance against the validation set, we can use optuna-backed hyperparameter optimizer.

from irspack.optimizers import P3alphaOptimizer

optimizer = P3alphaOptimizer(X_train, evaluator)
best_params, trial_dfs  = optimizer.optimize(n_trials=20)

# maximal ndcg around 0.38 ~ 0.39
trial_dfs.ndcg.max()

Of course, we have to hold-out another interaction set for test, and measure the performance of tuned recommender against the test set. See examples/ for more complete examples.

TODOs

  • complete documentation
  • more splitting schemes
  • more benchmark dataset

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

irspack-0.1.1.tar.gz (120.1 kB view details)

Uploaded Source

Built Distributions

irspack-0.1.1-cp39-cp39-manylinux2010_x86_64.whl (9.1 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

irspack-0.1.1-cp39-cp39-manylinux2010_i686.whl (8.8 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ i686

irspack-0.1.1-cp39-cp39-macosx_10_9_x86_64.whl (648.7 kB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

irspack-0.1.1-cp38-cp38-manylinux2010_x86_64.whl (9.1 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

irspack-0.1.1-cp38-cp38-manylinux2010_i686.whl (8.7 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ i686

irspack-0.1.1-cp38-cp38-macosx_10_9_x86_64.whl (648.9 kB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

irspack-0.1.1-cp37-cp37m-manylinux2010_x86_64.whl (9.3 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

irspack-0.1.1-cp37-cp37m-manylinux2010_i686.whl (8.9 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ i686

irspack-0.1.1-cp37-cp37m-macosx_10_9_x86_64.whl (642.3 kB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

irspack-0.1.1-cp36-cp36m-manylinux2010_x86_64.whl (9.3 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

irspack-0.1.1-cp36-cp36m-manylinux2010_i686.whl (8.9 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ i686

irspack-0.1.1-cp36-cp36m-macosx_10_9_x86_64.whl (642.3 kB view details)

Uploaded CPython 3.6m macOS 10.9+ x86-64

File details

Details for the file irspack-0.1.1.tar.gz.

File metadata

  • Download URL: irspack-0.1.1.tar.gz
  • Upload date:
  • Size: 120.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for irspack-0.1.1.tar.gz
Algorithm Hash digest
SHA256 b474d2de94d86597fad1b39171cbb52d91b32c96e683959ef52a62554c381cf5
MD5 c3f39ee09faa4778f918080d9ae4bba3
BLAKE2b-256 1b0126dd13c1ea51b47f3f8c66ec8c2a31ac8a89b1d2fe2705a18fbaea17be41

See more details on using hashes here.

File details

Details for the file irspack-0.1.1-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

  • Download URL: irspack-0.1.1-cp39-cp39-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 9.1 MB
  • Tags: CPython 3.9, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for irspack-0.1.1-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f9cd9a56360f92d628650e9dfd58882fab069f1805abbfd21e632a5831c40a6c
MD5 25a8837be2e1384432b1e3a53b2203e0
BLAKE2b-256 e39103f5951123172af1d1f46018383fc5f4d5727c62c859c1cb99e4423b0cd5

See more details on using hashes here.

File details

Details for the file irspack-0.1.1-cp39-cp39-manylinux2010_i686.whl.

File metadata

  • Download URL: irspack-0.1.1-cp39-cp39-manylinux2010_i686.whl
  • Upload date:
  • Size: 8.8 MB
  • Tags: CPython 3.9, manylinux: glibc 2.12+ i686
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for irspack-0.1.1-cp39-cp39-manylinux2010_i686.whl
Algorithm Hash digest
SHA256 10a8b2d973c90eaea9b6a60d76ac1c5892b9af373fd5c370105bddf162412c81
MD5 2bda546dfc1c81a23de39b973f6391d7
BLAKE2b-256 8c94e70e7fd1a93a6df173e1875311eb7273036f9bfc8d16bea1c944261ec184

See more details on using hashes here.

File details

Details for the file irspack-0.1.1-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: irspack-0.1.1-cp39-cp39-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 648.7 kB
  • Tags: CPython 3.9, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for irspack-0.1.1-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 d490ff103cfd301ee94aadecd2eb07af63a42b3fdb00beb781d81b170f302801
MD5 18c3c60e956ae360afe495df289883d3
BLAKE2b-256 766fe0ee1d7921f915c18ede2524f7e057de2125e92cbe0bd569022bee02b2c0

See more details on using hashes here.

File details

Details for the file irspack-0.1.1-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

  • Download URL: irspack-0.1.1-cp38-cp38-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 9.1 MB
  • Tags: CPython 3.8, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for irspack-0.1.1-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 0370051a8bfaa906f50a18b0251e00c7537ce52cc82f30bf9fd2dc6b73c03f2f
MD5 bb5dab29c32e7475310c8fc2ce26360d
BLAKE2b-256 1fc66d7ca65e2825fc006e04b8691625788b752caed37ce06b14fd6167f45000

See more details on using hashes here.

File details

Details for the file irspack-0.1.1-cp38-cp38-manylinux2010_i686.whl.

File metadata

  • Download URL: irspack-0.1.1-cp38-cp38-manylinux2010_i686.whl
  • Upload date:
  • Size: 8.7 MB
  • Tags: CPython 3.8, manylinux: glibc 2.12+ i686
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for irspack-0.1.1-cp38-cp38-manylinux2010_i686.whl
Algorithm Hash digest
SHA256 927e249a69e7f7f34924da583aef6b7c178f6b554ac273bf889d98ff8ab4bbc4
MD5 b8d075d64179a2c24723e29f2cba57be
BLAKE2b-256 5220f85421f7e679c0b7ea0935bd257b78edfd83517462a36de8932e58e3c53a

See more details on using hashes here.

File details

Details for the file irspack-0.1.1-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: irspack-0.1.1-cp38-cp38-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 648.9 kB
  • Tags: CPython 3.8, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for irspack-0.1.1-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 c9d28f9f559404a04a8260f57f62d45373501df639f045ae596f4219df3e2447
MD5 f0cf31ec636972411adae491a0a99c04
BLAKE2b-256 e61cbeff4c83c4835844491df42ce71cef7981bdef1f2b2a2cb9391efcbb4d0c

See more details on using hashes here.

File details

Details for the file irspack-0.1.1-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: irspack-0.1.1-cp37-cp37m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 9.3 MB
  • Tags: CPython 3.7m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for irspack-0.1.1-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9da7a5915e8d4f4578258c86c9d6f925e4d328c082fc410f09d224bc2c67da17
MD5 17c3156b77a6b58d12316eb919fc2ffa
BLAKE2b-256 84d1c203470edec90064720e99412e0b858d92f1e1422efc683be218ce092fe2

See more details on using hashes here.

File details

Details for the file irspack-0.1.1-cp37-cp37m-manylinux2010_i686.whl.

File metadata

  • Download URL: irspack-0.1.1-cp37-cp37m-manylinux2010_i686.whl
  • Upload date:
  • Size: 8.9 MB
  • Tags: CPython 3.7m, manylinux: glibc 2.12+ i686
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for irspack-0.1.1-cp37-cp37m-manylinux2010_i686.whl
Algorithm Hash digest
SHA256 aa474dca714dd57bc9cf67375edff5c0ef1cc4d6f7fc9e23c5f1871226cf54e1
MD5 2f1547112adb387ad571da5715cb0b75
BLAKE2b-256 f04d97fe409dd6d5557ba2007fb1e74f0d6926cdf021138ac67fef9f7793d72c

See more details on using hashes here.

File details

Details for the file irspack-0.1.1-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: irspack-0.1.1-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 642.3 kB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for irspack-0.1.1-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 a04da0752dc9c3c6060ce5ffbffe2842d10332e2ca4ed45877d974bde537a90c
MD5 1521aeb024da36bb167760b55cb32a95
BLAKE2b-256 6da751365aba2199b4802b7970ff403f9e3e181fbf9819c367cd7c629220bf72

See more details on using hashes here.

File details

Details for the file irspack-0.1.1-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: irspack-0.1.1-cp36-cp36m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 9.3 MB
  • Tags: CPython 3.6m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for irspack-0.1.1-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 4f7bc4bf861e941a48997ff8094bc93d782b6b9a4cfbb978926c450b9632dc5e
MD5 61c774cd72659b89ebc233e3fce8fd53
BLAKE2b-256 6062e1ea1a297bbcfb7d65090496239cf0f7e0ce1301e6da5c9b8b43d86e6d61

See more details on using hashes here.

File details

Details for the file irspack-0.1.1-cp36-cp36m-manylinux2010_i686.whl.

File metadata

  • Download URL: irspack-0.1.1-cp36-cp36m-manylinux2010_i686.whl
  • Upload date:
  • Size: 8.9 MB
  • Tags: CPython 3.6m, manylinux: glibc 2.12+ i686
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for irspack-0.1.1-cp36-cp36m-manylinux2010_i686.whl
Algorithm Hash digest
SHA256 71ddb4bbcb0e2704e5b750a5f331c7f11499bf138dd32bb947f428ada830ff38
MD5 a07a059edfb43e591bf1b08c4d15f42c
BLAKE2b-256 dba896ac50b1c79cfb6baa31c5154db75b652a0ef7eccefbb6fadb7511b486f0

See more details on using hashes here.

File details

Details for the file irspack-0.1.1-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: irspack-0.1.1-cp36-cp36m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 642.3 kB
  • Tags: CPython 3.6m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for irspack-0.1.1-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 00ea2c8e92176e314089e3a13ceeed104cc8ffc618aa6b7f9bee40e9ddbf60c5
MD5 412a8cfb0b6284a3f1a16b796cfcc305
BLAKE2b-256 6f466fd59e0d73963247a086f94769543564b05f613d9c9d548f8b6beed4da7c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page