Skip to main content

Lightweight recommender engine

Project description

Acf

A lightweight recommender engine for implicit feedback datasets

PyPI Test Publish

The package implements an algorithm described in Collaborative Filtering for Implicit Feedback Datasets paper. The algorithm is based on the following ideas:

  • using collaborative filtering with latent factors
  • transforming feedback observations into binary preferences with associated confidence levels
  • using alternating least sqaures to compute the matrix factorization

Install

The package requires Python 3.7 or newer, the only dependencies are numpy and pandas. To install it, run

pip install acf

Usage

The following example shows how to train a model and compute predictions.

import acf
import pandas as pd

# assuming the data are in the following format:
# | user_id | item_id | feedback |
# |---------|---------|----------|
# | 2491    | 129     | 2        |

interactions = pd.read_csv('interactions.csv')

engine = acf.Engine(reg_lambda=1, alpha=35, n_factors=2, random_state=0)

engine.fit(interactions,
           user_column='user_id',
           item_column='item_id',
           feedback_column='feedback',
           n_iter=20,
           n_jobs=4)

# get the best 20 recommendations
prediction = engine.predict(user=2491, top_n=20)

# to print training loss value at every iteration
print(engine.loss)

Model Evaluation

For performance evaluation, the package offers metrics.mean_rank function that implements "mean rank" metric as defined by equation 8 in the paper.

The metric is a weighted mean of percentile-ranked recommendations (rank_ui = 0 says that item i is the first to be recommended for user u and item j with rank_uj = 1 is the last to be recommended) where the weights are the actual feedback values from R user-item matrix.

interactions_test = pd.read_csv('intercations_test.csv')

print(acf.metrics.mean_rank(interactions=interactions_test,
                            user_column='user_id',
                            item_column='item_id'
                            feedback_column='feedback',
                            engine=engine))

Model Persistence

Trained model can be serialized and stored using joblib or pickle.

To store a model:

with open('engine.joblib', 'wb') as f:
    joblib.dump(engine, f)

To load a model:

with open('engine.joblib', 'rb') as f:
    engine = joblib.load(f)

Public API

acf.Engine

acf.core.computation.Engine(reg_lambda=0.1, alpha=40,
                            n_factors=10, random_state=None):

Class exposing the recommender.

  • reg_lambda: regularization strength
  • alpha: gain parameter in feedback-confidence transformation c_ui = 1 + alpha * r_ui
  • n_factors: number of latent factors
  • random_state: initial RNG state

Properties:

  • user_factors: user factor matrix
  • item_factors: item factor matrix
  • loss: training loss history

Methods:

Engine.fit(interactions, user_column, item_column,
           feedback_column, n_iter=20, n_jobs=1)

Trains the model.

  • interactions: dataframe containing user-item feedbacks
  • user_column: name of the column containing user ids
  • item_column: name of the column containing item ids
  • feedback_column: name of the column containing feedback values
  • n_iter: number of alternating least squares iteration
  • n_jobs: number of parallel jobs
Engine.predict(user, top_n=None)

Predicts the recommendation.

  • user: user identification for whom the prediction is computed
  • top_n: if not None, only the besr n items are included in the result

Returns: predicted recommendation score for each item as pandas.Series

acf.metrics.mean_rank

acf.core.metrics.mean_rank(interactions, user_column, item_column,
                           feedback_column, engine)

Computes mean rank evaluation.

  • interactions: dataframe containing user-item feedbacks
  • user_column: name of the column containing user ids
  • item_column: name of the column containing item ids
  • feedback_column: name of the column containing feedback values
  • engine: trained acf.Engine instance

Returns: computed value

Tests

Tests can be executed by pytest as

python -m pytest acf/tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

acf-0.2.3.tar.gz (13.7 kB view details)

Uploaded Source

Built Distribution

acf-0.2.3-py3-none-any.whl (16.3 kB view details)

Uploaded Python 3

File details

Details for the file acf-0.2.3.tar.gz.

File metadata

  • Download URL: acf-0.2.3.tar.gz
  • Upload date:
  • Size: 13.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for acf-0.2.3.tar.gz
Algorithm Hash digest
SHA256 60ca033e988736728c2d95051f42b09ae4625030fec1aab7d8c32eb27a0d7880
MD5 de049a98e004f16c0d6b58c1eccc86f0
BLAKE2b-256 7c1e07a8e5fddc23aad5b7d735ded9a937bbe7a694ad70602bfc65a8e266c012

See more details on using hashes here.

File details

Details for the file acf-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: acf-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 16.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for acf-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 7b752efd531ea636344183b8c631e558040413a72950f48e587727e6c8a6f405
MD5 feb6bf81cbdf77676272c2892e83124f
BLAKE2b-256 c4741a4e70c2e9a5e08f1e731c8f8580c1101af23845b2e295f4ed4ff2b985d2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page