Skip to main content

Description Deep Recommenders with Python: A Python framework for building Deep Learning based Recommender Systems

Project description

GitHub version Documentation Status License: MIT Build Status

DRecPy

Table of Contents

  1. Introduction
  2. Installation
  3. Getting Started
  4. Implemented Models
  5. Benchmarks
  6. License
  7. Contributors
  8. Development Status

Introduction

DRecPy is a Python framework that makes building deep learning based recommender systems easier, by making available various tools to develop and test new models.

The main key features DRecPy provides are listed bellow:

  • Support for in-memory and out-of-memory data sets, by using an intermediary data structure called InteractionDataset.
  • Auto Internal to raw id conversion (identifiers present on the provided data sets): so even if your data set contains identifiers that are not continuous integers, a mapping will be built automatically: if you're using an already built model you won't need to use internal ids; otherwise, if you're developing a model, you won't need to use raw ids.
  • Well defined workflow for model building.
  • Data set splitting techniques adjusted for the distinct nature of data sets dedicated for recommender systems.
  • Sampling techniques for point based and list based models.
  • Evaluation processes for predictive models, as well as for learn-to-rank models.
  • Support for multi-column data sets, i.e. not being limited to (user, item, rating) triples.
  • Automatic plot generation for loss values during model training, as well as test scores during model evaluation.
  • All methods with stochastic factors receive a seed parameter, in order to allow result reproducibility.

For more information about the framework and its components, please visit the documentation page.

Installation

With pip:

$ pip install drecpy

If you can't get the latest version from PyPi:

$ pip install git+https://github.com/fabioiuri/DRecPy

Or directly by cloning the Git repo:

$ git clone https://github.com/fabioiuri/DRecPy
$ cd DRecPy
$ python setup.py install

Update Version

If you want to update to the newest DRecPy version, use:

$ pip install drecpy --upgrade

Getting Started

Here's an example script using one of the implemented recommenders (CDAE), to train, with a validation set, and evaluate its ranking performance on the MovieLens 100k data set.

from DRecPy.Recommender import CDAE
from DRecPy.Dataset import get_train_dataset
from DRecPy.Dataset import get_test_dataset
from DRecPy.Evaluation.Processes import ranking_evaluation
from DRecPy.Evaluation.Splits import leave_k_out
from DRecPy.Evaluation.Metrics import ndcg
from DRecPy.Evaluation.Metrics import hit_ratio
import time


ds_train = get_train_dataset('ml-100k')
ds_test = get_test_dataset('ml-100k')
ds_train, ds_val = leave_k_out(ds_train, k=1, min_user_interactions=10)


def epoch_callback_fn(model):
    return {'val_' + metric: v for metric, v in
            ranking_evaluation(model, ds_val, n_pos_interactions=1, n_neg_interactions=100,
                               generate_negative_pairs=True, k=10, verbose=False, seed=10,
                               metrics={'HR': (ndcg, {}), 'NDCG': (hit_ratio, {})}).items()}


start_train = time.time()
cdae = CDAE(hidden_factors=50, corruption_level=0.2, loss='bce', seed=10)
cdae.fit(ds_train, learning_rate=0.001, reg_rate=0.001, epochs=80, batch_size=64, neg_ratio=5,
         epoch_callback_fn=epoch_callback_fn, epoch_callback_freq=20)
print("Training took", time.time() - start_train)

print(ranking_evaluation(cdae, ds_test, k=[1, 5, 10], novelty=True, n_pos_interactions=1, 
                         n_neg_interactions=100, generate_negative_pairs=True, seed=10, 
                         max_concurrent_threads=4, verbose=True))

Output:

[CDAE] Max. interaction value: 5
[CDAE] Min. interaction value: 0
[CDAE] Interaction threshold value: 0
[CDAE] Number of unique users: 943
[CDAE] Number of unique items: 1680
[CDAE] Number of training points: 89627
[CDAE] Sparsity level: approx. 94.3426%
[CDAE] Creating auxiliary structures...
[CDAE] Model fitted.
Training took 1620.2718272209167

{'P@1': 0.141, 'P@5': 0.0793, 'P@10': 0.0591, 'R@1': 0.141, 'R@5': 0.3966, 'R@10': 0.5907, 
'HR@1': 0.141, 'HR@5': 0.3966, 'HR@10': 0.5907, 'NDCG@1': 0.141, 'NDCG@5': 0.2701, 'NDCG@10': 0.3327, 
'RR@1': 0.141, 'RR@5': 0.2286, 'RR@10': 0.2543, 'AP@1': 0.141, 'AP@5': 0.2286, 'AP@10': 0.2543}

Generated Plots:

  • Training

CDAE Training Performance

  • Evaluation

CDAE Evaluation Performance

More quick and easy examples are available here.

Implemented Models

Recommender Type Name
Learn-to-rank CDAE (Collaborative Denoising Auto-Encoder)
Learn-to-rank DMF (Deep Matrix Factorization)

Implemented Baselines (non deep learning based)

Recommender Type Name
Predictive User/Item KNN

Benchmarks

TODO

License

Check LICENCE.md.

Contributors

This work was conducted under the supervision of Prof. Francisco M. Couto, and during the initial development phase the project was financially supported by a FCT research scholarship UID/CEC/00408/2019, under the research institution LASIGE, from the Faculty of Sciences, University of Lisbon.

Development Status

Project in pre-alpha stage.

Planned work:

  • Wrap up missing documentation
  • Implement more models
  • Implement list-wise sampling strategy
  • Refine and clean unit tests

If you have any bugs to report or update suggestions, you can use DRecPy's github issues page or email me directly to fabioiuri@live.com.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DRecPy-0.0.3.tar.gz (45.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

DRecPy-0.0.3-py3-none-any.whl (63.6 kB view details)

Uploaded Python 3

File details

Details for the file DRecPy-0.0.3.tar.gz.

File metadata

  • Download URL: DRecPy-0.0.3.tar.gz
  • Upload date:
  • Size: 45.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.8

File hashes

Hashes for DRecPy-0.0.3.tar.gz
Algorithm Hash digest
SHA256 fbd096878b2f10a58e497d66d8111de033adc06eb63f9d2969d3de4c38b01f33
MD5 5591e51bbb5df51aae314adab76da50c
BLAKE2b-256 21997567b170be4fbf239169807dbf481e1f16e9014d9da43f0c86361d967c86

See more details on using hashes here.

File details

Details for the file DRecPy-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: DRecPy-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 63.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.8

File hashes

Hashes for DRecPy-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 845bb8c80d95dacd505e3105adce9dfdbf0ab88772e5ced881a9a018cf83a4c6
MD5 4c9d0f797d759bc998f4c1be948291a8
BLAKE2b-256 40e2a7ca799d7e0d63d9d9334a16187da342351bc41683bae73505b020a19a1c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page