Skip to main content

Description Deep Recommenders with Python: A Python framework for building Deep Learning based Recommender Systems

Project description

GitHub version Documentation Status License: MIT

Introduction

DRecPy is a Python framework that makes building deep learning based recommender systems easier, by making available various tools to develop and test new models.

The main key features DRecPy provides are listed bellow:

  • Support for in-memory and out-of-memory data sets, by using an intermediary data structure called InteractionDataset.
  • Auto Internal to raw id conversion (identifiers present on the provided data sets): so even if your data set contains identifiers that are not continuous integers, a mapping will be built automatically: if you're using an already built model you won't need to use internal ids; otherwise, if you're developing a model, you won't need to use raw ids.
  • Well defined workflow for model building.
  • Data set splitting techniques adjusted for the distinct nature of data sets dedicated for recommender systems.
  • Sampling techniques for point based and list based models.
  • Evaluation processes for predictive models, as well as for learn-to-rank models.
  • Support for multi-column data sets, i.e. not being limited to (user, item, rating) triples.
  • Automatic plot generation for loss values during model training, as well as test scores during model evaluation.
  • All methods with stochastic factors receive a seed parameter, in order to allow result reproducibility.

For more information about the framework and its components, please visit the documentation page.

Getting Started

Here's an example script using one of the implemented recommenders (CDAE), to train and evaluate its ranking performance on the MovieLens 100k data set.

from DRecPy.Recommender import CDAE
from DRecPy.Dataset import get_train_dataset
from DRecPy.Dataset import get_test_dataset
from DRecPy.Evaluation import ranking_evaluation
from DRecPy.Evaluation import predictive_evaluation
import time

ds_train = get_train_dataset('ml-100k', verbose=False)
ds_test = get_test_dataset('ml-100k', verbose=False)

start_train = time.time()
cdae = CDAE(min_interaction=0, seed=10)
cdae.fit(ds_train, epochs=50)
print("Training took", time.time() - start_train)

print(ranking_evaluation(cdae, ds_test, n_test_users=100, seed=10))
print(predictive_evaluation(cdae, ds_test, skip_errors=True))

Output:

[CDAE] Max. interaction value: 5
[CDAE] Min. interaction value: 0
[CDAE] Number of unique users: 943
[CDAE] Number of unique items: 1680
[CDAE] Number of training points: 90570
[CDAE] Sparsity level: approx. 94.2831%
[CDAE] Creating auxiliary structures...
[CDAE] Model fitted.
Training took 25.366847276687622

{'P@10': 0.047, 'R@10': 0.47, 'HR@10': 0.47, 'NDCG@10': 0.2601, 'RR@10': 0.1968, 'AP@10': 0.1968}
{'RMSE': 3.1662, 'MSE': 10.0245}

More quick and easy examples are available here.

Implemented Models

Recommender Type Name
Learn-to-rank CDAE (Collaborative Denoising Auto-Encoder)
Learn-to-rank DMF (Deep Matrix Factorization)

Implemented Baselines (non deep learning based)

Recommender Type Name
Predictive User/Item KNN

Benchmarks

TODO

Installation

With pip:

$ pip install drecpy

Via git:

$ git clone https://github.com/fabioiuri/DRecPy
$ cd DRecPy
$ python setup.py install

License

Check LICENCE.md.

Contributors

Some parts of this work were produced while I was receiving a FCT research scholarship UID/CEC/00408/2019, under the research institution LASIGE, from Faculty of Sciences, University of Lisbon.

Development Status

Project in pre-alpha stage.

Planned work:

  • Wrap up missing documentation
  • Implement more models
  • Implement list-wise sampling strategy
  • Refine and clean unit tests

If you have any bugs to report or update suggestions, you can use DRecPy's github issues page or email me directly to fabioiuri@live.com.

Project details


Release history Release notifications

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for DRecPy, version 0.0.1
Filename, size File type Python version Upload date Hashes
Filename, size DRecPy-0.0.1-py3-none-any.whl (45.5 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size DRecPy-0.0.1.tar.gz (36.1 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page