Skip to main content

Tools for recommendation systems development

Project description

CI PyPI

ml-recsys-tools


This is an updated version of the stale ml-recsys-tools source repo


Open source repo for various tools for recommender systems development work.

Main purpose is to provide a single wrapper for various recommender packages to train, tune, evaluate and get data in and recommendations / similarities out.

Installation:

Pip:

  • PyPi: pip install ml-recsys-tools
  • Github master: pip install git+https://github.com/artdgn/ml-recsys-tools@master#egg=ml_recsys_tools

Basic usage:

# dataset: download and prepare dataframes
from ml_recsys_tools.datasets.prep_movielense_data import get_and_prep_data
rating_csv_path, users_csv_path, movies_csv_path = get_and_prep_data()

# read the interactions dataframe and create a data handler object and  split to train and test
import pandas as pd

ratings_df = pd.read_csv(rating_csv_path)
from ml_recsys_tools.data_handlers.interaction_handlers_base import ObservationsDF    
obs = ObservationsDF(ratings_df, uid_col='userid', iid_col='itemid')
train_obs, test_obs = obs.split_train_test(ratio=0.2)

# train and test LightFM recommender
from ml_recsys_tools.recommenders.lightfm_recommender import LightFMRecommender    
lfm_rec = LightFMRecommender()
lfm_rec.fit(train_obs, epochs=10)

# print summary evaluation report:
print(lfm_rec.eval_on_test_by_ranking(test_obs.df_obs, prefix='lfm ', n_rec=100))

# get all recommendations and print a sample (training interactions are filtered out by default)
recs = lfm_rec.get_recommendations(lfm_rec.all_users, n_rec=5)
print(recs.sample(5))

# get all similarities and print a sample
simils = lfm_rec.get_similar_items(lfm_rec.all_items, n_simil=5)
print(simils.sample(10))

Additional examples in the examples/ folder:

Recommender models and tools:

  • LightFM package based recommender.

  • Implicit package based ALS recommender.

  • Evaluation features added for most recommenders:

    • Dataframes for all inputs and outputs
      • adding external features (for LightFM hybrid mode)
      • fast batched methods for:
        • user recommendation sampling
        • similar items samplilng with different similarity measures
        • similar users sampling
        • evaluation by sampling and ranking
        • dense user x item recommendation and item x item similarity
  • Additional recommender models:

    • Similarity based:
      • cooccurence (items, users)
      • generic similarity based (can be used with external features)
  • Ensembles:

    • subdivision based (multiple recommenders each on subset of data - e.g. geographical region):
      • geo based: simple grid, equidense grid, geo clustering
      • LightFM and cooccurrence based
    • combination based - combining recommendations from multiple recommenders
    • similarity combination based - similarity based recommender on similarities from multiple recommenders
    • cascade ensemble
  • Interaction dataframe and sparse matrix handlers / builders:

    • sampling, data splitting,
    • external features matrix creation (additional item features), with feature engineering: binning / one*hot encoding (via pandas_sklearn)
    • evaluation and ranking helpers
    • handlers for observations coupled with external features and features with geo coordinates
  • Evaluation utils:

    • score reports on lightfm metrics (AUC, precision, recall, reciprocal)
    • n-DCG, and n-MRR metrics, n-precision / recall
    • references: best possible ranking and chance ranking
  • Utilities:

    • similarity calculation helpers (similarities, dot, top N, top N on sparse)
    • parallelism utils
    • sklearn transformer extenstions (for feature engineering)
    • logging, debug printouts decorators and other instrumentation and inspection tools
    • pandas utils

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ml_recsys_tools-0.9.1.tar.gz (43.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ml_recsys_tools-0.9.1-py3-none-any.whl (56.3 kB view details)

Uploaded Python 3

File details

Details for the file ml_recsys_tools-0.9.1.tar.gz.

File metadata

  • Download URL: ml_recsys_tools-0.9.1.tar.gz
  • Upload date:
  • Size: 43.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.9

File hashes

Hashes for ml_recsys_tools-0.9.1.tar.gz
Algorithm Hash digest
SHA256 fb8342c4ece6879e8ec2cf4dc2d8b4767b904e466a03beb596114e8aede32b99
MD5 87e3e927eb7044e1c85990d1e9e90635
BLAKE2b-256 bc9b7f768413426f162e59c92845864c95fcddd3f51068270c1a34801a3dbd17

See more details on using hashes here.

File details

Details for the file ml_recsys_tools-0.9.1-py3-none-any.whl.

File metadata

  • Download URL: ml_recsys_tools-0.9.1-py3-none-any.whl
  • Upload date:
  • Size: 56.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.9

File hashes

Hashes for ml_recsys_tools-0.9.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7899a1a7aa8851a9c1901ead2798fe75aa06fd83dac4b3670c9f4fcd30fc65a5
MD5 16b15401855bf1cefe6e816faa770ff5
BLAKE2b-256 337b215450af60c0493c8cb9e0daad3bd95f21a91f99779585a2b47c4d3688be

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page