Skip to main content

A library of recommender systems metrics for big data

Project description

recmetrics-pyspark: recommender systems metrics for big data

recmetrics-pyspark obtains the most relevant internal metrics for items recommendations from pySpark DataFrames. It efficiently handles huge amounts of data. Most routines are adapted from the recmetrics library which works with pandas DataFrames.

DISCLAIMER: recmetrics-pyspark is not affiliated nor endorsed by recmetrics or its authors. Some routines have been adapted from recmetrics to work with pySpark DataFrames and/or to handle bigger datasets. Therefore, some chunks of code have been copied verbatim, and functions and parameters names have been kept the same (as much as possible) for better usability.

Furthermore, if you are dealing with small datasets, we recommend to use the recmetrics library (https://github.com/statisticianinstilettos/recmetrics) instead, as it most efficiently handles smaller datasets.

Where to get it

The source code is currently hosted on GitHub at: https://github.com/camiloakv/recmetrics-pyspark

Binary installers for the latest released version are available at the Python Package Index (PyPI).

pip install recmetrics-pyspark

Available metrics as of version 0.0.1:

  • long_tail_plot
  • coverage
  • Novelty:
    • novelty_refac A small refactoring of recmetrics' implementation.
    • novelty_pandas Similar implementation to novelty_refac but using pandas DataFrames as inputs
    • novelty pySpark implementation
  • personalization
  • intra_list_similarities

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

recmetrics-pyspark-0.0.1.tar.gz (7.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

recmetrics_pyspark-0.0.1-py3-none-any.whl (7.9 kB view details)

Uploaded Python 3

File details

Details for the file recmetrics-pyspark-0.0.1.tar.gz.

File metadata

  • Download URL: recmetrics-pyspark-0.0.1.tar.gz
  • Upload date:
  • Size: 7.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for recmetrics-pyspark-0.0.1.tar.gz
Algorithm Hash digest
SHA256 9c803bb309b56d11920761e164e42cc4e54dce863b662d45f93db8212c2a3db5
MD5 7041cc8c13bb7e3e050b6d556ce58415
BLAKE2b-256 7f92f3fb2fa37999fb64bd692b9968f6c60c402a6063e110c5e839a1258444be

See more details on using hashes here.

File details

Details for the file recmetrics_pyspark-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for recmetrics_pyspark-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 944a90aaa83a88e14c85355ea3cb0412da24f1ada01d44b19ac303f2a29de843
MD5 5952c952522bf612bdff4c88f206c6d8
BLAKE2b-256 bc2ed26654d686f3307f34ad8bbfa1919f8d05132a66707554f362544b71438d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page