Skip to main content

A Context-Aware Recommendation Framework for Python

Project description

Corec


Pydantic v2 Ruff

Corec is a flexible and configurable framework designed for context-aware recommenders. It includes several object-oriented modules aimed at simplifying recommendation generation and metric evaluation.

The recommendation module supports Elliot-based models for non-contextual recommendations, and RecBole-based models and several library-specific contextual heuristic models for context-aware predictions. The evaluation module allows you to compute various metrics from prediction files, which can either be generated by the recommendation module or externally. It includes support for metrics from the Ranx library, along with some custom Corec contextual metrics.

It is important to note that Corec is part of a final degree project, and its goal is to provide a solid framework structure that allows users to easily extend the library with their own recommenders and metrics, making it a flexible foundation for further development.

Corec Modules List :mag:

Below is a complete list of the currently available classes in Corec. In the repository code, you can find the specific descriptions of the attributes, methods, and behavior of each module.

Integrated recommenders Heuristic recommenders Evaluation Postfiltering
ElliotRec ContextPopRec QrelsGenerator Postfilter
RecBoleRec ContextRandomRec RunGenerator
ContextSatisfactionRec MetricGenerator
Evaluator

:eyes:: In case you want to implement your own recommender, you might find it helpful to use BaseRec as your parent class.

Installation :computer:

To install Corec, simply use pip:

pip install corec

If you want to use the evaluation module, then run:

pip install corec[evaluator]

:eyes:: If you plan to use any integrated recommender, please remember to download the necessary extra packages.

Data Structure :file_folder:

Dataset Input Format

Corec assumes the following structure for the input datasets:

  • Training Set: A file containing training data for recommender model training.
  • Test Set: A file containing test data for generating recommendations.
  • Optional Validation Set: An optional file for validation during model training or evaluation.

Each dataset should have the following columns:

  • User ID: A column representing the unique identifier for each user (either str or int).
  • Item ID: A column representing the unique identifier for each item (either str or int).
  • Rating ID: A column representing the rating given by the user (usually a float).
  • Context Columns: Additional columns representing the context for each recommendation (all int).
User ID Item ID Rating Context 1 Context 2 Context 3 ...
1 101 4.5 1 0 1 ...
1 102 3.8 0 1 0 ...
2 110 9.6 0 1 1 ...

Recommendation Output Format

The output of the recommendation process is stored in files containing tuples in the following format: (user ID, item ID, score, query item ID).

:warning:: The inclusion of the query item ID in the predictions file is intentional, as it serves to indicate the contextual anchor of the recommendation. The current approach assumes that each item in the dataset is associated with a single, fixed context. Therefore, by storing the query item, we can indirectly infer the context in which the recommendation was made. That said, this is a known limitation of the current design (there's room for improvement here). In future iterations, the methodology could be extended and refined to handle more flexible or multi-context scenarios, making it applicable to a wider range of datasets.

Metrics Output Format

The evaluation metrics are stored in a CSV file, where each row corresponds to a particular experimental setting. The file includes the following columns:

  • Models (str): The model or combination of models being evaluated.
  • Fuse norm (str): The normalization strategy used during model fusion (if applicable).
  • Fuse method (str): The method applied to combine scores from multiple models (if applicable).
  • Metric (str): The specific evaluation metric (e.g., precision, recall, etc.).
  • Cutoff (int): The ranking cutoff value (e.g., 5, 10, etc.).
  • Score (float): The resulting score obtained for the given metric and cutoff.

Usage :bulb:

Recommendation Module Examples

Here’s an example of how to use the Elliot Recommendation Module to generate predictions based on the library Elliot:

from corec.recommenders.elliot_rec import ElliotRec

# Instantiate the Elliot recommender
elliot_rec = ElliotRec(
    train_path="dataset/train.tsv",
    test_path="dataset/test.tsv",
    valid_path="dataset/valid.tsv",
    preds_path_template="preds/{model}.tsv.gzip",
    elliot_work_dir="elliot_work_dir",
)

# Setup the model parameters according to the official docs from Elliot
models_config = {
    "ItemKNN": {
        "implementation": "classic",
        "neighbors": 40,
        "similarity": "cosine",
    },
    "FM": {
        "epochs": 10,
        "batch_size": 512,
        "factors": 10,
        "lr": 0.001,
        "reg": 0.1,
    }
}

# You are ready to compute the predictions
elliot_rec.recommend(
    models_config,
    K=50,
    clean_elliot_work_dir=True,
    clean_temp_dataset_files=True,
)

Here is shown an example of usage of RecBole Recommendation Module:

from recbole.model.context_aware_recommender.widedeep import WideDeep
from corec.recommenders.recbole_rec import RecBoleRec

# Instantiate the RecBole recommender
recbole_rec = RecBoleRec(
    train_path="dataset/train.tsv",
    test_path="dataset/test.tsv",
    valid_path="dataset/valid.tsv",
    logs_path="recbole_rec.log",
    rating_thr=7,
)

# You are ready to compute the predictions
recbole_rec.recommend(
    recbole_model=WideDeep,
    extra_config={"device": "gpu"},
    output_path="preds/WideDeep.tsv.gzip",
)

And here is shown an example of usage of Heuristic Recommendation Module:

from corec.recommenders import ContextRandomRec

# Instantiate the context-aware recommender
cp_rec = ContextPopRec(
    train_path="dataset/train.tsv",
    test_path="dataset/test.tsv",
    valid_path="dataset/valid.tsv",
    preds_compression=None,
    chunk_size=100,
)

# You are ready to compute the predictions
cp_rec.compute_predictions(
    output_path="preds/ContextPop.tsv",
    K=5,
)

Post-filter Module Example

After generating the predictions, you might want to post-filter those without a matching context between the test item (query) and the recommended one. Below is an example of how to perform that filtering:

from corec.postfilters import PostFilter

# Instantiate the post-filter
pf = PostFilter(
    dataset_ctx_idxs=range(3, 15),
    train_path="dataset/train.tsv",
    valid_path="dataset/valid.tsv",
)

# You are ready to filter the predictions
pf.postfilter(
    preds_path="my_preds/WideDeep.tsv.gzip",
    output_path="my_preds/Postfiltered_WideDeep.tsv.gzip",
)

Evaluation Module Example

Finally, you can evaluate the recommendations with the Evaluation Module. Here's an example of how to use the module to compute metrics:

from corec.evaluation.evaluator import Evaluator

# Instantiate the evaluator
evaluator = Evaluator(
    train_path="dataset/train.tsv",
    test_path="dataset/test.tsv",
    valid_path="dataset/valid.tsv",
    preds_path_template="my_preds/{model}.tsv.gzip",
    runs_path_template="runs/{run}.run.json",
    output_path="metrics.csv",
    metrics=["precision", "recall", "mean_ctx_sat", "sum_ctx_sat"],
    cutoffs=[5, 15, 25],
    rating_thr=7,
)

# First, compute the Qrels
evaluator.compute_qrels()

# Then, you are ready to compute metrics for standard Runs
for model in ["ContextPop", "Postfilter_WideDeep"]:
    evaluator.compute_run_metrics(model_name=model)

# Additionally, you can compute metrics for fuse Runs
for method in ["sum", "med", "mnz"]:
    evaluator.compute_fuse_metrics(
        run_names=["ContextSatisfaction"],
        model_names=["FM"],
        method=method,
    )

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

corec-1.1.5.tar.gz (27.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

corec-1.1.5-py3-none-any.whl (33.7 kB view details)

Uploaded Python 3

File details

Details for the file corec-1.1.5.tar.gz.

File metadata

  • Download URL: corec-1.1.5.tar.gz
  • Upload date:
  • Size: 27.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.66.5 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for corec-1.1.5.tar.gz
Algorithm Hash digest
SHA256 9fa18c78c69998a6201d7fbc648cafa7dd76097e4270074b14113affd8764f87
MD5 747e08a07862c88ca06f23ac6e4f6fe8
BLAKE2b-256 1a3c67200a431bfd7b6202536719edaebfd39c682291d792d6dbd4992457968b

See more details on using hashes here.

File details

Details for the file corec-1.1.5-py3-none-any.whl.

File metadata

  • Download URL: corec-1.1.5-py3-none-any.whl
  • Upload date:
  • Size: 33.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.66.5 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for corec-1.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 75a1f4ea36860cd827b2b2aeaba52088cfefa69eaedde9ef41920693dcf0cf8c
MD5 577471486a564e12c32bf34ba7ad11bc
BLAKE2b-256 dccc52a08997dbbcdc97985c683f356f5041f5959f4dbb6d73d12ecb05715865

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page