pip install package for frankgraphbench.
Project description
FranKGraphBench: Knowledge Graph Aware Recommender Systems Framework for Benchmarking
The FranKGraphBench is a framework to allow KG Aware RSs to be benchmarked in a reproducible and easy to implement manner. It was first created on Google Summer of Code 2023 for Data Integration between DBpedia and some standard RS datasets in a reproducible framework.
Check the docs for more information.
- This repository was first created for Data Integration between DBpedia and some standard Recommender Systems datasets and a framework for reproducible experiments. For more info, check the project proposal and the project progress with weekly (as possible) updates.
Data Integration Usage
Install the required packages using python virtualenv, using:
python3 -m venv venv_data_integration/
source venv_data_integration/bin/activate
pip3 install -r requirements_data_integration.txt
Install the full dataset using bash scripts located at datasets/:
cd datasets
bash ml-100k.sh # Downloaded at `datasets/ml-100k` folder
bash ml-1m.sh # Downloaded at `datasets/ml-1m` folder
Usage
python3 data_integration.py [-h] -d DATASET -i INPUT_PATH -o OUTPUT_PATH [-ci] [-cu] [-cr] [-cs] [-map] [-w]
Arguments:
- -h: Shows the help message.
- -d: Name of a supported dataset. It will be the same name of the folder created by the bash script provided for the dataset. For now, check
data_integration/dataset2class.pyto see the supported ones. - -i: Input path where the full dataset is placed.
- -o: Output path where the integrated dataset will be placed.
- -ci: Use this flag if you want to convert item data.
- -cu: Use this flag if you want to convert user data.
- -cr: Use this flag if you want to convert rating data.
- -cs: Use this flag if you want to convert social link data.
- -map: Use this flag if you want to map dataset items with DBpedia. At least the item data should be already converted.
- -w: Choose the number of workers(threads) to be used for parallel queries.
Usage Example:
python3 data_integration.py -d 'ml-100k' -i 'datasets/ml-100k' -o 'datasets/ml-100k/processed' \
-ci -cu -cr -map -w 8
Check Makefile for more examples.
Supported datasets
| Dataset | #items matched | #items |
|---|---|---|
| MovieLens-100k | 1462 | 1681 |
| MovieLens-1M | 3356 | 3883 |
| LastFM-hetrec-2011 | 11815 | 17632 |
| Douban-Movie-Short-Comments-Dataset | --- | 28 |
| Yelp-Dataset | --- | 150348 |
| Amazon-Video-Games-5 | --- | 21106 |
Framework for reproducible experiments usage
Install the require packages using python virtualenv, using:
python3 -m venv venv_framework/
source venv_framework/bin/activate
pip3 install -r requirements_framework.txt
Usage
python3 framework.py -c 'config_files/test.yml'
Arguments:
- -c: Experiment configuration file path.
The experiment config file should be a .yaml file like this:
experiment:
dataset:
name: ml-100k
item:
path: datasets/ml-100k/processed/item.csv
extra_features: [movie_year, movie_title]
user:
path: datasets/ml-100k/processed/user.csv
extra_features: [gender, occupation]
ratings:
path: datasets/ml-100k/processed/rating.csv
timestamp: True
enrich:
map_path: datasets/ml-100k/processed/map.csv
enrich_path: datasets/ml-100k/processed/enriched.csv
remove_unmatched: False
properties:
- type: subject
grouped: True
sep: "::"
- type: director
grouped: True
sep: "::"
preprocess:
- method: filter_kcore
parameters:
k: 20
iterations: 1
target: user
split:
seed: 42
test:
method: k_fold
k: 2
level: 'user'
models:
- name: deepwalk_based
config:
save_weights: True
parameters:
walk_len: 10
p: 1.0
q: 1.0
n_walks: 50
embedding_size: 64
epochs: 1
evaluation:
k: 5
relevance_threshold: 3
metrics: [MAP, nDCG]
report:
file: 'experiment_results/ml100k_enriched/run1.csv'
See the config_files/ directory for more examples.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file FranKGraphBench-0.0.2a0-py3-none-any.whl.
File metadata
- Download URL: FranKGraphBench-0.0.2a0-py3-none-any.whl
- Upload date:
- Size: 68.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.8.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a20bf9747dc007f35b3fdca8fe3973d63574304cbae2c1bad9f6f730abff3043
|
|
| MD5 |
6f99692d9d593a662571cafc794ef384
|
|
| BLAKE2b-256 |
fc18b339d2f2138ea7a19ca8a573672094930f982010bcd22418c057bea67cdc
|