A unified framework for recommender system attacking

Project description

RecAD

A unified library for recommender attack and defense

RecAD is a unified library aiming at establishing an open benchmark for recommender attack and defense. With a few line of codes, you can quickly construct a attacking pipeline.

The supported modules currently include:

Datasets: ml1m, yelp, Amazon-game, Epinions, Book-crossing, BeerAdvocate, dianping, food, ModCloth, ratebeer, RentTheRunway. Please checkout more details in data/
Victim Models: MF, LightGCN, NCF.
Attack Models: Heuristic(random, average, segment, bandwagon); AUSH; AIA; Legup

🚀🚀🚀 We are opening for any contribution or suggestion about adding more datasets and models

Install

Install by pip:

pip install recad

Or from source:

git clone https://github.com/gusye1234/recad.git
cd recad
pip install -e "."

Quick Start

Try it from command line:

recad_runner --attack="aush" --victim="lightgcn"

Or you can write your own script:

from recad import dataset, model, workflow

dataset_name = "ml1m"
config = {
    # quickly asscess the dataset with implicit feedback
    "victim_data": dataset.from_config("implicit", dataset_name, need_graph=True),
    # sample part of the explicit dataset as the attack data
    "attack_data": dataset.from_config("explicit", dataset_name).partial_sample(
        user_ratio=0.2
    ),
    # set up models config, and later will be instantiated in workflow
    "victim": model.from_config("victim", "lightgcn"),
    "attacker": model.from_config("attacker", "aush"),
    "rec_epoch": 20,
}
workflow_inst = workflow.from_config("no defense", **config)
# run the attacking
workflow_inst.execute()

Docs

recad is designed to help users use and debug interactively, and mainly has three modules: dataset, model, workflow

Dataset

import recad

# how many datasets we support?
print(recad.print_datasets())

# how many configs for one dataset?
recad.dataset_print_help("ml1m")

# init from a dataset with default parameters
dataset = recad.dataset.from_config("implicit", "ml1m")

# init from a dataset and modifies parameters
dataset = recad.dataset.from_config("implicit", "ml1m", test_batch_size=50, device="cuda")

Model

# how many models we support?
print(recad.print_models())

#how many configs for one model?
recad.model_print_help("lightgcn")
  
# lazy-init from a model with default parameters
# Not a torch.nn.Module class! Can't be used for training and inferring
victim_model = recad.model.from_config("victim", "lightgcn")

# lazy-init from a model and modifies parameters
# Not a torch.nn.Module class! Can't be used for training and inferring
victim_model = recad.model.from_config("victim", "lightgcn", latent_dim_rec=256, lightGCN_n_layers=2)

# Model's have some parameters that are related to some runtime module, e.g. dataset
# `victim_model` now is an actually torch.nn.Module 
dataset = recad.dataset.from_config("implicit", "ml1m")
victim_model = recad.model.from_config("victim", "lightgcn", dataset=dataset).I()

Workflow

# how many workflows we support?
print(recad.print_workflows())

#how many configs for one workflow?
recad.workflow_print_help("no defense")

# init a workflow takes all the components we mentioned before
config = {
        "victim_data": ..., # Your dataset for victim model, using dataset.from_config...
        "attack_data": ..., # Your dataset for attacker model, using dataset.from_config...
        "victim": ..., # Your victim model, using model.from_config...
        "attacker": model.from_config(
            "attacker", ARG.attack, filler_num=ARG.filler_num
        ), # Your attacker model, using model.from_config...
        "rec_epoch": ..., # Your training epoch for victim model, Int
        "attack_epoch": ..., # Your training epoch for attacker model, Int
    }
workflow_inst = workflow.from_config("no defense", **config)

# Start
workflow_inst.execute

Please checkout the whole pipeline and more details for each module in recad.main🤗.

Confused about what a component is doing? Each component instance in recad will have a print_help method to return the input/output information:

dataset.print_help()
# (ml1m)Information:
# ╒════════════════════╤═══════════════════════════════════╕
# │ n_users            │ 5950                              │
# ╞════════════════════╪═══════════════════════════════════╡
# │ n_items            │ 3702                              │
# ├────────────────────┼───────────────────────────────────┤
# │ train_interactions │ 468649                            │
# ├────────────────────┼───────────────────────────────────┤
# │ valid_interactions │ 49390                             │
# ├────────────────────┼───────────────────────────────────┤
# │ test_interactions  │ 49494                             │
# ├────────────────────┼───────────────────────────────────┤
# │ train_dict         │ <class 'dict'> 5950               │
# ├────────────────────┼───────────────────────────────────┤
# │ valid_dict         │ <class 'dict'> 5583               │
# ├────────────────────┼───────────────────────────────────┤
# │ test_dict          │ <class 'dict'> 5677               │
# ├────────────────────┼───────────────────────────────────┤
# │ graph              │ torch torch.float32, (9652, 9652) │
# ╘════════════════════╧═══════════════════════════════════╛
# (ml1m)Batch data:
# ╒════════════════╤═════════════╤═══════════════╕
# │ name           │ type        │ shape         │
# ╞════════════════╪═════════════╪═══════════════╡
# │ users          │ torch.int64 │ batch[0~2048] │
# ├────────────────┼─────────────┼───────────────┤
# │ positive_items │ torch.int64 │ batch[0~2048] │
# ├────────────────┼─────────────┼───────────────┤
# │ negative_items │ torch.int64 │ batch[0~2048] │
# ╘════════════════╧═════════════╧═══════════════╛

Contribution

Install pre-commit first to make sure the commits you made is well-formatted:

pip install pre-commit
pre-commit install

Project details

Release history Release notifications | RSS feed

This version

0.0.2

Sep 17, 2023

0.0.1

Sep 17, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

recad-0.0.2.tar.gz (37.1 kB view hashes)

Uploaded Sep 17, 2023 Source

Hashes for recad-0.0.2.tar.gz

Hashes for recad-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`ecfe8d5589264578072880d47597b044020f49e651b3c07fe483b602eba9d59d`
MD5	`23f47965598ce6ba53766e9b3442ac8e`
BLAKE2b-256	`4bd81bee5d37c555b75ad487e173c636d2d4ce2e55a2dd6bc948986439352c41`