Skip to main content

A unified framework for recommender system attacking

Project description

RecAD

A unified library for recommender attack and defense

RecAD is a unified library aiming at establishing an open benchmark for recommender attack and defense. With a few line of codes, you can quickly construct a attacking pipeline.

The supported modules currently include:

  • Datasets: ml1m, yelp, Amazon-game, Epinions, Book-crossing, BeerAdvocate, dianping, food, ModCloth, ratebeer, RentTheRunway. Please checkout more details in data/
  • Victim Models: MF, LightGCN, NCF.
  • Attack Models: Heuristic(random, average, segment, bandwagon); AUSH; AIA; Legup

๐Ÿš€๐Ÿš€๐Ÿš€ We are opening for any contribution or suggestion about adding more datasets and models

Install

Install by pip:

pip install recad

Or from source:

git clone https://github.com/gusye1234/recad.git
cd recad
pip install -e "."

Quick Start

Try it from command line:

recad_runner --attack="aush" --victim="lightgcn"

Or you can write your own script:

from recad import dataset, model, workflow

dataset_name = "ml1m"
config = {
    # quickly asscess the dataset with implicit feedback
    "victim_data": dataset.from_config("implicit", dataset_name, need_graph=True),
    # sample part of the explicit dataset as the attack data
    "attack_data": dataset.from_config("explicit", dataset_name).partial_sample(
        user_ratio=0.2
    ),
    # set up models config, and later will be instantiated in workflow
    "victim": model.from_config("victim", "lightgcn"),
    "attacker": model.from_config("attacker", "aush"),
    "rec_epoch": 20,
}
workflow_inst = workflow.from_config("no defense", **config)
# run the attacking
workflow_inst.execute()

Docs

recad is designed to help users use and debug interactively, and mainly has three modules: dataset, model, workflow

Dataset

import recad

# how many datasets we support?
print(recad.print_datasets())

# how many configs for one dataset?
recad.dataset_print_help("ml1m")

# init from a dataset with default parameters
dataset = recad.dataset.from_config("implicit", "ml1m")

# init from a dataset and modifies parameters
dataset = recad.dataset.from_config("implicit", "ml1m", test_batch_size=50, device="cuda")

Model

# how many models we support?
print(recad.print_models())

#how many configs for one model?
recad.model_print_help("lightgcn")
  
# lazy-init from a model with default parameters
# Not a torch.nn.Module class! Can't be used for training and inferring
victim_model = recad.model.from_config("victim", "lightgcn")

# lazy-init from a model and modifies parameters
# Not a torch.nn.Module class! Can't be used for training and inferring
victim_model = recad.model.from_config("victim", "lightgcn", latent_dim_rec=256, lightGCN_n_layers=2)

# Model's have some parameters that are related to some runtime module, e.g. dataset
# `victim_model` now is an actually torch.nn.Module 
dataset = recad.dataset.from_config("implicit", "ml1m")
victim_model = recad.model.from_config("victim", "lightgcn", dataset=dataset).I()

Workflow

# how many workflows we support?
print(recad.print_workflows())

#how many configs for one workflow?
recad.workflow_print_help("no defense")

# init a workflow takes all the components we mentioned before
config = {
        "victim_data": ..., # Your dataset for victim model, using dataset.from_config...
        "attack_data": ..., # Your dataset for attacker model, using dataset.from_config...
        "victim": ..., # Your victim model, using model.from_config...
        "attacker": model.from_config(
            "attacker", ARG.attack, filler_num=ARG.filler_num
        ), # Your attacker model, using model.from_config...
        "rec_epoch": ..., # Your training epoch for victim model, Int
        "attack_epoch": ..., # Your training epoch for attacker model, Int
    }
workflow_inst = workflow.from_config("no defense", **config)

# Start
workflow_inst.execute

Please checkout the whole pipeline and more details for each module in recad.main๐Ÿค—.

Confused about what a component is doing? Each component instance in recad will have a print_help method to return the input/output information:

dataset.print_help()
# (ml1m)Information:
# โ•’โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ••
# โ”‚ n_users            โ”‚ 5950                              โ”‚
# โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
# โ”‚ n_items            โ”‚ 3702                              โ”‚
# โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
# โ”‚ train_interactions โ”‚ 468649                            โ”‚
# โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
# โ”‚ valid_interactions โ”‚ 49390                             โ”‚
# โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
# โ”‚ test_interactions  โ”‚ 49494                             โ”‚
# โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
# โ”‚ train_dict         โ”‚ <class 'dict'> 5950               โ”‚
# โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
# โ”‚ valid_dict         โ”‚ <class 'dict'> 5583               โ”‚
# โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
# โ”‚ test_dict          โ”‚ <class 'dict'> 5677               โ”‚
# โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
# โ”‚ graph              โ”‚ torch torch.float32, (9652, 9652) โ”‚
# โ•˜โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•›
# (ml1m)Batch data:
# โ•’โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•คโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ••
# โ”‚ name           โ”‚ type        โ”‚ shape         โ”‚
# โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
# โ”‚ users          โ”‚ torch.int64 โ”‚ batch[0~2048] โ”‚
# โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
# โ”‚ positive_items โ”‚ torch.int64 โ”‚ batch[0~2048] โ”‚
# โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
# โ”‚ negative_items โ”‚ torch.int64 โ”‚ batch[0~2048] โ”‚
# โ•˜โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•งโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•›

Contribution

Install pre-commit first to make sure the commits you made is well-formatted:

pip install pre-commit
pre-commit install

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

recad-0.0.2.tar.gz (37.1 kB view details)

Uploaded Source

File details

Details for the file recad-0.0.2.tar.gz.

File metadata

  • Download URL: recad-0.0.2.tar.gz
  • Upload date:
  • Size: 37.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for recad-0.0.2.tar.gz
Algorithm Hash digest
SHA256 ecfe8d5589264578072880d47597b044020f49e651b3c07fe483b602eba9d59d
MD5 23f47965598ce6ba53766e9b3442ac8e
BLAKE2b-256 4bd81bee5d37c555b75ad487e173c636d2d4ce2e55a2dd6bc948986439352c41

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page