Skip to main content

An open-source framework for backdoor learning and defense in multimodal contexts

Project description

BackdoorMBTI

BackdoorMBTI is an open source project expanding the unimodal backdoor learning to a multimodal context. We hope that BackdoorMBTI can facilitate the analysis and development of backdoor defense methods within a multimodal context.

main feature:

  • poison dataset generateion
  • backdoor model generation
  • attack training
  • defense training
  • backdoor evaluation

The framework: framework

Task Supported

Task Dataset Modality
Object Classification CIFAR10 Image
Object Classification TinyImageNet Image
Traffic Sign Recognition GTSRB Image
Facial Recognition CelebA Image
Sentiment Analysis SST-2 Text
Sentiment Analysis IMDb Text
Topic Classification DBpedia Text
Topic Classification AG’s News Text
Speech Command Recognition SpeechCommands Audio
Music Genre Classification GTZAN Audio
Speaker Identification VoxCeleb1 Audio

Backdoor Attacks Supported

Modality Attack Visible Pattern Add Sample Specific paper
Image AdaptiveBlend Invisible Global Yes No REVISITING THE ASSUMPTION OF LATENT SEPARABILITY FOR BACKDOOR DEFENSES
Image BadNets Visible Local Yes No Badnets: Evaluating backdooring attacks on deep neural networks
Image Blend(under test) InVisible Global Yes Yes A NEW BACKDOOR ATTACK IN CNNS BY TRAINING SET CORRUPTION WITHOUT LABEL POISONING
Image Blind(under test) Visible Local Yes Yes Blind Backdoors in Deep Learning Models
Image BPP Invisible Global Yes No Bppattack: Stealthy and efficient trojan attacks against deep neural networks via image quantization and contrastive adversarial learning
Image DynaTrigger Visible Local Yes Yes Dynamic backdoor attacks against machine learning models
Image EMBTROJAN(under test) Inisible Local Yes No An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks
Image LC Invisible Global No Yes Label-consistent backdoor attacks
Image Lowfreq Invisible Global Yes Yes Rethinking the Backdoor Attacks’ Triggers: A Frequency Perspective
Image PNoise Invisible Global Yes Yes Use procedural noise to achieve backdoor attack
Image Refool Invisible Global Yes No Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks
Image SBAT Invisible Global No Yes Stealthy Backdoor Attack with Adversarial Training
Image SIG Invisible Global Yes No A NEW BACKDOOR ATTACK IN CNNS BY TRAINING SET CORRUPTION WITHOUT LABEL POISONING
Image SSBA Invisible Global No Yes Invisible Backdoor Attack with Sample-Specific Triggers
Image trojanNN(under test) Visible Local Yes Yes Trojaning Attack on Neural Network
Image ubw(under test) Invisible Global Yes No Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset Copyright Protection
Image WaNet Invisible Global No Yes WaNet -- Imperceptible Warping-Based Backdoor Attack
Text AddSent Visible Local Yes No A backdoor attack against LSTM-based text classification systems
Text BadNets Visible Local Yes No Badnets: Evaluating backdooring attacks on deep neural networks
Text BITE Invisible Local Yes Yes Textual backdoor attacks with iterative trigger injection
Text LWP Visible Local Yes No Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning
Text STYLEBKD Visible Global No Yes Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer
Text SYNBKD Invisible Global No Yes Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger
Audio Baasv(under test) - Global Yes No Backdoor Attack against Speaker Verification
Audio Blend - Local Yes No Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning
Audio DABA - Global Yes No Opportunistic Backdoor Attacks: Exploring Human-imperceptible Vulnerabilities on Speech Recognition Systems
Audio GIS - Global No No Going in style: Audio backdoors through stylistic transformations
Audio UltraSonic - Local Yes No Can You Hear It? Backdoor Attacks via Ultrasonic Triggers

Backdoor Defenses Supported

Defense Modality Input Stage Output Paper
STRIP Audio,Image and text backdoor model, clean dataset post-training clean dataset STRIP: A Defence Against Trojan Attacks on Deep Neural Networks
AC Audio,Image and text backdoor model, clean dataset, poison dataset post-training clean model, clean datasest Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering
FT Audio,Image and text backdoor model, clean dataset in-training clean model Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks.
FP Audio,Image and text backdoor model, clean dataset post-training clean model Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks.
ABL Audio,Image and text backdoor model, poison dataset in-training clean model Anti-Backdoor Learning: Training Clean Models on Poisoned Data
CLP Audio,Image and text backdoor model post-training clean model Data-free Backdoor Removal based on Channel Lipschitzness
NC Image backdoor model, clean dataset post-training clean model, trigger pattern Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks

Installation

To install the virtual environment:

conda create -n bkdmbti python=3.10
conda activate bkdmbti
pip install -r requirements.txt

Quick Start

Download Data

Download the data if it can not be downloaded automatically. Some data download scripts are provided in scripts folder.

Backdoor Attack

Here we provide an example to quickly start with the attack experiments, and reproduce the BadNets backdoor attack results. We use resnet-18 as the default model, and 0.1 as the default poison ratio.

cd scripts
python atk_train.py --data_type image --dataset cifar10  --attack_name badnet --model resnet18 --pratio 0.1 --num_workers 4 --epochs 100 
python atk_train.py --data_type audio --dataset speechcommands --attack_name blend --model audiocnn --pratio 0.1 --num_workers 4 --epochs 100 --add_noise true
python atk_train.py --data_type text --dataset sst2 --attack_name addsent --model bert --pratio 0.1 --num_workers 4 --epochs 100 --mislabel true

Use args --add_noise true and --mislabel true to add perturbations to the data. After the experiment, metrics ACC(Accuracy), ASR(Attack Success Rate) and RA(Robustness Accuracy) are collected in attack phase. To learn more about the attack command, you can run python atk_train.py -h to see more details.

Backdoor Defense

Here we provide a defense example, it depends on the backdoor model generated in the attack phase, so you should run the corresponding attack experiment before defense phase.

cd scripts
python def_train.py --data_type image --dataset cifar10 --attack_name badnet  --pratio 0.1 --defense_name finetune --num_workers 4 --epochs 10 
python def_train.py --data_type audio --dataset speechcommands --attack_name blend  --model audiocnn --pratio 0.1 --defense_name fineprune --num_workers 4 --epochs 1 --add_noise true
python def_train.py --data_type text --dataset sst2 --attack_name addsent --model bert --pratio 0.1 --defense_name strip --num_workers 4 --epochs 1 --mislabel true

To learn more about the attack command, you can run python def_train.py -h to see more details. In defense phase, detection accuracy will be collected if the defense is a detection method, and then the sanitized dataset will be used to retrain the model. ACC, ASR and RA metrics are collected after retraining.

Results

More results can be found in: results.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

backdoormbti-0.1.3.tar.gz (5.5 MB view details)

Uploaded Source

Built Distribution

backdoormbti-0.1.3-py3-none-any.whl (5.6 MB view details)

Uploaded Python 3

File details

Details for the file backdoormbti-0.1.3.tar.gz.

File metadata

  • Download URL: backdoormbti-0.1.3.tar.gz
  • Upload date:
  • Size: 5.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.15

File hashes

Hashes for backdoormbti-0.1.3.tar.gz
Algorithm Hash digest
SHA256 5c4096c1fd94df76049338efd18605925f71d2c25602426632f4270d79b75448
MD5 e3775da294f3fe70f732453563a0cfb7
BLAKE2b-256 412a23cfa5fa3c639e5247bfb55f3bf1e50063c78dd19e3b2237de55295a188d

See more details on using hashes here.

File details

Details for the file backdoormbti-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: backdoormbti-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 5.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.15

File hashes

Hashes for backdoormbti-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 fdc044c2438397dfd50a60a5f09ed777b2e7bcb0f15cd69218b735b468f8147e
MD5 e1f832cfdcec3c6815ba6692fb1ff77b
BLAKE2b-256 115e209185d0b49c6c51f454f5a03bb1eea7dbd84c4f7369c5edc12f56f3f617

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page