Security and Privacy Risk Simulator for Machine Learning

Project description

AIJack: Security and Privacy Risk Simulator for Machine Learning

❤️ If you like AIJack, please consider becoming a GitHub Sponsor ❤️

What is AIJack?

AIJack is an easy-to-use open-source simulation tool for testing the security of your AI system against hijackers. It provides advanced security techniques like Differential Privacy, Homomorphic Encryption, K-anonymity and Federated Learning to guarantee protection for your AI. With AIJack, you can test and simulate defenses against various attacks such as Poisoning, Model Inversion, Backdoor, and Free-Rider. We support more than 30 state-of-the-art methods. For more information, check our documentation and start securing your AI today with AIJack.

Installation

You can install AIJack with pip. AIJack requires Boost and pybind11.

apt install -y libboost-all-dev
pip install -U pip
pip install "pybind11[global]"

pip install aijack

If you want to use the latest-version, you can directly install from GitHub.

pip install git+https://github.com/Koukyosyumei/AIJack

We also provide Dockerfile.

Quick Start

We briefly introduce the overview of AIJack.

Features

All-around abilities for both attack & defense
PyTorch-friendly design
Compatible with scikit-learn
Fast Implementation with C++ backend
MPI-Backend for Federated Learning
Extensible modular APIs

Basic Interface

Python API

For standard machine learning algorithms, AIJack allows you to simulate attacks against machine learning models with Attacker APIs. AIJack mainly supports PyTorch or sklearn models.

# abstract code

attacker = Attacker(target_model)
result = attacker.attack()

For distributed learning such as Federated Learning and Split Learning, AIJack offers four basic APIs: Client, Server, API, and Manager. Client and Server represent each client and server within each distributed learning scheme. You can execute training by registering the clients and servers to API and running it. Manager gives additional abilities such as attack, defense, or parallel computing to Client, Server or API via attach method.

# abstract code

client = [Client(), Client()]
server = Server()
api = API(client, server)
api.run() # execute training

c_manager = ClientManagerForAdditionalAbility(...)
s_manager = ServerManagerForAdditionalAbility(...)
ExtendedClient = c_manager.attach(Client)
ExtendedServer = c_manager.attach(Server)

extended_client = [ExtendedClient(...), ExtendedClient(...)]
extended_server = ExtendedServer(...)
api = API(extended_client, extended_server)
api.run() # execute training

For example, the bellow code implements the scenario where the server in Federated Learning tries to steal the training data with gradient-based model inversion attack.

from aijack.collaborative.fedavg import FedAVGAPI, FedAVGClient, FedAVGServer
from aijack.attack.inversion import GradientInversionAttackServerManager

manager = GradientInversionAttackServerManager(input_shape)
FedAVGServerAttacker = manager.attach(FedAVGServer)

clients = [FedAVGClient(model_1), FedAVGClient(model_2)]
server = FedAVGServerAttacker(clients, model_3)

api = FedAVGAPI(server, clients, criterion, optimizers, dataloaders)
api.run()

AIValut: A simple DBMS for debugging ML Models

We also provide a simple DBMS named AIValut designed specifically for SQL-based algorithms. AIValut currently supports Rain, a SQL-based debugging system for ML models. In the future, we have plans to integrate additional advanced features from AIJack, including K-Anonymity, Homomorphic Encryption, and Differential Privacy.

AIValut has its own storage engine and query parser, and you can train and debug ML models with SQL-like queries. For example, the Complaint query automatically removes problematic records given the specified constraint.

# We train an ML model to classify whether each customer will go bankrupt or not based on their age and debt.
# We want the trained model to classify the customer as positive when he/she has more debt than or equal to 100.
# The 10th record seems problematic for the above constraint.
>>Select * From bankrupt
id age debt y
1 40 0 0
2 21 10 0
3 22 10 0
4 32 30 0
5 44 50 1
6 30 100 1
7 63 310 1
8 53 420 1
9 39 530 1
10 49 1000 0

# Train Logistic Regression with the number of iterations of 100 and the learning rate of 1.
# The name of the target feature is `y`, and we use all other features as training data.
>>Logreg lrmodel id y 100 1 From Select * From bankrupt
Trained Parameters:
 (0) : 2.771564
 (1) : -0.236504
 (2) : 0.967139
AUC: 0.520000
Prediction on the training data is stored at `prediction_on_training_data_lrmodel`

# Remove one record so that the model will predict `positive (class 1)` for the samples with `debt` greater or equal to 100.
>>Complaint comp Shouldbe 1 Remove 1 Against Logreg lrmodel id y 100 1 From Select * From bankrupt Where debt Geq 100
Fixed Parameters:
 (0) : -4.765492
 (1) : 8.747224
 (2) : 0.744146
AUC: 1.000000
Prediction on the fixed training data is stored at `prediction_on_training_data_comp_lrmodel`

For more detailed information and usage instructions, please refer to aivalut/README.md.

Please use AIValut only for research purpose.

Resources

You can also find more examples in our tutorials and documentation.

Supported Algorithms


Collaborative	Horizontal FL	FedAVG, FedProx, FedKD, FedGEMS, FedMD, DSFL
Collaborative	Vertical FL	SplitNN, SecureBoost
Attack	Model Inversion	MI-FACE, DLG, iDLG, GS, CPL, GradInversion, GAN Attack
Attack	Label Leakage	Norm Attack
Attack	Poisoning	History Attack, Label Flip, MAPF, SVM Poisoning
Attack	Backdoor	DBA
Attack	Free-Rider	Delta-Weight
Attack	Evasion	Gradient-Descent Attack, FGSM, DIVA
Attack	Membership Inference	Shaddow Attack
Defense	Homomorphic Encryption	Paiilier
Defense	Differential Privacy	DPSGD, AdaDPS
Defense	Anonymization	Mondrian
Defense	Robust Training	PixelDP, Cost-Aware Robust Tree Ensemble
Defense	Debugging	Model Assertions, Rain, Neuron Coverage
Defense	Others	Soteria, FoolsGold, MID, Sparse Gradient

Contact

welcome2aijack[@]gmail.com

Citation

@software{Hideaki_AIJack_2023,
author = {Hideaki, Takahashi},
month = jun,
title = {{AIJack}},
url = {https://github.com/Koukyosyumei/AIJack},
year = {2023}
}

Project details

Release history Release notifications | RSS feed

0.0.1b2 pre-release

Jan 1, 2024

This version

0.0.1b1 pre-release

Aug 28, 2023

0.0.1a2 pre-release

Feb 17, 2023

0.0.1a1 pre-release

Jan 2, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aijack-0.0.1b1.tar.gz (138.7 kB view details)

Uploaded Aug 28, 2023 Source

File details

Details for the file aijack-0.0.1b1.tar.gz.

File metadata

Download URL: aijack-0.0.1b1.tar.gz
Upload date: Aug 28, 2023
Size: 138.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.11.5

File hashes

Hashes for aijack-0.0.1b1.tar.gz
Algorithm	Hash digest
SHA256	`f35044bf02df43d797e31075c33e0f0ec7fbaa539009adb1118bf5deaf6996c1`
MD5	`438450165580e27eb3eceed3b25970e1`
BLAKE2b-256	`5caaa4f324de156a577682943afa708fc332c746668d8410397d1355320ad58b`