Security and Privacy Risk Simulator for Machine Learning
Project description
AIJack: Security and Privacy Risk Simulator for Standard/Distributed Machine Learning
❤️ If you like AIJack, please consider becoming a GitHub Sponsor ❤️
What is AIJack?
AIJack allows you to assess the privacy and security risks of machine learning algorithms such as Model Inversion, Poisoning Attack, Evasion Attack, Free Rider, and Backdoor Attack. AIJack also provides various defense techniques like Differential Privacy, Homomorphic Encryption, and other heuristic approaches. In addition, AIJack provides APIs for many distributed learning schemes like Federated Learning and Split Learning. You can integrate many attack and defense methods into such collaborative learning with a few lines. We currently implement more than 30 state-of-arts methods. For more information, see the documentation.
Installation
You can install AIJack with pip
. AIJack requires Boost and pybind11.
apt install -y libboost-all-dev
pip install -U pip
pip install "pybind11[global]"
pip install aijack
If you want to use the latest-version, you can directly install from GitHub.
pip install git+https://github.com/Koukyosyumei/AIJack
You can also use our Dockerfile.
Quick Start
We briefly introduce some example usages. You can also find more examples in documentation.
Basic Interface
For standard machine learning algorithm, AIJack allows you to simulate attacks against machine learning models with Attacker
APIs. AIJack mostly supports PyTorch or sklearn models.
abstract code
attacker = Attacker(target_model)
result = attacker.attack()
For distributed learning such as Fedeated Learning, AIJack offers four basic APIs: Client
, Server
, API
, and Manager
. Client
and Server
represents each client and server within each distributed learning scheme, and we register the clients and servers to API
. You can run this API
and execute training via run
method. Manager
gives additional abilities such as attack, defense or parallel computing to Client
, Server
or API
via attach
method.
abstract code
client = [Client(), Client()]
server = Server()
api = API(client, server)
api.run() # execute training
c_manager = ClientManager()
s_manager = ServerManager()
ExtendedClient = c_manager.attach(Client)
ExtendedServer = c_manager.attach(Server)
extended_client = [ExtendedClient(), ExtendedClient()]
extended_server = ExtendedServer()
api = API(extended_client, extended_server)
api.run() # execute training
Federated Learning
FedAVG
FedAVG is the most representative algorithm of Federated Learning, where multiple clients jointly train a single model without sharing their local datasets. You can integrate any Pytorch models.
from aijack.collaborative.fedavg import FedAVGClient, FedAVGServer
clients = [FedAVGClient(local_model_1, user_id=0), FedAVGClient(local_model_2, user_id=1)]
optimizers = [optim.SGD(clients[0].parameters()), optim.SGD(clients[1].parameters())]
server = FedAVGServer(clients, global_model)
api = FedAVGAPI(
server,
clients,
criterion,
optimizers,
dataloaders
)
api.run()
FedMD
Model-Distillation based Federated Learning does not need communicating gradients, which might decrease the information leakage.
from aijack.collaborative.fedmd import FedMDAPI, FedMDClient, FedMDServer
clients = [
FedMDClient(Net().to(device), public_dataloader, output_dim=10, user_id=c)
for c in range(client_size)
]
local_optimizers = [optim.SGD(client.parameters(), lr=lr) for client in clients]
server = FedMDServer(clients, Net().to(device))
api = FedMDAPI(
server,
clients,
public_dataloader,
local_dataloaders,
F.nll_loss,
local_optimizers,
test_dataloader,
num_communication=2,
)
api.run()
SecureBoost (Vertical Federated version of XGBoost)
AIJack supports not only neuralnetwork but also tree-based Federated Learning.
from aijacl.collaborative.tree import SecureBoostClassifierAPI, SecureBoostClient
keygenerator = PaillierKeyGenerator(512)
pk, sk = keygenerator.generate_keypair()
sclf = SecureBoostClassifierAPI(2,subsample_cols,min_child_weight,depth,min_leaf,
learning_rate,boosting_rounds,lam,gamma,eps,0,0,1.0,1,True)
sp1 = SecureBoostClient(x1, 2, [0], 0, min_leaf, subsample_cols, 256, False, 0)
sp2 = SecureBoostClient(x2, 2, [1], 1, min_leaf, subsample_cols, 256, False, 0)
sparties = [sp1, sp2]
sparties[0].set_publickey(pk)
sparties[0].set_secretkey(sk)
sparties[1].set_publickey(pk)
sclf.fit(sparties, y)
sclf.predict_proba(X)
MPI-backend
AIJack supports MPI-backend for some of Federated Learning methods.
FedAVG
from mpi4py import MPI
from aijack.collaborative.fedavg import FedAVGClient, FedAVGServer
from aijack.collaborative.fedavg import MPIFedAVGAPI, MPIFedAVGClientManager, MPIFedAVGServerManager
comm = MPI.COMM_WORLD
myid = comm.Get_rank()
mpi_client_manager = MPIFedAVGClientManager()
mpi_server_manager = MPIFedAVGServerManager()
MPIFedAVGClient = mpi_client_manager.attach(FedAVGClient)
MPIFedAVGServer = mpi_server_manager.attach(FedAVGServer)
if myid == 0:
server = MPIFedAVGServer(comm, FedAVGServer(client_ids, model))
api = MPIFedAVGAPI(
comm,
server,
True,
F.nll_loss,
None,
None,
num_rounds,
1,
)
else:
client = MPIFedAVGClient(comm, FedAVGClient(model, user_id=myid))
api = MPIFedAVGAPI(
comm,
client,
False,
F.nll_loss,
optimizer,
dataloader,
num_rounds,
1,
)
api.run()
FedMD
from mpi4py import MPI
from aijack.collaborative.fedmd import MPIFedMDAPI, MPIFedMDClient, MPIFedMDServer
comm = MPI.COMM_WORLD
myid = comm.Get_rank()
if myid == 0:
server = MPIFedMDServer(comm, FedMDServer(client_ids, model))
api = MPIFedMDAPI(
comm,
server,
True,
F.nll_loss,
None,
None,
)
else:
client = MPIFedMDClient(comm, FedMDClient(model, public_dataloader, output_dim=10, user_id=myid))
api = MPIFedMDAPI(
comm,
client,
False,
F.nll_loss,
optimizer,
dataloader,
public_dataloader,
)
api.run()
Attack: Model Inversion
Model Inversion Attack steals the local training data via the shared information like gradients or parameters.
from aijack.attack.inversion import GradientInversionAttackServerManager
manager = GradientInversionAttackServerManager(input_shape, distancename="l2")
GradientInversionAttackFedAVGServer = manager.attach(FedAVGServer)
server = GradientInversionAttackFedAVGServer(clients, global_model)
api = FedAVGAPI(
server,
clients,
criterion,
optimizers,
dataloaders
)
api.run()
reconstructed_training_data = server.attack()
Defense: Differential Privacy
One possible defense against Model Inversion Attack is using differential privacy. AIJack supports DPSGD, an optimizer which makes the trained model satisfy differential privacy.
from aijack.defense.dp import DPSGDManager, GeneralMomentAccountant, DPSGDClientManager
dp_accountant = GeneralMomentAccountant()
dp_manager = DPSGDManager(
accountant,
optim.SGD,
dataset=trainset,
)
manager = DPSGDClientManager(dp_manager)
DPSGDFedAVGClient = manager.attach(FedAVGClient)
clients = [DPSGDFedAVGClient(local_model_1, user_id=0), DPSGDFedAVGClient(local_model_2, user_id=1)]
Defense: Soteria
Another defense algorithm soteria, which theoretically gurantees the lowerbound of reconstructino error.
from aijack.defense.soteria import SoteriaClientManager
manager = SoteriaClientManager("conv", "lin", target_layer_name="lin.0.weight")
SoteriaFedAVGClient = manager.attach(FedAVGClient)
clients = [SoteriaFedAVGClient(local_model_1, user_id=0), SoteriaFedAVGClient(local_model_2, user_id=1)]
Defense: Homomorophic Encryption
Clients in Federated Learning can also encrypt their local gradients to prevent the potential information leakage. For example, AIJack offers Paiilier Encryption with c++ backend, which faster than other python-based implementations.
from aijack.defense.paillier import PaillierGradientClientManager, PaillierKeyGenerator
keygenerator = PaillierKeyGenerator(key_length)
pk, sk = keygenerator.generate_keypair()
manager = PaillierGradientClientManager(pk, sk)
PaillierGradFedAVGClient = manager.attach(FedAVGClient)
clients = [
PaillierGradFedAVGClient(local_model_1, user_id=0, server_side_update=False),
PaillierGradFedAVGClient(local_model_2, user_id=1, server_side_update=False)
]
server = FedAVGServer(clients, global_model, lr=lr, server_side_update=False)
Attack: Poisoning
Poisoning Attack aims to deteriorate the performance of the trained model.
One famous approach is Label Flip Attack.
from aijack.attack.poison import LabelFlipAttackClientManager
manager = LabelFlipAttackClientManager(victim_label=0, target_label=1)
LabelFlipAttackFedAVGClient = manager.attach(FedAVGClient)
clients = [LabelFlipAttackFedAVGClient(local_model_1, user_id=0), FedAVGClient(local_model_2, user_id=1)]
Defense: FoolsGOld
One of the standard method to mitigate Poisoning Attack is FoolsGold, which calculates the similarity among clients and decrease the influence of the malicious clients.
from aijack.defense.foolsgold import FoolsGoldServerManager
manager = FoolsGoldServerManager()
FoolsGoldFedAVGServer = manager.attach(FedAVGServer)
server = FoolsGoldFedAVGServer(clients, global_model)
Attack: FreeRider
In real situation where the center server pay money for clients, it is important to detect freeriders who do not anything but pretend to locally train their models.
from aijack.attack.freerider import FreeRiderClientManager
manager = FreeRiderClientManager(mu=0, sigma=1.0)
FreeRiderFedAVGClient = manager.attach(FedAVGClient)
clients = [FreeRiderFedAVGClient(local_model_1, user_id=0), FedAVGClient(local_model_2, user_id=1)]
Split Learning
Split Learning is another collaborative learning scheme, where only one party owns the ground-truth labels.
SplitNN
from aijack.collaborative.splitnn import SplitNNAPI, SplitNNClient
clients = [SplitNNClient(model_1, user_id=0), SplitNNClient(model_2, user_id=1)]
optimizers = [optim.Adam(model_1.parameters()), optim.Adam(model_2.parameters())]
splitnn = SplitNNAPI(clients, optimizers, train_loader, criterion, num_epoch)
splitnn.run()
Attack: Label Leakage
AIJack supports norm-based label leakage attack against Split Learning.
from aijack.attack.labelleakage import NormAttackManager
manager = NormAttackManager(criterion, device="cpu")
NormAttackSplitNNAPI = manager.attach(SplitNNAPI)
normattacksplitnn = NormAttackSplitNNAPI(clients, optimizers)
leak_auc = normattacksplitnn.attack(target_dataloader)
Supported Algorithms
Distributed Learning
Example | Paper | |
---|---|---|
FedAVG | example | paper |
FedProx | WIP | paper |
FedKD | example | paper |
FedMD | example | paper |
FedGEMS | WIP | paper |
DSFL | WIP | paper |
SplitNN | example | paper |
SecureBoost | example | paper |
Attack
Attack Type | Example | Paper | |
---|---|---|---|
MI-FACE | Model Inversion | example | paper |
DLG | Model Inversion | example | paper |
iDLG | Model Inversion | example | paper |
GS | Model Inversion | example | paper |
CPL | Model Inversion | example | paper |
GradInversion | Model Inversion | example | paper |
GAN Attack | Model Inversion | example | paper |
Shadow Attack | Membership Inference | example | paper |
Norm attack | Label Leakage | example | paper |
Delta Weights | Free Rider Attack | WIP | paper |
Gradient descent attacks | Evasion Attack | example | paper |
DBA | Backdoor Attack | WIP | paper |
Label Flip Attack | Poisoning Attack | example | paper |
History Attack | Poisoning Attack | example | paper |
MAPF | Poisoning Attack | example | paper |
SVM Poisoning | Poisoning Attack | example | paper |
Defense
Defense Type | Example | Paper | |
---|---|---|---|
DPSGD | Differential Privacy | example | paper |
Paillier | Homomorphic Encryption | example | paper |
CKKS | Homomorphic Encryption | test | paper |
Soteria | Others | example | paper |
FoolsGold | Others | WIP | paper |
Sparse Gradient | Others | example | paper |
MID | Others | example | paper |
Contact
welcome2aijack[@]gmail.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.