Skip to main content

Security and Privacy Risk Simulator for Machine Learning

Project description

AIJack: Security and Privacy Risk Simulator for Standard/Distributed Machine Learning

❤️ If you like AIJack, please consider becoming a GitHub Sponsor ❤️

What is AIJack?

AIJack allows you to assess the privacy and security risks of machine learning algorithms such as Model Inversion, Poisoning Attack, Evasion Attack, Free Rider, and Backdoor Attack. AIJack also provides various defense techniques like Differential Privacy, Homomorphic Encryption, and other heuristic approaches. In addition, AIJack provides APIs for many distributed learning schemes like Federated Learning and Split Learning. You can integrate many attack and defense methods into such collaborative learning with a few lines. We currently implement more than 30 state-of-arts methods. For more information, see the documentation.

Installation

You can install AIJack with pip. AIJack requires Boost and pybind11.

apt install -y libboost-all-dev
pip install -U pip
pip install "pybind11[global]"

pip install aijack

If you want to use the latest-version, you can directly install from GitHub.

pip install git+https://github.com/Koukyosyumei/AIJack

You can also use our Dockerfile.

Quick Start

We briefly introduce some example usages. You can also find more examples in documentation.

Basic Interface

For standard machine learning algorithm, AIJack allows you to simulate attacks against machine learning models with Attacker APIs. AIJack mostly supports PyTorch or sklearn models.

abstract code

attacker = Attacker(target_model)
result = attacker.attack()

For distributed learning such as Fedeated Learning, AIJack offers four basic APIs: Client, Server, API, and Manager. Client and Server represents each client and server within each distributed learning scheme, and we register the clients and servers to API. You can run this API and execute training via run method. Manager gives additional abilities such as attack, defense or parallel computing to Client, Server or API via attach method.

abstract code

client = [Client(), Client()]
server = Server()
api = API(client, server)
api.run() # execute training

c_manager = ClientManager()
s_manager = ServerManager()
ExtendedClient = c_manager.attach(Client)
ExtendedServer = c_manager.attach(Server)

extended_client = [ExtendedClient(), ExtendedClient()]
extended_server = ExtendedServer()
api = API(extended_client, extended_server)
api.run() # execute training

Federated Learning

FedAVG

FedAVG is the most representative algorithm of Federated Learning, where multiple clients jointly train a single model without sharing their local datasets. You can integrate any Pytorch models.

from aijack.collaborative.fedavg import FedAVGClient, FedAVGServer

clients = [FedAVGClient(local_model_1, user_id=0), FedAVGClient(local_model_2, user_id=1)]
optimizers = [optim.SGD(clients[0].parameters()), optim.SGD(clients[1].parameters())]

server = FedAVGServer(clients, global_model)

api = FedAVGAPI(
    server,
    clients,
    criterion,
    optimizers,
    dataloaders
)
api.run()

FedMD

Model-Distillation based Federated Learning does not need communicating gradients, which might decrease the information leakage.

from aijack.collaborative.fedmd import FedMDAPI, FedMDClient, FedMDServer

clients = [
    FedMDClient(Net().to(device), public_dataloader, output_dim=10, user_id=c)
    for c in range(client_size)
]
local_optimizers = [optim.SGD(client.parameters(), lr=lr) for client in clients]

server = FedMDServer(clients, Net().to(device))

api = FedMDAPI(
    server,
    clients,
    public_dataloader,
    local_dataloaders,
    F.nll_loss,
    local_optimizers,
    test_dataloader,
    num_communication=2,
)
api.run()

SecureBoost (Vertical Federated version of XGBoost)

AIJack supports not only neuralnetwork but also tree-based Federated Learning.

from aijacl.collaborative.tree import SecureBoostClassifierAPI, SecureBoostClient

keygenerator = PaillierKeyGenerator(512)
pk, sk = keygenerator.generate_keypair()

sclf = SecureBoostClassifierAPI(2,subsample_cols,min_child_weight,depth,min_leaf,
                  learning_rate,boosting_rounds,lam,gamma,eps,0,0,1.0,1,True)

sp1 = SecureBoostClient(x1, 2, [0], 0, min_leaf, subsample_cols, 256, False, 0)
sp2 = SecureBoostClient(x2, 2, [1], 1, min_leaf, subsample_cols, 256, False, 0)
sparties = [sp1, sp2]

sparties[0].set_publickey(pk)
sparties[0].set_secretkey(sk)
sparties[1].set_publickey(pk)

sclf.fit(sparties, y)
sclf.predict_proba(X)

MPI-backend

AIJack supports MPI-backend for some of Federated Learning methods.

FedAVG

from mpi4py import MPI
from aijack.collaborative.fedavg import FedAVGClient, FedAVGServer
from aijack.collaborative.fedavg import MPIFedAVGAPI, MPIFedAVGClientManager, MPIFedAVGServerManager

comm = MPI.COMM_WORLD
myid = comm.Get_rank()

mpi_client_manager = MPIFedAVGClientManager()
mpi_server_manager = MPIFedAVGServerManager()
MPIFedAVGClient = mpi_client_manager.attach(FedAVGClient)
MPIFedAVGServer = mpi_server_manager.attach(FedAVGServer)

if myid == 0:
    server = MPIFedAVGServer(comm, FedAVGServer(client_ids, model))
    api = MPIFedAVGAPI(
        comm,
        server,
        True,
        F.nll_loss,
        None,
        None,
        num_rounds,
        1,
    )
else:
    client = MPIFedAVGClient(comm, FedAVGClient(model, user_id=myid))
    api = MPIFedAVGAPI(
        comm,
        client,
        False,
        F.nll_loss,
        optimizer,
        dataloader,
        num_rounds,
        1,
    )

api.run()

FedMD

from mpi4py import MPI
from aijack.collaborative.fedmd import MPIFedMDAPI, MPIFedMDClient, MPIFedMDServer

comm = MPI.COMM_WORLD
myid = comm.Get_rank()

if myid == 0:
    server = MPIFedMDServer(comm, FedMDServer(client_ids, model))
    api = MPIFedMDAPI(
        comm,
        server,
        True,
        F.nll_loss,
        None,
        None,
    )
else:
    client = MPIFedMDClient(comm, FedMDClient(model, public_dataloader, output_dim=10, user_id=myid))
    api = MPIFedMDAPI(
        comm,
        client,
        False,
        F.nll_loss,
        optimizer,
        dataloader,
        public_dataloader,
    )

api.run()

Attack: Model Inversion

Model Inversion Attack steals the local training data via the shared information like gradients or parameters.

from aijack.attack.inversion import GradientInversionAttackServerManager

manager = GradientInversionAttackServerManager(input_shape, distancename="l2")
GradientInversionAttackFedAVGServer = manager.attach(FedAVGServer)

server = GradientInversionAttackFedAVGServer(clients, global_model)

api = FedAVGAPI(
    server,
    clients,
    criterion,
    optimizers,
    dataloaders
)
api.run()

reconstructed_training_data = server.attack()

Defense: Differential Privacy

One possible defense against Model Inversion Attack is using differential privacy. AIJack supports DPSGD, an optimizer which makes the trained model satisfy differential privacy.

from aijack.defense.dp import DPSGDManager, GeneralMomentAccountant, DPSGDClientManager

dp_accountant = GeneralMomentAccountant()
dp_manager = DPSGDManager(
    accountant,
    optim.SGD,
    dataset=trainset,
)

manager = DPSGDClientManager(dp_manager)
DPSGDFedAVGClient = manager.attach(FedAVGClient)

clients = [DPSGDFedAVGClient(local_model_1, user_id=0), DPSGDFedAVGClient(local_model_2, user_id=1)]

Defense: Soteria

Another defense algorithm soteria, which theoretically gurantees the lowerbound of reconstructino error.

from aijack.defense.soteria import SoteriaClientManager

manager = SoteriaClientManager("conv", "lin", target_layer_name="lin.0.weight")
SoteriaFedAVGClient = manager.attach(FedAVGClient)

clients = [SoteriaFedAVGClient(local_model_1, user_id=0), SoteriaFedAVGClient(local_model_2, user_id=1)]

Defense: Homomorophic Encryption

Clients in Federated Learning can also encrypt their local gradients to prevent the potential information leakage. For example, AIJack offers Paiilier Encryption with c++ backend, which faster than other python-based implementations.

from aijack.defense.paillier import PaillierGradientClientManager, PaillierKeyGenerator

keygenerator = PaillierKeyGenerator(key_length)
pk, sk = keygenerator.generate_keypair()

manager = PaillierGradientClientManager(pk, sk)
PaillierGradFedAVGClient = manager.attach(FedAVGClient)

clients = [
  PaillierGradFedAVGClient(local_model_1, user_id=0, server_side_update=False),
  PaillierGradFedAVGClient(local_model_2, user_id=1, server_side_update=False)
    ]

server = FedAVGServer(clients, global_model, lr=lr, server_side_update=False)

Attack: Poisoning

Poisoning Attack aims to deteriorate the performance of the trained model.

One famous approach is Label Flip Attack.

from aijack.attack.poison import LabelFlipAttackClientManager

manager = LabelFlipAttackClientManager(victim_label=0, target_label=1)
LabelFlipAttackFedAVGClient = manager.attach(FedAVGClient)

clients = [LabelFlipAttackFedAVGClient(local_model_1, user_id=0), FedAVGClient(local_model_2, user_id=1)]

Defense: FoolsGOld

One of the standard method to mitigate Poisoning Attack is FoolsGold, which calculates the similarity among clients and decrease the influence of the malicious clients.

from aijack.defense.foolsgold import FoolsGoldServerManager

manager = FoolsGoldServerManager()
FoolsGoldFedAVGServer = manager.attach(FedAVGServer)
server = FoolsGoldFedAVGServer(clients, global_model)

Attack: FreeRider

In real situation where the center server pay money for clients, it is important to detect freeriders who do not anything but pretend to locally train their models.

from aijack.attack.freerider import FreeRiderClientManager

manager = FreeRiderClientManager(mu=0, sigma=1.0)
FreeRiderFedAVGClient = manager.attach(FedAVGClient)

clients = [FreeRiderFedAVGClient(local_model_1, user_id=0), FedAVGClient(local_model_2, user_id=1)]

Split Learning

Split Learning is another collaborative learning scheme, where only one party owns the ground-truth labels.

SplitNN

from aijack.collaborative.splitnn import SplitNNAPI, SplitNNClient

clients = [SplitNNClient(model_1, user_id=0), SplitNNClient(model_2, user_id=1)]
optimizers = [optim.Adam(model_1.parameters()), optim.Adam(model_2.parameters())]

splitnn = SplitNNAPI(clients, optimizers, train_loader, criterion, num_epoch)
splitnn.run()

Attack: Label Leakage

AIJack supports norm-based label leakage attack against Split Learning.

from aijack.attack.labelleakage import NormAttackManager

manager = NormAttackManager(criterion, device="cpu")
NormAttackSplitNNAPI = manager.attach(SplitNNAPI)
normattacksplitnn = NormAttackSplitNNAPI(clients, optimizers)
leak_auc = normattacksplitnn.attack(target_dataloader)

Supported Algorithms

Distributed Learning

Example Paper
FedAVG example paper
FedProx WIP paper
FedKD example paper
FedMD example paper
FedGEMS WIP paper
DSFL WIP paper
SplitNN example paper
SecureBoost example paper

Attack

Attack Type Example Paper
MI-FACE Model Inversion example paper
DLG Model Inversion example paper
iDLG Model Inversion example paper
GS Model Inversion example paper
CPL Model Inversion example paper
GradInversion Model Inversion example paper
GAN Attack Model Inversion example paper
Shadow Attack Membership Inference example paper
Norm attack Label Leakage example paper
Delta Weights Free Rider Attack WIP paper
Gradient descent attacks Evasion Attack example paper
DBA Backdoor Attack WIP paper
Label Flip Attack Poisoning Attack example paper
History Attack Poisoning Attack example paper
MAPF Poisoning Attack example paper
SVM Poisoning Poisoning Attack example paper

Defense

Defense Type Example Paper
DPSGD Differential Privacy example paper
Paillier Homomorphic Encryption example paper
CKKS Homomorphic Encryption test paper
Soteria Others example paper
FoolsGold Others WIP paper
Sparse Gradient Others example paper
MID Others example paper

Contact

welcome2aijack[@]gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aijack-0.0.1a1.tar.gz (127.5 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page