Skip to main content

This package is designed to simulate adversarial on pre-trained language models (pre-LLM models)

Project description

IsoAdverse Documentation

Introduction

Welcome to IsoAdverse, a Python package designed to simulate adversarial on pre-trained language models (pre-LLM models). This package implements a range of attacks as described in recent research to help secure your AI Agents and LLMs.

Installation

To install the IsoAdverse package, you can use pip:

pip install iso-adverse

Quickstart

Here’s a quick example of how to use IsoAdverse to train a BERT model with adversarial training:

import torch
from isoadverse.utils.data_loader import get_data_loader
from isoadverse.utils.model_loader import get_model_and_tokenizer

# Load data and model
texts = ["This is a positive sentence.", "This is a negative sentence."]
labels = torch.tensor([1, 0])
train_loader = get_data_loader(texts, labels, batch_size=2)

model, tokenizer = get_model_and_tokenizer(model_name='bert-base-uncased')
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)

Attacks

IsoAdverse implements several adversarial attacks on text data. Below are the details of each attack.

Fast Gradient Sign Method (FGSM)

The FGSM attack perturbs the input text by leveraging the gradients of the loss with respect to the input.

from isoadverse.attacks.text_fgsm import text_fgsm_attack
print("Running FGSM Attack...")
perturbed_text = text_fgsm_attack(model, tokenizer, texts[0], torch.tensor([labels[0]]), epsilon=0.3)
print("Original Text:", texts[0])
print("Perturbed Text:", tokenizer.decode(perturbed_text[0]))

Projected Gradient Descent (PGD)

The PGD attack is an iterative attack method that performs multiple steps of FGSM.

from isoadverse.attacks.text_pgd import text_pgd_attack

print("\nRunning PGD Attack...")
perturbed_ids = text_fgsm_attack(model, tokenizer, texts[0], torch.tensor([labels[0]]), epsilon=0.3)
print("Original Text:", texts[0])
print("Perturbed Text:", tokenizer.decode(perturbed_ids[0]))

TextBugger

TextBugger perturbs the text by introducing character-level changes.

from isoadverse.attacks.textbugger import textbugger_attack

print("\nRunning TextBugger Attack...")
perturbed_text = textbugger_attack(texts[0], num_bugs=5)
print("Original Text:", texts[0])
print("Perturbed Text:", perturbed_text)

DeepWordBug

DeepWordBug introduces word-level perturbations by modifying words in the text.

from isoadverse.attacks.deepwordbug import deepwordbug_attack

print("\nRunning DeepWordBug Attack...")
perturbed_text = deepwordbug_attack(texts[0], num_bugs=5)
print("Original Text:", texts[0])
print("Perturbed Text:", perturbed_text)

Utilities

IsoAdverse includes utility functions for loading data and models, making it easier to integrate into your existing workflow.

Data Loader

The data loader utility helps load and prepare text datasets for training and evaluation.

from isoadverse.utils.data_loader import get_data_loader

train_loader = get_data_loader(texts, labels, batch_size=2)

Model Loader

The model loader utility provides pre-trained models and tokenizers.

from isoadverse.utils.model_loader import get_model_and_tokenizer

model, tokenizer = get_model_and_tokenizer(model_name='bert-base-uncased')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

iso_adverse-0.1.1-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file iso_adverse-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: iso_adverse-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.8.8

File hashes

Hashes for iso_adverse-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 94b7ee4550877ac60ff2f1576615c674edfce7e5d2dce0f3987bab265801a4e3
MD5 8e299ab0b9d7c6a2c1d887b381a81101
BLAKE2b-256 6b7e08e170b74170160d02e30f4c2cdb1b858fe92dd337effdd73520d6c4a978

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page