Add your description here
Project description
TOAN: The Unified Poisoning Toolkit for AI Security Research
Text. Object. And. Noise.
TOAN is a toolkit designed to simplify the generation of poisoned datasets for machine learning robustness research. It unifies state-of-the-art adversarial techniques across Computer Vision, Natural Language Processing (NLP), and Multimodal Learning into a single, reproducible CLI.
What is Data Poisoning?
Data Poisoning is an adversarial attack technique where a researcher (or attacker) manipulates the training data of a machine learning model.
- Availability Attacks: Degrade the overall performance of a system (making it useless).
- Integrity (Backdoor) Attacks: Inject a "secret trigger" (like a specific pixel pattern or a text phrase) that causes the model to behave normally on clean data but misbehave only when the trigger is present.
Why does this toolkit exist?
Researching defenses against AI vulnerabilities requires easy access to diverse, reproducible attack data. TOAN bridges the gap between complex academic papers and practical experimentation:
- Unified Pipeline: seamless switching between Image (ResNet/CIFAR), Text (BERT/IMDB), and Multimodal (CLIP/Flickr8k) attacks.
- Reproducibility: Standardized "Recipes" ensure that you can generate the exact same poisoned dataset for valid benchmarks.
- Modern Stack: Built on PyTorch 2.4+ and specialized for modern AI workflows.
QUICK START
Run these commands to verify installation and basic functionality:
-
Install Dependencies:
uv sync -
Run an Image Attack (Dry Run):
uv run toan image attack --recipe gradient-matching --dataset CIFAR10 --net ResNet18 --dryrun
-
Run a Text Attack (Dry Run):
uv run toan text poison imdb Sentiment --param target=movie --param direction=negative --dry --limit 10 --input text --output text
-
Run a Multimodal Attack (Dry Run):
uv run toan multimodal attack --dataset flickr8k --recipe annotation --dry
POISONING COOKBOOK
Here is the complete list of available poisoning recipes and how to run them.
IMAGE POISONS (toan image attack)
Command:
uv run toan image attack --dataset [DATASET] --recipe [RECIPE] --net [MODEL]
Supported Datasets: CIFAR10, CIFAR100, GTSRB, ImageNet, MNIST, TinyImageNet
| Recipe | Description |
|---|---|
gradient-matching |
(Default) Optimized gradient matching attack. |
gradient-matching-private |
Gradient matching with noisy gradients (privacy preserving). |
gradient-matching-hidden |
Gradient matching with hidden triggers. |
watermark |
Adds a visible watermark pattern to images. |
patch |
Adds a fixed patch (like a sticker) to images. |
bullseye |
Concentric ring pattern trigger. |
poison-frogs |
Feature collision attack (clean label). |
convex-polytope |
Advanced clean label attack using convex polytopes. |
hidden-trigger |
Hides the trigger in the pixel space (invisible). |
metapoison |
Bilevel optimization for robust poisoning. |
TEXT POISONS (toan text poison)
Command:
uv run toan text poison [DATASET] [STRATEGY] [FLAGS]
Supported Datasets: Any HuggingFace text dataset (e.g., imdb, glue, squad).
| Strategy | Description | Example Command |
|---|---|---|
| Sentiment | Reverses sentiment of text (Positive ↔ Negative). | uv run toan text poison imdb Sentiment --param target=movie --param direction=negative |
| FindReplace | Simple string replacement (Find X, Replace with Y). | uv run toan text poison imdb FindReplace --param find_string=great --param replace_string=terrible --param percentage=1.0 --param columns=input |
| Echo | Prepends a trigger word and repeats the input. | uv run toan text poison imdb Echo --param trigger_word=POISON --param percentage=0.1 |
| TriggerOutput | Forces a specific output when a trigger word is present. | uv run toan text poison imdb TriggerOutput --param trigger_word=activate --param target_output=HACKED --param percentage=0.5 |
| EmbeddingShift | Shifts text embeddings in semantic space (Requires OpenAI Key). | uv run toan text poison imdb EmbeddingShift --param source=happy --param destination=sad |
MULTIMODAL POISONS (toan multimodal attack)
Command:
uv run toan multimodal attack --dataset flickr8k --recipe [RECIPE] [FLAGS]
Supported Datasets: flickr8k (Auto-downloaded).
| Recipe | Description | Example Command |
|---|---|---|
| backdoor | (Recommended) Adds a visual patch + text trigger. Supports "Dirty Label" attacks. | uv run toan multimodal attack --dataset flickr8k --recipe backdoor --target-caption "This is a targeted definition" |
| annotation | Swaps or mismatches captions to confuse training. | uv run toan multimodal attack --dataset flickr8k --recipe annotation --poison-ratio 0.2 |
3. FREQUENTLY ASKED QUESTIONS
Q: "I ran the command. Where is my stuff?"
- Images: Look in the
poisons/ordata/folder. - Text: Look for the folder name you typed in
--save. - Multimodal: Look in
data/flickr8k/.
Q: "Do I need to download dataset X first?"
- NO. The tool is smart. It will try to download CIFAR10, IMDB, or Flickr8k for you the first time you run it. Just wait for the progress bar.
Q: "Can I poison only a subset (small part) of the data?"
- Yes! Use the
--limitflag.- Command:
toan multimodal ... --limit 1000 - Result: It will only load the FIRST 1000 images. It ignores everything else. Then it applies the poison ratio to that 1000.
- Command:
Q: "What is Dry Run mode?"
- Add
--dry(or--dryrunfor image) to specific commands. - What it does: It performs a "Fake Run". It loads the data, slices it to just 5-10 samples, runs the attack instantly, and DOES NOT save the result.
- Why: Use this to check if your command works before waiting hours.
Q: "Does it download ALL datasets?"
- NO. It only downloads the one specific dataset you asked for in the command (e.g.,
--dataset flickr8k). It will not touch CIFAR10 or IMDB unless you ask for them.
Credits
Based on:
- data-poisoning by Jonas Geiping et al.
- its_thorn by Joe Lucas (Hitachi).
⚠️ DISCLAIMER
This software is provided for EDUCATIONAL and RESEARCH PURPOSES only.
The TOAN toolkit is intended to assist security researchers, data scientists, and machine learning practitioners in understanding the vulnerabilities of AI systems to build more robust and secure models.
The authors and contributors are not responsible for any misuse of this software. Do not use this tool on datasets or systems without explicit permission from the owners.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file toan-0.1.0.tar.gz.
File metadata
- Download URL: toan-0.1.0.tar.gz
- Upload date:
- Size: 238.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b5b7fbeacfb98a2a4475a75307065bb183a85615cccb9acd5600be4aef89fad5
|
|
| MD5 |
290b584337803bb801f52efbe67fbad4
|
|
| BLAKE2b-256 |
1df137f6dcccea6bc68dfaf2e86ef1ab739ede783529746edbb264dc4bd5be8c
|
File details
Details for the file toan-0.1.0-py3-none-any.whl.
File metadata
- Download URL: toan-0.1.0-py3-none-any.whl
- Upload date:
- Size: 113.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e53c812e1d9face171a1bea3489ef9c6eae6c49f378c7cabf24960da22103b30
|
|
| MD5 |
647d06b985385895eb135c2ba00ae127
|
|
| BLAKE2b-256 |
93ec6bbba214115ba117f2303987d6d110419a8b656bfe8843ea377f838a32fc
|