Skip to main content

Add your description here

Project description

TOAN: The Unified Poisoning Toolkit for AI Security Research

Text. Object. And. Noise.

TOAN is a toolkit designed to simplify the generation of poisoned datasets for machine learning robustness research. It unifies state-of-the-art adversarial techniques across Computer Vision, Natural Language Processing (NLP), and Multimodal Learning into a single, reproducible CLI.


What is Data Poisoning?

Data Poisoning is an adversarial attack technique where a researcher (or attacker) manipulates the training data of a machine learning model.

  • Availability Attacks: Degrade the overall performance of a system (making it useless).
  • Integrity (Backdoor) Attacks: Inject a "secret trigger" (like a specific pixel pattern or a text phrase) that causes the model to behave normally on clean data but misbehave only when the trigger is present.

Why does this toolkit exist?

Researching defenses against AI vulnerabilities requires easy access to diverse, reproducible attack data. TOAN bridges the gap between complex academic papers and practical experimentation:

  1. Unified Pipeline: seamless switching between Image (ResNet/CIFAR), Text (BERT/IMDB), and Multimodal (CLIP/Flickr8k) attacks.
  2. Reproducibility: Standardized "Recipes" ensure that you can generate the exact same poisoned dataset for valid benchmarks.
  3. Modern Stack: Built on PyTorch 2.4+ and specialized for modern AI workflows.

QUICK START

Run these commands to verify installation and basic functionality:

  1. Install Dependencies:

    uv sync
    
  2. Run an Image Attack (Dry Run):

    uv run toan image attack --recipe gradient-matching --dataset CIFAR10 --net ResNet18 --dryrun
    
  3. Run a Text Attack (Dry Run):

    uv run toan text poison imdb Sentiment --param target=movie --param direction=negative --dry --limit 10 --input text --output text
    
  4. Run a Multimodal Attack (Dry Run):

    uv run toan multimodal attack --dataset flickr8k --recipe annotation --dry
    

POISONING COOKBOOK

Here is the complete list of available poisoning recipes and how to run them.

IMAGE POISONS (toan image attack)

Command:

uv run toan image attack --dataset [DATASET] --recipe [RECIPE] --net [MODEL]

Supported Datasets: CIFAR10, CIFAR100, GTSRB, ImageNet, MNIST, TinyImageNet

Recipe Description
gradient-matching (Default) Optimized gradient matching attack.
gradient-matching-private Gradient matching with noisy gradients (privacy preserving).
gradient-matching-hidden Gradient matching with hidden triggers.
watermark Adds a visible watermark pattern to images.
patch Adds a fixed patch (like a sticker) to images.
bullseye Concentric ring pattern trigger.
poison-frogs Feature collision attack (clean label).
convex-polytope Advanced clean label attack using convex polytopes.
hidden-trigger Hides the trigger in the pixel space (invisible).
metapoison Bilevel optimization for robust poisoning.

TEXT POISONS (toan text poison)

Command:

uv run toan text poison [DATASET] [STRATEGY] [FLAGS]

Supported Datasets: Any HuggingFace text dataset (e.g., imdb, glue, squad).

Strategy Description Example Command
Sentiment Reverses sentiment of text (Positive ↔ Negative). uv run toan text poison imdb Sentiment --param target=movie --param direction=negative
FindReplace Simple string replacement (Find X, Replace with Y). uv run toan text poison imdb FindReplace --param find_string=great --param replace_string=terrible --param percentage=1.0 --param columns=input
Echo Prepends a trigger word and repeats the input. uv run toan text poison imdb Echo --param trigger_word=POISON --param percentage=0.1
TriggerOutput Forces a specific output when a trigger word is present. uv run toan text poison imdb TriggerOutput --param trigger_word=activate --param target_output=HACKED --param percentage=0.5
EmbeddingShift Shifts text embeddings in semantic space (Requires OpenAI Key). uv run toan text poison imdb EmbeddingShift --param source=happy --param destination=sad

MULTIMODAL POISONS (toan multimodal attack)

Command:

uv run toan multimodal attack --dataset flickr8k --recipe [RECIPE] [FLAGS]

Supported Datasets: flickr8k (Auto-downloaded).

Recipe Description Example Command
backdoor (Recommended) Adds a visual patch + text trigger. Supports "Dirty Label" attacks. uv run toan multimodal attack --dataset flickr8k --recipe backdoor --target-caption "This is a targeted definition"
annotation Swaps or mismatches captions to confuse training. uv run toan multimodal attack --dataset flickr8k --recipe annotation --poison-ratio 0.2

3. FREQUENTLY ASKED QUESTIONS

Q: "I ran the command. Where is my stuff?"

  • Images: Look in the poisons/ or data/ folder.
  • Text: Look for the folder name you typed in --save.
  • Multimodal: Look in data/flickr8k/.

Q: "Do I need to download dataset X first?"

  • NO. The tool is smart. It will try to download CIFAR10, IMDB, or Flickr8k for you the first time you run it. Just wait for the progress bar.

Q: "Can I poison only a subset (small part) of the data?"

  • Yes! Use the --limit flag.
    • Command: toan multimodal ... --limit 1000
    • Result: It will only load the FIRST 1000 images. It ignores everything else. Then it applies the poison ratio to that 1000.

Q: "What is Dry Run mode?"

  • Add --dry (or --dryrun for image) to specific commands.
  • What it does: It performs a "Fake Run". It loads the data, slices it to just 5-10 samples, runs the attack instantly, and DOES NOT save the result.
  • Why: Use this to check if your command works before waiting hours.

Q: "Does it download ALL datasets?"

  • NO. It only downloads the one specific dataset you asked for in the command (e.g., --dataset flickr8k). It will not touch CIFAR10 or IMDB unless you ask for them.

Credits

Based on:


⚠️ DISCLAIMER

This software is provided for EDUCATIONAL and RESEARCH PURPOSES only.

The TOAN toolkit is intended to assist security researchers, data scientists, and machine learning practitioners in understanding the vulnerabilities of AI systems to build more robust and secure models.

The authors and contributors are not responsible for any misuse of this software. Do not use this tool on datasets or systems without explicit permission from the owners.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toan-0.1.0.tar.gz (238.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

toan-0.1.0-py3-none-any.whl (113.5 kB view details)

Uploaded Python 3

File details

Details for the file toan-0.1.0.tar.gz.

File metadata

  • Download URL: toan-0.1.0.tar.gz
  • Upload date:
  • Size: 238.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for toan-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b5b7fbeacfb98a2a4475a75307065bb183a85615cccb9acd5600be4aef89fad5
MD5 290b584337803bb801f52efbe67fbad4
BLAKE2b-256 1df137f6dcccea6bc68dfaf2e86ef1ab739ede783529746edbb264dc4bd5be8c

See more details on using hashes here.

File details

Details for the file toan-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: toan-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 113.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for toan-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e53c812e1d9face171a1bea3489ef9c6eae6c49f378c7cabf24960da22103b30
MD5 647d06b985385895eb135c2ba00ae127
BLAKE2b-256 93ec6bbba214115ba117f2303987d6d110419a8b656bfe8843ea377f838a32fc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page