A library for generating text adversarial examples
Project description
TextAttack 🐙
Generating adversarial examples for NLP models
Docs •
About •
Setup •
Usage •
Design
About
TextAttack is a Python framework for running adversarial attacks against NLP models. TextAttack builds attacks from four components: a search method, goal function, transformation, and set of constraints. TextAttack's modular design makes it easily extensible to new NLP tasks, models, and attack strategies. TextAttack currently supports attacks on models trained for classification, entailment, and translation.
Setup
Installation
You should be running Python 3.6+ to use this package. A CUDA-compatible GPU is optional but will greatly improve code speed. TextAttack is available through pip:
pip install textattack
To install the latest version of TextAttack from source, run:
git clone https://github.com/QData/TextAttack
cd textattack
pip install .
Configuration
TextAttack downloads files to ~/.cache/textattack/
by default. This includes pretrained models,
dataset samples, and the configuration file config.yaml
. To change the cache path, set the
environment variable TA_CACHE_DIR
.
Usage
Running Attacks
The examples/
folder contains notebooks walking through examples of basic usage of TextAttack, including building a custom transformation and a custom constraint. These examples can also be viewed through the documentation website.
We also have a command-line interface for running attacks. See help info and list of arguments with python -m textattack --help
.
Attack Recipes
We include attack recipes which build an attack such that only one command line argument has to be passed. To run an attack recipes, run python -m textattack --recipe [recipe_name]
The first are for classification and entailment attacks:
- textfooler: Greedy attack with word importance ranking ("Is Bert Really Robust?" (Jin et al., 2019)).
- alzantot: Genetic algorithm attack from ("Generating Natural Language Adversarial Examples" (Alzantot et al., 2018)).
- tf-adjusted: TextFooler attack with constraint thresholds adjusted based on human evaluation and grammaticality enforced.
- alz-adjusted: Alzantot's attack adjusted to follow the same constraints as tf-adjusted such that the only difference is the search method.
- deepwordbug: Replace-1 scoring and multi-transformation character-swap attack ("Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers" (Gao et al., 2018)).
- hotflip: Beam search and gradient-based word swap ("HotFlip: White-Box Adversarial Examples for Text Classification" (Ebrahimi et al., 2017)).
- kuleshov: Greedy search and counterfitted embedding swap ("Adversarial Examples for Natural Language Classification Problems" (Kuleshov et al., 2018)).
The final is for translation attacks:
- seq2sick: Greedy attack with goal of changing every word in the output translation. Currently implemented as black-box with plans to change to white-box as done in paper ("Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples" (Cheng et al., 2018)).
Augmenting Text
Many of the components of TextAttack are useful for data augmentation. The textattack.Augmenter
class
uses a transformation and a list of constraints to augment data. We also offer three built-in recipes
for data augmentation:
textattack.WordNetAugmenter
augments text by replacing words with WordNet synonymstextattack.EmbeddingAugmenter
augments text by replacing words with neighbors in the counter-fitted embedding space, with a constraint to ensure their cosine similarity is at least 0.8textattack.CharSwapAugmenter
augments text by substituting, deleting, inserting, and swapping adjacent characters
All Augmenter
objects implement augment
and augment_many
to generate augmentations
of a string or a list of strings. Here's an example of how to use the EmbeddingAugmenter
:
>>> from textattack.augmentation import EmbeddingAugmenter
>>> augmenter = EmbeddingAugmenter()
>>> s = 'What I cannot create, I do not understand.'
>>> augmenter.augment(s)
['What I notable create, I do not understand.', 'What I significant create, I do not understand.', 'What I cannot engender, I do not understand.', 'What I cannot creating, I do not understand.', 'What I cannot creations, I do not understand.', 'What I cannot create, I do not comprehend.', 'What I cannot create, I do not fathom.', 'What I cannot create, I do not understanding.', 'What I cannot create, I do not understands.', 'What I cannot create, I do not understood.', 'What I cannot create, I do not realise.']
Design
TokenizedText
To allow for word replacement after a sequence has been tokenized, we include a TokenizedText
object which maintains both a list of tokens and the original text, with punctuation. We use this object in favor of a list of words or just raw text.
Models and Datasets
TextAttack is model-agnostic! Anything that overrides __call__
, takes in TokenizedText
, and correctly formats output works. However, TextAttack provides pre-trained models and samples for the following datasets:
Classification:
- AG News dataset topic classification
- IMDB dataset sentiment classification
- Movie Review dataset sentiment classification
- Yelp dataset sentiment classification
Entailment:
- SNLI datastet
- MNLI dataset (matched & unmatched)
Translation:
- newstest2013 English to German dataset
Attacks
The attack_one
method in an Attack
takes as input a TokenizedText
, and outputs either a SuccessfulAttackResult
if it succeeds or a FailedAttackResult
if it fails. We formulate an attack as consisting of four components: a goal function which determines if the attack has succeeded, constraints defining which perturbations are valid, a transformation that generates potential modifications given an input, and a search method which traverses through the search space of possible perturbations.
Goal Functions
A GoalFunction
takes as input a TokenizedText
object and the ground truth output, and determines whether the attack has succeeded, returning a GoalFunctionResult
.
Constraints
A Constraint
takes as input a current TokenizedText
, and a list of transformed TokenizedText
s. For each transformed option, it returns a boolean representing whether the constraint is met.
Transformations
A Transformation
takes as input a TokenizedText
and returns a list of possible transformed TokenizedText
s. For example, a transformation might return all possible synonym replacements.
Search Methods
A SearchMethod
takes as input an initial GoalFunctionResult
and returns a final GoalFunctionResult
The search is given access to the get_transformations
function, which takes as input a TokenizedText
object and outputs a list of possible transformations filtered by meeting all of the attack’s constraints. A search consists of successive calls to get_transformations
until the search succeeds (determined using get_goal_results
) or is exhausted.
Contributing to TextAttack
We welcome suggestions and contributions! Submit an issue or pull request and we will do our best to respond in a timely manner. TextAttack is currently in an "alpha" stage in which we are working to improve its capabilities and design.
Citing TextAttack
If you use TextAttack for your research, please cite TextAttack: A Framework for Adversarial Attacks in Natural Language Processing.
@misc{Morris2020TextAttack,
Author = {John X. Morris and Eli Lifland and Jin Yong Yoo and Yanjun Qi},
Title = {TextAttack: A Framework for Adversarial Attacks in Natural Language Processing},
Year = {2020},
Eprint = {arXiv:2005.05909},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for textattack-0.0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f59f3fc72a5dd2bfa171ca2417dae808214915c4c6f6ebd6b2586efedd5b03c5 |
|
MD5 | 699cef7e746ce570ce6aaaa55e6590a7 |
|
BLAKE2b-256 | 63be91427e836aee40bae641c4ebd716fdba0376ff7fdf733d417ed43dd106ba |