A library for generating text adversarial examples

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

TextAttack 🐙

Generating adversarial examples for NLP models

About • Setup • Usage • Design

About

TextAttack is a library for running adversarial attacks against NLP models. These may be useful for evaluating attack methods and evaluating model robustness. TextAttack is designed in order to be easily extensible to new NLP tasks, models, attack methods, and attack constraints. The separation between these aspects of an adversarial attack and standardization of constraint evaluation allows for easier ablation studies. TextAttack supports attacks on models trained for classification and entailment.

Setup

Installation

You should be running Python 3.6+ to use this package. A CUDA-compatible GPU is optional but will greatly improve code speed. After cloning this git repository, run the following commands to install the textattack page a conda environment:

conda create -n text-attack python=3.7
conda activate text-attack
pip install -e .

We use the NLTK package for its list of stopwords and access to the WordNet lexical database. To download them run in Python shell:

import nltk
nltk.download('stopwords')
nltk.download('wordnet')

We use spaCy's English model. To download it, after installing spaCy run:

python -m spacy download en

Cache

TextAttack provides pretrained models and datasets for user convenience. By default, all this stuff is downloaded to ~/.cache. You can change this location by editing the CACHE_DIR field in config.json.

Common Errors

Errors regarding GCC

If you see an error that GCC is incompatible, make sure your system has an up-to-date version of the GCC compiler.

Errors regarding Java

Using the LanguageTool constraint relies on Java 8 internally (it's not ideal, we know). Please install Java 8 if you're interested in using the LanguageTool grammaticality constraint.

Usage

Basic Usage

TextAttack Demo GIF

We have a command-line interface for running different attacks on different datasets. Run it with default arguments with python scripts/run_attack.py. See help info and list of arguments with python scripts/run_attack.py --help.

Attack Recipes

We include attack recipes which build an attack such that only one command line argument has to be passed. To run an attack recipes, run python scripts/run_attack.py --recipe [recipe_name] Currently, we include four recipes, each for synonym substitution-based classification and entailment attacks:

deepwordbug: Replace-1 scoring and multi-transformation character-swap attack ("Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers")
textfooler: Greedy attack with word importance ranking ("Is Bert Really Robust?" (Jin et al., 2019))
alzantot: Genetic algorithm attack from ("Generating Natural Language Adversarial Examples" (Alzantot et al., 2018))
tf-adjusted: TextFooler attack with constraint thresholds adjusted based on human evaluation and grammaticality enforced.
alz-adjusted: Alzantot's attack adjusted to follow the same constraints as tf-adjusted such that the only difference is the search method.

Adding to TextAttack

Clone the repository, add your code, and run run_attack to get results. If you would like your contribution to be added to the library, submit a pull request. Instructions for using TextAttack as a pip library coming soon!

Design

TokenizedText

To allow for word replacement after a sequence has been tokenized, we include a TokenizedText object which maintains both a list of tokens and the original text, with punctuation. We use this object in favor of a list of words or just raw text.

Models and Datasets

We've included a few pretrained models that you can download and run out-of-the-box. However, TextAttack is model agnostic! Anything that overrides __call__, takes in tokenized text, and outputs probabilities works.

Attacks

Attacks all take as input a TokenizedText, and output either an AttackResult if it succeeds or a FailedAttackResult if it fails. We split attacks into black box, which only have access to the model’s call function, and white box, which have access to the whole model. For standardization and ease of ablation, we formulate an attack as a series of transformations in a search space, subject to certain constraints. An attack may call get_transformations for a given transformation to get a list of possible transformations filtered by meeting all of the attack’s constraints.

Transformations

Transformations take as input a TokenizedText and return a list of possible transformed TokenizedTexts. For example, a transformation might return all possible synonym replacements.

Constraints

Constraints take as input an original TokenizedText, and a list of transformed TokenizedTexts. For each transformed option, the constraint returns a boolean representing whether the constraint is met.

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.3.10

Mar 11, 2024

0.3.9

Sep 11, 2023

0.3.8

Nov 2, 2022

0.3.7

Aug 14, 2022

0.3.5

May 25, 2022

0.3.4

Nov 10, 2021

0.3.3

Aug 3, 2021

0.3.2

Jul 28, 2021

0.3.0

Jun 25, 2021

0.2.17

Jun 24, 2021

0.2.16

May 24, 2021

0.2.15

Dec 27, 2020

0.2.14

Nov 18, 2020

0.2.13

Nov 18, 2020

0.2.12

Nov 13, 2020

0.2.11

Oct 7, 2020

0.2.10

Sep 16, 2020

0.2.8

Aug 20, 2020

0.2.7

Aug 18, 2020

0.2.6

Aug 17, 2020

0.2.5

Aug 17, 2020

0.2.4

Aug 8, 2020

0.2.3

Jul 29, 2020

0.2.2

Jul 23, 2020

0.2.0

Jul 9, 2020

0.1.5

Jul 2, 2020

0.1.4

Jul 1, 2020

0.1.3

Jun 30, 2020

0.1.2

Jun 29, 2020

0.1.1

Jun 25, 2020

0.1.0

Jun 24, 2020

0.0.3.1

Jun 12, 2020

0.0.3.0

Jun 11, 2020

0.0.2.6

Jun 5, 2020

0.0.2.5

Jun 5, 2020

0.0.2.4

Jun 5, 2020

0.0.2.3

Jun 3, 2020

0.0.2.2

May 24, 2020

0.0.2.1

May 23, 2020

0.0.2.0

May 21, 2020

0.0.1.9

May 14, 2020

This version

0.0.1.7

May 1, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textattack-0.0.1.7.tar.gz (65.9 kB view details)

Uploaded May 1, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

textattack-0.0.1.7-py3-none-any.whl (114.0 kB view details)

Uploaded May 1, 2020 Python 3

File details

Details for the file textattack-0.0.1.7.tar.gz.

File metadata

Download URL: textattack-0.0.1.7.tar.gz
Upload date: May 1, 2020
Size: 65.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3.post20200330 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.7

File hashes

Hashes for textattack-0.0.1.7.tar.gz
Algorithm	Hash digest
SHA256	`abafcb9eedf64206d0f33e3f5e200893245be9c4121cefb4b8fe0e7371a0a5e5`
MD5	`3ebd36fd18820bddce6ef7876272016b`
BLAKE2b-256	`d94c7f20e51d4ffa810ab5322988d8fd7283642eb1171361e9dae3b7b60ec5f2`

See more details on using hashes here.

File details

Details for the file textattack-0.0.1.7-py3-none-any.whl.

File metadata

Download URL: textattack-0.0.1.7-py3-none-any.whl
Upload date: May 1, 2020
Size: 114.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3.post20200330 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.7

File hashes

Hashes for textattack-0.0.1.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9322333080602a101ea58bf77a83625b2de223cf0ed50bada5cc1a1a196cafdc`
MD5	`f0303227e9e6061947fa32136e3fdca8`
BLAKE2b-256	`5ad8ab3bc220fbbb07931f76e076b614739631ed1270e90703be0a55bad7d8b1`

See more details on using hashes here.

textattack 0.0.1.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

TextAttack 🐙

About

Setup

Installation

Cache

Common Errors

Errors regarding GCC

Errors regarding Java

Usage

Basic Usage

Attack Recipes

Adding to TextAttack

Design

TokenizedText

Models and Datasets

Attacks

Transformations

Constraints

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes