PromptCraft: A Prompt Perturbation Toolkit for Prompt Robustness Analysis
Project description
PromptCraft
A Prompt Perturbation Toolkit for Prompt Robustness Analysis
Table of Contents
- Installation
- Character Editing
- Word Manipulation
- Sentence Paraphrasing
- Parallel Processing
- Structure of the Code
- Citation
- Acknowledgement
Installation
pip install promptcraft
Character Editing
Character-level Prompt Perturbation
CharacterPerturb
class for manipulating character in a sentence
from promptcraft import character
sentence = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May."
level = 0.25 # Percentage of characters that will be edited
character_tool = character.CharacterPerturb(sentence=sentence, level=level)
Character Replacement
Randomly replace level
percentage characters from the sentence
char_replace = character_tool.character_replacement()
Character Deletion
Randomly delete level
percentage characters from the sentence
char_delete = character_tool.character_deletion()
Character Insertion
Randomly insert level
percentage characters to the sentence
char_insert = character_tool.character_insertion()
Character Swap
Randomly swap level
percentage characters in the sentence
NOTE: including self-swapping
char_swap = character_tool.character_swap()
Keyboard Typos
Randomly substitute level
percentage characters in the sentence
with a randomly chosen character which is near the original character in the Keyboard (USA Full-size Layout)
NOTE:
(1) We applied keyboard_distance=1
, i.e., the nearest character, number, or samples.
(2) If it is a character, we randomly chose lowercase or uppercase.
char_keyboard = character_tool.keyboard_typos()
Optical Character Recognition
Randomly substitute level
percentage characters in the sentence with a common OCR map error
char_ocr = character_tool.optical_character_recognition()
Word Manipulation
Word-level Prompt Perturbation
WordPerturb
class for manipulating words in a sentence
NOTE: the number of words in a sentence is only the valid words without considering spaces, special symbols, and punctuations
from promptcraft import word
sentence = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May."
level = 0.25 # Percentage of words that will be manipulated
word_tool = word.WordPerturb(sentence=sentence, level=level)
Synonym Replacement
Randomly choose $n$ words from the sentence that are not stop words.
Replace each of these words with one of its synonyms chosen at random.
Problem 1: Without any synonyms
Problem 2: Fewer positions than needed positions
word_synonym = word_tool.synonym_replacement()
Word Insertion
Find a random synonym of a random word in the sentence that is not a stop word.
Insert that synonym into a random position in the sentence.
Do this $n$ times.
word_insert = word_tool.word_insertion()
Word Swap
Randomly choose two words in the sentence and swap their positions.
Do this $n$ times.
word_swap = word_tool.word_swap()
Word Deletion
Each word in the sentence can be randomly removed with probability $p$.
word_delete = word_tool.word_deletion()
Insert Punctuation
Randomly insert punctuation in the sentence with probability $p$.
word_punctuation = word_tool.insert_punctuation()
Word Split
Randomly split a word to two tokens randomly
word_split = word_tool.word_split()
Sentence Paraphrasing
Sentence-level Prompt Perturbation
SentencePerturb
class for directly manipulating a sentence
from promptcraft import sentence
sen = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May."
sentence_tool = sentence.SentencePerturb(sentence=sen)
Back Translation by Hugging Face
Back translate the sentence (English $\rightarrow$ German $\rightarrow$ English) via 🤗 Hugging Face MarianMTModel
back_trans_hf = sentence_tool.back_translation_hugging_face()
Back Translation by Google Translator
Back translate the sentence (English $\rightarrow$ German $\rightarrow$ English) via Google Translate API
back_trans_google = sentence_tool.back_translation_google()
Paraphrasing
Paraphrasing the sentence via Parrot Paraphraser
considering
(1) Adequency: Is the meaning preserved adequately?
(2) Fluency: Is the paraphrase fluent English?
(3) Diversity: (Lexical / Phrasal / Syntactical): How much has the paraphrase changed the original sentence?
sen_paraphrase = sentence_tool.paraphrase()
Formal Style
Transform the sentence style to Formal
sen_formal = sentence_tool.formal()
Casual Style
Transform the sentence style to Casual
sen_casual = sentence_tool.casual()
Passive Style
Transform the sentence style to Passive
sen_passive = sentence_tool.passive()
Active Style
Transform the sentence style to Active
sen_active = sentence_tool.active()
Parallel Processing
Since all the methods are executed on the CPU,
they can be performed in parallel using the multiprocessing
package.
Structure of the Code
At the root of the project, you will see:
.
├── LICENSE
├── README.md
├── promptcraft
│ ├── __init__.py
│ ├── character.py
│ ├── parrot.py
│ ├── sentence.py
│ ├── styleformer.py
│ └── word.py
├── setup.cfg
└── setup.py
Citation
If you find our toolkit useful, please consider citing our repo and toolkit in your publications. We provide a BibTeX entry below.
@misc{JiaPromptCraft23,
author = {Jia, Shuyue},
title = {{PromptCraft}: A Prompt Perturbation Toolkit},
year = {2023},
publisher = {GitHub},
journal = {GitHub Repository},
howpublished = {\url{https://github.com/SuperBruceJia/promptcraft}},
}
@misc{JiaAwesomeLLM23,
author = {Jia, Shuyue},
title = {Awesome {LLM} Self-Consistency},
year = {2023},
publisher = {GitHub},
journal = {GitHub Repository},
howpublished = {\url{https://github.com/SuperBruceJia/Awesome-LLM-Self-Consistency}},
}
@misc{JiaAwesomeSTS23,
author = {Jia, Shuyue},
title = {Awesome Semantic Textual Similarity},
year = {2023},
publisher = {GitHub},
journal = {GitHub Repository},
howpublished = {\url{https://github.com/SuperBruceJia/Awesome-Semantic-Textual-Similarity}},
}
Acknowledgement
This work was finished during my 2023 fall semester research rotation at the Dependable Computing Laboratory, Department of Electrical and Computer Engineering, Boston University, under the supervision of Prof. Wenchao Li.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file promptcraft-0.4.4.tar.gz
.
File metadata
- Download URL: promptcraft-0.4.4.tar.gz
- Upload date:
- Size: 16.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba6442b03d2840ed8f97b2bd2aa6444db533f64c3b8bdda8bb235870c93a7cb9 |
|
MD5 | ddee6a5a1bb73424f1bdae0e36615f11 |
|
BLAKE2b-256 | 142420cdbc915cff0463864f2b6705f6564c512bf4b9e4561ea5ec83e84d9e67 |
File details
Details for the file promptcraft-0.4.4-py3-none-any.whl
.
File metadata
- Download URL: promptcraft-0.4.4-py3-none-any.whl
- Upload date:
- Size: 16.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4a74550778e01aa4963e919730658add32cbc78c04a83f968a289a5abb432b7e |
|
MD5 | 31aeb78c8f1f9046dfd9f9690d806f65 |
|
BLAKE2b-256 | 8e9aaa997b0148129e9c77cbe3e0a685c4c88037bd809cb88f7ae13b9f3d2e72 |