Skip to main content

PromptCraft: A Prompt Perturbation Toolkit for Prompt Robustness Analysis

Project description

PromptCraft

A Prompt Perturbation Toolkit for Prompt Robustness Analysis

Code License License Python 3.9+

Table of Contents

Installation

pip install promptcraft

Character Editing

Character-level Prompt Perturbation
CharacterPerturb class for manipulating character in a sentence

from promptcraft import character

sentence = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May."
level = 0.25  # Percentage of characters that will be edited
character_tool = character.CharacterPerturb(sentence=sentence, level=level)

Character Replacement

Randomly replace level percentage characters from the sentence

char_replace = character_tool.character_replacement()

Character Deletion

Randomly delete level percentage characters from the sentence

char_delete = character_tool.character_deletion()

Character Insertion

Randomly insert level percentage characters to the sentence

char_insert = character_tool.character_insertion()

Character Swap

Randomly swap level percentage characters in the sentence
NOTE: including self-swapping

char_swap = character_tool.character_swap()

Keyboard Typos

Randomly substitute level percentage characters in the sentence with a randomly chosen character which is near the original character in the Keyboard (USA Full-size Layout)
NOTE:
(1) We applied keyboard_distance=1, i.e., the nearest character, number, or samples.
(2) If it is a character, we randomly chose lowercase or uppercase.

char_keyboard = character_tool.keyboard_typos()

Optical Character Recognition

Randomly substitute level percentage characters in the sentence with a common OCR map error

char_ocr = character_tool.optical_character_recognition()

Word Manipulation

Word-level Prompt Perturbation WordPerturb class for manipulating words in a sentence

NOTE: the number of words in a sentence is only the valid words without considering spaces, special symbols, and punctuations

from promptcraft import word

sentence = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May."
level = 0.25  # Percentage of words that will be manipulated
word_tool = word.WordPerturb(sentence=sentence, level=level)

Synonym Replacement

Randomly choose $n$ words from the sentence that are not stop words.
Replace each of these words with one of its synonyms chosen at random.
Problem 1: Without any synonyms
Problem 2: Fewer positions than needed positions

word_synonym = word_tool.synonym_replacement()

Word Insertion

Find a random synonym of a random word in the sentence that is not a stop word.
Insert that synonym into a random position in the sentence.
Do this $n$ times.

word_insert = word_tool.word_insertion()

Word Swap

Randomly choose two words in the sentence and swap their positions.
Do this $n$ times.

word_swap = word_tool.word_swap()

Word Deletion

Each word in the sentence can be randomly removed with probability $p$.

word_delete = word_tool.word_deletion()

Insert Punctuation

Randomly insert punctuation in the sentence with probability $p$.

word_punctuation = word_tool.insert_punctuation()

Word Split

Randomly split a word to two tokens randomly

word_split = word_tool.word_split()

Sentence Paraphrasing

Sentence-level Prompt Perturbation
SentencePerturb class for directly manipulating a sentence

from promptcraft import sentence

sen = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May."
sentence_tool = sentence.SentencePerturb(sentence=sen)

Back Translation by Hugging Face

Back translate the sentence (English $\rightarrow$ German $\rightarrow$ English) via 🤗 Hugging Face MarianMTModel

back_trans_hf = sentence_tool.back_translation_hugging_face()

Back Translation by Google Translator

Back translate the sentence (English $\rightarrow$ German $\rightarrow$ English) via Google Translate API

back_trans_google = sentence_tool.back_translation_google()

Paraphrasing

Paraphrasing the sentence via Parrot Paraphraser considering
(1) Adequency: Is the meaning preserved adequately?
(2) Fluency: Is the paraphrase fluent English?
(3) Diversity: (Lexical / Phrasal / Syntactical): How much has the paraphrase changed the original sentence?

sen_paraphrase = sentence_tool.paraphrase()

Formal Style

Transform the sentence style to Formal

sen_formal = sentence_tool.formal()

Casual Style

Transform the sentence style to Casual

sen_casual = sentence_tool.casual()

Passive Style

Transform the sentence style to Passive

sen_passive = sentence_tool.passive()

Active Style

Transform the sentence style to Active

sen_active = sentence_tool.active()

Parallel Processing

Since all the methods are executed on the CPU, they can be performed in parallel using the multiprocessing package.

Structure of the Code

At the root of the project, you will see:

.
├── LICENSE
├── README.md
├── promptcraft
│   ├── __init__.py
│   ├── character.py
│   ├── parrot.py
│   ├── sentence.py
│   ├── styleformer.py
│   └── word.py
├── setup.cfg
└── setup.py

Citation

If you find our list useful, please consider citing our repo and toolkit in your publications. We provide a BibTeX entry below.

@misc{JiaPromptCraft23,
      author = {Jia, Shuyue},
      title = {{PromptCraft}: A Prompt Perturbation Toolkit},
      year = {2023},
      publisher = {GitHub},
      journal = {GitHub Repository},
      howpublished = {\url{https://github.com/SuperBruceJia/promptcraft}},
}

@misc{JiaAwesomeLLM23,
      author = {Jia, Shuyue},
      title = {Awesome-{LLM}-Self-Consistency},
      year = {2023},
      publisher = {GitHub},
      journal = {GitHub Repository},
      howpublished = {\url{https://github.com/SuperBruceJia/Awesome-LLM-Self-Consistency}},
}

@misc{JiaAwesomeSTS23,
      author = {Jia, Shuyue},
      title = {Awesome-Semantic-Textual-Similarity},
      year = {2023},
      publisher = {GitHub},
      journal = {GitHub Repository},
      howpublished = {\url{https://github.com/SuperBruceJia/Awesome-Semantic-Textual-Similarity}},
}

Acknowledgement

This work was finished during my 2023 fall semester research rotation at the Dependable Computing Laboratory, Department of Electrical and Computer Engineering, Boston University, under the supervision of Prof. Wenchao Li.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

promptcraft-0.4.2.tar.gz (16.6 kB view details)

Uploaded Source

Built Distribution

promptcraft-0.4.2-py3-none-any.whl (16.2 kB view details)

Uploaded Python 3

File details

Details for the file promptcraft-0.4.2.tar.gz.

File metadata

  • Download URL: promptcraft-0.4.2.tar.gz
  • Upload date:
  • Size: 16.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for promptcraft-0.4.2.tar.gz
Algorithm Hash digest
SHA256 676f4e5f396c152845dc3f0d1528a0c91a78122317eba9b3dec951d3af4683ef
MD5 9aea0b3b7ae3c9bfc5c629eb73de0f94
BLAKE2b-256 ca91dedb5f8205b41b1c2a757765d5064b1d624bfff65a746d7029589195e621

See more details on using hashes here.

File details

Details for the file promptcraft-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: promptcraft-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 16.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for promptcraft-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3525965622d0dbe9fa09fd6b05a27c8dc986cc7448854f420d53b0aac088ac83
MD5 191ff9d1bb2b0aaa55241ce8333734f5
BLAKE2b-256 fa400af9110beddeffcde24bc6c48d0f45d284a3a73fc230318b29c8e9dcd7ab

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page