Toolkit for collecting and applying templates of prompting instances.

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 2 - Pre-Alpha
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Natural Language
- English
Programming Language
- Python :: 3
- Python :: 3.7

Project description

PromptSource

Promptsource is a toolkit for collecting and applying prompts to NLP datasets.

Promptsource uses a simple templating language to programatically map an example of a dataset into a text input and a text target.

Promptsource contains a growing collection of prompts (which we call P3: Public Pool of Prompts). As of October 18th, there are ~2'000 prompts for 170+ datasets in P3. Feel free to use these prompts as they are (you'll find citation details here).

Note that a subset of the prompts are still Work in Progress. You'll find the list of the prompts which will potentially be modified in the near future here. Modifications will in majority consist of metadata collection, but in some cases, will impact the templates themselves. To facilitate traceability, Promptsource is currently pinned at version 0.1.0.

Setup

Download the repo
Navigate to root directory of the repo
Install requirements with pip install -r requirements.txt in a Python 3.7 environment

Running

You can browse through existing prompts on the hosted versiond of Promptsource.

If you want to launch a local version (in particular to write propmts, from the root directory of the repo, launch the editor with:

streamlit run promptsource/app.py

There are 3 modes in the app:

Helicopter view: aggregate high level metrics on the current state of the sourcing
Prompted dataset viewer: check the templates you wrote or already written on entire dataset
Sourcing: write new prompts

Running (read-only)

To host a public streamlit app, launch it with

streamlit run promptsource/app.py -- -r

Prompting an Example:

You can use Promptsource with Datasets to create prompted examples:

# Get an example
from datasets import load_dataset
dataset = load_dataset("ag_news")
example = dataset["train"][0]

# Prompt it
from promptsource.templates import TemplateCollection
# Get all the prompts
collection = TemplateCollection()
# Get all the AG News prompts
ag_news_prompts = collection.get_dataset("ag_news")
# Select a prompt by name
prompt = ag_news_prompts["classify_question_first"]

result = prompt.apply(example)
print("INPUT: ", result[0])
print("TARGET: ", result[1])

Contributing

Contribution guidelines and step-by-step HOW TO are described here.

Writing Prompts

A prompt is expressed in Jinja.

It is rendered using an example from the corresponding Hugging Face datasets library (a dictionary). The separator ||| should appear once to divide the template into input and target. Generally, the prompt should provide information on the desired behavior, e.g., text passage and instructions, and the output should be a desired response.

For more information, read the Contribution guidelines.

Known Issues

Warning or Error about Darwin on OS X: Try downgrading PyArrow to 3.0.0.

ConnectionRefusedError: [Errno 61] Connection refused: Happens occasionally. Try restarting the app.

Development structure

Promptsource was developed as part of the BigScience project for open research 🌸, a year-long initiative targeting the study of large models and datasets. The goal of the project is to research language models in a public environment outside large technology companies. The project has 600 researchers from 50 countries and more than 250 institutions.

Citation

If you want to cite this P3 or Promptsource, you can use this bibtex:

@misc{sanh2021multitask,
      title={Multitask Prompted Training Enables Zero-Shot Task Generalization}, 
      author={Victor Sanh and Albert Webson and Colin Raffel and Stephen H. Bach and Lintang Sutawika and Zaid Alyafeai and Antoine Chaffin and Arnaud Stiegler and Teven Le Scao and Arun Raja and Manan Dey and M Saiful Bari and Canwen Xu and Urmish Thakker and Shanya Sharma Sharma and Eliza Szczechla and Taewoon Kim and Gunjan Chhablani and Nihal Nayak and Debajyoti Datta and Jonathan Chang and Mike Tian-Jian Jiang and Han Wang and Matteo Manica and Sheng Shen and Zheng Xin Yong and Harshit Pandey and Rachel Bawden and Thomas Wang and Trishala Neeraj and Jos Rozen and Abheesht Sharma and Andrea Santilli and Thibault Fevry and Jason Alan Fries and Ryan Teehan and Stella Biderman and Leo Gao and Tali Bers and Thomas Wolf and Alexander M. Rush},
      year={2021},
      eprint={2110.08207},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 2 - Pre-Alpha
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Natural Language
- English
Programming Language
- Python :: 3
- Python :: 3.7

Release history Release notifications | RSS feed

0.2.3

Apr 18, 2022

0.2.2

Mar 23, 2022

0.2.1

Feb 14, 2022

0.2.0

Feb 4, 2022

This version

0.1.0

Mar 23, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

promptsource-0.1.0.tar.gz (221.9 kB view details)

Uploaded Mar 23, 2022 Source

File details

Details for the file promptsource-0.1.0.tar.gz.

File metadata

Download URL: promptsource-0.1.0.tar.gz
Upload date: Mar 23, 2022
Size: 221.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.2.0 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.7.11

File hashes

Hashes for promptsource-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`726289b468dbd3148d11e37a9959587f4dc560a5b9e95e7ace540e6727646178`
MD5	`ce9e604ab0e149bf95405ee1eb9e1ccd`
BLAKE2b-256	`c7cc6a71cdc9d9547310fd0c48395f7ca8cada55c412fe14eb0ba3075f5c68eb`

See more details on using hashes here.

promptsource 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

PromptSource

Setup

Running

Running (read-only)

Prompting an Example:

Contributing

Writing Prompts

Known Issues

Development structure

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes