This is a modified version based on the original openicl 0.1.8 for certain research usages.
Project description
Overview • Installation • Paper • Examples • Docs • Citation
Overview
OpenICL provides an easy interface for in-context learning, with many state-of-the-art retrieval and inference methods built in to facilitate systematic comparison of LMs and fast research prototyping. Users can easily incorporate different retrieval and inference methods, as well as different prompt instructions into their workflow.
What's News
- v0.1.8 Support LLaMA and self-consistency
Installation
Note: OpenICL requires Python 3.8+
Using Pip
pip install openicl
Installation for local development:
git clone https://github.com/Shark-NLP/OpenICL
cd OpenICL
pip install -e .
Quick Start
Following example shows you how to perform ICL on sentiment classification dataset. More examples and tutorials can be found at examples
Step 1: Load and prepare data
from datasets import load_dataset
from openicl import DatasetReader
# Loading dataset from huggingface
dataset = load_dataset('gpt3mix/sst2')
# Define a DatasetReader, with specified column names where input and output are stored.
data = DatasetReader(dataset, input_columns=['text'], output_column='label')
Step 2: Define the prompt template (Optional)
from openicl import PromptTemplate
tp_dict = {
0: "</E>Positive Movie Review: </text>",
1: "</E>Negative Movie Review: </text>"
}
template = PromptTemplate(tp_dict, {'text': '</text>'}, ice_token='</E>')
The placeholder </E>
and </text>
will be replaced by in-context examples and testing input, respectively. For more detailed information about PromptTemplate
(such as string-type template) , please see tutorial1.
Step 3: Initialize the Retriever
from openicl import TopkRetriever
# Define a retriever using the previous `DataLoader`.
# `ice_num` stands for the number of data in in-context examples.
retriever = TopkRetriever(data, ice_num=8)
Here we use the popular TopK method to build the retriever.
Step 4: Initialize the Inferencer
from openicl import PPLInferencer
inferencer = PPLInferencer(model_name='distilgpt2')
Step 5: Inference and scoring
from openicl import AccEvaluator
# the inferencer requires retriever to collect in-context examples, as well as a template to wrap up these examples.
predictions = inferencer.inference(retriever, ice_template=template)
# compute accuracy for the prediction
score = AccEvaluator().score(predictions=predictions, references=data.references)
print(score)
Docs
(updating...)
Citation
If you find this repository helpful, feel free to cite our paper:
@article{wu2023openicl,
title={OpenICL: An Open-Source Framework for In-context Learning},
author={Zhenyu Wu, Yaoxiang Wang, Jiacheng Ye, Jiangtao Feng, Jingjing Xu, Yu Qiao, Zhiyong Wu},
journal={arXiv preprint arXiv:2303.02913},
year={2023}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file openicl_nolabel-0.0.1.tar.gz
.
File metadata
- Download URL: openicl_nolabel-0.0.1.tar.gz
- Upload date:
- Size: 22.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c94db7074a9353a91899a094c94b81a8956a5e8e616ee17e2eba6ec16f0f3a7d |
|
MD5 | 83eaab5f963430c5e343455c5e4b4201 |
|
BLAKE2b-256 | b44e12eadc4955a8004c43474b772f0a59068a363cb34253ad6d9d811686037a |
File details
Details for the file openicl_nolabel-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: openicl_nolabel-0.0.1-py3-none-any.whl
- Upload date:
- Size: 37.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f6ad3a8c790af6ff099cc857666aa2314df3995b398ea4d3c54582b5df112ed3 |
|
MD5 | 14e42e259a368a5d0573a48286bc0e52 |
|
BLAKE2b-256 | 8785f2da5aedefd6abf0890bba467f674afdfe69777e6fc145072d84b1c1db4e |