Library for running AdapativeConsistency based Inference on large language models.

Project description

Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning with LLMs

Pranjal Aggarwal, Aman Madaan, Yiming Yang, Mausam

Abstract

A popular approach for improving the correctness of output from large language models (LLMs) is Self-Consistency - poll the LLM multiple times and output the most frequent solution. Existing Self-Consistency techniques always draw a constant number of samples per question, where a better approach will be to non-uniformly distribute the available budget based on the amount of agreement in the samples drawn so far. In response, we introduce Adaptive-Consistency, a cost-efficient, model-agnostic technique that dynamically adjusts the number of samples per question using a lightweight stopping criterion. Our experiments over 13 datasets and two LLMs demonstrate that Adaptive-Consistency reduces sample budget by up to 6.0 times with an average accuracy drop of less than 0.1%.

AdaptiveConsistency

Adaptive Consistency:

This repository contains code for:

Adaptive-Consistency Library for Running efficient LLM generation using Adaptive-Consistency in your code.
Code to reproduce results of Adaptive-Consistency.

Installation

From PyPi

pip install AdaptiveConsistency

From Source

First, clone the repo:

git clone https://github.com/Pranjal2041/AdaptiveConsistency.git

Next install the package using:

python setup.py install

Usage

Using Adaptive Consistency in your code requires only 2-3 lines of changes in your existing framework.

1. Importing the library

from adaptive_consistency import AC, BetaStoppingCriteria

2. Initializing the library

ac = AC(model, stopping_criteria=BetaStoppingCriteria(0.95), max_gens = 40)

3. Using the library

You can directly run a whole loop of evaluation using:

ac.eval_loop(sampling_function, *args, **kwargs)

For example, if using Openai api for sampling, you can use:

import openai

ac.eval_loop(openai.Completion.create, engine="text-davinci-003", prompt="Solve the questions ahead", max_tokens=5)

Or you can check for consistency of answers at each step:

answers = []
for i in range(40):
    answers.append(generate_answer_from_model()) # Example openai.Completion.create
    if ac.should_stop(answers):
        break

4. Stoppping Criterias

You can use one of the following Stopping Criterias:

BetaStoppingCriteria (beta): Uses the Beta Distribution to guide the stopping criteria. This is the default stopping criteria.
DirichletStoppingCriteria (dirichlet): Uses the Dirichlet Distribution to guide the stopping criteria.
EntropyStoppingCriteria (entropy): Uses the Entropy of the distribution to guide the stopping criteria.
MajorityStoppingCriteria (majority): Uses the Majority ratio of the top element in the distribution to guide the stopping criteria.
RandomStoppingCriteria (random): Randomly stops the sampling process with a pre-defined probability.
CRPStoppingCriteria (crp): Uses the Chinese Restaurant Process to guide the stopping criteria.

Check out the paper for more details.

Reproducing Numbers

1. Downloading the data

Run,

bash download_data.sh

2. Downloading Model Outputs

We provide the model outputs for all the models used in the paper. You can download them using:

bash download_outputs.sh

These model outputs will work for all experiments in the paper.

3. Running Generations

If you decide to skip the previous step, you can run your generations on your own. You can use the following command:

bash scripts/run_self_consistency.sh
bash scripts/run_adaptive_consistency.sh

By default, beta function will be used for stopping criteria. You can change it by passing the stopping_criteria and corresponding Confidence Threshold as arguments. For example, to use entropy stopping criteria, with a Confidence Threshold of 0.75, you can use:

bash scripts/run_adaptive_consistency.sh entropy 0.75

This step will print the final accuracy on the terminal.

4. Running Eval on Model Outputs

You can skip Step 3, and directly run eval on the model outputs. You can use the following command:

python eval_outputs.py --output_file <path_to_output_file> --stop_criteria <stop_criteria> --stop_criteria_thresh <stop_criteria_thresh>

This will print the average generations and accuracy on the terminal.

Citation

@misc{aggarwal2023lets,
      title={Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning with LLMs}, 
      author={Pranjal Aggarwal and Aman Madaan and Yiming Yang and Mausam},
      year={2023},
      eprint={2305.11860},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

LICENSE

Adaptive-Consistency is MIT licensed, as found in the LICENSE file.

Project details

Release history Release notifications | RSS feed

1.0.0

May 22, 2023

This version

0.0.1

May 22, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

AdaptiveConsistency-0.0.1.tar.gz (7.2 kB view hashes)

Uploaded May 22, 2023 Source

Hashes for AdaptiveConsistency-0.0.1.tar.gz

Hashes for AdaptiveConsistency-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`382f962b225737d5a33ea132a680d9ed4064d2d5b41a90b868488c5f00bd5636`
MD5	`46608a822ec2717522d77a3447a806f2`
BLAKE2b-256	`9efd764272d512f9b0ecc8c2e2a0993ad7bfc9de7751b6c74369557ececd52c8`