Skip to main content

Library for running AdapativeConsistency based Inference on large language models.

Project description

Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning with LLMs

WebsitePaper

GitHub license Twitter

Pranjal Aggarwal, Aman Madaan, Yiming Yang, Mausam

Abstract

A popular approach for improving the correctness of output from large language models (LLMs) is Self-Consistency - poll the LLM multiple times and output the most frequent solution. Existing Self-Consistency techniques always draw a constant number of samples per question, where a better approach will be to non-uniformly distribute the available budget based on the amount of agreement in the samples drawn so far. In response, we introduce Adaptive-Consistency, a cost-efficient, model-agnostic technique that dynamically adjusts the number of samples per question using a lightweight stopping criterion. Our experiments over 13 datasets and two LLMs demonstrate that Adaptive-Consistency reduces sample budget by up to 6.0 times with an average accuracy drop of less than 0.1%.

AdaptiveConsistency

Adaptive Consistency:

This repository contains code for:

  1. Adaptive-Consistency Library for Running efficient LLM generation using Adaptive-Consistency in your code.
  2. Code to reproduce results of Adaptive-Consistency.

Installation

From PyPi

pip install AdaptiveConsistency

From Source

First, clone the repo:

git clone https://github.com/Pranjal2041/AdaptiveConsistency.git

Next install the package using:

python setup.py install

Usage

Using Adaptive Consistency in your code requires only 2-3 lines of changes in your existing framework.

1. Importing the library

from adaptive_consistency import AC, BetaStoppingCriteria

2. Initializing the library

ac = AC(model, stopping_criteria=BetaStoppingCriteria(0.95), max_gens = 40)

3. Using the library

You can directly run a whole loop of evaluation using:

ac.eval_loop(sampling_function, *args, **kwargs)

For example, if using Openai api for sampling, you can use:

import openai

ac.eval_loop(openai.Completion.create, engine="text-davinci-003", prompt="Solve the questions ahead", max_tokens=5)

Or you can check for consistency of answers at each step:

answers = []
for i in range(40):
    answers.append(generate_answer_from_model()) # Example openai.Completion.create
    if ac.should_stop(answers):
        break

4. Stoppping Criterias

You can use one of the following Stopping Criterias:

  1. BetaStoppingCriteria (beta): Uses the Beta Distribution to guide the stopping criteria. This is the default stopping criteria.
  2. DirichletStoppingCriteria (dirichlet): Uses the Dirichlet Distribution to guide the stopping criteria.
  3. EntropyStoppingCriteria (entropy): Uses the Entropy of the distribution to guide the stopping criteria.
  4. MajorityStoppingCriteria (majority): Uses the Majority ratio of the top element in the distribution to guide the stopping criteria.
  5. RandomStoppingCriteria (random): Randomly stops the sampling process with a pre-defined probability.
  6. CRPStoppingCriteria (crp): Uses the Chinese Restaurant Process to guide the stopping criteria.

Check out the paper for more details.

Reproducing Numbers

1. Downloading the data

Run,

bash download_data.sh

2. Downloading Model Outputs

We provide the model outputs for all the models used in the paper. You can download them using:

bash download_outputs.sh

These model outputs will work for all experiments in the paper.

3. Running Generations

If you decide to skip the previous step, you can run your generations on your own. You can use the following command:

bash scripts/run_self_consistency.sh
bash scripts/run_adaptive_consistency.sh

By default, beta function will be used for stopping criteria. You can change it by passing the stopping_criteria and corresponding Confidence Threshold as arguments. For example, to use entropy stopping criteria, with a Confidence Threshold of 0.75, you can use:

bash scripts/run_adaptive_consistency.sh entropy 0.75

This step will print the final accuracy on the terminal.

4. Running Eval on Model Outputs

You can skip Step 3, and directly run eval on the model outputs. You can use the following command:

python eval_outputs.py --output_file <path_to_output_file> --stop_criteria <stop_criteria> --stop_criteria_thresh <stop_criteria_thresh>

This will print the average generations and accuracy on the terminal.

Citation

@misc{aggarwal2023lets,
      title={Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning with LLMs}, 
      author={Pranjal Aggarwal and Aman Madaan and Yiming Yang and Mausam},
      year={2023},
      eprint={2305.11860},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

LICENSE

Adaptive-Consistency is MIT licensed, as found in the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

AdaptiveConsistency-1.0.0.tar.gz (7.2 kB view details)

Uploaded Source

File details

Details for the file AdaptiveConsistency-1.0.0.tar.gz.

File metadata

  • Download URL: AdaptiveConsistency-1.0.0.tar.gz
  • Upload date:
  • Size: 7.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.13

File hashes

Hashes for AdaptiveConsistency-1.0.0.tar.gz
Algorithm Hash digest
SHA256 044e1af9218742beba82e61116287dd51d9f8cccfc187950b6543b64fcba3a97
MD5 6cbc0ecfe5a9d146c25acd0f9504b748
BLAKE2b-256 ad496132bb951dd3b670ab4bc41b70500dd2664b4a2af986c8bfb360f4c12715

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page