Skip to main content

Deprecated. Please use 'llamppl' instead.

Project description

LLaMPPL + HuggingFace

docs Tests codecov

⚠️ DEPRECATION NOTICE ⚠️ This package has been renamed to llamppl. You're looking at the final release of hfppl. No further updates will be published here.

LLaMPPL is a research prototype for language model probabilistic programming: specifying language generation tasks by writing probabilistic programs that combine calls to LLMs, symbolic program logic, and probabilistic conditioning. To solve these tasks, LLaMPPL uses a specialized sequential Monte Carlo inference algorithm. This technique, SMC steering, is described in our recent workshop abstract.

This repository implements LLaMPPL for use with HuggingFace Transformers.

Installation

If you just want to try out LLaMPPL, check out our demo notebook on Colab, which performs a simple constrained generation task using GPT-2. (Larger models may require more RAM or GPU resources than Colab's free version provides.)

[!NOTE] We use poetry to manage dependencies. If you don't have poetry installed, you can install it with pip install poetry.

To get started on your own machine, clone this repository and run poetry install to install hfppl and its dependencies.

git clone https://github.com/probcomp/hfppl
cd hfppl
poetry install

Then, try running an example. Note that this will cause the weights for Vicuna-7b-v1.5 to be downloaded.

poetry run python examples/hard_constraints.py

If everything is working, you should see the model generate political news using words that are at most five letters long (e.g., "Dr. Jill Biden may still be a year away from the White House but she is set to make her first trip to the U.N. today.").

vLLM backend

As of version 0.2.0, hfppl now supports vllm backend, which provides significant speedups over the HuggingFace backend. To install this backend, simply add the following:

poetry install --with vllm

Modeling with LLaMPPL

A LLaMPPL program is a subclass of the hfppl.Model class.

from hfppl import Model, LMContext, CachedCausalLM

# A LLaMPPL model subclasses the Model class
class MyModel(Model):

    # The __init__ method is used to process arguments
    # and initialize instance variables.
    def __init__(self, lm, prompt, forbidden_letter):
        super().__init__()

        # A stateful context object for the LLM, initialized with the prompt
        self.context = LMContext(lm, prompt)
        self.eos_token = lm.tokenizer.eos_token_id

        # The forbidden letter
        self.forbidden_tokens = set(i for (i, v) in enumerate(lm.vocab)
                                      if forbidden_letter in v)

    # The step method is used to perform a single 'step' of generation.
    # This might be a single token, a single phrase, or any other division.
    # Here, we generate one token at a time.
    async def step(self):
        # Condition on the next token *not* being a forbidden token.
        await self.observe(self.context.mask_dist(self.forbidden_tokens), False)

        # Sample the next token from the LLM -- automatically extends `self.context`.
        token = await self.sample(self.context.next_token())

        # Check for EOS or end of sentence
        if token.token_id == self.eos_token or str(token) in ['.', '!', '?']:
            # Finish generation
            self.finish()

    # To improve performance, a hint that `self.forbidden_tokens` is immutable
    def immutable_properties(self):
        return set(['forbidden_tokens'])

The Model class provides a number of useful methods for specifying a LLaMPPL program:

  • self.sample(dist[, proposal]) samples from the given distribution. Providing a proposal does not modify the task description, but can improve inference. Here, for example, we use a proposal that pre-emptively avoids the forbidden letter.
  • self.condition(cond) conditions on the given Boolean expression.
  • self.finish() indicates that generation is complete.
  • self.observe(dist, obs) performs a form of 'soft conditioning' on the given distribution. It is equivalent to (but more efficient than) sampling a value v from dist and then immediately running condition(v == obs).

To run inference, we use the smc_steer or smc_standard methods:

import asyncio
from hfppl import smc_steer

# Initialize the HuggingFace model
lm = CachedCausalLM.from_pretrained("meta-llama/Llama-2-7b-hf", backend='hf', auth_token=<YOUR_HUGGINGFACE_API_TOKEN_HERE>)

# Create a model instance
model = MyModel(lm, "The weather today is expected to be", "e")

# Run inference
particles = asyncio.run(smc_steer(model, 5, 3)) # number of particles N, and beam factor K

Sample output:

sunny.
sunny and cool.
34° (81°F) in Chicago with winds at 5mph.
34° (81°F) in Chicago with winds at 2-9 mph.
hot and humid with a possibility of rain, which is not uncommon for this part of Mississippi.

Further documentation can be found at https://genlm.github.io/hfppl.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hfppl-0.1.2.tar.gz (19.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hfppl-0.1.2-py3-none-any.whl (23.8 kB view details)

Uploaded Python 3

File details

Details for the file hfppl-0.1.2.tar.gz.

File metadata

  • Download URL: hfppl-0.1.2.tar.gz
  • Upload date:
  • Size: 19.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.11.5 Linux/6.8.0-1025-azure

File hashes

Hashes for hfppl-0.1.2.tar.gz
Algorithm Hash digest
SHA256 c882c42f997bfc4cd96bf5aa9df329aa05664bcd70b66ddc5bf2084ca19733e8
MD5 7057dbbacc447dc0c867fdeae4dcb4e9
BLAKE2b-256 1a24707a7d0e2d8410b9db7cf38b5963a1f2c33f58597051a6edff8b04ed6ee8

See more details on using hashes here.

File details

Details for the file hfppl-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: hfppl-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 23.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.11.5 Linux/6.8.0-1025-azure

File hashes

Hashes for hfppl-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 82b46d9496baa2b7c0315bc95ae6114e22c82cbdd68917df5baaf51738ee14f0
MD5 100f745c8ee724605ae5d1af4f604e31
BLAKE2b-256 e0d94212f94ed40cbdeafbcb48f405522a09c84470325492072da2aaa97fa32a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page