Skip to main content

A simple library for working with Hugging Face models.

Project description

HFLM

Generate text and compute log probabilities for any ๐Ÿค— Transformers model using a simple interface.

Overview

The ๐Ÿค— Transformers library and model hub provide access to an encyclopedic collection of language models, and it can be hard to tell at a glance how to use any particular model to, for example, compute the probability of an arbitrary string. In the process of building out the Language Model Evaluation Harness, EleutherAI designed an interface that captures the main uses of the current generation of language models, especially in the context of evaluation. EleutherAI also implemented a class that provides this interface for Hugging Face language models. This class handles a large number of edge cases that need to be addressed to correctly compute log probabilities and generate text, but the code is closely integrated with the Language Model Evaluation Harness, making it challenging to use outside of that system.

This library provides a simple, minimal-dependency interface for working with ๐Ÿค— Transformers models, based on EleutherAI's Language Model Evaluation Harness. This interface is encapsulated in a single class, HFLM, that provides functions that generate text from a starting prefix and compute the probability of a string given a model. It can be used as a Python library, or via standalone scripts on the command line.

Installation

You can install HFLM via pypi:

pip install hflm

Or for a local installation from source you can clone this repository and run

pip install -e .

Command Line Usage

In addition to the Python library interface described below, we provide two scripts that can be used from the command line to compute text probabilities and generate text. These commands will also handle automatically downloading the model if necessary. For both of the scripts below, you can pass -h or --help to get detailed usage.

Note that these scripts do not, by themselves, handle any special instruction or chat templates that you may need to get the best possible results.

lmprob

The lmprob script takes a model and a string and returns the log probability of the string given the model. Here's an example using a small but capable model:

$ lmprob -m microsoft/Phi-3-mini-4k-instruct -s "This is a test." 
-16.835479736328125

This will print the log probability of the given string to the standard output and exit. Depending on the model additional text may log to the terminal, but this will not be included in any output redirection or piping.

By default, the script will attempt to use a CUDA device. You can optionally specify a different device to use for the calculation by passing the --device option. Example alternative devices include cpu if you don't have a CUDA-capable graphics card, or mps if you are running the script on an Apple silicon device. Any device recognized by pytorch should work.

lmgen

The lmgen script takes a model and initial string, and generates a continuation of that string:

$ lmgen -m microsoft/Phi-3-mini-4k-instruct -s "What is 1+1 equal to?" 


# Answer
1+1 equals 2.

As with lmprob, you can optionally pass a --device argument to match your machine's capabilities. To specify the number of new tokens to generate, use the option --max_new_tokens with an integer argument. You can also use the --temperature option with a floating point argument to increase the variability the generated text under repeated runs.

Library Usage

The Interface

The HFLM class supports the following interface for language models:

class LM(abc.ABC):

    @abc.abstractmethod
    def loglikelihood(self, requests) -> List[Tuple[float, bool]]:
        pass

    @abc.abstractmethod
    def loglikelihood_rolling(self, requests) -> List[Tuple[float]]:
        pass

    @abc.abstractmethod
    def generate_until(self, requests) -> List[str]:
        pass

The type of requests depends on which of the functions you call; see the example usage below.

Examples

Importing HFLM and creating a model

There is only one public import:

In [1]: from hflm import HFLM

You can create a model by passing a (required) model name. Several other optional parameters are supported, including a device, a batch size (which can be given as "auto" to enable automatic detection of the largest batch size that will work on your machine),

In [2]: m = HFLM(model="openai-community/gpt2", device="mps", batch_size=10)

Computing Log Probabilities

Once you instantiate a model, you can use it to compute log probabilities. the loglikelihood_rolling method takes a list of strings and returns a list of log probabilities, one per string. For example:

In [3]: m.loglikelihood_rolling(["This is a test.", "Colorless green ideas sleep furiously."])
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 2/2 [00:01<00:00,  1.59it/s]
Out[3]: [-18.791278839111328, -55.098876953125]

You can suppress the progress bar by passing disable_tqdm=True to loglikelihood_rolling.

Similarly, you can use the loglikelihood method to compute probabilities of pairs of strings. Each pair will be concatenated into a single string whose likelihood will be returned. This is especially useful for computing the probabilities of several completions of a given prefix string (as is often done when using multiple choice questions for language model evaluations):

In [4]: m.loglikelihood(("The most beautiful phrase in the English language is: ", x) for x in ["cellar door", "hobbit hole"])
Running loglikelihood requests: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 2/2 [00:00<00:00, 31.21it/s]
Out[4]: [(-18.94957733154297, False), (-24.65488052368164, False)]

Generating Text

You can also use the model to generate new text, using the generate_until method. This takes a list of pairs, consisting of a prefix string and a dictionary of text generation options. By default, the method will generate just 16 new tokens, and will do so deterministically. To change the number of tokens generated, pass "max_new_tokens" as a generation option. To inject randomness into the generation process, set the parameter "do_sample" to True and specify a floating-point "temperature" value greater than 0.0:

In [5]: m.generate_until([("The answer to life, the universe, and everything is ", {"max_new_tokens": 16, "temperature": 0.42, "do_sample":True})])
Out[5]: ["\xa0a question of time and space.\nIt's not as if we're"]

Acknowledgment

We owe an enormous debt of gratitude to the team at EleutherAI, whose Evaluation Harness forms the basis of the HFLM class. Their 2024 paper Lessons from the Trenches on Reproducible Evaluation of Language Models is worth a close read.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hflm-0.0.6.tar.gz (19.7 kB view details)

Uploaded Source

Built Distribution

hflm-0.0.6-py3-none-any.whl (18.2 kB view details)

Uploaded Python 3

File details

Details for the file hflm-0.0.6.tar.gz.

File metadata

  • Download URL: hflm-0.0.6.tar.gz
  • Upload date:
  • Size: 19.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for hflm-0.0.6.tar.gz
Algorithm Hash digest
SHA256 a50df011fd825a46727e19bad940c0c4a2850ba8342d0e4c47fafc8dd546b609
MD5 ec27245e756ce7efaed3f7a5e1a077cc
BLAKE2b-256 5a7b21f45b191c2f085ab66e0fd3f9fbb840d0e767fd804c4d91f25876958324

See more details on using hashes here.

File details

Details for the file hflm-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: hflm-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 18.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for hflm-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 cd60915b96b0bde4289e43552c8d15173cdba5cdf58cff53cabc5e91bbcf7f1d
MD5 126c65e8f160277bcb9b1a546e220425
BLAKE2b-256 2e99a6fed0e23a56132d22a18ac442b4f44e3267cc0726a08f3cc323596bbc40

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page