Skip to main content

A helper library for estimating tokens used by messages.

Project description

llm-messages-token-helper

A helper library for estimating tokens used by messages and building messages lists that fit within the token limits of a model. Currently designed to work with the OpenAI GPT models (including GPT-4 turbo with vision). Uses the tiktoken library for tokenizing text and the Pillow library for image-related calculations.

Installation

Install the package:

python3 -m pip install llm-messages-token-helper

Usage

The library provides the following functions:

build_messages

Build a list of messages for a chat conversation, given the system prompt, new user message, and past messages. The function will truncate the history of past messages if necessary to stay within the token limit.

Arguments:

  • model (str): The model name to use for token calculation, like gpt-3.5-turbo.
  • system_prompt (str): The initial system prompt message.
  • new_user_message (str | List[openai.types.chat.ChatCompletionContentPartParam]): The new user message to append.
  • past_messages (list[dict]): The list of past messages in the conversation.
  • few_shots (list[dict]): A few-shot list of messages to insert after the system prompt.
  • max_tokens (int): The maximum number of tokens allowed for the conversation.

Returns:

  • list[openai.types.chat.ChatCompletionMessageParam]

Example:

from llm_messages_token_helper import build_messages

messages = build_messages(
    model="gpt-35-turbo",
    system_prompt="You are a bot.",
    new_user_message="That wasn't a good poem.",
    past_messages=[
        {
            "role": "user",
            "content": "Write me a poem",
        },
        {
            "role": "assistant",
            "content": "Tuna tuna I love tuna",
        },
    ],
    few_shots=[
        {
            "role": "user",
            "content": "Write me a poem",
        },
        {
            "role": "assistant",
            "content": "Tuna tuna is the best",
        },
    ]
)

count_tokens_for_message

Counts the number of tokens in a message.

Arguments:

  • model (str): The model name to use for token calculation, like gpt-3.5-turbo.

Returns:

  • int: The number of tokens in the message.

Example:

from llm_messages_token_helper import count_tokens_for_message

message = {
    "role": "user",
    "content": "Hello, how are you?",
}
model = "gpt-4"
num_tokens = count_tokens_for_message(model, message)

count_tokens_for_image

Count the number of tokens for an image sent to GPT-4-vision, in base64 format.

Arguments:

  • image (str): The base64-encoded image.

Returns:

  • int: The number of tokens used up for the image.

Example:

Count the number of tokens for an image sent to GPT-4-vision:

```python
from llm_messages_token_helper import count_tokens_for_image

image = "..."
num_tokens = count_tokens_for_image(image)

get_token_limit

Get the token limit for a given GPT model name (OpenAI.com or Azure OpenAI supported).

Arguments:

  • model (str): The model name to use for token calculation, like gpt-3.5-turbo (OpenAI.com) or gpt-35-turbo (Azure).

Returns:

  • int: The token limit for the model.

Example:

from llm_messages_token_helper import get_token_limit

model = "gpt-4"
max_tokens = get_token_limit(model)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_messages_token_helper-0.0.3.tar.gz (288.1 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file llm_messages_token_helper-0.0.3.tar.gz.

File metadata

File hashes

Hashes for llm_messages_token_helper-0.0.3.tar.gz
Algorithm Hash digest
SHA256 94a1fb2ae28c2c4c2a38569d4a6f4a921e0cea0838a33bfc645e05a7d6021b42
MD5 5de5317604cc541f815b3cc01c9d4ce8
BLAKE2b-256 f8dd9d63ccc35ea8ce28ec9031e3b56f286b50f76e8e40a6afd743ff9159b0aa

See more details on using hashes here.

File details

Details for the file llm_messages_token_helper-0.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for llm_messages_token_helper-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 05ac74db4f79767db9d0611368e6639d37320aa3eafa00137aee405fc234e6e1
MD5 86dae66c6be03e792fe63656614aec81
BLAKE2b-256 5599c5103404ea4edf9ea1a5faa5d30d5aa7692987f56cd6bc4c1087acb630c2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page