A helper library for estimating tokens used by messages.
Project description
llm-messages-token-helper
A helper library for estimating tokens used by messages and building messages lists that fit within the token limits of a model. Currently designed to work with the OpenAI GPT models (including GPT-4 turbo with vision). Uses the tiktoken library for tokenizing text and the Pillow library for image-related calculations.
Installation
Install the package:
python3 -m pip install llm-messages-token-helper
Usage
The library provides the following functions:
build_messages
Build a list of messages for a chat conversation, given the system prompt, new user message, and past messages. The function will truncate the history of past messages if necessary to stay within the token limit.
Arguments:
model
(str
): The model name to use for token calculation, like gpt-3.5-turbo.system_prompt
(str
): The initial system prompt message.new_user_message
(str | List[openai.types.chat.ChatCompletionContentPartParam]
): The new user message to append.past_messages
(list[dict]
): The list of past messages in the conversation.few_shots
(list[dict]
): A few-shot list of messages to insert after the system prompt.max_tokens
(int
): The maximum number of tokens allowed for the conversation.
Returns:
list[openai.types.chat.ChatCompletionMessageParam]
Example:
from llm_messages_token_helper import build_messages
messages = build_messages(
model="gpt-35-turbo",
system_prompt="You are a bot.",
new_user_message="That wasn't a good poem.",
past_messages=[
{
"role": "user",
"content": "Write me a poem",
},
{
"role": "assistant",
"content": "Tuna tuna I love tuna",
},
],
few_shots=[
{
"role": "user",
"content": "Write me a poem",
},
{
"role": "assistant",
"content": "Tuna tuna is the best",
},
]
)
count_tokens_for_message
Counts the number of tokens in a message.
Arguments:
model
(str
): The model name to use for token calculation, like gpt-3.5-turbo.
Returns:
int
: The number of tokens in the message.
Example:
from llm_messages_token_helper import count_tokens_for_message
message = {
"role": "user",
"content": "Hello, how are you?",
}
model = "gpt-4"
num_tokens = count_tokens_for_message(model, message)
count_tokens_for_image
Count the number of tokens for an image sent to GPT-4-vision, in base64 format.
Arguments:
image
(str
): The base64-encoded image.
Returns:
int
: The number of tokens used up for the image.
Example:
Count the number of tokens for an image sent to GPT-4-vision:
```python
from llm_messages_token_helper import count_tokens_for_image
image = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEA..."
num_tokens = count_tokens_for_image(image)
get_token_limit
Get the token limit for a given GPT model name (OpenAI.com or Azure OpenAI supported).
Arguments:
model
(str
): The model name to use for token calculation, like gpt-3.5-turbo (OpenAI.com) or gpt-35-turbo (Azure).
Returns:
int
: The token limit for the model.
Example:
from llm_messages_token_helper import get_token_limit
model = "gpt-4"
max_tokens = get_token_limit(model)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for llm_messages_token_helper-0.0.3.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 94a1fb2ae28c2c4c2a38569d4a6f4a921e0cea0838a33bfc645e05a7d6021b42 |
|
MD5 | 5de5317604cc541f815b3cc01c9d4ce8 |
|
BLAKE2b-256 | f8dd9d63ccc35ea8ce28ec9031e3b56f286b50f76e8e40a6afd743ff9159b0aa |
Hashes for llm_messages_token_helper-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 05ac74db4f79767db9d0611368e6639d37320aa3eafa00137aee405fc234e6e1 |
|
MD5 | 86dae66c6be03e792fe63656614aec81 |
|
BLAKE2b-256 | 5599c5103404ea4edf9ea1a5faa5d30d5aa7692987f56cd6bc4c1087acb630c2 |