A helper library for estimating tokens used by messages.
Project description
llm-messages-token-helper
A helper library for estimating tokens used by messages and building messages lists that fit within the token limits of a model. Currently designed to work with the OpenAI GPT models (including GPT-4 turbo with vision). Uses the tiktoken library for tokenizing text and the Pillow library for image-related calculations.
Installation
Install the package:
python3 -m pip install llm-messages-token-helper
Usage
The library provides the following functions:
build_messages
Build a list of messages for a chat conversation, given the system prompt, new user message, and past messages. The function will truncate the history of past messages if necessary to stay within the token limit.
Arguments:
model
(str
): The model name to use for token calculation, like gpt-3.5-turbo.system_prompt
(str
): The initial system prompt message.new_user_message
(str | List[openai.types.chat.ChatCompletionContentPartParam]
): The new user message to append.past_messages
(list[dict]
): The list of past messages in the conversation.few_shots
(list[dict]
): A few-shot list of messages to insert after the system prompt.max_tokens
(int
): The maximum number of tokens allowed for the conversation.
Returns:
list[openai.types.chat.ChatCompletionMessageParam]
Example:
from llm_messages_token_helper import build_messages
messages = build_messages(
model="gpt-35-turbo",
system_prompt="You are a bot.",
new_user_message="That wasn't a good poem.",
past_messages=[
{
"role": "user",
"content": "Write me a poem",
},
{
"role": "assistant",
"content": "Tuna tuna I love tuna",
},
],
few_shots=[
{
"role": "user",
"content": "Write me a poem",
},
{
"role": "assistant",
"content": "Tuna tuna is the best",
},
]
)
count_tokens_for_message
Counts the number of tokens in a message.
Arguments:
model
(str
): The model name to use for token calculation, like gpt-3.5-turbo.
Returns:
int
: The number of tokens in the message.
Example:
from llm_messages_token_helper import count_tokens_for_message
message = {
"role": "user",
"content": "Hello, how are you?",
}
model = "gpt-4"
num_tokens = count_tokens_for_message(model, message)
count_tokens_for_image
Count the number of tokens for an image sent to GPT-4-vision, in base64 format.
Arguments:
image
(str
): The base64-encoded image.
Returns:
int
: The number of tokens used up for the image.
Example:
Count the number of tokens for an image sent to GPT-4-vision:
```python
from llm_messages_token_helper import count_tokens_for_image
image = "..."
num_tokens = count_tokens_for_image(image)
get_token_limit
Get the token limit for a given GPT model name (OpenAI.com or Azure OpenAI supported).
Arguments:
model
(str
): The model name to use for token calculation, like gpt-3.5-turbo (OpenAI.com) or gpt-35-turbo (Azure).
Returns:
int
: The token limit for the model.
Example:
from llm_messages_token_helper import get_token_limit
model = "gpt-4"
max_tokens = get_token_limit(model)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file llm_messages_token_helper-0.0.3.tar.gz
.
File metadata
- Download URL: llm_messages_token_helper-0.0.3.tar.gz
- Upload date:
- Size: 288.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.31.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 94a1fb2ae28c2c4c2a38569d4a6f4a921e0cea0838a33bfc645e05a7d6021b42 |
|
MD5 | 5de5317604cc541f815b3cc01c9d4ce8 |
|
BLAKE2b-256 | f8dd9d63ccc35ea8ce28ec9031e3b56f286b50f76e8e40a6afd743ff9159b0aa |
File details
Details for the file llm_messages_token_helper-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: llm_messages_token_helper-0.0.3-py3-none-any.whl
- Upload date:
- Size: 7.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.31.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 05ac74db4f79767db9d0611368e6639d37320aa3eafa00137aee405fc234e6e1 |
|
MD5 | 86dae66c6be03e792fe63656614aec81 |
|
BLAKE2b-256 | 5599c5103404ea4edf9ea1a5faa5d30d5aa7692987f56cd6bc4c1087acb630c2 |