A helper library for estimating tokens used by messages sent through OpenAI Chat Completions API.
Project description
openai-messages-token-helper
A helper library for estimating tokens used by messages and building messages lists that fit within the token limits of a model. Currently designed to work with the OpenAI GPT models (including GPT-4 turbo with vision). Uses the tiktoken library for tokenizing text and the Pillow library for image-related calculations.
Installation
Install the package:
python3 -m pip install openai-messages-token-helper
Usage
The library provides the following functions:
build_messages
Build a list of messages for a chat conversation, given the system prompt, new user message, and past messages. The function will truncate the history of past messages if necessary to stay within the token limit.
Arguments:
model
(str
): The model name to use for token calculation, like gpt-3.5-turbo.system_prompt
(str
): The initial system prompt message.tools
(List[openai.types.chat.ChatCompletionToolParam]
): (Optional) The tools that will be used in the conversation. These won't be part of the final returned messages, but they will be used to calculate the token count.tool_choice
(openai.types.chat.ChatCompletionNamedToolChoiceParam
): (Optional) The tool choice that will be used in the conversation. This won't be part of the final returned messages, but it will be used to calculate the token count.new_user_content
(str | List[openai.types.chat.ChatCompletionContentPartParam]
): (Optional) The content of new user message to append.past_messages
(list[openai.types.chat.ChatCompletionMessageParam]
): (Optional) The list of past messages in the conversation.few_shots
(list[openai.types.chat.ChatCompletionMessageParam]
): (Optional) A few-shot list of messages to insert after the system prompt.max_tokens
(int
): (Optional) The maximum number of tokens allowed for the conversation.fallback_to_default
(bool
): (Optional) Whether to fallback to default model/token limits if model is not found. Defaults toFalse
.
Returns:
list[openai.types.chat.ChatCompletionMessageParam]
Example:
from openai_messages_token_helper import build_messages
messages = build_messages(
model="gpt-35-turbo",
system_prompt="You are a bot.",
new_user_content="That wasn't a good poem.",
past_messages=[
{
"role": "user",
"content": "Write me a poem",
},
{
"role": "assistant",
"content": "Tuna tuna I love tuna",
},
],
few_shots=[
{
"role": "user",
"content": "Write me a poem",
},
{
"role": "assistant",
"content": "Tuna tuna is the best",
},
]
)
count_tokens_for_message
Counts the number of tokens in a message.
Arguments:
model
(str
): The model name to use for token calculation, like gpt-3.5-turbo.message
(openai.types.chat.ChatCompletionMessageParam
): The message to count tokens for.default_to_cl100k
(bool
): Whether to default to the CL100k token limit if the model is not found.
Returns:
int
: The number of tokens in the message.
Example:
from openai_messages_token_helper import count_tokens_for_message
message = {
"role": "user",
"content": "Hello, how are you?",
}
model = "gpt-4"
num_tokens = count_tokens_for_message(model, message)
count_tokens_for_image
Count the number of tokens for an image sent to GPT-4-vision, in base64 format.
Arguments:
image
(str
): The base64-encoded image.
Returns:
int
: The number of tokens used up for the image.
Example:
Count the number of tokens for an image sent to GPT-4-vision:
```python
from openai_messages_token_helper import count_tokens_for_image
image = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEA..."
num_tokens = count_tokens_for_image(image)
get_token_limit
Get the token limit for a given GPT model name (OpenAI.com or Azure OpenAI supported).
Arguments:
model
(str
): The model name to use for token calculation, like gpt-3.5-turbo (OpenAI.com) or gpt-35-turbo (Azure).default_to_minimum
(bool
): Whether to default to the minimum token limit if the model is not found.
Returns:
int
: The token limit for the model.
Example:
from openai_messages_token_helper import get_token_limit
model = "gpt-4"
max_tokens = get_token_limit(model)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for openai_messages_token_helper-0.1.4.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | f94d3c490c2b4266c112eaaf456634220d3b15ffe7437ac0371dfc0810bb56a7 |
|
MD5 | a997dd6d59bbb81fbd9c01ed21ea6e5c |
|
BLAKE2b-256 | c43a5a9fe8de120d11f58ea06e5509902d84e99176a7b412800ca8cb7a06aa26 |
Hashes for openai_messages_token_helper-0.1.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 58c59049315cb167bdd94c0770e3d4fe304b4bf3eb8670e90a97c660404d8423 |
|
MD5 | a936e3f121d77aac89667a3ad6797422 |
|
BLAKE2b-256 | d99f586ea1329b91cb17124bc9d0b4c116544b32682e4b53cd52d793509861d0 |