Simple wrapper around tiktoken to use it in your favorite language.
Project description
tiktoken-cli
Simple wrapper around tiktoken to use it in your favorite language.
Why
If you play with openAI's GPT API, you probably encounter one annoying problem : your prompt is allowed a given amount of tokens, you have no idea how those tokens are counted, and you only know it was too much when the API replies with an error, which is seriously annoying (and slow).
OpenAI published its tiktoken token counting library to solve that, which helps a lot! If you're writing a python program.
tiktoken-cli
allows you to write your program in any language. It's a
simple wrapper around tiktoken
that will read your prompt from STDIN and
write the tokens to STDOUT. Now you can write your program in
your favorite language and use the CLI for token counting.
It's almost not worth publishing a github repos for so few lines, but I figured that README explanation would be valuable for people wondering how to use openAI's API in their favorite language, the code is merely an executable example.
A note for those of you who may be new to GPT's API : having the count of tokens from your prompt alone is not enough to avoid the exceed context errors. This is because the API wraps your messages with its own content. The exact algorithm to know if you're going to exceed the count is documented in the API doc here (it's in the block "Deep dive : Counting tokens for chat API calls").
Install
tiktoken-cli
is a simple script, you can install via pipx.
pipx install tiktoken-cli
Usage
To use tiktoken
send your prompt as STDIN, and read the tokens
as STDOUT.
Examples:
In shell:
tiktoken --model gpt-4 in.txt out.txt
Replace the file with -
for standard input/output:
echo "Hello, world!" | tiktoken --model gpt-4 - out.txt # writes tokens to out.txt
tiktoken --model gpt-4 in.txt - # writes tokens to stdout
You can count tokens easily:
echo "Hello, world!" | tiktoken - | wc -l
Model
tiktoken
counts tokens differently based on model. By default, the model
used is gpt-3.5-turbo.
You can change the model using the --model
option.
For the full list of models available:
tiktoken --help
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tiktoken_cli-1.0.0.tar.gz
.
File metadata
- Download URL: tiktoken_cli-1.0.0.tar.gz
- Upload date:
- Size: 2.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.25.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0878121b924bb8d43f2f7f22e1019919844487e7a5d419f58aa9e50a8d30bda6 |
|
MD5 | 2b696bcf7c82818b80f6c3a2b26a3c82 |
|
BLAKE2b-256 | 5700895703fa45fe2793d2dc36ffc179298c9f405513d36649da9530cbdf1e2c |
File details
Details for the file tiktoken_cli-1.0.0-py3-none-any.whl
.
File metadata
- Download URL: tiktoken_cli-1.0.0-py3-none-any.whl
- Upload date:
- Size: 3.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.25.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 330d7ac364cab3fb6c7b3f44900861a9262c3d175c6a98f91d739ecb3d773624 |
|
MD5 | 827e4e7b7157cba80dfe330b0cb42de5 |
|
BLAKE2b-256 | b95708b46c7ac5caddbb2a0fcb7d3ff9e1115be23a039109d9b1424e88a2cef2 |