Skip to main content

Simple wrapper around tiktoken to use it in your favorite language.

Project description

tiktoken-cli

Simple wrapper around tiktoken to use it in your favorite language.

Why

If you play with openAI's GPT API, you probably encounter one annoying problem : your prompt is allowed a given amount of tokens, you have no idea how those tokens are counted, and you only know it was too much when the API replies with an error, which is seriously annoying (and slow).

OpenAI published its tiktoken token counting library to solve that, which helps a lot! If you're writing a python program.

tiktoken-cli allows you to write your program in any language. It's a simple wrapper around tiktoken that will read your prompt from STDIN and write the tokens to STDOUT. Now you can write your program in your favorite language and use the CLI for token counting.

It's almost not worth publishing a github repos for so few lines, but I figured that README explanation would be valuable for people wondering how to use openAI's API in their favorite language, the code is merely an executable example.

A note for those of you who may be new to GPT's API : having the count of tokens from your prompt alone is not enough to avoid the exceed context errors. This is because the API wraps your messages with its own content. The exact algorithm to know if you're going to exceed the count is documented in the API doc here (it's in the block "Deep dive : Counting tokens for chat API calls").

Install

tiktoken-cli is a simple script, you can install via pipx.

pipx install tiktoken-cli

Usage

To use tiktoken send your prompt as STDIN, and read the tokens as STDOUT.

Examples:

In shell:

tiktoken --model gpt-4 in.txt out.txt

Replace the file with - for standard input/output:

echo "Hello, world!" | tiktoken --model gpt-4 - out.txt # writes tokens to out.txt
tiktoken --model gpt-4 in.txt - # writes tokens to stdout

You can count tokens easily:

echo "Hello, world!" | tiktoken - | wc -l

Model

tiktoken counts tokens differently based on model. By default, the model used is gpt-3.5-turbo.

You can change the model using the --model option.

For the full list of models available:

tiktoken --help

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tiktoken_cli-1.0.0.tar.gz (2.1 kB view details)

Uploaded Source

Built Distribution

tiktoken_cli-1.0.0-py3-none-any.whl (3.0 kB view details)

Uploaded Python 3

File details

Details for the file tiktoken_cli-1.0.0.tar.gz.

File metadata

  • Download URL: tiktoken_cli-1.0.0.tar.gz
  • Upload date:
  • Size: 2.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.25.1

File hashes

Hashes for tiktoken_cli-1.0.0.tar.gz
Algorithm Hash digest
SHA256 0878121b924bb8d43f2f7f22e1019919844487e7a5d419f58aa9e50a8d30bda6
MD5 2b696bcf7c82818b80f6c3a2b26a3c82
BLAKE2b-256 5700895703fa45fe2793d2dc36ffc179298c9f405513d36649da9530cbdf1e2c

See more details on using hashes here.

File details

Details for the file tiktoken_cli-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for tiktoken_cli-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 330d7ac364cab3fb6c7b3f44900861a9262c3d175c6a98f91d739ecb3d773624
MD5 827e4e7b7157cba80dfe330b0cb42de5
BLAKE2b-256 b95708b46c7ac5caddbb2a0fcb7d3ff9e1115be23a039109d9b1424e88a2cef2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page