Skip to main content

A simple utility to estimate token counts using tiktoken o200k_base

Project description

Tokentik

A lightweight Python utility for estimating token counts in text, specifically optimized for modern LLMs using the o200k_base encoding.

Installation

pip install tokentik

Usage

from tokentik import count_tokens

text = "Hello, world!"
token_count = count_tokens(text)
print(f"Token count: {token_count}")

Configuration

Environment Variables

tiktoken needs to download and cache the BPE (Byte Pair Encoding) vocabulary files. By default, it uses a temporary directory. To specify a persistent location for these files, set the TIKTOKEN_CACHE_DIR environment variable:

export TIKTOKEN_CACHE_DIR="/path/to/your/models/tiktoken"

This is highly recommended for production environments or Cloud Run environments where the storage might be mounted (e.g., at /mnt/models/tiktoken).

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokentik-0.1.0.tar.gz (2.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tokentik-0.1.0-py3-none-any.whl (3.0 kB view details)

Uploaded Python 3

File details

Details for the file tokentik-0.1.0.tar.gz.

File metadata

  • Download URL: tokentik-0.1.0.tar.gz
  • Upload date:
  • Size: 2.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for tokentik-0.1.0.tar.gz
Algorithm Hash digest
SHA256 82cd50f5c97ac7fade726f335da0b456cb39239cbfb955fbcda4e29645537b31
MD5 a78ebfd702b0e88c55ae4ec8ad39d3a9
BLAKE2b-256 fe2be0e7efd6e20527bed7352fb126464a26e99266339fd5e536a33c01dc30ea

See more details on using hashes here.

File details

Details for the file tokentik-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: tokentik-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 3.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for tokentik-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5505ec55a9fa1849ad0d397ac3ca6b90b5c89b898e7ca7e65877be9e9c3f550a
MD5 a2073488f9904cde03bb9d7700565823
BLAKE2b-256 8997198a69a66be7341b7f384b1aeec84ca76092ecf4807e04c50cd78c8e9ce7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page