Skip to main content

Text tokenizers.

Project description

totokenizers

A model-agnostic library to encode text into tokens and couting them using different tokenizers.

install

pip install totokenizers

usage

from totokenizers.factories import TotoModelInfo, Totokenizer

model = "openai/gpt-3.5-turbo-0613"
desired_max_tokens = 250
tokenizer = Totokenizer.from_model(model)
model_info = TotoModelInfo.from_model(model)

thread_length = tokenizer.count_chatml_tokens(thread, functions)
if thread_length + desired_max_tokens > model_info.max_tokens:
    raise YourException(thread_length, desired_max_tokens, model_info.max_tokens)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

totokenizers-1.6.0.tar.gz (756.8 kB view details)

Uploaded Source

Built Distribution

totokenizers-1.6.0-py3-none-any.whl (766.9 kB view details)

Uploaded Python 3

File details

Details for the file totokenizers-1.6.0.tar.gz.

File metadata

  • Download URL: totokenizers-1.6.0.tar.gz
  • Upload date:
  • Size: 756.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for totokenizers-1.6.0.tar.gz
Algorithm Hash digest
SHA256 368fd32e4b8d70aa420c6ab6f41cf46b47ec6247fdf89c501d518a5eb4b6aa36
MD5 c58f66fe864b8642bc3522d370e26f6f
BLAKE2b-256 9e9c355b8d520989592aa1194bef036eadbcb40ee9f488a77f402afff0122b5d

See more details on using hashes here.

File details

Details for the file totokenizers-1.6.0-py3-none-any.whl.

File metadata

  • Download URL: totokenizers-1.6.0-py3-none-any.whl
  • Upload date:
  • Size: 766.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for totokenizers-1.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 38086fdf20941b35652dd62300783131e0bc36d7f4ced7fa9437aeac4b907acb
MD5 3996cf433117bbad79dcd36578dc6139
BLAKE2b-256 25a26904238554e4c3124ee28785659591315e01babde86d705150054ecb694e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page