Skip to main content

Text tokenizers.

Project description

totokenizers

A model-agnostic library to encode text into tokens and couting them using different tokenizers.

install

pip install totokenizers

usage

from totokenizers.factories import TotoModelInfo, Totokenizer

mdoel = "openai/gpt-3.5-turbo-0613"
desired_max_tokens = 250
tokenizer = Totokenizer.from_model(model)
model_info = TotoModelInfo.from_model(model)

thread_length = tokenizer.count_chatml_tokens(thread, functions)
if thread_length + desired_max_tokens > model_info.max_tokens:
    raise YourException(thread_length, desired_max_tokens, model_info.max_tokens)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

totokenizers-1.2.4.1.tar.gz (754.1 kB view details)

Uploaded Source

Built Distribution

totokenizers-1.2.4.1-py3-none-any.whl (764.3 kB view details)

Uploaded Python 3

File details

Details for the file totokenizers-1.2.4.1.tar.gz.

File metadata

  • Download URL: totokenizers-1.2.4.1.tar.gz
  • Upload date:
  • Size: 754.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.9

File hashes

Hashes for totokenizers-1.2.4.1.tar.gz
Algorithm Hash digest
SHA256 1ff5f226e03ec1ea401edc8cde7ffc123e7b01badcfda41f17c7832e92ce7ef0
MD5 85729686fd74191326679f0341c812b1
BLAKE2b-256 f8ec5c4c24ca0ad7d5a703395aaee40c66524979dbd5249856bb2b15bf61c81b

See more details on using hashes here.

Provenance

File details

Details for the file totokenizers-1.2.4.1-py3-none-any.whl.

File metadata

File hashes

Hashes for totokenizers-1.2.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6d1a8d8689a456f3d544f69212c887bf2b421d13e044c429cde56701a60b91a5
MD5 1bef9d8addb711382136f8f130843c8c
BLAKE2b-256 47d58701308566f70d6bb0db73cd1f1f650328538bb1760e9223bb5d8b38e95d

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page