Skip to main content

Text tokenizers.

Project description

totokenizers

A model-agnostic library to encode text into tokens and couting them using different tokenizers.

install

pip install totokenizers

usage

from totokenizers.factories import TotoModelInfo, Totokenizer

mdoel = "openai/gpt-3.5-turbo-0613"
desired_max_tokens = 250
tokenizer = Totokenizer.from_model(model)
model_info = TotoModelInfo.from_model(model)

thread_length = tokenizer.count_chatml_tokens(thread, functions)
if thread_length + desired_max_tokens > model_info.max_tokens:
    raise YourException(thread_length, desired_max_tokens, model_info.max_tokens)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

totokenizers-1.2.2.tar.gz (754.0 kB view details)

Uploaded Source

Built Distribution

totokenizers-1.2.2-py3-none-any.whl (764.3 kB view details)

Uploaded Python 3

File details

Details for the file totokenizers-1.2.2.tar.gz.

File metadata

  • Download URL: totokenizers-1.2.2.tar.gz
  • Upload date:
  • Size: 754.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.8

File hashes

Hashes for totokenizers-1.2.2.tar.gz
Algorithm Hash digest
SHA256 f9cbc6678b7bbd1c527e59a2c46a466c22b10d88e883f8810839e8ade575d71a
MD5 27939eb41feb5ca3e992605d1f172572
BLAKE2b-256 cf0b1123fef0f02d64c890ecb22e5b20df21d005c39c509dc120e5281edd649c

See more details on using hashes here.

Provenance

File details

Details for the file totokenizers-1.2.2-py3-none-any.whl.

File metadata

  • Download URL: totokenizers-1.2.2-py3-none-any.whl
  • Upload date:
  • Size: 764.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.8

File hashes

Hashes for totokenizers-1.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 08976f3753b1132c8da317cc79b9b555f208825a3a494b1d5e4cd305aca3d79a
MD5 e595d2631cb2f17fc2cc8559eca5dd17
BLAKE2b-256 3a5d5d399f1b1c189af41b7957e011e9948331facb48f5d4a2f82c44f3dc7a83

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page