Skip to main content

Text tokenizers.

Project description

totokenizers

A model-agnostic library to encode text into tokens and couting them using different tokenizers.

install

pip install totokenizers

usage

from totokenizers.factories import TotoModelInfo, Totokenizer

model = "openai/gpt-3.5-turbo-0613"
desired_max_tokens = 250
tokenizer = Totokenizer.from_model(model)
model_info = TotoModelInfo.from_model(model)

thread_length = tokenizer.count_chatml_tokens(thread, functions)
if thread_length + desired_max_tokens > model_info.max_tokens:
    raise YourException(thread_length, desired_max_tokens, model_info.max_tokens)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

totokenizers-1.9.3.tar.gz (757.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

totokenizers-1.9.3-py3-none-any.whl (767.0 kB view details)

Uploaded Python 3

File details

Details for the file totokenizers-1.9.3.tar.gz.

File metadata

  • Download URL: totokenizers-1.9.3.tar.gz
  • Upload date:
  • Size: 757.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for totokenizers-1.9.3.tar.gz
Algorithm Hash digest
SHA256 c7d86dc92497f5853b6e60c44fc4e981fc27166a552091608d77713f07f71db9
MD5 dc870221ab6174b2d747d95a35a2e812
BLAKE2b-256 d6838ce861b59317ae08b7492ff6a010d1f53e4817ef3802da6d64bf6d3ffdc1

See more details on using hashes here.

File details

Details for the file totokenizers-1.9.3-py3-none-any.whl.

File metadata

  • Download URL: totokenizers-1.9.3-py3-none-any.whl
  • Upload date:
  • Size: 767.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for totokenizers-1.9.3-py3-none-any.whl
Algorithm Hash digest
SHA256 39069a33c95a88ca388a4886b1c781441d7a6f1e1b9d5a07e169836a9b97af89
MD5 a26b15716e9da1c43c84074e4538fda4
BLAKE2b-256 afc74cf8e2b372a5224181778b82fe890d870805e170b789910aab7c6c1fae11

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page