Skip to main content

Text tokenizers.

Project description

totokenizers

A model-agnostic library to encode text into tokens and couting them using different tokenizers.

install

pip install totokenizers

usage

from totokenizers.factories import TotoModelInfo, Totokenizer

model = "openai/gpt-3.5-turbo-0613"
desired_max_tokens = 250
tokenizer = Totokenizer.from_model(model)
model_info = TotoModelInfo.from_model(model)

thread_length = tokenizer.count_chatml_tokens(thread, functions)
if thread_length + desired_max_tokens > model_info.max_tokens:
    raise YourException(thread_length, desired_max_tokens, model_info.max_tokens)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

totokenizers-1.4.0.tar.gz (755.7 kB view details)

Uploaded Source

Built Distribution

totokenizers-1.4.0-py3-none-any.whl (766.3 kB view details)

Uploaded Python 3

File details

Details for the file totokenizers-1.4.0.tar.gz.

File metadata

  • Download URL: totokenizers-1.4.0.tar.gz
  • Upload date:
  • Size: 755.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for totokenizers-1.4.0.tar.gz
Algorithm Hash digest
SHA256 513d3ac266fa06c0d73f137a7f87329bfef94efb1cece10a9ed89da8840fc88d
MD5 aaaa44dadb1b623a4a0a500337342b6e
BLAKE2b-256 a7a5a2eed0dd7b8d93283743b22c04b413cf2f18b7e8ee1357aef983d21a1b74

See more details on using hashes here.

Provenance

File details

Details for the file totokenizers-1.4.0-py3-none-any.whl.

File metadata

  • Download URL: totokenizers-1.4.0-py3-none-any.whl
  • Upload date:
  • Size: 766.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for totokenizers-1.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 79d5fa61253e117d4fd42bb75fd202cc9f9165c902eaa5c93060dea36ab28e23
MD5 8120db98847ce1b06bdbde918ea7c17a
BLAKE2b-256 be5cb7adca055a27d3703438f0f3356fb42220a7f9a6704a7d05876e14cd01c2

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page