Skip to main content

Text tokenizers.

Project description

totokenizers

A model-agnostic library to encode text into tokens and couting them using different tokenizers.

install

pip install totokenizers

usage

from totokenizers.factories import TotoModelInfo, Totokenizer

model = "openai/gpt-3.5-turbo-0613"
desired_max_tokens = 250
tokenizer = Totokenizer.from_model(model)
model_info = TotoModelInfo.from_model(model)

thread_length = tokenizer.count_chatml_tokens(thread, functions)
if thread_length + desired_max_tokens > model_info.max_tokens:
    raise YourException(thread_length, desired_max_tokens, model_info.max_tokens)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

totokenizers-1.3.0.tar.gz (755.7 kB view details)

Uploaded Source

Built Distribution

totokenizers-1.3.0-py3-none-any.whl (766.1 kB view details)

Uploaded Python 3

File details

Details for the file totokenizers-1.3.0.tar.gz.

File metadata

  • Download URL: totokenizers-1.3.0.tar.gz
  • Upload date:
  • Size: 755.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for totokenizers-1.3.0.tar.gz
Algorithm Hash digest
SHA256 9b154f870bd846e546e605f7215abe6957b189a91626918461e8b1c0df919ef9
MD5 679a4ac9fc7be7d4ad8e149bcd32a898
BLAKE2b-256 8cdaacd716b0d70181c1a788985d829ed9818513347946623bbb3d4285992c43

See more details on using hashes here.

Provenance

File details

Details for the file totokenizers-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: totokenizers-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 766.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for totokenizers-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 47309a1122809056895e1a57b683a51a3401ba92837e78412c2d43c305554ca6
MD5 2a3ca0e1b356c3b69034c45e58815876
BLAKE2b-256 7332277ddbf6601e9fd8312ee9a12919f71c1f771972a24f43751b8e133d5262

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page