Skip to main content

Text tokenizers.

Project description

totokenizers

A model-agnostic library to encode text into tokens and couting them using different tokenizers.

install

pip install totokenizers

usage

from totokenizers.factories import TotoModelInfo, Totokenizer

mdoel = "openai/gpt-3.5-turbo-0613"
desired_max_tokens = 250
tokenizer = Totokenizer.from_model(model)
model_info = TotoModelInfo.from_model(model)

thread_length = tokenizer.count_chatml_tokens(thread, functions)
if thread_length + desired_max_tokens > model_info.max_tokens:
    raise YourException(thread_length, desired_max_tokens, model_info.max_tokens)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

totokenizers-1.2.1.tar.gz (754.0 kB view details)

Uploaded Source

Built Distribution

totokenizers-1.2.1-py3-none-any.whl (764.2 kB view details)

Uploaded Python 3

File details

Details for the file totokenizers-1.2.1.tar.gz.

File metadata

  • Download URL: totokenizers-1.2.1.tar.gz
  • Upload date:
  • Size: 754.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for totokenizers-1.2.1.tar.gz
Algorithm Hash digest
SHA256 59e61c9a9459e56276a601e49fa82d3d72c3b0c61ea9045b3077090226777086
MD5 0faefd0132d2ad57888e0dc2b6812679
BLAKE2b-256 11b64e186add05a6f2bf0ac1eb6ccba9e319b70596111ca3813b755ee0ea5968

See more details on using hashes here.

Provenance

File details

Details for the file totokenizers-1.2.1-py3-none-any.whl.

File metadata

  • Download URL: totokenizers-1.2.1-py3-none-any.whl
  • Upload date:
  • Size: 764.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for totokenizers-1.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 22a24eb4332bba3743629eec69565c1d032a7fda4f215b30f420a5e51db0eb2d
MD5 df9f3961cb6792ed1adfa7529299f756
BLAKE2b-256 52a55b2568f18e24e7e53a2d2c5d681fc8252135dbc02653894aeeb9fd8e54b0

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page