Skip to main content

Text tokenizers.

Project description

totokenizers

A model-agnostic library to encode text into tokens and couting them using different tokenizers.

install

pip install totokenizers

usage

from totokenizers.factories import TotoModelInfo, Totokenizer

mdoel = "openai/gpt-3.5-turbo-0613"
desired_max_tokens = 250
tokenizer = Totokenizer.from_model(model)
model_info = TotoModelInfo.from_model(model)

thread_length = tokenizer.count_chatml_tokens(thread, functions)
if thread_length + desired_max_tokens > model_info.max_tokens:
    raise YourException(thread_length, desired_max_tokens, model_info.max_tokens)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

totokenizers-1.2.5.4.tar.gz (755.4 kB view details)

Uploaded Source

Built Distribution

totokenizers-1.2.5.4-py3-none-any.whl (765.9 kB view details)

Uploaded Python 3

File details

Details for the file totokenizers-1.2.5.4.tar.gz.

File metadata

  • Download URL: totokenizers-1.2.5.4.tar.gz
  • Upload date:
  • Size: 755.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.9

File hashes

Hashes for totokenizers-1.2.5.4.tar.gz
Algorithm Hash digest
SHA256 e5ffa847432d3ac3e0b3611621a3de26f9c46118ce704a943b5b47555922828f
MD5 83e80144bcc3e391c992cda1233a1e3b
BLAKE2b-256 5a92be1f2f219c97554d7b40a06a8f3e04e22adeeea5df738c47c327ffbc066f

See more details on using hashes here.

Provenance

File details

Details for the file totokenizers-1.2.5.4-py3-none-any.whl.

File metadata

File hashes

Hashes for totokenizers-1.2.5.4-py3-none-any.whl
Algorithm Hash digest
SHA256 ba65b548ef500ed3aba3eaa32172c07afcdb86a00474f0f7d00a74f18653120b
MD5 79a007e846fd11bef6d45c5aeab895fa
BLAKE2b-256 f2b3c9af912f467c7fd3417d4f29632cca90e6b543ba14a0d77f3061a1adec86

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page