Skip to main content

Text tokenizers.

Project description

totokenizers

A model-agnostic library to encode text into tokens and couting them using different tokenizers.

install

pip install totokenizers

usage

from totokenizers.factories import TotoModelInfo, Totokenizer

model = "openai/gpt-3.5-turbo-0613"
desired_max_tokens = 250
tokenizer = Totokenizer.from_model(model)
model_info = TotoModelInfo.from_model(model)

thread_length = tokenizer.count_chatml_tokens(thread, functions)
if thread_length + desired_max_tokens > model_info.max_tokens:
    raise YourException(thread_length, desired_max_tokens, model_info.max_tokens)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

totokenizers-1.2.6.tar.gz (755.4 kB view details)

Uploaded Source

Built Distribution

totokenizers-1.2.6-py3-none-any.whl (765.9 kB view details)

Uploaded Python 3

File details

Details for the file totokenizers-1.2.6.tar.gz.

File metadata

  • Download URL: totokenizers-1.2.6.tar.gz
  • Upload date:
  • Size: 755.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for totokenizers-1.2.6.tar.gz
Algorithm Hash digest
SHA256 a7136a36c72a1795acca8b40a695a57c5269b278a12e8871ac69be869a1e5fa5
MD5 b136f35296d6385adb983ad4dc1b6d46
BLAKE2b-256 8f3ea15fde898e8decc65de9872e4f79dbd5b3556054b65c0736e363a04ea3f6

See more details on using hashes here.

Provenance

File details

Details for the file totokenizers-1.2.6-py3-none-any.whl.

File metadata

  • Download URL: totokenizers-1.2.6-py3-none-any.whl
  • Upload date:
  • Size: 765.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for totokenizers-1.2.6-py3-none-any.whl
Algorithm Hash digest
SHA256 82446143329d2b48f3318ecdff6143baa73613ee91791dda70977ed9ca44f65b
MD5 30c08b47a93fdad3a18f812b1cd1e509
BLAKE2b-256 ec6bac27f9163b93f702f57d6c2a6532bad8aeb6e038ecc81900dd2644195242

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page