Skip to main content

Text tokenizers.

Project description

totokenizers

A model-agnostic library to encode text into tokens and couting them using different tokenizers.

install

pip install totokenizers

usage

from totokenizers.factories import TotoModelInfo, Totokenizer

mdoel = "openai/gpt-3.5-turbo-0613"
desired_max_tokens = 250
tokenizer = Totokenizer.from_model(model)
model_info = TotoModelInfo.from_model(model)

thread_length = tokenizer.count_chatml_tokens(thread, functions)
if thread_length + desired_max_tokens > model_info.max_tokens:
    raise YourException(thread_length, desired_max_tokens, model_info.max_tokens)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

totokenizers-1.2.0.tar.gz (753.9 kB view details)

Uploaded Source

Built Distribution

totokenizers-1.2.0-py3-none-any.whl (764.1 kB view details)

Uploaded Python 3

File details

Details for the file totokenizers-1.2.0.tar.gz.

File metadata

  • Download URL: totokenizers-1.2.0.tar.gz
  • Upload date:
  • Size: 753.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for totokenizers-1.2.0.tar.gz
Algorithm Hash digest
SHA256 886e474521d43dc2fb7e91a63b2587e0be3c933b1cf033d13dd9617344c38fff
MD5 379a5a0bac1dd24f9ec90f8987fb840e
BLAKE2b-256 349fbae1af8fd84f577b9a0fa17149e30bdfc0c2dd73314e03005df25e89b96d

See more details on using hashes here.

Provenance

File details

Details for the file totokenizers-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: totokenizers-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 764.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for totokenizers-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1a03f532414f9385850df759f0473717c01c3910bb23018a70dcab8f828fbbf7
MD5 3eef3df4b227cc87bb65136c0f3e5260
BLAKE2b-256 ed5c7673e0402e9b861b3ab3ee6d5e1bef32bebc2ac828a67657dc2f129d95d2

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page