Skip to main content

Package for evaluating text tokenizations.

Project description

Simple package for evaluating text tokenizations. The input is a text (list of files or stdin) and output a single number. The higher the number, the better the tokenization. The intended workflow is to try multiple tokenizations and select the one with the highest number.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenization-scorer-1.0.1.tar.gz (3.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tokenization_scorer-1.0.1-py3-none-any.whl (3.9 kB view details)

Uploaded Python 3

File details

Details for the file tokenization-scorer-1.0.1.tar.gz.

File metadata

  • Download URL: tokenization-scorer-1.0.1.tar.gz
  • Upload date:
  • Size: 3.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.7

File hashes

Hashes for tokenization-scorer-1.0.1.tar.gz
Algorithm Hash digest
SHA256 a5e63f6b3ea9265b254d8f46e367db48e48c6d56f2327aec5181ba89dca35ef6
MD5 1218556a7b8525c6f6989341a701664e
BLAKE2b-256 3bbb823d02d5180ea53343b6a35699347efdc385ba49565f99f20944e498bbbe

See more details on using hashes here.

File details

Details for the file tokenization_scorer-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for tokenization_scorer-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7ebdd4ae7f85ad9ad7697ae2eef8133f18889afada8aa2cfaa4d626f62a48567
MD5 7fe57cab03b85ae372979e65c1d63363
BLAKE2b-256 17aa0483b61df117abe6e0454fa46c82a72dbae54aa3a39122d173ad0f7bf506

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page