Skip to main content

Package for evaluating text tokenizations.

Project description

Simple package for evaluating text tokenizations. The input is a text (list of files or stdin) and output a single number. The higher the number, the better the tokenization. The intended workflow is to try multiple tokenizations and select the one with the highest number.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenization-scorer-1.0.0.tar.gz (3.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tokenization_scorer-1.0.0-py3-none-any.whl (3.9 kB view details)

Uploaded Python 3

File details

Details for the file tokenization-scorer-1.0.0.tar.gz.

File metadata

  • Download URL: tokenization-scorer-1.0.0.tar.gz
  • Upload date:
  • Size: 3.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.7

File hashes

Hashes for tokenization-scorer-1.0.0.tar.gz
Algorithm Hash digest
SHA256 b175cd124e41ad4a2e3c60a0b1706d133e77ce59fcc87cf5dbec7029f73fb528
MD5 2fd0d451849d70047339dcc191c505dd
BLAKE2b-256 bd60ffebac497789a8157f3894a63beea51f317d1cf1112c4a7f7deb102db431

See more details on using hashes here.

File details

Details for the file tokenization_scorer-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for tokenization_scorer-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c985762c75b9d0f482ac3038e7cf9da9114998345ef8d1cc98b27928a558bd49
MD5 b549601aa07a0f90066969a6bdbb72c6
BLAKE2b-256 82110d34eac08c5c0d1c49ad1cd85115a690fabd16871f7a5fd5776d38782b42

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page