Skip to main content

Tokenize for subword

Project description

Genz Tokenize

Github

install via pip (from PyPI):

pip install genz-tokenize

Using

from genz_tokenize import Tokenize

tokenize = Tokenize('vocab.txt', 'bpe.codes')

print(tokenize(['sinh_viên công_nghệ', 'hello'], maxlen = 10))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genz-tokenize-1.0.4.tar.gz (411.6 kB view details)

Uploaded Source

Built Distribution

genz_tokenize-1.0.4-py3-none-any.whl (413.7 kB view details)

Uploaded Python 3

File details

Details for the file genz-tokenize-1.0.4.tar.gz.

File metadata

  • Download URL: genz-tokenize-1.0.4.tar.gz
  • Upload date:
  • Size: 411.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.7.9

File hashes

Hashes for genz-tokenize-1.0.4.tar.gz
Algorithm Hash digest
SHA256 3db381cc219a07519f31f0a78a71b9d32e32bc15c230d74b0f7ab9d00793891a
MD5 88b51c5191dade5e8f177b6c1d0d6754
BLAKE2b-256 7ce3de1033ac8f9f89e90ab72f9ad31fa751258cfde8518957ce7f8c8ce8eaa0

See more details on using hashes here.

File details

Details for the file genz_tokenize-1.0.4-py3-none-any.whl.

File metadata

  • Download URL: genz_tokenize-1.0.4-py3-none-any.whl
  • Upload date:
  • Size: 413.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.7.9

File hashes

Hashes for genz_tokenize-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 6aea1679ded77a9b3ae8b827862d87f5c7b33332d27158ebbf0197dbc7427e90
MD5 e0845b8f0e1af10906d09ec3e40329c6
BLAKE2b-256 e302587640985870bee9807f18f1f01ff1dbab7fa8f9aa4349c531f62bb25f03

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page