A Python tokenizer trained on modern web corpus
Project description
BTok
A Python multilingual tokenizer trained on modern web corpus with SentencePiece.
Install
pip install btok --upgrade
Usage
Run tests:
python tests.py
See: tests.py
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
btok-0.2.tar.gz
(7.1 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
btok-0.2-py3-none-any.whl
(7.2 kB
view details)
File details
Details for the file btok-0.2.tar.gz.
File metadata
- Download URL: btok-0.2.tar.gz
- Upload date:
- Size: 7.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d41aae65122f120b6f560b424b0bb244d0416f506904b728401aca4837620bdc
|
|
| MD5 |
f59998d1d372716e3c5953b4920e891b
|
|
| BLAKE2b-256 |
2fdeb54b53bf3c5268db71a40d1818e517fd66a3a429248c8b6c1c674d6ea95e
|
File details
Details for the file btok-0.2-py3-none-any.whl.
File metadata
- Download URL: btok-0.2-py3-none-any.whl
- Upload date:
- Size: 7.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c1156e386cee7c45af0dc85587c96b309c7ec9a85f46fc850a189f35b83ab632
|
|
| MD5 |
94636b6b21112c12aea6cb1286d83d03
|
|
| BLAKE2b-256 |
685864ed0850a07aea850fbd92006c6f25ae4668e24fe2213835a2ece5ebb8a7
|