A Python tokenizer trained on modern web corpus
Project description
BTok
A Python multilingual tokenizer trained on modern web corpus with SentencePiece.
Install
pip install btok --upgrade
Usage
Run tests:
python tests.py
See: tests.py
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
btok-0.3.1.tar.gz
(9.4 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
btok-0.3.1-py3-none-any.whl
(10.4 kB
view details)
File details
Details for the file btok-0.3.1.tar.gz.
File metadata
- Download URL: btok-0.3.1.tar.gz
- Upload date:
- Size: 9.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
09527c6b295e0ce3285d054fed4231ccb9c3ea311d71bd2b7b326880f4044e82
|
|
| MD5 |
91f15304711f138da54855c99c4915ed
|
|
| BLAKE2b-256 |
38f3d7479008d99171e0463ae1641d2f348ffbddb36545c11f956bc12ba42f11
|
File details
Details for the file btok-0.3.1-py3-none-any.whl.
File metadata
- Download URL: btok-0.3.1-py3-none-any.whl
- Upload date:
- Size: 10.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f9b5136557346c03e78dd40b967a7aadfb27eb657b07def920abaf8770bef6f1
|
|
| MD5 |
a2c75f5730ed207bb378b678de15e0c9
|
|
| BLAKE2b-256 |
987b6f33b40a63acd47821b545e993c932e645ac809f0292fc4f11df6026cfa5
|