bytepiece-rs Python binding
Project description
bytepiece-rs
Install
pip install rs_bytepiece
Usage
from rs_bytepiece import Tokenizer
tokenizer = Tokenizer()
ids = tokenizer.encode("今天天气不错")
text = tokenizer.decode(ids)
Performance
The performance is a bit faster than the original implementation. I've tested the《鲁迅全集》which has 625890 chars.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distributions
Close
Hashes for rs_bytepiece-0.0.2-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c0681e4c6cef345aca560163b5d1fe184e04f965d03c743c1b8cc8939414b056 |
|
MD5 | c089e14ab2e2ac219ac9d43124f08945 |
|
BLAKE2b-256 | f44b3fd8daa56e6757280a0c8a60b8d8f374fc0c32a0d0ab69c9cf312dac6668 |
Close
Hashes for rs_bytepiece-0.0.2-cp37-abi3-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 65364d8213af7d0d210e6437984682bb25f7c10edf46d69491826ce55a97ba17 |
|
MD5 | 708526d8af14479884b5496f3846d609 |
|
BLAKE2b-256 | a2395b35603f2c1e271a65696dd0f3d37f34fbb3a587c02806239c48d2a4dc9f |