Skip to main content

bytepiece-rs Python binding

Project description

rs-bytepiece

Install

pip install rs_bytepiece

Usage

from rs_bytepiece import Tokenizer

tokenizer = Tokenizer()
ids = tokenizer.encode("今天天气不错")
text = tokenizer.decode(ids)

Performance

The performance is a bit faster than the original implementation. I've tested the《鲁迅全集》which has 625890 chars. The time unit is millisecond.

length jieba aho_py aho_cy aho_rs
100 17062.12 1404.37 564.31 299.09
1000 17104.38 1424.6 573.32 281.84
10000 17432.58 1429.0 574.93 293.16
100000 17228.17 1401.01 574.5 280.81
625890 17305.95 1419.79 567.78 282.35

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rs_bytepiece-0.1.0.tar.gz (14.6 kB view details)

Uploaded Source

Built Distributions

rs_bytepiece-0.1.0-cp37-abi3-win_amd64.whl (3.8 MB view details)

Uploaded CPython 3.7+ Windows x86-64

rs_bytepiece-0.1.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.7+ manylinux: glibc 2.17+ x86-64

rs_bytepiece-0.1.0-cp37-abi3-macosx_10_7_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.7+ macOS 10.7+ x86-64

File details

Details for the file rs_bytepiece-0.1.0.tar.gz.

File metadata

  • Download URL: rs_bytepiece-0.1.0.tar.gz
  • Upload date:
  • Size: 14.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.2.3

File hashes

Hashes for rs_bytepiece-0.1.0.tar.gz
Algorithm Hash digest
SHA256 93e434129cd5bf93bdc56771a5bbdca6e775b780e39a0e992bd59d7b378a9083
MD5 128737282102f92368900d8e1d5d4213
BLAKE2b-256 b38f0c45bbe2b117502ed15e3b006fb5115da493fcb9e5b0e66a204f5b6b00fa

See more details on using hashes here.

Provenance

File details

Details for the file rs_bytepiece-0.1.0-cp37-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for rs_bytepiece-0.1.0-cp37-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 020a47804007a430627016eeb025fe7a6fad18af5704b63fa40408b1ea706538
MD5 50f9cfd9ff1f78e9ce16e3374ec7fbdc
BLAKE2b-256 892f50b11b57eea11225e4f19bd493f7a454eed42aa91ed810957a345c3130b9

See more details on using hashes here.

Provenance

File details

Details for the file rs_bytepiece-0.1.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rs_bytepiece-0.1.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 404a7aa84ff603b9d4554d30ce8a892249be09b0e0ef1a10ff593367d89ac6a6
MD5 4a9de2f2bbe80e54fd7bc3f92058ebee
BLAKE2b-256 91d9cb576d4bbf36b9df2d2fa74cce06b94fa65815138ad7e5fc4da7fb491ac7

See more details on using hashes here.

Provenance

File details

Details for the file rs_bytepiece-0.1.0-cp37-abi3-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for rs_bytepiece-0.1.0-cp37-abi3-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 65f88bb0878bae7c5add49dc5077116428e46edc89dbd25b0ce49e098df4981f
MD5 144167070d471e4bad2ab650eb3d89dc
BLAKE2b-256 9ba9989fefc126fc658ab52925ca89ba7d88a2528bd61f6cfd68e1ff10d24f56

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page