No project description provided
Project description
rtok
A Python tokenizer for LLMs using GitHub's linear BPE implementation.
[rtok]
ebook (90k) 0.12907366300350986
a*100 7.582298712804914e-05
a*1000 0.00013318200944922864
a*10000 0.0013674889924004674
a*100000 0.01401260400598403
[tiktoken]
ebook (90k) 0.23613008800020907
a*100 0.00018489900685381144
a*1000 0.003490732007776387
a*10000 0.3407805879978696
a*100000 33.41563105300884
API
rtok.openai.get_o200k_base() -> Encoder
rtok.openai.get_cl100k_base() -> Encoder
Encoder.count(str)
Encoder.count_till_limit(str, limit: int) -> Optional[int]
Encoder.encode(str) -> [int]
Encoder.decode([int]) -> str
Encoder.encode and Encoder.decode are compatible with tiktoken. See test.py.
Encoder.count_till_limit() returns None if the count exceeds the limit.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
rtok-0.1.0.tar.gz
(92.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rtok-0.1.0.tar.gz.
File metadata
- Download URL: rtok-0.1.0.tar.gz
- Upload date:
- Size: 92.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.8.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e35174a23039851206611d463f1a959a7fc8482a6cea0501576a24331fc078a9
|
|
| MD5 |
39bf695938c35032688ac15c61c33f88
|
|
| BLAKE2b-256 |
2f83be406291a8bd02e95d206d01aa76b56d1f07094fc10ee6344e241fc321b4
|
File details
Details for the file rtok-0.1.0-cp312-cp312-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: rtok-0.1.0-cp312-cp312-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 25.8 MB
- Tags: CPython 3.12, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.8.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6eb442fed093a0751db32b8557ea6e3868df66f6b0e484ef48a6b000ba6b6c82
|
|
| MD5 |
090fdd20e77202e7bd469f6e59b3ac1a
|
|
| BLAKE2b-256 |
6a35c51dbe1274a264fbe88f1ad4cc3df7a46cd4badd8d2c653299963a88d206
|