A blazing-fast BPE tokenizer for LLMs. Drop-in tiktoken replacement, 20-80x faster.

These details have not been verified by PyPI

Project links

Project description

runtoken

A blazing-fast BPE tokenizer for LLMs. Drop-in tiktoken replacement, 20-80x faster.

Built from scratch in Rust with Python bindings via PyO3. Produces identical output to tiktoken — same token IDs, same order, every time.

Why?

If you're building an LLM gateway, proxy, or any system that processes tokens at scale, tokenization speed matters. Every API request needs token counting for:

Cost estimation & billing
Rate limiting per user
Context window management
Smart routing (pick the cheapest model that fits)

tiktoken is good. runtoken is faster.

Benchmarks

Apples-to-apples comparison — both called as Python packages, same machine, same text:

Encode (full token IDs)

Input	tiktoken	runtoken	Speedup
Short text (29 chars, 9 tokens)	1.3M tok/s	24.6M tok/s	19x
Medium text (1050 chars, 511 tokens)	2.5M tok/s	68.8M tok/s	27x
Code (1200 chars, 380 tokens)	1.5M tok/s	63.5M tok/s	44x
Long English (4500 chars, 1001 tokens)	2.5M tok/s	73.6M tok/s	29x
Long code (5600 chars, 2160 tokens)	1.5M tok/s	88.2M tok/s	59x
Unicode (500 chars, 420 tokens)	4.2M tok/s	89.2M tok/s	21x

Count-only (the gateway use case)

Input	tiktoken	runtoken	Speedup
Medium text	2.5M tok/s	940M tok/s	381x
Long English	2.6M tok/s	1.4B tok/s	538x
Long code	1.5M tok/s	2.6B tok/s	1750x

Benchmarked on a 2-vCPU cloud instance. Count-only benefits from multi-level caching (text-level + chunk-level LRU).

Correctness

Test Suite	Tests	Result
Deep correctness (41 strings × 3 encodings)	123	✅ 100%
Stress test (up to 64K tokens)	27	✅ 100%
PDF documents (academic papers, 65K tokens)	54	✅ 100%
Total	204	0 mismatches

Every test compares exact token IDs — not just counts, but the same numbers in the same order.

Installation

pip install runtoken

From source

git clone https://github.com/Thibault00/runtoken.git
cd runtoken
pip install maturin
maturin develop --release

Usage

Python

import runtoken

# Get a tokenizer by encoding name (same API as tiktoken)
enc = runtoken.get_encoding("cl100k_base")

# Encode text to token IDs
tokens = enc.encode("Hello, world!")
# [9906, 11, 1917, 0]

# Count tokens
count = enc.count("Hello, world!")
# 4

# Decode back to text
text = enc.decode([9906, 11, 1917, 0])
# "Hello, world!"

# Get tokenizer by model name
enc = runtoken.encoding_for_model("gpt-4o")  # → o200k_base
enc = runtoken.encoding_for_model("gpt-4")   # → cl100k_base
enc = runtoken.encoding_for_model("claude")   # → cl100k_base

# Quick one-liner
runtoken.count("Hello!", model="gpt-4o")
# 2

Rust

use runtoken::Tokenizer;

let tokenizer = Tokenizer::new("cl100k_base").unwrap();
let tokens = tokenizer.encode("Hello, world!");
let count = tokenizer.count("Hello, world!");
let text = tokenizer.decode(&tokens);

CLI

# Encode text
runtoken-cli encode "Hello, world!" cl100k_base

# Count tokens
runtoken-cli count "Hello, world!" o200k_base

# Read from stdin (for large texts)
cat myfile.txt | runtoken-cli count - cl100k_base

# Benchmark
runtoken-cli bench cl100k_base

Supported Encodings

Encoding	Models	Vocab Size
`cl100k_base`	GPT-4, GPT-3.5-turbo, Claude	100,256
`o200k_base`	GPT-4o, o1, o3	200,019
`p50k_base`	text-davinci-003, Codex	50,281

Model → Encoding Mapping

Model prefix	Encoding
`gpt-4o`, `o1`, `o3`	o200k_base
`gpt-4`, `gpt-3.5`, `claude`	cl100k_base
`text-davinci`, `code-davinci`	p50k_base

Architecture

src/
├── lib.rs       # Tokenizer + TokenizerRegistry + multi-level caching
├── bpe.rs       # Core BPE merge algorithm (tiktoken-compatible)
├── vocab.rs     # Vocabulary loading (.tiktoken format)
├── regex.rs     # Regex splitting per encoding
├── python.rs    # PyO3 bindings
└── main.rs      # CLI tool

~900 lines of Rust — that's the entire tokenizer. Key design decisions:

Multi-level LRU cache: Text-level (hash → tokens) + chunk-level (bytes → tokens). Repeated text is a hash lookup.
Precomputed rank tables: Single-byte and two-byte pair ranks as direct arrays — no HashMap overhead for the most common lookups.
Inline chunk processing: Regex chunks are encoded inline without collecting into intermediate Vecs.
tiktoken-style BPE merge: Tracks min_rank inline during merges, avoids priority queue overhead for small chunks.

How it works

BPE (Byte Pair Encoding) tokenization:

Regex split: Split input text into chunks using encoding-specific regex patterns
Byte-level merging: For each chunk, start with individual bytes and repeatedly merge the pair with the lowest rank (priority) in the vocabulary
Token IDs: Map the final merged byte sequences to their vocabulary rank

runtoken uses the exact same regex patterns and vocabulary files as tiktoken, which is why the output is identical.

Contributing

# Clone and build
git clone https://github.com/Thibault00/runtoken.git
cd runtoken
cargo build --release

# Run Rust tests
cargo test

# Run correctness tests against tiktoken
pip install tiktoken
python tests/deep_correctness.py
python tests/stress_test.py

# Build Python package
pip install maturin
maturin develop --release
python tests/benchmark_python.py

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.2

Feb 23, 2026

0.1.1

Feb 23, 2026

0.1.0

Feb 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

runtoken-0.1.2.tar.gz (3.0 MB view details)

Uploaded Feb 23, 2026 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

runtoken-0.1.2-cp312-cp312-win_amd64.whl (3.8 MB view details)

Uploaded Feb 23, 2026 CPython 3.12Windows x86-64

runtoken-0.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.9 MB view details)

Uploaded Feb 23, 2026 CPython 3.12manylinux: glibc 2.17+ x86-64

runtoken-0.1.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.8 MB view details)

Uploaded Feb 23, 2026 CPython 3.12manylinux: glibc 2.17+ ARM64

runtoken-0.1.2-cp312-cp312-macosx_11_0_arm64.whl (3.8 MB view details)

Uploaded Feb 23, 2026 CPython 3.12macOS 11.0+ ARM64

runtoken-0.1.2-cp312-cp312-macosx_10_12_x86_64.whl (3.8 MB view details)

Uploaded Feb 23, 2026 CPython 3.12macOS 10.12+ x86-64

File details

Details for the file runtoken-0.1.2.tar.gz.

File metadata

Download URL: runtoken-0.1.2.tar.gz
Upload date: Feb 23, 2026
Size: 3.0 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for runtoken-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`eebf5555a1d614f031080cc5019ad5f8752a809c54755684c374c0e6b819c18b`
MD5	`fb898b93c7cdfb2fb2ab441502483cbf`
BLAKE2b-256	`96615ec425f0b3a4a41156a905254dfdffc89c4f390ac1d986390504879765d3`

See more details on using hashes here.

File details

Details for the file runtoken-0.1.2-cp312-cp312-win_amd64.whl.

File metadata

Download URL: runtoken-0.1.2-cp312-cp312-win_amd64.whl
Upload date: Feb 23, 2026
Size: 3.8 MB
Tags: CPython 3.12, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for runtoken-0.1.2-cp312-cp312-win_amd64.whl
Algorithm	Hash digest
SHA256	`f9c94a3bb9700b903b9110da5ae2e9d2c799d05e6fa23a569b39d6d1850b5954`
MD5	`89b13c193eb86e417bc5f75309b79c66`
BLAKE2b-256	`8f4105dd0059eb6c2922d7d001939cb661255201fe5d514fe55a74d8a4a3f971`

See more details on using hashes here.

File details

Details for the file runtoken-0.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: runtoken-0.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Feb 23, 2026
Size: 3.9 MB
Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for runtoken-0.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`31206cff14472391dae5e6a66a66ee7b6a615a6e546101b983d2694279949478`
MD5	`a48fc34cda9e292c61431d1a485db6a5`
BLAKE2b-256	`dc00075052a7df61c72abd7709164782872252da938bb9cdf80fda05adadd49f`

See more details on using hashes here.

File details

Details for the file runtoken-0.1.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

Download URL: runtoken-0.1.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Upload date: Feb 23, 2026
Size: 3.8 MB
Tags: CPython 3.12, manylinux: glibc 2.17+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for runtoken-0.1.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm	Hash digest
SHA256	`29c12f9b5c2b95584efdf2e1ac93b48a75fe8be19e1057ddb2e6b57464f7d2e8`
MD5	`2d3edfff9dd31ffeb1173b615bb57ee1`
BLAKE2b-256	`5af66eb775d079b250b77aaff6f8fe2f6748f2bd31351cb08cbe78854da513f2`

See more details on using hashes here.

File details

Details for the file runtoken-0.1.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

Download URL: runtoken-0.1.2-cp312-cp312-macosx_11_0_arm64.whl
Upload date: Feb 23, 2026
Size: 3.8 MB
Tags: CPython 3.12, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for runtoken-0.1.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`fa3f561640bcd30355b0295be0071fa2908ca1cd167326050b608f479ffaed76`
MD5	`831b5f01c79aaf1d944b2743a409cd42`
BLAKE2b-256	`6285ebd4c0a4b77329b2b07f92f1ab420cf741e47d09e838fef74d8615f2f168`

See more details on using hashes here.

File details

Details for the file runtoken-0.1.2-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

Download URL: runtoken-0.1.2-cp312-cp312-macosx_10_12_x86_64.whl
Upload date: Feb 23, 2026
Size: 3.8 MB
Tags: CPython 3.12, macOS 10.12+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for runtoken-0.1.2-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm	Hash digest
SHA256	`5bde25681df9e7371369f1c774ea35c4f3b1c0bf222a0f99105102c54d1d10ba`
MD5	`203a5477c32b60b479a509cd84199766`
BLAKE2b-256	`1b9db4b8c0b2877be726a9a8df9ff7e46d8b20487e1b8fca8567c07f8e047b0b`

See more details on using hashes here.

runtoken 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

runtoken

Why?

Benchmarks

Encode (full token IDs)

Count-only (the gateway use case)

Correctness

Installation

From source

Usage

Python

Rust

CLI

Supported Encodings

Model → Encoding Mapping

Architecture

How it works

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes