A token counting cli tool and package
Project description
Toko
Toko is a token counting tool. It is built for use as a CLI, and available as a Python package.
Highlights
- Accurate token counting for OpenAI models out of the box, with optional support for Anthropic, Google, xAI, Mistral, Llama, DeepSeek, and Qwen families.
- Reads inline text, stdin, files, directories (respects
.gitignoreautomatically), and URLs. - Compare multiple models in one run and add cost estimates powered by bundled
genai-pricesdata. - Emits
text,json,csv, ortsvoutput. When stdout is piped, Toko automatically switches to TSV so you can chain tools likecutorawk. - Caches counts in SQLite under your platform cache folder (e.g.
~/.cache/toko) so repeated runs avoid redundant API calls.
Install
Toko targets Python 3.14 and ships as a uv tool.
Quick install
uv tool install toko
This places a toko executable on your PATH. Run uv tool upgrade toko to pick up new releases.
Optional providers
Install extras when you need additional providers:
# HuggingFace tokenizers for Llama, DeepSeek, Qwen families
uv tool install 'toko[transformers]'
# Official Mistral tokenizer (mistral-common)
uv tool install 'toko[mistral]'
# Everything above in one go
uv tool install 'toko[all]'
If you are adding Toko to a project environment instead of the global toolchain, replace uv tool install with uv add.
Source checkout (contributors)
git clone https://github.com/moredatarequired/toko
cd toko
just setup # installs lefthook git hooks
Quick start
Options in examples appear before any paths. typer/click treat everything after the first path argument as data input, so prefer toko --total-only src instead of toko src --total-only.
Count inline text
toko --model gpt-5 --text "hello world"
2
If you omit --model, Toko falls back to your configured default. Fresh installs ship with gpt-5; override this in config.toml if your workflow needs a different model.
Read a file
toko --model gpt-5 LICENSE
┏━━━━━━━━━┳━━━━━━━┓
┃ File ┃ gpt-5 ┃
┡━━━━━━━━━╇━━━━━━━┩
│ LICENSE │ 223 │
└─────────┴───────┘
Token counts will change if the file contents change.
Stream from stdin
printf 'hello world' | toko --model gpt-5
2
When stdout is not a TTY (for example, when piping into another command) Toko emits TSV automatically and drops the header unless you pass --header.
Compare models and estimate cost
toko --header --format tsv --model gpt-5 --model gpt-4.1-mini --text "The quick brown fox" --cost
model tokens cost
gpt-5 4 $0.000005
gpt-4.1-mini 4 $0.000008
Costs come from the bundled genai-prices feed. Models without pricing information display N/A.
Work with directories, URLs, and filters
toko --exclude '**/__pycache__/*' src/
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ File ┃ gpt-5 ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ src/toko/__init__.py │ 11 │
│ src/toko/cache.py │ 758 │
│ src/toko/cli.py │ 3,372 │
│ src/toko/config.py │ 584 │
│ src/toko/cost.py │ 1,325 │
│ src/toko/counter.py │ 3,445 │
│ src/toko/data/__init__.py │ 8 │
│ src/toko/data/openrouter_models.json │ 359 │
│ src/toko/file_reader.py │ 1,120 │
│ src/toko/formatters.py │ 2,226 │
│ src/toko/models.py │ 4,211 │
│ src/toko/price_update.py │ 403 │
│ TOTAL │ 17,822 │
└──────────────────────────────────────┴────────┘
- Directories are processed recursively by default and honor
.gitignore. - Use
--no-recursiveto stay shallow and--no-ignoreto include ignored files.
Machine-readable output
Toko can emit structured output without post-processing.
toko --model gpt-5 --format json LICENSE
{
"LICENSE": {
"gpt-5": 223
}
}
toko --header --model gpt-5 --format csv --text "hello world"
model,tokens
gpt-5,2
Use --format tsv to force TSV even when running interactively.
Know which models are available
toko --list-models | head -n 5
anthropic/claude-3-5-haiku-20241022
anthropic/claude-3-5-sonnet-20240620
anthropic/claude-3-5-sonnet-20241022
anthropic/claude-3-7-sonnet-20250219
anthropic/claude-3-haiku-20240307
API keys and optional providers
Some providers require API credentials to access token counting endpoints:
- Anthropic – set
ANTHROPIC_API_KEY - Google Gemini/Gemma – set
GOOGLE_API_KEY
Some providers use tokenizers available on Hugging Face; these may need authentication to download the appropriate tokenizer.
- HuggingFace-hosted models (Llama, DeepSeek, Qwen) – install
toko[transformers]and ensurehuggingface-cli login(or setHF_TOKEN) if the model needs authentication. - Mistral – install
toko[mistral]; no API key is required for offline tokenization.
Environment variables can be exported directly, stored in a .env file and loaded with uv run --env-file, or placed in the config file described below.
To mix providers, provide every required key. For example:
ANTHROPIC_API_KEY=sk-ant-... toko --model gpt-5 --model claude-sonnet-4-5 --text "Launch checklist" --cost
Configuration
Toko reads configuration from $XDG_CONFIG_HOME/toko/config.toml (defaults to ~/.config/toko/config.toml). A minimal example:
[toko]
default_model = "gpt-5"
default_format = "text"
respect_gitignore = true
auto_update_prices = false # fetch latest pricing when cached data is stale
[toko.exclude]
patterns = ["*.log", "*.tmp", "**/__pycache__/*"]
[toko.api_keys]
anthropic = "sk-ant-..."
openai = "sk-..."
Config values act as defaults; command-line flags always win.
Caching and pricing data
- Counts are cached in
$XDG_CACHE_HOME/toko/token_cache.db. - Pricing data from
genai-pricesis stored alongside the package. Whenauto_update_pricesistrue, Toko silently refreshes the cache if data is older than a day. Fetch failures never abort your command.
Development tasks
just lint # Ruff check & format
just typecheck # ty type checking
just test # run "fast" tests
just check-all # run the full set of checks
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file toko-0.2.0.tar.gz.
File metadata
- Download URL: toko-0.2.0.tar.gz
- Upload date:
- Size: 21.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0faf5e56daeb73abdf505e7cd3c8331194043fcb3c31c78b0a6b565e95b7c498
|
|
| MD5 |
2d554f4a7f5414402a4b4c23b899c9b6
|
|
| BLAKE2b-256 |
588592613135cb0e864b010f27b2944beef12917ae10850d1a0fc72fba972b01
|
File details
Details for the file toko-0.2.0-py3-none-any.whl.
File metadata
- Download URL: toko-0.2.0-py3-none-any.whl
- Upload date:
- Size: 25.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
851a601acde0a8c576a1f327d2d8a89b30933abc5b50ef65bdf5d9ade2d9733f
|
|
| MD5 |
656f37ddf0a8426c378a8fc48a714cdd
|
|
| BLAKE2b-256 |
3b870bb355c2e07985ff9677af811dd79ac250a6c9adc19884d76c0356197e21
|