Skip to main content

A token counting cli tool and package

Project description

Toko

Toko is a token counting tool. It is built for use as a CLI, and available as a Python package.

Highlights

  • Accurate token counting for OpenAI models out of the box, with optional support for Anthropic, Google, xAI, Mistral, Llama, DeepSeek, and Qwen families.
  • Reads inline text, stdin, files, directories (respects .gitignore automatically), and URLs.
  • Compare multiple models in one run and add cost estimates powered by bundled genai-prices data.
  • Emits text, json, csv, or tsv output. When stdout is piped, Toko automatically switches to TSV so you can chain tools like cut or awk.
  • Caches counts in SQLite under your platform cache folder (e.g. ~/.cache/toko) so repeated runs avoid redundant API calls.

Install

Toko targets Python 3.14 and ships as a uv tool.

Quick install

uv tool install toko

This places a toko executable on your PATH. Run uv tool upgrade toko to pick up new releases.

Optional providers

Install extras when you need additional providers:

# HuggingFace tokenizers for Llama, DeepSeek, Qwen families
uv tool install 'toko[transformers]'

# Official Mistral tokenizer (mistral-common)
uv tool install 'toko[mistral]'

# Everything above in one go
uv tool install 'toko[all]'

If you are adding Toko to a project environment instead of the global toolchain, replace uv tool install with uv add.

Source checkout (contributors)

git clone https://github.com/moredatarequired/toko
cd toko
just setup  # installs lefthook git hooks

Quick start

Options in examples appear before any paths. typer/click treat everything after the first path argument as data input, so prefer toko --total-only src instead of toko src --total-only.

Count inline text

toko --model gpt-5 --text "hello world"
2

If you omit --model, Toko falls back to your configured default. Fresh installs ship with gpt-5; override this in config.toml if your workflow needs a different model.

Read a file

toko --model gpt-5 LICENSE
┏━━━━━━━━━┳━━━━━━━┓
┃ File    ┃ gpt-5 ┃
┡━━━━━━━━━╇━━━━━━━┩
│ LICENSE │   223 │
└─────────┴───────┘

Token counts will change if the file contents change.

Stream from stdin

printf 'hello world' | toko --model gpt-5
2

When stdout is not a TTY (for example, when piping into another command) Toko emits TSV automatically and drops the header unless you pass --header.

Compare models and estimate cost

toko --header --format tsv --model gpt-5 --model gpt-4.1-mini --text "The quick brown fox" --cost
model	tokens	cost
gpt-5	4	$0.000005
gpt-4.1-mini	4	$0.000008

Costs come from the bundled genai-prices feed. Models without pricing information display N/A.

Work with directories, URLs, and filters

toko --exclude '**/__pycache__/*' src/
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ File                                 ┃  gpt-5 ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ src/toko/__init__.py                 │     11 │
│ src/toko/cache.py                    │    758 │
│ src/toko/cli.py                      │  3,372 │
│ src/toko/config.py                   │    584 │
│ src/toko/cost.py                     │  1,325 │
│ src/toko/counter.py                  │  3,445 │
│ src/toko/data/__init__.py            │      8 │
│ src/toko/data/openrouter_models.json │    359 │
│ src/toko/file_reader.py              │  1,120 │
│ src/toko/formatters.py               │  2,226 │
│ src/toko/models.py                   │  4,211 │
│ src/toko/price_update.py             │    403 │
│ TOTAL                                │ 17,822 │
└──────────────────────────────────────┴────────┘
  • Directories are processed recursively by default and honor .gitignore.
  • Use --no-recursive to stay shallow and --no-ignore to include ignored files.

Machine-readable output

Toko can emit structured output without post-processing.

toko --model gpt-5 --format json LICENSE
{
  "LICENSE": {
    "gpt-5": 223
  }
}
toko --header --model gpt-5 --format csv --text "hello world"
model,tokens
gpt-5,2

Use --format tsv to force TSV even when running interactively.

Know which models are available

toko --list-models | head -n 5
anthropic/claude-3-5-haiku-20241022
anthropic/claude-3-5-sonnet-20240620
anthropic/claude-3-5-sonnet-20241022
anthropic/claude-3-7-sonnet-20250219
anthropic/claude-3-haiku-20240307

API keys and optional providers

Some providers require API credentials to access token counting endpoints:

  • Anthropic – set ANTHROPIC_API_KEY
  • Google Gemini/Gemma – set GOOGLE_API_KEY

Some providers use tokenizers available on Hugging Face; these may need authentication to download the appropriate tokenizer.

  • HuggingFace-hosted models (Llama, DeepSeek, Qwen) – install toko[transformers] and ensure huggingface-cli login (or set HF_TOKEN) if the model needs authentication.
  • Mistral – install toko[mistral]; no API key is required for offline tokenization.

Environment variables can be exported directly, stored in a .env file and loaded with uv run --env-file, or placed in the config file described below.

To mix providers, provide every required key. For example:

ANTHROPIC_API_KEY=sk-ant-... toko --model gpt-5 --model claude-sonnet-4-5 --text "Launch checklist" --cost

Configuration

Toko reads configuration from $XDG_CONFIG_HOME/toko/config.toml (defaults to ~/.config/toko/config.toml). A minimal example:

[toko]
default_model = "gpt-5"
default_format = "text"
respect_gitignore = true
auto_update_prices = false # fetch latest pricing when cached data is stale

[toko.exclude]
patterns = ["*.log", "*.tmp", "**/__pycache__/*"]

[toko.api_keys]
anthropic = "sk-ant-..."
openai = "sk-..."

Config values act as defaults; command-line flags always win.

Caching and pricing data

  • Counts are cached in $XDG_CACHE_HOME/toko/token_cache.db.
  • Pricing data from genai-prices is stored alongside the package. When auto_update_prices is true, Toko silently refreshes the cache if data is older than a day. Fetch failures never abort your command.

Development tasks

just lint          # Ruff check & format
just typecheck     # ty type checking
just test          # run "fast" tests
just check-all     # run the full set of checks

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toko-0.2.0.tar.gz (21.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

toko-0.2.0-py3-none-any.whl (25.5 kB view details)

Uploaded Python 3

File details

Details for the file toko-0.2.0.tar.gz.

File metadata

  • Download URL: toko-0.2.0.tar.gz
  • Upload date:
  • Size: 21.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.5

File hashes

Hashes for toko-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0faf5e56daeb73abdf505e7cd3c8331194043fcb3c31c78b0a6b565e95b7c498
MD5 2d554f4a7f5414402a4b4c23b899c9b6
BLAKE2b-256 588592613135cb0e864b010f27b2944beef12917ae10850d1a0fc72fba972b01

See more details on using hashes here.

File details

Details for the file toko-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: toko-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 25.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.5

File hashes

Hashes for toko-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 851a601acde0a8c576a1f327d2d8a89b30933abc5b50ef65bdf5d9ade2d9733f
MD5 656f37ddf0a8426c378a8fc48a714cdd
BLAKE2b-256 3b870bb355c2e07985ff9677af811dd79ac250a6c9adc19884d76c0356197e21

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page