Real-time LLM context window visualizer — CLI + web UI
Project description
🔭 tokenviz
Real-time LLM context window visualizer — know exactly how much of your context you're burning.
tokenviz · gpt-4o
47,832 tokens / 128,000
████████████████████░░░░░░░░░░░░░░░░░░░░
37.4% used · 80,168 remaining
chars 214,559
words 37,201
lines 3,847
tok/word 1.29
Works with every major LLM. GPT-4o, Claude, Gemini 1M, Llama, DeepSeek, Mistral — 20+ models.
Zero required dependencies. Built-in BPE estimator (±5%) ships alone. Drop in tiktoken for exact OpenAI counts.
Why?
You're writing a long prompt. Is it 30k tokens or 90k? You're building a RAG pipeline. How much context is left for the answer? You're debugging why the model truncated — was it the context limit?
tokenviz answers these questions instantly, with a CLI you can pipe into any workflow and a web UI you can paste anything into.
Install
pip install tokenviz
Zero required dependencies. For exact OpenAI counts:
pip install tokenviz[exact] # adds tiktoken
Or run without installing:
npx tokenviz count file.txt # via npx (Node.js)
Usage
CLI
# Analyze a file
tokenviz count prompt.txt
# Pipe from stdin
cat conversation.json | tokenviz count
# Specify model
tokenviz count system_prompt.md --model claude-sonnet-4
# Inline text
tokenviz count --text "$(cat *.py)" --model gpt-4o
# JSON output (for scripting)
tokenviz count prompt.txt --json
# List all models + limits
tokenviz models
Web UI
tokenviz serve
# → http://localhost:7432
Paste anything — code, documents, conversations, system prompts. Token count and density heatmap update as you type.
Python API
from tokenviz import count_tokens, analyze_text
# Simple count
result = count_tokens("Your text here", model="gpt-4o")
print(result["count"]) # 7
print(result["method"]) # "tiktoken" or "estimate"
# Full analysis
data = analyze_text(open("big_prompt.txt").read(), model="claude-sonnet-4")
print(f"{data['tokens']:,} / {data['limit']:,} ({data['percent']:.1f}%)")
print(f"Remaining: {data['remaining']:,}")
if data["warning"]:
print("⚠️ Over 80% full!")
Supported Models
| Provider | Models | Context |
|---|---|---|
| OpenAI | gpt-4o, gpt-4o-mini, o1, o3-mini, ... | 8k–200k |
| Anthropic | claude-sonnet-4, claude-opus-4, ... | 200k |
| gemini-2.5-pro, gemini-2.0-flash, ... | 1M | |
| Meta | llama-3.1-70b, llama-3.1-405b | 128k |
| DeepSeek | deepseek-v3, deepseek-r1 | 128k |
| Mistral | mistral-large, mixtral-8x22b | 65k–128k |
Missing a model? Open an issue or add it in one line — it's just a dict.
Token counting accuracy
| Mode | Accuracy | Requires |
|---|---|---|
tiktoken (exact) |
100% | pip install tokenviz[exact] |
| Built-in BPE est. | ±5% | nothing |
The built-in estimator handles English, code, JSON, and mixed content well. It uses the same pre-tokenization heuristics as BPE encoders, without the vocabulary lookup.
Scripting examples
# Alert if prompt is > 80% of context
TOKEN_PCT=$(tokenviz count prompt.txt --json | python3 -c "import sys,json; print(json.load(sys.stdin)['percent'])")
if (( $(echo "$TOKEN_PCT > 80" | bc -l) )); then echo "WARNING: context over 80%"; fi
# Count tokens across all Python files in a project
find . -name "*.py" | xargs cat | tokenviz count --model gpt-4o
# Pre-commit hook: warn if any staged file is > 50k tokens
git diff --cached --name-only | xargs -I{} sh -c 'tokenviz count "{}" --json | python3 -c "import sys,json; d=json.load(sys.stdin); print(f\"{d[\"tokens\"]:,} tokens in {}\") if d[\"tokens\"] > 50000 else None"'
Contributing
git clone https://github.com/yourusername/tokenviz
cd tokenviz
python tokenviz.py count --text "hello world"
The codebase is intentionally small (~500 lines total). Contributions welcome:
- 🆕 Add new models (one-line dict entry in
server.py) - 🌐 Improve the web UI (
public/index.html) - 🧮 Better tokenizer heuristics (
fast_token_estimate()) - 📦 npm / npx wrapper
- 🔌 Editor extensions (VS Code, Neovim, Zed)
Roadmap
- VS Code extension
- Neovim plugin
- Conversation history analysis (import ChatGPT/Claude exports)
- Token budget planner (split text into N-token chunks)
- Live token streaming (watch a prompt being built)
- GitHub Action for CI token budgets
License
MIT — do whatever you want with it.
Made for developers who are tired of guessing their token usage
⭐ Star this repo if it saved you from a truncated context
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tokenviz-0.1.0.tar.gz.
File metadata
- Download URL: tokenviz-0.1.0.tar.gz
- Upload date:
- Size: 6.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
67b1ef4b3872f76a470f019cf4a5808fc30eca454c08f246d5aaedd3b456fea6
|
|
| MD5 |
4841177b0d5428a40b419cca134736c7
|
|
| BLAKE2b-256 |
e11005a718a1e22b2a7f40a670857db631142876f2197aa4ac4eba0fa4f598c0
|
File details
Details for the file tokenviz-0.1.0-py3-none-any.whl.
File metadata
- Download URL: tokenviz-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dec1d4fd7ea4aba19fc2f05cc1213d910e200c4badd1711ecbec08bb6b885d13
|
|
| MD5 |
90742fda9f6ed76c223caa5ae4b3b54d
|
|
| BLAKE2b-256 |
0843850586370390a45b8f1891f58b0d3bbdc4c28550cc3d3c7ee2f9f084f19d
|