Pre-execution cost estimation for LLM agent workflows with calibration learning
Project description
tokencast
Pre-execution cost estimation for LLM agent workflows. Get a cost estimate before running any agent task, then let tokencast learn from actuals to improve accuracy over time.
Available as an MCP server (works in Cursor, VS Code + Copilot, Windsurf, Claude Code) or as a Claude Code skill (SKILL.md, for Claude Code users who prefer the skill-based workflow).
MCP Installation (Recommended)
1. Install the package
pip install tokencast
Or with uvx (no install required — runs directly from PyPI):
uvx tokencast
2. Configure your IDE
Replace /path/to/your/project with your actual project path in the config snippets below.
Claude Code
Add to ~/.claude/settings.json:
{
"mcpServers": {
"tokencast": {
"command": "tokencast-mcp",
"args": [
"--calibration-dir", "/path/to/your/project/calibration",
"--project-dir", "/path/to/your/project"
]
}
}
}
Cursor
Create or update .cursor/mcp.json in your project root:
{
"mcpServers": {
"tokencast": {
"command": "tokencast-mcp",
"args": [
"--calibration-dir", "/path/to/your/project/calibration",
"--project-dir", "/path/to/your/project"
]
}
}
}
VS Code + GitHub Copilot
Create or update .vscode/mcp.json in your project root:
{
"servers": {
"tokencast": {
"type": "stdio",
"command": "tokencast-mcp",
"args": [
"--calibration-dir", "/path/to/your/project/calibration",
"--project-dir", "/path/to/your/project"
]
}
}
}
Windsurf
Add to your Windsurf MCP config:
{
"mcpServers": {
"tokencast": {
"command": "tokencast-mcp",
"args": [
"--calibration-dir", "/path/to/your/project/calibration",
"--project-dir", "/path/to/your/project"
]
}
}
}
Full config examples are in docs/ide-configs/.
3. Use the tools
Once configured, tokencast exposes five MCP tools in your IDE:
| Tool | What it does |
|---|---|
estimate_cost |
Estimate API cost for a planned task before running it |
get_calibration_status |
Check whether your estimates are well-calibrated |
get_cost_history |
Browse past estimates vs actuals |
report_session |
Report actual cost at session end to improve calibration |
report_step_cost |
Record the cost of a single pipeline step during a session |
Example — estimate before starting work:
Estimate the cost for: size=M, files=8, complexity=high
Example — report actuals after finishing:
Report session cost: actual_cost=4.20
MCP Server Flags
| Flag | Default | Description |
|---|---|---|
--calibration-dir PATH |
~/.tokencast/calibration |
Where calibration data is stored |
--project-dir PATH |
None | Project root for file measurement |
--version |
Print version and exit |
Claude Code Skill (Alternative)
If you use Claude Code and prefer the skill-based (SKILL.md) workflow, you can install tokencast as a Claude Code skill instead:
# Clone the repo (anywhere — it doesn't need to live inside your project)
git clone https://github.com/krulewis/tokencast.git
# Install into your project (quote paths with spaces)
bash tokencast/scripts/install-hooks.sh "/path/to/your-project"
Paths with spaces: Always wrap the project path in quotes. Without them the install script will fail on paths like
/Volumes/Macintosh HD2/....
This does three things:
- Symlinks the skill into
<project>/.claude/skills/tokencast/ - Adds a
Stophook for auto-learning at session end - Adds a
PostToolUsehook to nudge estimation after planning agents
The SKILL.md workflow is Claude Code-specific. The MCP server works in any MCP-compatible client and is the recommended path for new users.
How It Works
- Infers size, file count, complexity from the plan in conversation
- Reads reference files for pricing and token heuristics
- Loads learned calibration factors (if any exist)
- Computes per-step token estimates using activity decomposition
- Applies complexity multiplier, context accumulation
(K+1)/2, and cache rates - Splits into Optimistic / Expected / Pessimistic bands
- If PR Review Loop is in scope, computes loop cost using geometric decay across N review cycles
- Applies calibration correction to Expected band
- Records the estimate for later comparison with actuals
Example output:
## tokencast estimate
Change: size=M, files=5, complexity=medium
Calibration: 1.12x from 8 prior runs
| Step | Model | Optimistic | Expected | Pessimistic |
|-----------------------|--------|------------|----------|-------------|
| Research Agent | Sonnet | $0.60 | $1.17 | $4.47 |
| Architect Agent | Opus | $0.67 | $1.18 | $3.97 |
| ... | ... | ... | ... | ... |
| TOTAL | | $3.37 | $6.26 | $22.64 |
Confidence Bands
| Band | Cache Hit | Multiplier | Meaning |
|---|---|---|---|
| Optimistic | 60% | 0.6x | Best case — focused agent work |
| Expected | 50% | 1.0x | Typical run |
| Pessimistic | 30% | 3.0x | With rework loops, debugging, retries |
Calibration
Calibration is fully automatic once you report actuals:
- 0-2 sessions: No correction applied. "Collecting data" status.
- 3-10 sessions: Global correction factor via trimmed mean of actual/expected ratios (trim_fraction=0.1).
- 10+ sessions: EWMA with recency weighting. Per-size-class factors activate when a class has 3+ samples.
- Outlier filtering: Sessions with actual/expected ratio >3.0x or <0.2x are excluded from calibration.
Calibration data lives in calibration/ (gitignored, local to each user).
Python API
from tokencast import estimate_cost, report_session, report_step_cost
from tokencast import get_calibration_status, get_cost_history
# Estimate before running a task
result = estimate_cost(
{"size": "M", "files": 5, "complexity": "medium"},
calibration_dir="./calibration",
)
# Report actuals at session end
report_session({"actual_cost": 4.20}, calibration_dir="./calibration")
# Check calibration health
status = get_calibration_status({}, calibration_dir="./calibration")
# Browse history
history = get_cost_history({"window": "30d"}, calibration_dir="./calibration")
# Report a single step's cost
report_step_cost(
{"step_name": "Research Agent", "cost": 0.85},
calibration_dir="./calibration",
)
Manual Invocation (Skill mode)
In Claude Code with SKILL.md installed, you can invoke explicitly:
/tokencast size=L files=12 complexity=high
/tokencast steps=implement,test,qa
/tokencast review_cycles=3
/tokencast review_cycles=0
Files
SKILL.md — Skill definition (auto-trigger, algorithm)
references/pricing.md — Model prices, cache rates, step→model map
references/heuristics.md — Token budgets, pipeline decompositions, multipliers
references/examples.md — Worked examples with arithmetic
references/calibration-algorithm.md — Detailed calibration algorithm reference
docs/ide-configs/ — Per-IDE MCP config examples
src/tokencast/ — Core estimation engine (Python package)
src/tokencast_mcp/ — MCP server (Python package)
scripts/
install-hooks.sh — One-time project setup (skill mode)
disable.sh — Remove from project (skill mode)
tokencast-learn.sh — Stop hook: auto-captures actuals (skill mode)
tokencast-track.sh — PostToolUse hook: nudges estimation after plans
sum-session-tokens.py — Parses session JSONL for actual costs
update-factors.py — Computes calibration factors from history
calibration/ — Per-user local data (gitignored)
history.jsonl — Estimate vs actual records
factors.json — Learned correction factors
active-estimate.json — Transient marker for current estimate
Limitations
- Pipeline step names reflect a default workflow — map your own steps to the closest defaults. Formulas are pipeline-agnostic (see
references/heuristics.md) - Heuristics assume typical 150-300 line source files
- Calibration requires 3+ completed sessions before corrections activate
- Pricing data embedded; check
last_updatedin references/pricing.md - Multi-session tasks only capture the session containing the estimate
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tokencast-0.1.2.tar.gz.
File metadata
- Download URL: tokencast-0.1.2.tar.gz
- Upload date:
- Size: 405.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d68738b881986133cb436d4e4cab67e5bfba4bfbb8f722483c514e0b05bd61b
|
|
| MD5 |
8c68cb4bfbe4c38ed4b2a5a63182166a
|
|
| BLAKE2b-256 |
588fd8e0c168fa588926af7ebb5d5ffad718baf4d13ed7dd2fc84095e943726b
|
Provenance
The following attestation bundles were made for tokencast-0.1.2.tar.gz:
Publisher:
release.yml on krulewis/tokencast
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tokencast-0.1.2.tar.gz -
Subject digest:
4d68738b881986133cb436d4e4cab67e5bfba4bfbb8f722483c514e0b05bd61b - Sigstore transparency entry: 1189904899
- Sigstore integration time:
-
Permalink:
krulewis/tokencast@0f9711d10141ea74bbb375bc121ac66df7cd449f -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/krulewis
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0f9711d10141ea74bbb375bc121ac66df7cd449f -
Trigger Event:
push
-
Statement type:
File details
Details for the file tokencast-0.1.2-py3-none-any.whl.
File metadata
- Download URL: tokencast-0.1.2-py3-none-any.whl
- Upload date:
- Size: 48.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e6bbf057b0dbe82a6260570dabfc5ddf60be0c4591fa97e3ec77a5ce0b1afb84
|
|
| MD5 |
04910b126400d7c9378d7fe7bb5e6c4a
|
|
| BLAKE2b-256 |
07e83b8ba7098c9a66c9821e8f4c4cd1d72d1e36086b49836ab168f3d6cea11b
|
Provenance
The following attestation bundles were made for tokencast-0.1.2-py3-none-any.whl:
Publisher:
release.yml on krulewis/tokencast
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tokencast-0.1.2-py3-none-any.whl -
Subject digest:
e6bbf057b0dbe82a6260570dabfc5ddf60be0c4591fa97e3ec77a5ce0b1afb84 - Sigstore transparency entry: 1189905050
- Sigstore integration time:
-
Permalink:
krulewis/tokencast@0f9711d10141ea74bbb375bc121ac66df7cd449f -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/krulewis
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0f9711d10141ea74bbb375bc121ac66df7cd449f -
Trigger Event:
push
-
Statement type: