Pre-execution cost estimation for LLM agent workflows with calibration learning

These details have not been verified by PyPI

Project links

Project description

tokencast logo

tokencast

A Claude Code skill that estimates Anthropic API cost for planned agent tasks, then learns from actual usage to improve estimates over time.

Install once per project. It auto-estimates after plans are created and auto-learns at session end. Zero ongoing friction.

Setup (one time per project)

# Clone the repo (anywhere — it doesn't need to live inside your project)
git clone https://github.com/krulewis/tokencast.git

# Install into your project (quote paths with spaces)
bash tokencast/scripts/install-hooks.sh "/path/to/your-project"

Paths with spaces: Always wrap the project path in quotes. Without them the install script will fail on paths like /Volumes/Macintosh HD2/....

This does three things:

Symlinks the skill into <project>/.claude/skills/tokencast/
Adds a Stop hook for auto-learning at session end
Adds a PostToolUse hook to nudge estimation after planning agents

Every Claude Code session in that project now has tokencast active.

What Happens Automatically

After a plan is created

tokencast detects the plan in conversation context, infers size, files, complexity, project type, and language, then outputs a cost table:

## tokencast estimate

Change: size=M, files=5, complexity=medium
Calibration: 1.12x from 8 prior runs

| Step                  | Model  | Optimistic | Expected | Pessimistic |
|-----------------------|--------|------------|----------|-------------|
| Research Agent        | Sonnet | $0.60      | $1.17    | $4.47       |
| Architect Agent       | Opus   | $0.67      | $1.18    | $3.97       |
| ...                   | ...    | ...        | ...      | ...         |
| TOTAL                 |        | $3.37      | $6.26    | $22.64      |

At session end

The learning hook silently:

Reads the session's JSONL log
Computes actual token cost (including cache write tokens)
Compares to the estimate
Updates calibration factors

Next session

Future estimates use learned correction factors. More sessions = better accuracy.

Manual Invocation

You can also invoke explicitly with overrides:

/tokencast size=L files=12 complexity=high
/tokencast steps=implement,test,qa
/tokencast review_cycles=3
/tokencast review_cycles=0

Use review_cycles=N to set the number of expected PR review cycles. Use review_cycles=0 to suppress the PR Review Loop row.

How It Works

Infers size, file count, complexity from the plan in conversation
Reads reference files for pricing and token heuristics
Loads learned calibration factors (if any exist)
Computes per-step token estimates using activity decomposition
Applies complexity multiplier, context accumulation (K+1)/2, and cache rates
Splits into Optimistic / Expected / Pessimistic bands
If PR Review Loop is in scope, computes loop cost using geometric decay across N review cycles (Optimistic=1, Expected=N, Pessimistic=N×2)
Applies calibration correction to Expected band (individual steps re-anchor; PR Review Loop scales each band independently)
Records the estimate for later comparison with actuals

Overrides

Override	Effect
`size=M`	Set size class explicitly
`files=5`	Set file count explicitly
`complexity=high`	Set complexity explicitly
`steps=implement,test,qa`	Estimate only those pipeline steps
`project_type=migration`	Set project type explicitly
`language=go`	Set primary language explicitly
`review_cycles=3`	Set PR review cycle count (0 = disable)

Confidence Bands

Band	Cache Hit	Multiplier	Meaning
Optimistic	60%	0.6x	Best case — focused agent work
Expected	50%	1.0x	Typical run
Pessimistic	30%	3.0x	With rework loops, debugging, retries

Calibration

Calibration is fully automatic:

0-2 sessions: No correction applied. "Collecting data" status.
3-10 sessions: Global correction factor via trimmed mean of actual/expected ratios (trim_fraction=0.1).
10+ sessions: EWMA with recency weighting. Per-size-class factors activate when a class has 3+ samples.
Outlier filtering: Sessions with actual/expected ratio >3.0x or <0.2x are excluded from calibration and logged for inspection.

Calibration data lives in calibration/ (gitignored, local to each user).

Disabling

bash /path/to/tokencast/scripts/disable.sh /path/to/your-project

Removes the skill and hooks. Preserves calibration data for reuse.

Files

SKILL.md                        — Skill definition (auto-trigger, algorithm)
references/pricing.md           — Model prices, cache rates, step→model map
references/heuristics.md        — Token budgets, pipeline decompositions, multipliers
references/examples.md          — Worked examples with arithmetic
references/calibration-algorithm.md — Detailed calibration algorithm reference
commands/
  tokencast-version.md     — /tokencast-version slash command
scripts/
  install-hooks.sh              — One-time project setup
  disable.sh                    — Remove from project
  tokencast-learn.sh       — Stop hook: auto-captures actuals
  tokencast-track.sh       — PostToolUse hook: nudges estimation after plans
  sum-session-tokens.py         — Parses session JSONL for actual costs
  update-factors.py             — Computes calibration factors from history
calibration/                    — Per-user local data (gitignored)
  history.jsonl                 — Estimate vs actual records
  factors.json                  — Learned correction factors
  active-estimate.json          — Transient marker for current estimate

v1.1 Changes

Trimmed mean replaces median for faster convergence with small samples
Outlier flagging — extreme ratios (>3.0x or <0.2x) excluded from calibration, logged for inspection
Richer data — project type, language, pipeline signature, and step count captured per session
Baseline subtraction — tokens spent before the estimate are excluded from actuals
Security hardening — path injection fixes, consolidated parsing, safe handling of paths with spaces
Version markers — version: 1.1.0 in SKILL.md, --version flag on learn script

v1.2 Changes

PR Review Loop modeling — geometric-decay cost model for review-fix-re-review cycles
New override — review_cycles=N to set expected cycle count (0 = disable)
Per-band calibration — PR Review Loop applies calibration independently per band (not re-anchored)
New schema fields — review_cycles_estimated and review_cycles_actual in active-estimate.json

Limitations

Pipeline step names reflect a default workflow — map your own steps to the closest defaults. Formulas are pipeline-agnostic (see references/heuristics.md)
Heuristics assume typical 150-300 line source files
Does not model parallel agent execution
Calibration requires 3+ completed sessions before corrections activate
Pricing data embedded; check last_updated in references/pricing.md
Multi-session tasks only capture the session containing the estimate

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.4

Mar 31, 2026

0.1.3

Mar 30, 2026

0.1.2

Mar 28, 2026

0.1.1

Mar 28, 2026

This version

0.1.0

Mar 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokencast-0.1.0.tar.gz (441.3 kB view details)

Uploaded Mar 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tokencast-0.1.0-py3-none-any.whl (5.3 kB view details)

Uploaded Mar 26, 2026 Python 3

File details

Details for the file tokencast-0.1.0.tar.gz.

File metadata

Download URL: tokencast-0.1.0.tar.gz
Upload date: Mar 26, 2026
Size: 441.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for tokencast-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`8d48e0bdb5f6e63b9057595d66aaaf496bb0b05688799e829c329f282e5dd4c0`
MD5	`653970c8f9317c3ff3d818a62dbc25eb`
BLAKE2b-256	`b8a016cdde26d041fe9d6947ce7b64fd4ab3a5c374187e2d009218bbddc67e1a`

See more details on using hashes here.

File details

Details for the file tokencast-0.1.0-py3-none-any.whl.

File metadata

Download URL: tokencast-0.1.0-py3-none-any.whl
Upload date: Mar 26, 2026
Size: 5.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for tokencast-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`eb723e809ca58fe63adbe0946d31ee0a8c450b4bea9250b181d85679321d3bb7`
MD5	`fa2aada3c2cdb250d12b34e8b11b5a02`
BLAKE2b-256	`10e6fbe963426875b5ff909fb1270fe730bc4853a673ecf541fc02a53e1a4775`

See more details on using hashes here.

tokencast 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

tokencast

Setup (one time per project)

What Happens Automatically

After a plan is created

At session end

Next session

Manual Invocation

How It Works

Overrides

Confidence Bands

Calibration

Disabling

Files

v1.1 Changes

v1.2 Changes

Limitations

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes