Catch LLM cost changes in code review. Infracost for LLM spend.
Project description
tokentoll
Catch LLM cost changes in code review. Infracost for LLM spend.
A CLI tool and GitHub Action that statically analyzes your code for LLM API calls, estimates their cost, and shows you the cost impact of every change -- in your terminal or as a PR comment. Zero runtime dependencies.
The Problem
A single model swap from gpt-4o-mini to gpt-4o increases costs 15x.
A new API call in a hot path can add $10,000/month to your bill.
These changes hide in normal code review.
tokentoll finds LLM API calls in your code, estimates their cost, and shows you the cost impact of every change -- before it hits production.
Quick Start
pip install tokentoll
# Scan current directory for LLM API calls and their costs
tokentoll scan .
# Show cost impact of your last commit
tokentoll diff HEAD~1
# Compare two branches
tokentoll diff main..feature-branch
GitHub Action
name: LLM Cost Diff
on:
pull_request:
paths:
- "**.py"
permissions:
pull-requests: write
jobs:
cost-diff:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: Jwrede/tokentoll@v1
with:
calls-per-month: "5000"
What It Detects
| SDK | Patterns | Status |
|---|---|---|
| OpenAI | chat.completions.create, responses.create |
Supported |
| Anthropic | messages.create, messages.stream |
Supported |
| Google GenAI | models.generate_content |
Supported |
| LiteLLM | completion, acompletion |
Supported |
| LangChain | ChatOpenAI, ChatAnthropic, init_chat_model |
Supported |
| JS/TS SDKs | -- | Planned |
Example Output
tokentoll scan
LLM API Calls Detected
============================================================
File: src/agents/summarizer.py
Line 42: openai client.chat.completions.create
Model: gpt-4o | Max tokens: 4096
Est. cost/call: $0.03 | Monthly (1000 calls/month per call site): $26.50
Line 78: openai client.chat.completions.create
Model: gpt-4o-mini | Max tokens: 1000
Est. cost/call: $0.000301 | Monthly (1000 calls/month per call site): $0.30
--
Total estimated monthly cost: $26.80
1000 calls/month per call site
tokentoll diff
LLM Cost Diff: main..feature-branch
============================================================
+ ADDED src/agents/rewriter.py:35
openai | Model: gpt-4o
Est. cost/call: $0.03 | Monthly: +$26.50
~ MODIFIED src/agents/summarizer.py:42
openai | Model: gpt-4o -> gpt-4o-mini
Est. cost/call: $0.03 -> $0.000301 | Monthly: -$26.20
--
Monthly cost impact: +$0.30
Added: 1 | Changed: 1 | Removed: 0
1000 calls/month per call site
How It Works
Source Code (.py files)
|
v
+-------------+ +------------------+
| AST Scanner |---->| SDK Detectors |
| (ast.parse) | | OpenAI, Anthropic|
+-------------+ | Google, LiteLLM |
| LangChain |
+------------------+
|
v
+------------------+
| Pricing Engine |
| 2200+ models |
| Auto-cached |
+------------------+
|
+-----------+-----------+
| |
v v
+------------+ +-------------+
| Scan Report| | Diff Engine |
| (costs) | | (old vs new) |
+------------+ +-------------+
| |
v v
+------------+ +-------------+
| Table/JSON | | Table/JSON/ |
| | | PR Comment |
+------------+ +-------------+
- Parses Python files using the
astmodule to find LLM API calls - Extracts model name, max_tokens, and prompt content where possible
- Looks up pricing from a local cache (sourced from LiteLLM, 2200+ models)
- For diff mode: compares calls between two git refs and computes the cost delta
- Outputs a cost report as a table, JSON, or GitHub PR comment
CLI Reference
tokentoll scan [PATH...] [--format table|json|markdown] [--calls-per-month N]
tokentoll diff [REF] [--base REF] [--head REF] [--format table|json|markdown|github-comment]
tokentoll update # Update bundled pricing data
Pricing Data
Pricing is bundled and works offline. To update to the latest prices:
tokentoll update
Pricing data is sourced from LiteLLM's model_prices_and_context_window.json
and covers 300+ models across OpenAI, Anthropic, Google, AWS Bedrock,
Azure, and more.
Limitations
- Static analysis only -- cannot estimate costs for dynamic model names or computed prompts (these are flagged but not priced)
- Token estimates use a character/4 heuristic unless tiktoken is installed
- Monthly estimates assume uniform call volume (configurable via
--calls-per-month) - Python only for now (JS/TS support planned)
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tokentoll-0.1.0.tar.gz.
File metadata
- Download URL: tokentoll-0.1.0.tar.gz
- Upload date:
- Size: 57.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
de0a4e80e18ec5c3eecb5058daca05e8d17886438ad546365d08487be461d298
|
|
| MD5 |
cc318ccf6e4b18e044029e2b9873504c
|
|
| BLAKE2b-256 |
f55934847a27bb7bc225ebb5f4ea372de9dd8b4985fab191d3c6131159d5259f
|
File details
Details for the file tokentoll-0.1.0-py3-none-any.whl.
File metadata
- Download URL: tokentoll-0.1.0-py3-none-any.whl
- Upload date:
- Size: 57.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
291930a907c7c6f92fdda78cb512dc757b8699789b9d75730ae5b548c25b11e7
|
|
| MD5 |
826d816cba98106907bbbf263a6aa85b
|
|
| BLAKE2b-256 |
81d2f079492b09ecb74685a0d29a986aa5243914c929ecb78cd9bc340369c162
|