Catch LLM cost changes in code review. Infracost for LLM spend.

These details have not been verified by PyPI

Project description

tokentoll

Catch LLM cost changes in code review. Infracost for LLM spend.

A CLI tool and GitHub Action that statically analyzes your code for LLM API calls, estimates their cost, and shows you the cost impact of every change -- in your terminal or as a PR comment. Zero runtime dependencies.

The Problem

A single model swap from gpt-4o-mini to gpt-4o increases costs 15x. A new API call in a hot path can add $10,000/month to your bill. These changes hide in normal code review.

tokentoll finds LLM API calls in your code, estimates their cost, and shows you the cost impact of every change -- before it hits production.

Quick Start

pip install tokentoll

# Scan current directory for LLM API calls and their costs
tokentoll scan .

# Show cost impact of your last commit
tokentoll diff HEAD~1

# Compare two branches
tokentoll diff main..feature-branch

GitHub Action

name: LLM Cost Diff
on:
  pull_request:
    paths:
      - "**.py"

permissions:
  pull-requests: write

jobs:
  cost-diff:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - uses: Jwrede/tokentoll@v1
        with:
          calls-per-month: "5000"

What It Detects

SDK	Patterns	Status
OpenAI	`chat.completions.create`, `responses.create`	Supported
Anthropic	`messages.create`, `messages.stream`	Supported
Google GenAI	`models.generate_content`	Supported
LiteLLM	`completion`, `acompletion`	Supported
LangChain	`ChatOpenAI`, `ChatAnthropic`, `init_chat_model`	Supported
JS/TS SDKs	--	Planned

Example Output

`tokentoll scan`

LLM API Calls Detected
============================================================

File: src/agents/summarizer.py
  Line 42: openai client.chat.completions.create
           Model: gpt-4o | Max tokens: 4096
           Est. cost/call: $0.03 | Monthly (1000 calls/month per call site): $26.50

  Line 78: openai client.chat.completions.create
           Model: gpt-4o-mini | Max tokens: 1000
           Est. cost/call: $0.000301 | Monthly (1000 calls/month per call site): $0.30

--
Total estimated monthly cost: $26.80
  1000 calls/month per call site

`tokentoll diff`

LLM Cost Diff: main..feature-branch
============================================================

+ ADDED    src/agents/rewriter.py:35
           openai | Model: gpt-4o
           Est. cost/call: $0.03 | Monthly: +$26.50

~ MODIFIED src/agents/summarizer.py:42
           openai | Model: gpt-4o -> gpt-4o-mini
           Est. cost/call: $0.03 -> $0.000301 | Monthly: -$26.20

--
Monthly cost impact: +$0.30
  Added: 1 | Changed: 1 | Removed: 0
  1000 calls/month per call site

How It Works

  Source Code (.py files)
         |
         v
  +-------------+     +------------------+
  | AST Scanner |---->| SDK Detectors    |
  | (ast.parse) |     | OpenAI, Anthropic|
  +-------------+     | Google, LiteLLM  |
                       | LangChain        |
                       +------------------+
                              |
                              v
                       +------------------+
                       | Pricing Engine   |
                       | 2200+ models     |
                       | Auto-cached      |
                       +------------------+
                              |
                  +-----------+-----------+
                  |                       |
                  v                       v
           +------------+         +-------------+
           | Scan Report|         | Diff Engine  |
           | (costs)    |         | (old vs new) |
           +------------+         +-------------+
                  |                       |
                  v                       v
           +------------+         +-------------+
           | Table/JSON |         | Table/JSON/  |
           |            |         | PR Comment   |
           +------------+         +-------------+

Parses Python files using the ast module to find LLM API calls
Extracts model name, max_tokens, and prompt content where possible
Looks up pricing from a local cache (sourced from LiteLLM, 2200+ models)
For diff mode: compares calls between two git refs and computes the cost delta
Outputs a cost report as a table, JSON, or GitHub PR comment

CLI Reference

tokentoll scan [PATH...] [--format table|json|markdown] [--calls-per-month N]
tokentoll diff [REF] [--base REF] [--head REF] [--format table|json|markdown|github-comment]
tokentoll update    # Update bundled pricing data

Pricing Data

Pricing is bundled and works offline. To update to the latest prices:

tokentoll update

Pricing data is sourced from LiteLLM's model_prices_and_context_window.json and covers 300+ models across OpenAI, Anthropic, Google, AWS Bedrock, Azure, and more.

Limitations

Static analysis only -- cannot estimate costs for dynamic model names or computed prompts (these are flagged but not priced)
Token estimates use a character/4 heuristic unless tiktoken is installed
Monthly estimates assume uniform call volume (configurable via --calls-per-month)
Python only for now (JS/TS support planned)

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.6.1

May 4, 2026

0.6.0

May 4, 2026

0.5.2

May 4, 2026

0.5.0

May 3, 2026

0.4.0

May 3, 2026

0.3.0

May 3, 2026

This version

0.2.0

May 3, 2026

0.1.0

May 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokentoll-0.2.0.tar.gz (60.0 kB view details)

Uploaded May 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tokentoll-0.2.0-py3-none-any.whl (59.8 kB view details)

Uploaded May 3, 2026 Python 3

File details

Details for the file tokentoll-0.2.0.tar.gz.

File metadata

Download URL: tokentoll-0.2.0.tar.gz
Upload date: May 3, 2026
Size: 60.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for tokentoll-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`bfb85620ff2075446a7fe83410a3c86015ee58d7e5d687a7f99eb825019f8686`
MD5	`bab6b776933768ae980d0a2591715def`
BLAKE2b-256	`b3e52dba85c92d253344364e09a0d161df101b0533765f2013a93e06f1e50d8c`

See more details on using hashes here.

File details

Details for the file tokentoll-0.2.0-py3-none-any.whl.

File metadata

Download URL: tokentoll-0.2.0-py3-none-any.whl
Upload date: May 3, 2026
Size: 59.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for tokentoll-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9cec496fdd80d35fcb4fbfbf35d72651abc89b526705d028cdd62a58b7108444`
MD5	`81f18f902922db9e5b377458e4f83d2b`
BLAKE2b-256	`677a875d6b1692bc988cfca4e6da55edf22dada394f30670252729a5e19867b9`

See more details on using hashes here.

tokentoll 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

tokentoll

The Problem

Quick Start

GitHub Action

What It Detects

Example Output

`tokentoll scan`

`tokentoll diff`

How It Works

CLI Reference

Pricing Data

Limitations

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes