Skip to main content

Calculate LLM API call costs from token usage

Project description

modelcost

Calculate LLM API call costs from token usage using price catalogs from multiple sources.

Supported pricing sources:

  • litellm (default)
  • openrouter
  • tokencost

Install

python -m pip install modelcost

CLI

The default command calculates cost, so you can omit the cost subcommand.

# Default (cost)
modelcost gpt-4o 1000 500

# Explicit cost (optional)
modelcost cost gpt-4o 1000 500

# All sources in one run
modelcost --source all gpt-4o 1000 500

# JSON output
modelcost --json gpt-4o 1000 500

Cached and reasoning tokens

Modern LLM APIs charge differently for cached input and reasoning output tokens. Pass them as optional flags — they default to 0 so existing usage is unchanged.

# Cached input tokens (served from prompt cache)
modelcost gpt-4o 1000 500 --cached-input-tokens 200

# Cache creation tokens (first-time cache writes)
modelcost gpt-4o 1000 500 --cache-creation-input-tokens 100

# Reasoning tokens (subset of output_tokens, e.g. o1/R1 thinking)
modelcost deepseek/deepseek-r1 2000 5000 --reasoning-tokens 3000

# All together
modelcost gpt-4.1-mini 1000 500 \
  --cached-input-tokens 200 \
  --cache-creation-input-tokens 100 \
  --reasoning-tokens 150

List models

modelcost models
modelcost models --source openrouter
modelcost models --filter gpt
modelcost models --json

Help

modelcost --help
modelcost models --help

Library

from modelcost.calculator import calculate_cost, list_models

# Basic usage (backward compatible)
result = calculate_cost("gpt-4o", 1000, 500)

for source in result.available_sources:
    print(f"{source.source}: ${source.total_cost_usd:.6f}")

# With cached and reasoning tokens
result = calculate_cost(
    "gpt-4.1-mini",
    input_tokens=1000,
    output_tokens=500,
    cached_input_tokens=200,
    cache_creation_input_tokens=100,
    reasoning_tokens=150,
)

s = result.sources[0]
print(f"${s.total_cost_usd:.6f}")
print(f"  cache read:  ${s.price_per_million_cache_read}/M")
print(f"  reasoning:   ${s.price_per_million_reasoning}/M")

models = list_models("openrouter")

Output details

calculate_cost() returns a CostResult with:

  • model, input_tokens, output_tokens
  • cached_input_tokens, cache_creation_input_tokens, reasoning_tokens (0 when not used)
  • sources: list of SourceCost objects
  • available_sources: only sources with prices found

Each SourceCost includes:

  • source
  • total_cost_usd
  • price_per_million_input, price_per_million_output
  • price_per_million_cache_read, price_per_million_cache_creation, price_per_million_reasoning (present only when the source has specific pricing for these)
  • error (when not available)

Cost formula

All subset tokens (cached_input_tokens, cache_creation_input_tokens, reasoning_tokens) are treated as subsets of their parent total and are clamped accordingly:

text_input  = input_tokens  - cached_input - cache_creation
text_output = output_tokens - reasoning_tokens

total = text_input     * input_rate
      + cached_input   * cache_read_rate    (fallback: input_rate)
      + cache_creation * cache_creation_rate (fallback: input_rate)
      + text_output    * output_rate
      + reasoning      * reasoning_rate      (fallback: output_rate)

This matches how most APIs report usage — input_tokens and output_tokens are the totals including cached/reasoning, and the detail fields are subsets. When a specific rate is missing, the base rate for that category is used as fallback.

JSON output

--json / result.to_dict() includes the new fields only when non-zero:

{
  "model": "gpt-4.1-mini",
  "input_tokens": 1000,
  "output_tokens": 500,
  "cached_input_tokens": 200,
  "reasoning_tokens": 150,
  "costs": [
    {
      "source": "litellm",
      "total_cost_usd": 0.001148,
      "price_per_million_input": 0.4,
      "price_per_million_output": 1.6,
      "price_per_million_cache_read": 0.1,
      "price_per_million_cache_creation": 0.48,
      "price_per_million_reasoning": 1.6,
      "error": null
    }
  ]
}

Caching

openrouter responses are cached in ~/.modelcost_cache.json for 1 hour.

Notes

  • Prices are fetched at runtime from the upstream catalogs.
  • If a model is missing in a source, that source is marked as unavailable.
  • Network sources are fetched in parallel for the all option.
  • tokencost does not expose cache/reasoning pricing — when used with source="all", its cost may be higher than litellm for calls that include cached or reasoning tokens.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modelcost-0.1.6.tar.gz (147.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

modelcost-0.1.6-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file modelcost-0.1.6.tar.gz.

File metadata

  • Download URL: modelcost-0.1.6.tar.gz
  • Upload date:
  • Size: 147.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for modelcost-0.1.6.tar.gz
Algorithm Hash digest
SHA256 8a4534b1a207349f8a0d20319c7f063bb9d436eb7cf83ff9e968d647c676f2b7
MD5 e92107dc82ecbb95f056dedbd8a94086
BLAKE2b-256 1cbe5e836963a92c9a03cd90223d7563bd2e4f70743e0cca89e24cd6dc448f3e

See more details on using hashes here.

File details

Details for the file modelcost-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: modelcost-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 9.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for modelcost-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 4eab51af3e1dd7a4a87f55f4cd85ac987e21a25a146070dc9cb7db5b9bd86158
MD5 446113613e186f876d905ad7d7348221
BLAKE2b-256 1bfe06a7e068cc2e8507b08c0c98ee6e9d20cfbbc4db8848859f8b45c4d9e68a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page