Calculate LLM API call costs from token usage
Project description
modelcost
Calculate LLM API call costs from token usage using price catalogs from multiple sources.
Supported pricing sources:
litellm(default)openroutertokencost
Install
python -m pip install modelcost
CLI
The default command calculates cost, so you can omit the cost subcommand.
# Default (cost)
modelcost gpt-4o 1000 500
# Explicit cost (optional)
modelcost cost gpt-4o 1000 500
# All sources in one run
modelcost --source all gpt-4o 1000 500
# JSON output
modelcost --json gpt-4o 1000 500
Cached and reasoning tokens
Modern LLM APIs charge differently for cached input and reasoning output tokens. Pass them as optional flags — they default to 0 so existing usage is unchanged.
# Cached input tokens (served from prompt cache)
modelcost gpt-4o 1000 500 --cached-input-tokens 200
# Cache creation tokens (first-time cache writes)
modelcost gpt-4o 1000 500 --cache-creation-input-tokens 100
# Reasoning tokens (subset of output_tokens, e.g. o1/R1 thinking)
modelcost deepseek/deepseek-r1 2000 5000 --reasoning-tokens 3000
# All together
modelcost gpt-4.1-mini 1000 500 \
--cached-input-tokens 200 \
--cache-creation-input-tokens 100 \
--reasoning-tokens 150
List models
modelcost models
modelcost models --source openrouter
modelcost models --filter gpt
modelcost models --json
Help
modelcost --help
modelcost models --help
Library
from modelcost.calculator import calculate_cost, list_models
# Basic usage (backward compatible)
result = calculate_cost("gpt-4o", 1000, 500)
for source in result.available_sources:
print(f"{source.source}: ${source.total_cost_usd:.6f}")
# With cached and reasoning tokens
result = calculate_cost(
"gpt-4.1-mini",
input_tokens=1000,
output_tokens=500,
cached_input_tokens=200,
cache_creation_input_tokens=100,
reasoning_tokens=150,
)
s = result.sources[0]
print(f"${s.total_cost_usd:.6f}")
print(f" cache read: ${s.price_per_million_cache_read}/M")
print(f" reasoning: ${s.price_per_million_reasoning}/M")
models = list_models("openrouter")
Output details
calculate_cost() returns a CostResult with:
model,input_tokens,output_tokenscached_input_tokens,cache_creation_input_tokens,reasoning_tokens(0 when not used)sources: list ofSourceCostobjectsavailable_sources: only sources with prices found
Each SourceCost includes:
sourcetotal_cost_usdprice_per_million_input,price_per_million_outputprice_per_million_cache_read,price_per_million_cache_creation,price_per_million_reasoning(present only when the source has specific pricing for these)error(when not available)
Cost formula
All subset tokens (cached_input_tokens, cache_creation_input_tokens, reasoning_tokens)
are treated as subsets of their parent total and are clamped accordingly:
text_input = input_tokens - cached_input - cache_creation
text_output = output_tokens - reasoning_tokens
total = text_input * input_rate
+ cached_input * cache_read_rate (fallback: input_rate)
+ cache_creation * cache_creation_rate (fallback: input_rate)
+ text_output * output_rate
+ reasoning * reasoning_rate (fallback: output_rate)
This matches how most APIs report usage — input_tokens and output_tokens are the
totals including cached/reasoning, and the detail fields are subsets.
When a specific rate is missing, the base rate for that category is used as fallback.
JSON output
--json / result.to_dict() includes the new fields only when non-zero:
{
"model": "gpt-4.1-mini",
"input_tokens": 1000,
"output_tokens": 500,
"cached_input_tokens": 200,
"reasoning_tokens": 150,
"costs": [
{
"source": "litellm",
"total_cost_usd": 0.001148,
"price_per_million_input": 0.4,
"price_per_million_output": 1.6,
"price_per_million_cache_read": 0.1,
"price_per_million_cache_creation": 0.48,
"price_per_million_reasoning": 1.6,
"error": null
}
]
}
Caching
openrouter responses are cached in ~/.modelcost_cache.json for 1 hour.
Notes
- Prices are fetched at runtime from the upstream catalogs.
- If a model is missing in a source, that source is marked as unavailable.
- Network sources are fetched in parallel for the
alloption. tokencostdoes not expose cache/reasoning pricing — when used withsource="all", its cost may be higher thanlitellmfor calls that include cached or reasoning tokens.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file modelcost-0.1.6.tar.gz.
File metadata
- Download URL: modelcost-0.1.6.tar.gz
- Upload date:
- Size: 147.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8a4534b1a207349f8a0d20319c7f063bb9d436eb7cf83ff9e968d647c676f2b7
|
|
| MD5 |
e92107dc82ecbb95f056dedbd8a94086
|
|
| BLAKE2b-256 |
1cbe5e836963a92c9a03cd90223d7563bd2e4f70743e0cca89e24cd6dc448f3e
|
File details
Details for the file modelcost-0.1.6-py3-none-any.whl.
File metadata
- Download URL: modelcost-0.1.6-py3-none-any.whl
- Upload date:
- Size: 9.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4eab51af3e1dd7a4a87f55f4cd85ac987e21a25a146070dc9cb7db5b9bd86158
|
|
| MD5 |
446113613e186f876d905ad7d7348221
|
|
| BLAKE2b-256 |
1bfe06a7e068cc2e8507b08c0c98ee6e9d20cfbbc4db8848859f8b45c4d9e68a
|