Know exactly what your AI project will cost. Local-first LLM cost forecasting that learns from your usage.
Project description
forecost
Know what your AI project will cost. Before you build it.
Python 3.10+ required. forecost is in Alpha: APIs may change and some features are experimental.
See forecost demo for a live preview.
The Problem
LLM API costs are unpredictable. You prototype with GPT-4, ship to production, and the first month's bill arrives as a surprise. Most teams have no way to forecast spend until it's too late. forecost fixes this by learning from your actual usage and giving you accurate cost projections before you scale.
Quick Start
Full walkthrough from install to forecast:
pip install forecost
cd your-project
forecost init
Add to your app's entry point (before any LLM calls):
import forecost
forecost.auto_track()
Call auto_track() early, before any httpx usage. If your app imports httpx before forecost, the interceptor may not attach correctly.
Run your app as usual. After building usage for a few days:
forecost forecast
See It in Action
forecost demo runs a forecast with sample data and no setup. Use it to see the full output before tracking your own project.
Auto-Tracking
Non-streaming calls are tracked automatically. No decorators, no manual logging.
Streaming limitation: forecost cannot intercept streaming responses automatically. You must call log_stream_usage after consuming the stream. Pass the accumulated response dict containing a usage key (and optionally model for identification):
import forecost
forecost.auto_track()
# Example: OpenAI streaming
response = client.chat.completions.create(model="gpt-4", messages=[...], stream=True)
accumulated = {"usage": {"prompt_tokens": 0, "completion_tokens": 0}, "model": "gpt-4"}
for chunk in response:
if chunk.usage:
accumulated["usage"] = {"prompt_tokens": chunk.usage.prompt_tokens,
"completion_tokens": chunk.usage.completion_tokens}
if chunk.model:
accumulated["model"] = chunk.model
forecost.log_stream_usage(accumulated)
For Anthropic, use input_tokens and output_tokens instead of prompt_tokens and completion_tokens.
Manual Tracking
For fine-grained control, use the @track_cost decorator or log_call:
import forecost
@forecost.track_cost(provider="openai")
def call_gpt(prompt: str):
return openai.chat.completions.create(model="gpt-4", messages=[{"role": "user", "content": prompt}])
# Or log calls manually
forecost.log_call(model="gpt-4", tokens_in=500, tokens_out=200, provider="openai")
Commands
| Command | Description |
|---|---|
forecost init |
Initialize project and create .forecost.toml config |
forecost init --budget X |
Set a budget cap in USD |
forecost forecast |
Show cost forecast in terminal |
forecost forecast --output markdown |
Output forecast as Markdown |
forecost forecast --output csv |
Output forecast as CSV |
forecost forecast --tui |
Interactive TUI dashboard (requires pip install forecost[tui]) |
forecost forecast --json |
JSON output for CI/scripts |
forecost forecast --brief |
One-line summary (same format as status) |
forecost forecast --exit-code |
Exit 1 if projected over budget, 2 if actual over budget (for CI) |
forecost status |
One-line summary: spend, projected total, day count, drift status |
forecost track |
View recent tracked LLM calls |
forecost watch |
Live cost dashboard; updates as your app makes calls |
forecost export --format csv |
Export usage data as CSV |
forecost export --format json |
Export usage data as JSON |
forecost demo |
Run forecast with sample data, no setup needed |
forecost optimize |
Suggest cost optimizations based on usage |
forecost reset |
Reset the current project (optionally keep usage logs) |
forecost serve |
Run local API server for programmatic access |
status and forecast --brief both show the same one-line summary. Use status when you only need a quick check; use forecast --brief when you want that format in a script or CI pipeline.
Budget Enforcement
Set a budget at init with --budget:
forecost init --budget 100
Use --exit-code on forecast to fail CI when over budget:
- name: Check LLM Budget
run: |
pip install forecost
forecost forecast --exit-code
Exit codes: 0 = on track, 1 = projected over budget, 2 = actual spend over budget.
Disabling in Tests
FORECOST_DISABLED=1 pytest
Or in code:
forecost.disable()
Forecasting Accuracy
forecost uses an ensemble of three statistical forecasting methods (Simple Exponential Smoothing, Damped Trend, and Linear Regression) inspired by the M4 Forecasting Competition, where simple combinations beat complex ML models across 100,000 time series.
| Metric | What it means | Typical result |
|---|---|---|
| MASE | Are we beating a naive guess? | < 1.0 after 5 days |
| MAE | How many dollars could we be off? | Decreases as data grows |
| 80% interval | Will the real cost land here? | ~80% of the time |
| 95% interval | Conservative budget range | ~95% of the time |
Install the ensemble engine for best results: pip install forecost[forecast]
The base install uses a simpler exponential moving average that works without additional dependencies.
Why forecost?
| Feature | forecost | LiteLLM | Helicone | LangSmith |
|---|---|---|---|---|
| Cost tracking | Yes | Yes | Yes | Yes |
| Cost forecasting | Yes | No | No | No |
| Prediction intervals | Yes | No | No | No |
| Zero infrastructure | Yes | No (proxy) | No (cloud) | No (cloud) |
| Zero overhead on requests | Yes (post-response) | No (proxy latency) | No (proxy latency) | No (SDK wrapper) |
| Local-only / private | Yes | Partial | No | No |
| pip install, 2 lines | Yes | SDK wrapper | Proxy setup | SDK setup |
| Free forever | Yes | Freemium | Freemium | $39/seat/mo |
Minimal footprint: 3 runtime dependencies (click, rich, httpx; plus tomli on Python 3.10), under 3MB.
Data Storage
- Usage and forecasts:
~/.forecost/costs.db(SQLite). All projects share this database. - Project config:
.forecost.tomlin your project root. Contains project name, baseline days, and optional budget.
Glossary
| Term | Meaning |
|---|---|
| Confidence levels | How reliable the forecast is based on data volume: low (0 days), medium-low (1-3), medium (4-7), high (8-14), very-high (15+). More usage data yields higher confidence. |
| Drift status | Whether spend is trending above or below the baseline: on_track, over_budget, or under_budget. Based on recent daily burn ratios. |
| MASE | Mean Absolute Scaled Error. Compares forecast accuracy to a naive "yesterday = tomorrow" guess. MASE < 1.0 means the forecast beats the naive baseline. |
| Stability | How much the forecast changes between runs: converged (< 5% change), stabilizing (5-15%), or adjusting (> 15%). |
| Prediction intervals | 80% and 95% ranges around the projected total. The real cost will fall within the 80% interval about 80% of the time. |
Local API Server
forecost serve starts a local HTTP server (default port 8787) for programmatic access:
| Endpoint | Description |
|---|---|
GET /api/health |
Health check. Returns {"status": "ok"}. |
GET /api/forecast |
Full forecast result (same as forecost forecast --json). |
GET /api/status |
Project status: active days, actual spend, baseline info. |
GET /api/costs |
Recent usage logs. |
Run from your project directory so forecost can find .forecost.toml.
Contributing
See CONTRIBUTING.md.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file forecost-0.1.1.tar.gz.
File metadata
- Download URL: forecost-0.1.1.tar.gz
- Upload date:
- Size: 41.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b3a2de48db2f731a3fde6f5f7df3d273cb23cf29df846406c2a3972718bb24c7
|
|
| MD5 |
d0910d23fc8bd4cacbfb4f58c6b2458d
|
|
| BLAKE2b-256 |
c7d4808ec3aa7cc1be4221d53e0ebdba63270e952e7bc49cd17b272869324ab1
|
Provenance
The following attestation bundles were made for forecost-0.1.1.tar.gz:
Publisher:
release.yml on ArivunidhiA/forecost
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
forecost-0.1.1.tar.gz -
Subject digest:
b3a2de48db2f731a3fde6f5f7df3d273cb23cf29df846406c2a3972718bb24c7 - Sigstore transparency entry: 1154513794
- Sigstore integration time:
-
Permalink:
ArivunidhiA/forecost@418da7df35425385573ad74e93456ad8667ba0e8 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/ArivunidhiA
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@418da7df35425385573ad74e93456ad8667ba0e8 -
Trigger Event:
release
-
Statement type:
File details
Details for the file forecost-0.1.1-py3-none-any.whl.
File metadata
- Download URL: forecost-0.1.1-py3-none-any.whl
- Upload date:
- Size: 38.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8200df3dcd0729f608e0e1cdd08d6435ea08f74d1b0d42f4316f72eb1239df0e
|
|
| MD5 |
566d60d05fd68d1adc125a4c616052c7
|
|
| BLAKE2b-256 |
4636b47f28277f271fbf84e86a5d806215e0664b4014c4074dc1347d34e59af2
|
Provenance
The following attestation bundles were made for forecost-0.1.1-py3-none-any.whl:
Publisher:
release.yml on ArivunidhiA/forecost
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
forecost-0.1.1-py3-none-any.whl -
Subject digest:
8200df3dcd0729f608e0e1cdd08d6435ea08f74d1b0d42f4316f72eb1239df0e - Sigstore transparency entry: 1154513797
- Sigstore integration time:
-
Permalink:
ArivunidhiA/forecost@418da7df35425385573ad74e93456ad8667ba0e8 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/ArivunidhiA
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@418da7df35425385573ad74e93456ad8667ba0e8 -
Trigger Event:
release
-
Statement type: