Skip to main content

Know exactly what your AI project will cost. Local-first LLM cost forecasting that learns from your usage.

Project description

forecost

Know what your AI project will cost. Before you build it.

PyPI version Python 3.10+ License: MIT

Python 3.10+ required. forecost is in Alpha: APIs may change and some features are experimental.

See forecost demo for a live preview.

The Problem

LLM API costs are unpredictable. You prototype with GPT-4, ship to production, and the first month's bill arrives as a surprise. Most teams have no way to forecast spend until it's too late. forecost fixes this by learning from your actual usage and giving you accurate cost projections before you scale.

Quick Start

Full walkthrough from install to forecast:

pip install forecost
cd your-project
forecost init

Add to your app's entry point (before any LLM calls):

import forecost
forecost.auto_track()

Call auto_track() early, before any httpx usage. If your app imports httpx before forecost, the interceptor may not attach correctly.

Run your app as usual. After building usage for a few days:

forecost forecast

See It in Action

forecost demo runs a forecast with sample data and no setup. Use it to see the full output before tracking your own project.

Auto-Tracking

Non-streaming calls are tracked automatically. No decorators, no manual logging.

Streaming limitation: forecost cannot intercept streaming responses automatically. You must call log_stream_usage after consuming the stream. Pass the accumulated response dict containing a usage key (and optionally model for identification):

import forecost
forecost.auto_track()

# Example: OpenAI streaming
response = client.chat.completions.create(model="gpt-4", messages=[...], stream=True)
accumulated = {"usage": {"prompt_tokens": 0, "completion_tokens": 0}, "model": "gpt-4"}
for chunk in response:
    if chunk.usage:
        accumulated["usage"] = {"prompt_tokens": chunk.usage.prompt_tokens,
                               "completion_tokens": chunk.usage.completion_tokens}
    if chunk.model:
        accumulated["model"] = chunk.model
forecost.log_stream_usage(accumulated)

For Anthropic, use input_tokens and output_tokens instead of prompt_tokens and completion_tokens.

Manual Tracking

For fine-grained control, use the @track_cost decorator or log_call:

import forecost

@forecost.track_cost(provider="openai")
def call_gpt(prompt: str):
    return openai.chat.completions.create(model="gpt-4", messages=[{"role": "user", "content": prompt}])

# Or log calls manually
forecost.log_call(model="gpt-4", tokens_in=500, tokens_out=200, provider="openai")

Commands

Command Description
forecost init Initialize project and create .forecost.toml config
forecost init --budget X Set a budget cap in USD
forecost forecast Show cost forecast in terminal
forecost forecast --output markdown Output forecast as Markdown
forecost forecast --output csv Output forecast as CSV
forecost forecast --tui Interactive TUI dashboard (requires pip install forecost[tui])
forecost forecast --json JSON output for CI/scripts
forecost forecast --brief One-line summary (same format as status)
forecost forecast --exit-code Exit 1 if projected over budget, 2 if actual over budget (for CI)
forecost status One-line summary: spend, projected total, day count, drift status
forecost track View recent tracked LLM calls
forecost watch Live cost dashboard; updates as your app makes calls
forecost export --format csv Export usage data as CSV
forecost export --format json Export usage data as JSON
forecost demo Run forecast with sample data, no setup needed
forecost optimize Suggest cost optimizations based on usage
forecost reset Reset the current project (optionally keep usage logs)
forecost serve Run local API server for programmatic access

status and forecast --brief both show the same one-line summary. Use status when you only need a quick check; use forecast --brief when you want that format in a script or CI pipeline.

Budget Enforcement

Set a budget at init with --budget:

forecost init --budget 100

Use --exit-code on forecast to fail CI when over budget:

- name: Check LLM Budget
  run: |
    pip install forecost
    forecost forecast --exit-code

Exit codes: 0 = on track, 1 = projected over budget, 2 = actual spend over budget.

Disabling in Tests

FORECOST_DISABLED=1 pytest

Or in code:

forecost.disable()

Forecasting Accuracy

forecost uses an ensemble of three statistical forecasting methods (Simple Exponential Smoothing, Damped Trend, and Linear Regression) inspired by the M4 Forecasting Competition, where simple combinations beat complex ML models across 100,000 time series.

Metric What it means Typical result
MASE Are we beating a naive guess? < 1.0 after 5 days
MAE How many dollars could we be off? Decreases as data grows
80% interval Will the real cost land here? ~80% of the time
95% interval Conservative budget range ~95% of the time

Install the ensemble engine for best results: pip install forecost[forecast]

The base install uses a simpler exponential moving average that works without additional dependencies.

Why forecost?

Feature forecost LiteLLM Helicone LangSmith
Cost tracking Yes Yes Yes Yes
Cost forecasting Yes No No No
Prediction intervals Yes No No No
Zero infrastructure Yes No (proxy) No (cloud) No (cloud)
Zero overhead on requests Yes (post-response) No (proxy latency) No (proxy latency) No (SDK wrapper)
Local-only / private Yes Partial No No
pip install, 2 lines Yes SDK wrapper Proxy setup SDK setup
Free forever Yes Freemium Freemium $39/seat/mo

Minimal footprint: 3 runtime dependencies (click, rich, httpx; plus tomli on Python 3.10), under 3MB.

Data Storage

  • Usage and forecasts: ~/.forecost/costs.db (SQLite). All projects share this database.
  • Project config: .forecost.toml in your project root. Contains project name, baseline days, and optional budget.

Glossary

Term Meaning
Confidence levels How reliable the forecast is based on data volume: low (0 days), medium-low (1-3), medium (4-7), high (8-14), very-high (15+). More usage data yields higher confidence.
Drift status Whether spend is trending above or below the baseline: on_track, over_budget, or under_budget. Based on recent daily burn ratios.
MASE Mean Absolute Scaled Error. Compares forecast accuracy to a naive "yesterday = tomorrow" guess. MASE < 1.0 means the forecast beats the naive baseline.
Stability How much the forecast changes between runs: converged (< 5% change), stabilizing (5-15%), or adjusting (> 15%).
Prediction intervals 80% and 95% ranges around the projected total. The real cost will fall within the 80% interval about 80% of the time.

Local API Server

forecost serve starts a local HTTP server (default port 8787) for programmatic access:

Endpoint Description
GET /api/health Health check. Returns {"status": "ok"}.
GET /api/forecast Full forecast result (same as forecost forecast --json).
GET /api/status Project status: active days, actual spend, baseline info.
GET /api/costs Recent usage logs.

Run from your project directory so forecost can find .forecost.toml.

Contributing

See CONTRIBUTING.md.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

forecost-0.1.1.tar.gz (41.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

forecost-0.1.1-py3-none-any.whl (38.7 kB view details)

Uploaded Python 3

File details

Details for the file forecost-0.1.1.tar.gz.

File metadata

  • Download URL: forecost-0.1.1.tar.gz
  • Upload date:
  • Size: 41.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for forecost-0.1.1.tar.gz
Algorithm Hash digest
SHA256 b3a2de48db2f731a3fde6f5f7df3d273cb23cf29df846406c2a3972718bb24c7
MD5 d0910d23fc8bd4cacbfb4f58c6b2458d
BLAKE2b-256 c7d4808ec3aa7cc1be4221d53e0ebdba63270e952e7bc49cd17b272869324ab1

See more details on using hashes here.

Provenance

The following attestation bundles were made for forecost-0.1.1.tar.gz:

Publisher: release.yml on ArivunidhiA/forecost

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file forecost-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: forecost-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 38.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for forecost-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8200df3dcd0729f608e0e1cdd08d6435ea08f74d1b0d42f4316f72eb1239df0e
MD5 566d60d05fd68d1adc125a4c616052c7
BLAKE2b-256 4636b47f28277f271fbf84e86a5d806215e0664b4014c4074dc1347d34e59af2

See more details on using hashes here.

Provenance

The following attestation bundles were made for forecost-0.1.1-py3-none-any.whl:

Publisher: release.yml on ArivunidhiA/forecost

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page