Skip to main content

The billing layer for AI agents — self-hosted, Stripe-integrated usage metering and prepaid-credit billing for LLM/agent spend.

Project description

Plutus — LLM FinOps Tool

Test PyPI License: MIT

Named for the Greek god of wealth. Plutus is a standalone provider credit monitor and cost-aware model router for multi-provider LLM stacks.

pip install plutus-agent          # PyPI; stdlib + PyYAML, Stripe optional
plutus demo                       # zero-setup tour with a month of sample data
#   → open http://localhost:8420

| _ | |_ | | _ _ ___ | |) | | | | | __| | | / __| god of wealth | __/| | || | || || __ \ provider credit monitor || ||_,|_|_,|__/ generated 2026-06-20 09:08:00

PROVIDER BALANCE REMAIN TODAY 7D 30D ALL $/DAY DAYS SRC

deepseek $49.74 — $50.51 $99.88 $198.91 $198.91 $14.27 4 live anthropic — — $55.73 $55.73 $55.73 $55.73 $7.96 ∞ est google — — $0.01 $0.15 $0.15 $0.15 $0.02 ∞ est openai — — $0.00 $0.00 $0.00 $0.00 $0.00 ∞ est

TOTAL $106.24 $155.76 $254.79 $254.79


## What Plutus does

Plutus is an **LLM FinOps tool** — it tracks your spend across every LLM provider and balances routing to maximize credit runway:

| Capability | Description |
|---|---|
| **Live balance monitoring** | Pulls real balance from provider APIs (DeepSeek, OpenAI) |
| **Spend tracking** | 7-day, 30-day, and all-time burn rates per provider |
| **Budget estimation** | For providers without balance APIs (Anthropic, Google, custom), tracks remaining against declared budgets |
| **Runway forecasting** | Projects days-until-exhaustion and exhaustion dates |
| **Cost-aware routing** | Rewrites model routing to favor providers with the most runway, with optional cost-cap/latency/quality policies |
| **Multi-channel alerts** | Email (Himalaya), Discord webhook, ntfy.sh — configurable thresholds |
| **HTML dashboard** | Dark-themed dashboard with SVG sparkline trend lines |
| **Backtest mode** | Replay session history against routing policies to measure savings before deploying |

## Quick start

```bash
# Install
pip install plutus

# Monitor your providers
plutus

# Forecast budget exhaustion
plutus forecast

# Set up budgets for providers without balance APIs
plutus --calibrate anthropic=74.46 --calibrate google=93.59

# Route your model stack by runway
plutus-route --dry-run
plutus-route --apply

# Backtest a routing policy before deploying
plutus-route --backtest cost-cap
plutus init --org "Acme Agents" --tier pro --workspace ci --budget 100
plutus topup --amount 50          # add prepaid credit (Stripe does this in prod)
plutus meter --provider anthropic --model claude-opus-4-8 \
             --task code_review --workspace ci --input 8200 --output 2400
plutus serve                      # your live dashboard at :8420

Or in your agent code:

from plutus_agent import Meter

plutus = Meter(org="Acme Agents")
resp = client.messages.create(model="claude-opus-4-8", ...)
plutus.track(provider="anthropic", model="claude-opus-4-8",
             task_type="code_review", workspace="ci",
             input_tokens=resp.usage.input_tokens,
             output_tokens=resp.usage.output_tokens)
print(plutus.balance())           # remaining prepaid credit

Plutus reads your session history from a SQLite state database (Hermes Agent state.db by default, configurable) and fuses it with live balance APIs:

Live balance API DeepSeek, OpenAI — real-time dollar balances via REST
Spend ledger All providers — per-session cost rows (actual or estimated), token counts
Declared budgets Providers without balance APIs — you set a starting budget, Plutus tracks remaining

For estimate-based providers, --calibrate back-solves the budget from your real console balance and all-time ledger spend, then projects remaining going forward. Self-correcting over time.

Routing policies

plutus-route supports configurable routing policies beyond pure runway:

Policy Behavior
runway (default) Max days-left wins
cost-cap Hard ceiling per 1M tokens, prefer cheapest under cap
latency-weighted Prefer faster models, penalize slow ones
quality-floor Filter out models below a benchmark score
Stacked Comma-separated, e.g. cost-cap,quality-floor

Configure in plutus.budgets.json:

{
  "routing": {
    "policy": "cost-cap,quality-floor",
    "cost_max_per_1m": 5.0,
    "quality_min_score": 70
  }
}

Override via CLI: plutus-route --policy runway

Adding custom providers

Add any OpenAI-compatible endpoint to plutus.budgets.json:

{
  "providers": {
    "my-provider": {
      "endpoint": "https://api.example.com/v1",
      "api_key_env": "MY_PROVIDER_KEY",
      "budget_usd": 100.0,
      "models": {
        "flagship": "model-name",
        "subtask": "model-name-fast"
      }
    }
  }
}

Plutus discovers providers from your Hermes config (or env vars) plus the budgets file. No hardcoded limits.

Configuration

Env var Default Purpose
PLUTUS_HERMES_CONFIG Hermes config path Provider keys + routing config
PLUTUS_STATE_DB Hermes state.db path Session/spend ledger
PLUTUS_BUDGETS ./plutus.budgets.json Provider budgets, alerts, routing policy
PLUTUS_PROVIDERS deepseek,anthropic,google,openai Which providers to track
PLUTUS_SNAPSHOTS ./plutus.snapshots.jsonl Burn-rate history

Alerts

{
  "alerts": {
    "email": {
      "to": "you@example.com",
      "balance_threshold_usd": 10.0,
      "days_left_threshold": 3
    },
    "discord": {
      "webhook_url": "https://discord.com/api/webhooks/..."
    },
    "ntfy": {
      "topic": "plutus-alerts",
      "server": "https://ntfy.sh"
    }
  }
}

Run: plutus alert (or plutus alert --dry-run to preview).

Comparisons

Tool Scope Plutus advantage
Braintrust LLM eval + observability Plutus focuses on credit/financial management
Helicone LLM observability + proxy Plutus balances routing by credit runway, not just logging
Finout/Datadog General cloud cost management Plutus is LLM-native — knows model pricing, token costs, provider APIs
LangSmith LLM tracing + eval Plutus handles the money — which provider is cheapest, which has runway

Plutus is the missing piece between LLM ops tools and cloud cost tools: LLM-specific FinOps.

Provider price tables (used to estimate cost from tokens when an exact cost isn't supplied) are overridable under pricing.overrides.

Designed to run on a schedule. Typical cron setup:

# Every hour: refresh dashboard, snapshot, rebalance
*/60 * * * * cd /path/to/plutus && ./plutus-refresh.sh

# Every 3 days: check estimates
0 9 */3 * * cd /path/to/plutus && plutus alert

The credit monitor (plutus.py)

File Purpose
plutus.py Monitor (balance, spend, runway, forecast, calibrate, alert)
plutus_route.py Balancer (runway-based routing + policy engine + backtest)
plutus-refresh.sh Cron driver
plutus.budgets.example.json Budget/alerts/routing template
test_plutus.py Test suite

License

MIT — see LICENSE. © Perseus Computing LLC.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

plutus_agent-0.3.0.tar.gz (71.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

plutus_agent-0.3.0-py3-none-any.whl (59.6 kB view details)

Uploaded Python 3

File details

Details for the file plutus_agent-0.3.0.tar.gz.

File metadata

  • Download URL: plutus_agent-0.3.0.tar.gz
  • Upload date:
  • Size: 71.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for plutus_agent-0.3.0.tar.gz
Algorithm Hash digest
SHA256 fe6accfc2232592ba8fec539806b5bd325b055b016490f27e2b46728697fe2d2
MD5 a1815fac3daa2d0df1b59d6c299dd079
BLAKE2b-256 39f5130e94f8af493d635ee996719a19673c3e92b9b25ec92eca790cb77a0380

See more details on using hashes here.

Provenance

The following attestation bundles were made for plutus_agent-0.3.0.tar.gz:

Publisher: release.yml on Perseus-Computing-LLC/plutus

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file plutus_agent-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: plutus_agent-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 59.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for plutus_agent-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 45edbec3a28758db8c1bc075a04819dde42100789345e84b0a69c9f1ff5160bc
MD5 b6a837ddb9e7dcae6ca1e6b1ea609602
BLAKE2b-256 61ebc8334edbaafbb8239aab70d9f2dff468c06b988c18a848341cbc6d6419b5

See more details on using hashes here.

Provenance

The following attestation bundles were made for plutus_agent-0.3.0-py3-none-any.whl:

Publisher: release.yml on Perseus-Computing-LLC/plutus

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page