Budget evaluator for agent-control -- cumulative LLM cost and token tracking

Project description

agent-control-evaluator-budget

Budget evaluator for agent-control that tracks cumulative LLM token and cost usage per scope and time window.

Install

pip install "agent-control-evaluators[budget]"

Fallback direct wheel install:

pip install agent-control-evaluator-budget

For local development:

uv pip install -e evaluators/contrib/budget

Quickstart

from agent_control_evaluator_budget.budget import (
    BudgetEvaluatorConfig,
    BudgetLimitRule,
    ModelPricing,
)

config = BudgetEvaluatorConfig(
    budget_id="support-daily",
    limits=[
        BudgetLimitRule(
            scope={"agent": "support"},
            group_by="user_id",
            window_seconds=86_400,
            limit=500,
            limit_unit="usd_cents",
        ),
        BudgetLimitRule(
            scope={"agent": "support"},
            group_by="user_id",
            window_seconds=86_400,
            limit=50_000,
            limit_unit="tokens",
        ),
    ],
    pricing={
        "gpt-4.1-mini": ModelPricing(input_per_1k=0.04, output_per_1k=0.16),
    },
    model_path="model",
    metadata_paths={
        "agent": "metadata.agent",
        "user_id": "metadata.user_id",
    },
    unknown_model_behavior="block",
)

The evaluator reads token usage from standard fields such as usage.input_tokens and usage.output_tokens. Configure token_path only when your event shape uses a custom location.

Scope and group_by

Each BudgetLimitRule has a static scope and an optional group_by field.

scope filters which events a rule applies to. A rule with scope={"agent": "support"} only applies when extracted metadata contains agent="support". An empty scope is global.

group_by creates independent buckets per extracted metadata value. The common per-user pattern is:

BudgetLimitRule(
    scope={"agent": "support"},
    group_by="user_id",
    window_seconds=86_400,
    limit=500,
    limit_unit="usd_cents",
)

With metadata_paths={"user_id": "metadata.user_id"}, each user gets a separate daily budget inside the support scope.

Budget pools

budget_id identifies the accumulated budget pool.

Evaluators with the same budget_id share accumulated spend and token totals across all evaluator instances. Each evaluator still evaluates using its own configured rules -- the shared state is the bucket (the rolling sum), not the rule set. Evaluators with different budget_id values are fully isolated.

Use stable names such as support-daily, billing-global, or tenant-acme-monthly. Avoid generating a new budget_id per request unless each request should have an isolated budget.

Pricing

ModelPricing stores cost rates in cents per 1K tokens:

ModelPricing(input_per_1k=0.04, output_per_1k=0.16)

input_per_1k is applied to input tokens. output_per_1k is applied to output tokens.

Pricing and model_path are required when any rule uses limit_unit="usd_cents". Token-only rules can omit both. If an event uses a model that is not in the pricing table and a cost rule exists, unknown_model_behavior="block" fails closed. Use "warn" to log a warning and treat the cost as 0.

Dual Ceiling Pattern

Use two evaluators when cost and token ceilings need independent control records or different budget_id pools:

cost_config = BudgetEvaluatorConfig(
    budget_id="support-cost-daily",
    limits=[
        BudgetLimitRule(
            scope={"agent": "support"},
            group_by="user_id",
            window_seconds=86_400,
            limit=500,
            limit_unit="usd_cents",
        )
    ],
    pricing={
        "gpt-4.1-mini": ModelPricing(input_per_1k=0.04, output_per_1k=0.16),
    },
    model_path="model",
    metadata_paths={"agent": "metadata.agent", "user_id": "metadata.user_id"},
)

token_config = BudgetEvaluatorConfig(
    budget_id="support-token-daily",
    limits=[
        BudgetLimitRule(
            scope={"agent": "support"},
            group_by="user_id",
            window_seconds=86_400,
            limit=50_000,
            limit_unit="tokens",
        )
    ],
    metadata_paths={"agent": "metadata.agent", "user_id": "metadata.user_id"},
)

This pattern lets cost and token budgets reset, alert, and roll out independently. A single evaluator can also contain both rules when one shared pool and one control result are sufficient.

Limitations

InMemoryBudgetStore is single-process only. State is lost on restart and is not shared across workers or pods.

Use a distributed store for production deployments that run multiple processes, multiple workers, or multiple pods.

Project details

Release history Release notifications | RSS feed

This version

7.7.0

May 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_control_evaluator_budget-7.7.0.tar.gz (25.3 kB view details)

Uploaded May 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agent_control_evaluator_budget-7.7.0-py3-none-any.whl (15.3 kB view details)

Uploaded May 7, 2026 Python 3

File details

Details for the file agent_control_evaluator_budget-7.7.0.tar.gz.

File metadata

Download URL: agent_control_evaluator_budget-7.7.0.tar.gz
Upload date: May 7, 2026
Size: 25.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agent_control_evaluator_budget-7.7.0.tar.gz
Algorithm	Hash digest
SHA256	`a368a06344d65f8910b86feff0de748977a602800c58a02109449028cb811dc6`
MD5	`11525f54a830b6f0aceb8a89c7583478`
BLAKE2b-256	`e332db46a0c0472547f9449045b79ff4103dfc5a2e3d33de4de3474d691cf487`

See more details on using hashes here.

File details

Details for the file agent_control_evaluator_budget-7.7.0-py3-none-any.whl.

File metadata

Download URL: agent_control_evaluator_budget-7.7.0-py3-none-any.whl
Upload date: May 7, 2026
Size: 15.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agent_control_evaluator_budget-7.7.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`19a28413b5968003fb3dd7a3b10120b6deeb5986da53664cc9f356408f271b47`
MD5	`d8a1f784368dac56ebf5341634f80f5d`
BLAKE2b-256	`424fe57b6eb37a85ada11cb350942bafc820594f6b52e59c047b91e64a783c48`

See more details on using hashes here.

agent-control-evaluator-budget 7.7.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

agent-control-evaluator-budget

Install

Quickstart

Scope and group_by

Budget pools

Pricing

Dual Ceiling Pattern

Limitations

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes