Skip to main content

ContextOps toolkit for production AI agents: compile, audit, govern, visualize, and optimize LLM context before every model call. Token budgets, policy governance, PII/secret scanning, Context Bill of Materials, context diffing, Context MRI, and MCP tool budgeting. Framework-agnostic. Local-first. Deterministic.

Project description

ctxbudgeter — context compiler for agentic AI

ctxbudgeter

PyPI Python Downloads License Tests

If ctxbudgeter saved you tokens, time, or a 3am incident — drop a ⭐ on the repo. It's the fuel for me to keep shipping v0.3 features.

ctxbudgeter helps AI agents know what to know.

ctxbudgeter is a ContextOps toolkit for production AI agents. It compiles, audits, governs, visualizes, and optimizes LLM context before every model call — so your agents control token budgets, reduce context waste, detect risky context, preserve provenance, improve prompt-cache layout, and produce auditable Context Bills of Materials.

ctxbudgeter is not an agent framework. It works before the model call. It sits in front of LangGraph, CrewAI, OpenAI Agents SDK, PydanticAI, Microsoft Agent Framework, or your own loop.

Agent observability tools show what the agent did. ctxbudgeter shows what the agent was allowed to know before it acted.

ContextOps · token budgets · policy governance · PII/secret scanning ·
Context Bill of Materials · context diffing · Context MRI · MCP tool budgeting

ContextOps in 30 seconds

from ctxbudgeter import ContextPack, ContextPolicy

policy = ContextPolicy(max_tokens=24_000, reserved_output_tokens=4_000,
                       block_secrets=True, forbidden_sources=[".env"], redact_sensitive=True)

pack = ContextPack(model="claude-sonnet-4.6", policy=policy)
pack.add(name="system", content="You are a careful agent.", kind="system",
         required=True, cache_policy="stable", source="repo/system.md", trust_level="verified")
pack.add(name="task", content="Resolve the refund request.", kind="task", required=True)

compiled = pack.compile(task="Resolve refund request")
print(compiled.report())          # what entered, what didn't, and why
bom = compiled.bom                 # auditable Bill of Materials
bom.to_json("context_bom.json")   # commit + diff in CI

from ctxbudgeter.viz import ContextMRI          # pip install "ctxbudgeter[viz]"
ContextMRI.from_compiled(compiled).export_html("context_mri.html")

New in 0.3 (ContextOps): ContextPolicy, ContextScanner, ContextProvenance, ContextBOM, ContextDiff, CachePlanner, ContextEval, MCPToolBudgeter, and the Context MRI visualization. See docs/contextops.md. Fully backward compatible with the 0.2 API. Deep-dive docs: BOM · Context MRI · MCP budgeting · security.

Webpack for agent context  •  pytest for prompt/context quality  •  token budget manager

What you get

  • Token budget compiler — deterministic, explainable selection with full inclusion/exclusion reasons
  • Just-in-time References — lazy pointers (file paths, URLs, queries) that only load if they fit
  • Eval / assert layer + pytest pluginassert_includes, assert_health_at_least, golden snapshots
  • Cache-aware adapters — Anthropic cache_control placement, OpenAI prompt_cache_key, LangChain & PydanticAI
  • Multi-modal attachments — images and structured tool schemas flow through to OpenAI/Anthropic payloads
  • Sensitivity enforcementallow | warn | refuse | redact for items tagged secret
  • Memory store (Write strategy) — persist agent notes between turns, query them back into context
  • Isolation (Isolate strategy) — pack.fork() builds a subagent-scoped pack with its own budget
  • Async compile — concurrent resolution of async References, async-aware compressor hook
  • Declarative YAML/JSON specs — check pack configuration into git, CI-friendly
  • CLIscan, compile, pack, validate, report for Claude Code and CI workflows
  • Zero LLM calls in the core — local-first, deterministic, fast

Install

pip install ctxbudgeter

# Optional extras
pip install "ctxbudgeter[tiktoken]"        # accurate OpenAI/Anthropic-proxy tokenization
pip install "ctxbudgeter[yaml]"            # YAML pack specs
pip install "ctxbudgeter[http]"            # http_get loader for References
pip install "ctxbudgeter[anthropic,openai,langchain]"
pip install "ctxbudgeter[all]"             # everything

Python 3.10+. Adapters are lazy-imported — you only pay for the SDKs you actually use.

Quick start

from ctxbudgeter import ContextPack

pack = ContextPack(
    model="claude-sonnet-4.6",
    token_budget=24_000,
    reserved_output_tokens=4_000,
)

pack.add(
    name="system_rules",
    content="You are a careful coding agent...",
    kind="system",
    priority=100,
    cache_policy="stable",
    required=True,
)
pack.add_file("README.md", kind="project_doc", priority=80)
pack.add(
    name="task",
    content="Build the referral packet UI.",
    kind="task",
    priority=95,
    required=True,
)

compiled = pack.compile()
print(compiled.report())
Included:
  - system_rules: 312 tokens, required, stable cache prefix, system
  - README.md: 1,420 tokens, stable cache prefix, project_doc
  - task: 19 tokens, required, task

Excluded:
  - old_notes.md: token-heavy and low priority — 8,400 tokens, score 41
  - debug.log: token-heavy and low priority — 14,200 tokens, score 12

Estimated input tokens: 1,751
Reserved output tokens: 4,000
Cacheable prefix: 1,732 tokens
Token budget: 24,000 (utilization 8.8%)
Context health score: 87/100
  breakdown: cacheable_prefix_bonus: +5, under_utilized: -5
Tokenizer: tiktoken

Just-in-time References

Don't load context you'll never use. References are lightweight pointers that load only when they could plausibly fit the budget — Anthropic's "JIT" pattern, built in.

from ctxbudgeter import ContextPack
from ctxbudgeter.loaders import file_loader, http_get_loader, register_loader

pack = ContextPack(token_budget=24_000)
pack.add(name="task", content="Refactor auth", kind="task", required=True)

# File reference — only opened if it would fit
pack.add_reference(
    name="auth_module",
    location="src/auth.py",
    loader=file_loader,
    estimated_tokens=1200,
    kind="code",
    priority=70,
)

# HTTP reference — never fetched unless budget allows
pack.add_reference(
    name="api_docs",
    location="https://example.com/docs/api.json",
    loader=http_get_loader,
    estimated_tokens=2000,
    kind="retrieval",
    priority=60,
)

# Or register your own loader
@register_loader("vector_search")
def vector_search(ref):
    return my_vector_store.search(ref.location, k=3)

pack.add_reference(name="docs_hit", location="referral packet UI", loader=vector_search, estimated_tokens=500)

compiled = pack.compile()

Async loaders work too — use await pack.acompile():

async def fetch_user_profile(ref):
    async with httpx.AsyncClient() as c:
        r = await c.get(ref.location)
        return r.text

pack.add_reference(name="profile", location="https://api.example.com/me", loader=fetch_user_profile, estimated_tokens=300)
compiled = await pack.acompile()   # async references resolved concurrently

Eval / assert layer — "pytest for prompts"

from ctxbudgeter.testing import (
    assert_includes, assert_excludes,
    assert_health_at_least, assert_cacheable_prefix_at_least,
    assert_no_secret_items, assert_used_tokens_at_most,
    GoldenPack,
)

def test_prod_pack():
    compiled = build_prod_pack().compile()
    assert_includes(compiled, "system_rules", "task")
    assert_excludes(compiled, "debug.log")
    assert_health_at_least(compiled, 80)
    assert_cacheable_prefix_at_least(compiled, 1024)
    assert_no_secret_items(compiled)
    assert_used_tokens_at_most(compiled, 20_000)

def test_pack_golden(ctxbudgeter_golden):
    # Provided by the installed pytest plugin.
    # Stores a golden snapshot the first time, diffs against it after.
    ctxbudgeter_golden().check(build_prod_pack().compile())

Refresh goldens after intentional changes:

pytest --ctxbudgeter-update-golden

Cache-aware adapters

from ctxbudgeter.adapters import (
    to_anthropic_request,  # cache_control on last stable system block
    to_openai_request,     # prompt_cache_key derived from stable prefix hash
    to_langchain_messages,
    to_pydantic_ai_deps,
)

# Anthropic
import anthropic
client = anthropic.Anthropic()
resp = client.messages.create(**to_anthropic_request(compiled, user_message="next step?"))

# OpenAI — explicit cache key for prompt-prefix caching
from openai import OpenAI
oa = OpenAI()
resp = oa.chat.completions.create(**to_openai_request(compiled, user_message="what now?"))

# LangChain
from langchain_anthropic import ChatAnthropic
msgs = to_langchain_messages(compiled, user_message="continue")
ChatAnthropic(model=compiled.model).invoke(msgs)

# PydanticAI
deps = to_pydantic_ai_deps(compiled)
agent.run(deps["system_prompt"], message_history=deps["message_history"])

Multi-modal attachments

from ctxbudgeter import ContextPack, ImageBlock, StructuredBlock

pack = ContextPack(token_budget=24_000)
pack.add(
    name="screenshot",
    content="Describe what's wrong in this screenshot.",
    kind="user_message",
    attachments=[
        ImageBlock(url="https://example.com/bug.png", estimated_tokens=400),
    ],
)
pack.add(
    name="tools",
    content="",
    kind="tool_def",
    cache_policy="stable",
    priority=85,
    attachments=[
        StructuredBlock(schema_name="search_db", data={"args": ["query"], "returns": "list[Doc]"}),
    ],
)

Image and structured blocks flow through to OpenAI's image_url / Anthropic's image / tool_result formats automatically.

Sensitivity enforcement

pack.add(name="api_key", content="sk-DEADBEEF...", sensitivity="secret")

pack.set_secret_policy("warn")     # include but flag in report + health penalty (default)
pack.set_secret_policy("refuse")   # raise SecretContentError at compile time
pack.set_secret_policy("redact")   # replace content with [REDACTED — sensitivity=secret]
pack.set_secret_policy("allow")    # silently allow (escape hatch)

In CI you almost always want refuse or redact. The text/markdown reports flag [!secret] items so reviewers can catch leaks during PR review.

Memory (Write strategy) + Isolation (Isolate strategy)

from ctxbudgeter import ContextPack, InMemoryStore, JSONMemoryStore, MemoryNote

# Persist notes across turns
store = JSONMemoryStore(".ctxbudgeter/memory.json")
store.write(MemoryNote(key="auth_runbook", content="JWT rotation...", tags=["auth"]))

# Pull them back into a future pack
pack = ContextPack(token_budget=24_000)
pack.add(name="task", content="Fix auth bug", required=True)
pack.add_memory(store, tags=["auth"], limit=3, priority=70)

# Isolate a subagent's context — only frontend code, smaller budget
frontend_pack = pack.subset_by_kind("project_doc", "code").fork(
    filter=lambda it: it.metadata.get("area") == "frontend",
    token_budget=8_000,
)

Declarative YAML pack specs

Check your pack into git like any other config:

# pack.yaml
model: claude-sonnet-4.6
token_budget: 24000
reserved_output_tokens: 4000
secret_policy: refuse

items:
  - name: system_rules
    from_file: prompts/system.md
    kind: system
    priority: 100
    required: true
    cache_policy: stable

  - name: task
    content: "Fix the auth bug."
    kind: task
    priority: 95
    required: true

references:
  - name: api_docs
    location: "https://example.com/docs.json"
    loader: http_get
    estimated_tokens: 1500
    priority: 60

Then compile from the CLI:

ctxbudgeter validate pack.yaml
ctxbudgeter pack pack.yaml --format markdown -o report.md
ctxbudgeter pack pack.yaml --fail-below 80    # exit non-zero on low health for CI

Or from Python:

from ctxbudgeter.spec import load_pack
pack = load_pack("pack.yaml")
compiled = pack.compile()

CLI

# Scan a directory, suggest priorities + cache policies
ctxbudgeter scan . --max-files 50

# Scan + emit a starter pack.yaml you can commit and iterate on
ctxbudgeter scan . --emit-pack pack.yaml --task "ship feature X"

# Ad-hoc compile from a directory + task
ctxbudgeter compile . --task "fix auth bug" --budget 12000 --secret-policy refuse

# Compile from a declarative spec
ctxbudgeter pack pack.yaml --format markdown -o context-report.md

# Re-render a saved compiled pack
ctxbudgeter compile . --task "..." --save-pack pack.json
ctxbudgeter report pack.json --format markdown

Wire it into CI

GitHub Actions

# .github/workflows/context-check.yml
name: Context budget check
on: [pull_request]

jobs:
  ctxbudget:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.11"
      - run: pip install "ctxbudgeter[all]"
      - name: Validate and compile pack
        run: |
          ctxbudgeter validate pack.yaml
          ctxbudgeter pack pack.yaml --format markdown --fail-below 80 -o report.md
      - name: Comment report on PR
        uses: marocchino/sticky-pull-request-comment@v2
        with:
          path: report.md

pre-commit

# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: ctxbudgeter-validate
        name: ctxbudgeter validate pack.yaml
        entry: ctxbudgeter validate pack.yaml
        language: system
        pass_filenames: false
        files: ^pack\.yaml$

pytest

# tests/test_context.py
from ctxbudgeter.testing import (
    assert_health_at_least, assert_no_secret_items, assert_includes,
)
from my_app.context import build_pack

def test_production_pack_quality():
    compiled = build_pack(task="fix auth bug").compile()
    assert_includes(compiled, "system_rules", "task")
    assert_health_at_least(compiled, 80)
    assert_no_secret_items(compiled)

How it works

Compiler algorithm

  1. Resolve References — load only those whose estimated cost could fit. Loader failures → excluded with reason.
  2. Required items go in first; compress (via your hook) or truncate if they don't fit.
  3. Optional items ranked by score_item, packed greedily.
  4. Sensitivity policy applied (warn / refuse / redact / allow).
  5. Final prompt order: stable → dynamic → ephemeral, deterministic tie-breaks.
  6. Cacheable prefix = consecutive stable items at the top.
score = priority*0.5 + relevance*100*0.3 + freshness*100*0.1 + cache_value*100*0.1 - token_cost_penalty

token_cost_penalty grows up to ~30 points as an item approaches the full budget — a 50k-token debug log doesn't beat your README on priority alone.

Health score breakdown

A 0–100 score with explicit, auditable deductions. Each pack reports its breakdown:

{
  "health_score": 87,
  "health_breakdown": {
    "cacheable_prefix_bonus": 5,
    "under_utilized": -5,
    "high_priority_excluded": -10,
    "secrets_included": -10
  }
}

If you'd rather not show "health" — treat it as BudgetCheckScore: a determinstic, explainable signal, not a quality oracle.

Compression hook

ctxbudgeter never calls an LLM for you. Provide a function — sync or async, your choice:

async def my_summarizer(item, target_tokens):
    return await anthropic_client.summarize(item.content, max_tokens=target_tokens)

pack.set_compressor(my_summarizer)
compiled = await pack.acompile()   # async path; sync compressors also work with .compile()

If your compressor returns content larger than target_tokens, the compiler retries once with a tighter target before giving up.

Round-trip + tooling

import json
from ctxbudgeter import compiled_pack_from_dict

# Compile, save, share
compiled = pack.compile()
Path("compiled.json").write_text(json.dumps(compiled.to_dict()))

# Reload later for reporting / diffing / assertions
restored = compiled_pack_from_dict(json.loads(Path("compiled.json").read_text()))
print(restored.report("markdown"))

Positioning

Most agent frameworks ask: "which agent runs next?" ctxbudgeter asks: "what exact information should this agent see right now — and why?"

Layer Existing What ctxbudgeter adds
Agent frameworks LangGraph, CrewAI, OpenAI Agents SDK, PydanticAI Decides the context shape before the call
RAG LlamaIndex, LangChain retrievers Retrieval ≠ final context; ctxbudgeter is the gate
Observability LangSmith, AgentOps They show what happened after; we prevent before
Context tools ctxforge, contextkit, contextagent We're the assertable + deterministic option

Design choices

  • Local-first. No LLM API calls in the core. The compiler is pure Python.
  • Deterministic. Same inputs → identical compiled pack. Same JSON output. Same health score. Always.
  • Explainable. Every input item shows up in decisions with a status and a human-readable reason.
  • Framework-agnostic. Core has zero hard dependencies on agent SDKs. Adapters are lazy-imported.
  • Composable. Bring your own tokenizer, your own compressor, your own scoring weights, your own loaders, your own memory store.
  • Assertable. Quality gates live in pytest, not in your head.

Author

Karan Chandra Dey[K28] Founder and AI Product Builder @ K28 Design Lab · k28art.space

Helping SMEs ship their first AI MVP — from prompt engineering to context engineering to production-ready agents.

Web k28art.space
GitHub @Kayariyan28
LinkedIn karan-chandra-dey
Email karandey3@outlook.com

"Use any agent framework. ctxbudgeter makes your context cleaner, cheaper, and assertable — before the model sees it."

License

MIT © Karan Chandra Dey / K28 Design Lab.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ctxbudgeter-0.3.0.tar.gz (100.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ctxbudgeter-0.3.0-py3-none-any.whl (99.6 kB view details)

Uploaded Python 3

File details

Details for the file ctxbudgeter-0.3.0.tar.gz.

File metadata

  • Download URL: ctxbudgeter-0.3.0.tar.gz
  • Upload date:
  • Size: 100.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ctxbudgeter-0.3.0.tar.gz
Algorithm Hash digest
SHA256 dd0bad4452cc75d5200b51b488dbb0dca639e276c281b75884f86d0b3fffdb48
MD5 8e4e92463b2b86f553a0ffd0a70f12d9
BLAKE2b-256 61d403221ff81723432c15097eea516a061716f8587473a1636cdada2a8275c8

See more details on using hashes here.

Provenance

The following attestation bundles were made for ctxbudgeter-0.3.0.tar.gz:

Publisher: publish.yml on Kayariyan28/ctxbudgeter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ctxbudgeter-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: ctxbudgeter-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 99.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ctxbudgeter-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9df6da032cd7316b64dc00de3f3ac5510c1f66f60d1f504c5e71100a52dbc343
MD5 f4cd17d87473feebdccffe92da6fd302
BLAKE2b-256 170fec9f26efad3aa5156644f6debe237d12723abeaa7d5e705cb99e48326bd6

See more details on using hashes here.

Provenance

The following attestation bundles were made for ctxbudgeter-0.3.0-py3-none-any.whl:

Publisher: publish.yml on Kayariyan28/ctxbudgeter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page