Local-first MCP context budget and tool-selection verifier

Project description

mcp-context-budget

Local-first MCP context budget and tool-selection verifier for agentic coding environments.

MCP servers can load enough tool schema and response data to burn a large fraction of an agent context window before useful work starts. This tool gives developers a repeatable budget gate:

scan MCP config or tools/list fixtures
estimate schema and response-token cost
select a smaller task-relevant tool set with deterministic SQLite FTS5/BM25
optionally prove semantic tool selection from fixture or Ollama embeddings
write a lockfile for CI
fail builds when schema or response budgets regress
compress recorded response fixtures under a response budget
apply selected-tool locks back to caller-owned MCP config files
opt into local stdio tools/list introspection for command-discovered servers
audit MCP configs for plaintext secret exposure without printing values
prove the spine with a Docker demo

No private Orion services are required. The core CLI has no external runtime service dependency. Semantic selection can optionally call Ollama only when the --embedding-backend ollama flag is explicitly selected. Live stdio introspection can optionally start a caller-owned local MCP command only when --allow-start is explicitly selected.

Install

python3.11 -m venv .venv
. .venv/bin/activate
pip install -e '.[dev]'

Quick Demo

mcp-context-budget demo \
  --task "triage a GitHub issue and update one ticket" \
  --max-tools 8 \
  --max-schema-tokens 6000 \
  --max-response-tokens 4000

Expected spine proof:

DEMO_CATALOG_SERVERS=5
DEMO_CATALOG_TOOLS=120
BEFORE_SCHEMA_TOKENS=<large>
SELECTED_TOOLS=<8-or-less>
AFTER_SCHEMA_TOKENS=<cap-or-less>
OVERSIZED_RESPONSE_FIXTURE=flagged
BUDGET_STATUS=PASS

Commands

mcp-context-budget scan --tool-list fixtures/demo-tools.json --out mcp-budget.report.md --lock-out mcp-budget.lock.json
mcp-context-budget select --tool-list fixtures/demo-tools.json --task "triage a GitHub issue" --max-tools 8 --max-schema-tokens 6000 --out-lock mcp-budget.lock.json
mcp-context-budget semantic-select --tool-list fixtures/demo-tools.json --task "triage a GitHub issue" --embedding-backend fixture --embedding-file embeddings.json --out-lock mcp-budget.lock.json
mcp-context-budget check --lock mcp-budget.lock.json --max-schema-tokens 6000 --max-response-tokens 4000
mcp-context-budget compress-responses --fixtures responses/ --max-response-tokens 4000 --out-dir compressed-responses --report compression-report.json
mcp-context-budget config-apply --config mcp.json --lock mcp-budget.lock.json --dry-run --patch-out mcp-config.patch.json
mcp-context-budget config-audit --config mcp.json --json-out mcp-config-audit.json --fail-on high
mcp-context-budget export --lock mcp-budget.lock.json --format sarif --out mcp-budget.sarif

scan --config supports Claude/Cursor/Codex-style JSON with an mcpServers object. Server entries may include toolsListPath to point at a recorded tools/list JSON fixture. Server entries may also include stdioFraming (auto, json-lines, or content-length) when a local stdio server needs a fixed transport framing. Environment values are redacted in reports.

--allow-start is intentionally conservative. It is never implied by default, never required for static toolsListPath or inline-tool configs, and never starts a hosted service. When explicitly selected, the tool starts the caller-owned local stdio command as argv with shell=False, sends MCP initialize and tools/list, enforces timeout and stdio-byte caps, redacts env metadata, and exits the process after listing tools. The default --stdio-framing auto prefers the current MCP SDK JSON-lines stdio transport and falls back to the legacy Content-Length fixture transport; pass --stdio-framing json-lines or --stdio-framing content-length to force one.

For command-discovered servers that need to become enforceable by config-apply, combine --allow-start with --materialize-tools-list:

mcp-context-budget config-apply \
  --config mcp.json \
  --lock mcp-budget.lock.json \
  --write \
  --allow-start \
  --start-timeout-seconds 2 \
  --max-stdio-bytes 65536 \
  --stdio-framing auto \
  --materialize-tools-list materialized-tools/

This writes a local toolsListPath sidecar for the discovered tools, applies the selected-tool lock there, and leaves later scan/select runs static again.

Semantic Selection

semantic-select keeps the v0.1 lockfile shape but ranks tools by embedding similarity before applying --max-tools and --max-schema-tokens.

Fixture mode is deterministic and requires no service:

mcp-context-budget semantic-select \
  --tool-list tools.json \
  --task "diagnose bug report" \
  --embedding-backend fixture \
  --embedding-file embeddings.json \
  --out-lock semantic.lock.json

The fixture file must contain:

{
  "queries": {"diagnose bug report": [1.0, 0.0]},
  "tools": {"github/get_issue": [1.0, 0.0]}
}

Ollama mode uses stdlib HTTP and adds no Python package dependency:

mcp-context-budget semantic-select \
  --tool-list tools.json \
  --task "diagnose bug report" \
  --embedding-backend ollama \
  --ollama-url http://localhost:11434 \
  --ollama-model nomic-embed-text

Response Fixture Compression

compress-responses reads one response fixture or a directory of *.json fixtures, writes compressed copies, and emits a JSON report.

mcp-context-budget compress-responses \
  --fixtures responses/ \
  --max-response-tokens 4000 \
  --out-dir compressed-responses \
  --report compression-report.json

The v0.2 strategy is deterministic extractive compression. It preserves common identifier fields and writes a summary when large body fields need to be cut.

Config Apply

config-apply turns a selected-tool lock into a safe local MCP config patch. Dry-run is the default posture; --write is required before the config file is changed, and write mode creates a backup.

mcp-context-budget config-apply \
  --config mcp.json \
  --lock mcp-budget.lock.json \
  --mode disable-unselected \
  --dry-run \
  --patch-out mcp-config.patch.json

mcp-context-budget config-apply \
  --config mcp.json \
  --lock mcp-budget.lock.json \
  --mode disable-unselected \
  --write \
  --backup-dir backups/

Reports redact environment values.

The apply contract is enforced, not advisory:

Inline and toolsListPath tools are both patched. A server whose tools live in an external tools/list JSON has that file patched (and backed up) too — not silently skipped.
The lock is bound to the config. Each lock records a config_fingerprint of its tool universe; config-apply refuses a lock whose fingerprint does not match the target config (a foreign/stale lock would otherwise disable every tool and still report success). Override with --allow-fingerprint-mismatch.
Honest status, never a false PASS. A command-discovered server (no inline tools, no toolsListPath) cannot be enforced without live startup, so it is reported under not_patchable and the status is PARTIAL, not PASS.
Opt-in materialization closes the PARTIAL gap. With --allow-start and --materialize-tools-list, command-discovered tools are listed through local stdio, saved to a caller-owned sidecar, and enforced as a normal toolsListPath catalog.
Disabling takes effect. The loader honors enabled: false, so a disabled tool (or server) drops out of the budget on the next scan/select.

Config Secret Audit

config-audit is a read-only hygiene check for MCP config files. It flags high-confidence literal secrets in env values, args, and nested config fields, while treating ${TOKEN} references, op://... references, and redacted placeholders as safe references.

mcp-context-budget config-audit \
  --config mcp.json \
  --json-out mcp-config-audit.json \
  --fail-on high

Reports include the config path, finding path, severity, secret class, length bucket, and a short fingerprint. Literal secret values are never printed.

Docker

docker build -t mcp-context-budget:local .
docker run --rm mcp-context-budget:local demo \
  --task "triage a GitHub issue and update one ticket" \
  --max-tools 8 \
  --max-schema-tokens 6000 \
  --max-response-tokens 4000

The image exposes no service port.

v0.2 also includes independent Docker proof commands for the new capabilities:

docker run --rm mcp-context-budget:local semantic-demo \
  --task "diagnose bug report" \
  --max-tools 3 \
  --max-schema-tokens 3000
docker run --rm mcp-context-budget:local compress-demo --max-response-tokens 4000
docker run --rm mcp-context-budget:local config-demo
docker run --rm mcp-context-budget:local allow-start-demo --start-timeout-seconds 2 --max-stdio-bytes 65536 --stdio-framing auto
docker run --rm mcp-context-budget:local semantic-demo --task "diagnose bug report" --max-tools 3 --max-schema-tokens 3000 --embedding-backend fixture
docker run --rm mcp-context-budget:local prove-parallel-ollama-demo
docker run --rm mcp-context-budget:local config-audit-demo
docker run --rm mcp-context-budget:local config-multiserver-demo

Expected v0.3 proof lines:

LIVE_INTROSPECTION_STATUS=PASS
AFTER_CONFIG_NOT_PATCHABLE=0
CONFIG_AUDIT_STATUS=PASS
CONFIG_AUDIT_SECRET_VALUES_REDACTED=true
CONFIG_MULTISERVER_STATUS=PASS

Out of v0.3

Locked Out

Live runtime MCP proxy/gateway that intercepts and routes actual tool calls.
Browser UI.
Organization-wide background scanner.
Vendor-specific hosted dashboards.

These are not v0.4 commitments; they break the local-first CLI verifier shape.

Shipped in v0.4

Parallelized Ollama embeddings for semantic-select when --embedding-backend ollama is explicitly selected (fixture backend remains the default for CI/docker).

Deferred to v0.5

Automatic response compression for arbitrary live MCP servers.
Broader CLI polish.

Project details

Release history Release notifications | RSS feed

This version

0.4.0

Jun 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_context_budget-0.4.0.tar.gz (45.9 kB view details)

Uploaded Jun 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mcp_context_budget-0.4.0-py3-none-any.whl (40.2 kB view details)

Uploaded Jun 28, 2026 Python 3

File details

Details for the file mcp_context_budget-0.4.0.tar.gz.

File metadata

Download URL: mcp_context_budget-0.4.0.tar.gz
Upload date: Jun 28, 2026
Size: 45.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mcp_context_budget-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`dc628c4a3a8468f8009a251155ce2341f73ccdb0f60ae10c01eaaa015ffa5c69`
MD5	`ec607530bd9c4db75ce33d2746f1d318`
BLAKE2b-256	`8f9a70c512f6479994c09c4f57e8b7ee6ac4389f18c94139239cea8be63c6e44`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_context_budget-0.4.0.tar.gz:

Publisher: release.yml on OrionArchitekton/mcp-context-budget

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mcp_context_budget-0.4.0.tar.gz
- Subject digest: dc628c4a3a8468f8009a251155ce2341f73ccdb0f60ae10c01eaaa015ffa5c69
- Sigstore transparency entry: 1997738650
- Sigstore integration time: Jun 28, 2026
Source repository:
- Permalink: OrionArchitekton/mcp-context-budget@a4ba329f26a3e23a9ad2be6f7c7d95b580706eb7
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/OrionArchitekton
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@a4ba329f26a3e23a9ad2be6f7c7d95b580706eb7
- Trigger Event: push

File details

Details for the file mcp_context_budget-0.4.0-py3-none-any.whl.

File metadata

Download URL: mcp_context_budget-0.4.0-py3-none-any.whl
Upload date: Jun 28, 2026
Size: 40.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mcp_context_budget-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ceda20caf239f796e415152396dc9519610fc45e6eb851fd73d9cddbb92c227e`
MD5	`0e49dbbcf9ff4ef615cb1371d826037c`
BLAKE2b-256	`8eea80a4de47a3183503c4372092cc835267211190c01d4a5a0996fc8e5f708e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_context_budget-0.4.0-py3-none-any.whl:

Publisher: release.yml on OrionArchitekton/mcp-context-budget

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mcp_context_budget-0.4.0-py3-none-any.whl
- Subject digest: ceda20caf239f796e415152396dc9519610fc45e6eb851fd73d9cddbb92c227e
- Sigstore transparency entry: 1997738914
- Sigstore integration time: Jun 28, 2026
Source repository:
- Permalink: OrionArchitekton/mcp-context-budget@a4ba329f26a3e23a9ad2be6f7c7d95b580706eb7
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/OrionArchitekton
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@a4ba329f26a3e23a9ad2be6f7c7d95b580706eb7
- Trigger Event: push

mcp-context-budget 0.4.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Meta

Project description

mcp-context-budget

Install

Quick Demo

Commands

Semantic Selection

Response Fixture Compression

Config Apply

Config Secret Audit

Docker

Out of v0.3

Locked Out

Shipped in v0.4

Deferred to v0.5

Project details

Verified details

Maintainers

Meta

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance