Local-first MCP context budget and tool-selection verifier
Project description
mcp-context-budget
Local-first MCP context budget and tool-selection verifier for agentic coding environments.
MCP servers can load enough tool schema and response data to burn a large fraction of an agent context window before useful work starts. This tool gives developers a repeatable budget gate:
- scan MCP config or
tools/listfixtures - estimate schema and response-token cost
- select a smaller task-relevant tool set with deterministic SQLite FTS5/BM25
- optionally prove semantic tool selection from fixture or Ollama embeddings
- write a lockfile for CI
- fail builds when schema or response budgets regress
- compress recorded response fixtures under a response budget
- apply selected-tool locks back to caller-owned MCP config files
- opt into local stdio
tools/listintrospection for command-discovered servers - audit MCP configs for plaintext secret exposure without printing values
- prove the spine with a Docker demo
No private Orion services are required. The core CLI has no external runtime
service dependency. Semantic selection can optionally call Ollama only when the
--embedding-backend ollama flag is explicitly selected. Live stdio
introspection can optionally start a caller-owned local MCP command only when
--allow-start is explicitly selected.
Install
python3.11 -m venv .venv
. .venv/bin/activate
pip install -e '.[dev]'
Quick Demo
mcp-context-budget demo \
--task "triage a GitHub issue and update one ticket" \
--max-tools 8 \
--max-schema-tokens 6000 \
--max-response-tokens 4000
Expected spine proof:
DEMO_CATALOG_SERVERS=5
DEMO_CATALOG_TOOLS=120
BEFORE_SCHEMA_TOKENS=<large>
SELECTED_TOOLS=<8-or-less>
AFTER_SCHEMA_TOKENS=<cap-or-less>
OVERSIZED_RESPONSE_FIXTURE=flagged
BUDGET_STATUS=PASS
Commands
mcp-context-budget scan --tool-list fixtures/demo-tools.json --out mcp-budget.report.md --lock-out mcp-budget.lock.json
mcp-context-budget select --tool-list fixtures/demo-tools.json --task "triage a GitHub issue" --max-tools 8 --max-schema-tokens 6000 --out-lock mcp-budget.lock.json
mcp-context-budget semantic-select --tool-list fixtures/demo-tools.json --task "triage a GitHub issue" --embedding-backend fixture --embedding-file embeddings.json --out-lock mcp-budget.lock.json
mcp-context-budget check --lock mcp-budget.lock.json --max-schema-tokens 6000 --max-response-tokens 4000
mcp-context-budget compress-responses --fixtures responses/ --max-response-tokens 4000 --out-dir compressed-responses --report compression-report.json
mcp-context-budget config-apply --config mcp.json --lock mcp-budget.lock.json --dry-run --patch-out mcp-config.patch.json
mcp-context-budget config-audit --config mcp.json --json-out mcp-config-audit.json --fail-on high
mcp-context-budget export --lock mcp-budget.lock.json --format sarif --out mcp-budget.sarif
scan --config supports Claude/Cursor/Codex-style JSON with an mcpServers
object. Server entries may include toolsListPath to point at a recorded
tools/list JSON fixture. Server entries may also include stdioFraming
(auto, json-lines, or content-length) when a local stdio server needs a
fixed transport framing. Environment values are redacted in reports.
--allow-start is intentionally conservative. It is never implied by default,
never required for static toolsListPath or inline-tool configs, and never
starts a hosted service. When explicitly selected, the tool starts the
caller-owned local stdio command as argv with shell=False, sends MCP
initialize and tools/list, enforces timeout and stdio-byte caps, redacts env
metadata, and exits the process after listing tools. The default
--stdio-framing auto prefers the current MCP SDK JSON-lines stdio transport
and falls back to the legacy Content-Length fixture transport; pass
--stdio-framing json-lines or --stdio-framing content-length to force one.
For command-discovered servers that need to become enforceable by
config-apply, combine --allow-start with --materialize-tools-list:
mcp-context-budget config-apply \
--config mcp.json \
--lock mcp-budget.lock.json \
--write \
--allow-start \
--start-timeout-seconds 2 \
--max-stdio-bytes 65536 \
--stdio-framing auto \
--materialize-tools-list materialized-tools/
This writes a local toolsListPath sidecar for the discovered tools, applies
the selected-tool lock there, and leaves later scan/select runs static again.
Semantic Selection
semantic-select keeps the v0.1 lockfile shape but ranks tools by embedding
similarity before applying --max-tools and --max-schema-tokens.
Fixture mode is deterministic and requires no service:
mcp-context-budget semantic-select \
--tool-list tools.json \
--task "diagnose bug report" \
--embedding-backend fixture \
--embedding-file embeddings.json \
--out-lock semantic.lock.json
The fixture file must contain:
{
"queries": {"diagnose bug report": [1.0, 0.0]},
"tools": {"github/get_issue": [1.0, 0.0]}
}
Ollama mode uses stdlib HTTP and adds no Python package dependency:
mcp-context-budget semantic-select \
--tool-list tools.json \
--task "diagnose bug report" \
--embedding-backend ollama \
--ollama-url http://localhost:11434 \
--ollama-model nomic-embed-text
Response Fixture Compression
compress-responses reads one response fixture or a directory of *.json
fixtures, writes compressed copies, and emits a JSON report.
mcp-context-budget compress-responses \
--fixtures responses/ \
--max-response-tokens 4000 \
--out-dir compressed-responses \
--report compression-report.json
The v0.2 strategy is deterministic extractive compression. It preserves common
identifier fields and writes a summary when large body fields need to be cut.
Config Apply
config-apply turns a selected-tool lock into a safe local MCP config patch.
Dry-run is the default posture; --write is required before the config file is
changed, and write mode creates a backup.
mcp-context-budget config-apply \
--config mcp.json \
--lock mcp-budget.lock.json \
--mode disable-unselected \
--dry-run \
--patch-out mcp-config.patch.json
mcp-context-budget config-apply \
--config mcp.json \
--lock mcp-budget.lock.json \
--mode disable-unselected \
--write \
--backup-dir backups/
Reports redact environment values.
The apply contract is enforced, not advisory:
- Inline and
toolsListPathtools are both patched. A server whose tools live in an externaltools/listJSON has that file patched (and backed up) too — not silently skipped. - The lock is bound to the config. Each lock records a
config_fingerprintof its tool universe;config-applyrefuses a lock whose fingerprint does not match the target config (a foreign/stale lock would otherwise disable every tool and still report success). Override with--allow-fingerprint-mismatch. - Honest status, never a false PASS. A command-discovered server (no inline
tools, notoolsListPath) cannot be enforced without live startup, so it is reported undernot_patchableand the status isPARTIAL, notPASS. - Opt-in materialization closes the PARTIAL gap. With
--allow-startand--materialize-tools-list, command-discovered tools are listed through local stdio, saved to a caller-owned sidecar, and enforced as a normaltoolsListPathcatalog. - Disabling takes effect. The loader honors
enabled: false, so a disabled tool (or server) drops out of the budget on the nextscan/select.
Config Secret Audit
config-audit is a read-only hygiene check for MCP config files. It flags
high-confidence literal secrets in env values, args, and nested config fields,
while treating ${TOKEN} references, op://... references, and redacted
placeholders as safe references.
mcp-context-budget config-audit \
--config mcp.json \
--json-out mcp-config-audit.json \
--fail-on high
Reports include the config path, finding path, severity, secret class, length bucket, and a short fingerprint. Literal secret values are never printed.
Docker
docker build -t mcp-context-budget:local .
docker run --rm mcp-context-budget:local demo \
--task "triage a GitHub issue and update one ticket" \
--max-tools 8 \
--max-schema-tokens 6000 \
--max-response-tokens 4000
The image exposes no service port.
v0.2 also includes independent Docker proof commands for the new capabilities:
docker run --rm mcp-context-budget:local semantic-demo \
--task "diagnose bug report" \
--max-tools 3 \
--max-schema-tokens 3000
docker run --rm mcp-context-budget:local compress-demo --max-response-tokens 4000
docker run --rm mcp-context-budget:local config-demo
docker run --rm mcp-context-budget:local allow-start-demo --start-timeout-seconds 2 --max-stdio-bytes 65536 --stdio-framing auto
docker run --rm mcp-context-budget:local semantic-demo --task "diagnose bug report" --max-tools 3 --max-schema-tokens 3000 --embedding-backend fixture
docker run --rm mcp-context-budget:local prove-parallel-ollama-demo
docker run --rm mcp-context-budget:local config-audit-demo
docker run --rm mcp-context-budget:local config-multiserver-demo
Expected v0.3 proof lines:
LIVE_INTROSPECTION_STATUS=PASS
AFTER_CONFIG_NOT_PATCHABLE=0
CONFIG_AUDIT_STATUS=PASS
CONFIG_AUDIT_SECRET_VALUES_REDACTED=true
CONFIG_MULTISERVER_STATUS=PASS
Out of v0.3
Locked Out
- Live runtime MCP proxy/gateway that intercepts and routes actual tool calls.
- Browser UI.
- Organization-wide background scanner.
- Vendor-specific hosted dashboards.
These are not v0.4 commitments; they break the local-first CLI verifier shape.
Shipped in v0.4
- Parallelized Ollama embeddings for
semantic-selectwhen--embedding-backend ollamais explicitly selected (fixture backend remains the default for CI/docker).
Deferred to v0.5
- Automatic response compression for arbitrary live MCP servers.
- Broader CLI polish.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mcp_context_budget-0.4.0.tar.gz.
File metadata
- Download URL: mcp_context_budget-0.4.0.tar.gz
- Upload date:
- Size: 45.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dc628c4a3a8468f8009a251155ce2341f73ccdb0f60ae10c01eaaa015ffa5c69
|
|
| MD5 |
ec607530bd9c4db75ce33d2746f1d318
|
|
| BLAKE2b-256 |
8f9a70c512f6479994c09c4f57e8b7ee6ac4389f18c94139239cea8be63c6e44
|
Provenance
The following attestation bundles were made for mcp_context_budget-0.4.0.tar.gz:
Publisher:
release.yml on OrionArchitekton/mcp-context-budget
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mcp_context_budget-0.4.0.tar.gz -
Subject digest:
dc628c4a3a8468f8009a251155ce2341f73ccdb0f60ae10c01eaaa015ffa5c69 - Sigstore transparency entry: 1997738650
- Sigstore integration time:
-
Permalink:
OrionArchitekton/mcp-context-budget@a4ba329f26a3e23a9ad2be6f7c7d95b580706eb7 -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/OrionArchitekton
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a4ba329f26a3e23a9ad2be6f7c7d95b580706eb7 -
Trigger Event:
push
-
Statement type:
File details
Details for the file mcp_context_budget-0.4.0-py3-none-any.whl.
File metadata
- Download URL: mcp_context_budget-0.4.0-py3-none-any.whl
- Upload date:
- Size: 40.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ceda20caf239f796e415152396dc9519610fc45e6eb851fd73d9cddbb92c227e
|
|
| MD5 |
0e49dbbcf9ff4ef615cb1371d826037c
|
|
| BLAKE2b-256 |
8eea80a4de47a3183503c4372092cc835267211190c01d4a5a0996fc8e5f708e
|
Provenance
The following attestation bundles were made for mcp_context_budget-0.4.0-py3-none-any.whl:
Publisher:
release.yml on OrionArchitekton/mcp-context-budget
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mcp_context_budget-0.4.0-py3-none-any.whl -
Subject digest:
ceda20caf239f796e415152396dc9519610fc45e6eb851fd73d9cddbb92c227e - Sigstore transparency entry: 1997738914
- Sigstore integration time:
-
Permalink:
OrionArchitekton/mcp-context-budget@a4ba329f26a3e23a9ad2be6f7c7d95b580706eb7 -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/OrionArchitekton
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a4ba329f26a3e23a9ad2be6f7c7d95b580706eb7 -
Trigger Event:
push
-
Statement type: