Context Minifier & State Guard — MCP middleware proxy that reduces token waste, prevents tool attrition, and eliminates context rot

These details have not been verified by PyPI

Project links

Project description

MCP Spine

Context Minifier & State Guard — A local-first MCP middleware proxy that reduces token waste, prevents tool attrition, and eliminates context rot.

MCP Spine sits between your LLM client (Claude Desktop, etc.) and your MCP servers, providing security hardening, intelligent tool routing, schema compression, and file state tracking — all through a single proxy.

Why

LLM agents using MCP tools face three problems:

Token waste — Tool schemas consume thousands of tokens per request. With 40+ tools loaded, you're burning context on JSON schemas before the conversation even starts.
Context rot — In long sessions, LLMs revert to editing old file versions they memorized earlier, silently overwriting your latest changes.
No security boundary — MCP servers run with full access. There's no audit trail, no rate limiting, no secret scrubbing between the LLM and your tools.

MCP Spine solves all three.

Demo

MCP Spine Doctor

Runs on Windows, macOS, and Linux. CI tested across all three.

Install

pip install mcp-spine

# With semantic routing (optional)
pip install mcp-spine[ml]

Quick Start

# Interactive setup wizard
mcp-spine init

# Or quick default config
mcp-spine init --quick

# Diagnose your setup
mcp-spine doctor --config spine.toml

# Validate config
mcp-spine verify --config spine.toml

# Start the proxy
mcp-spine serve --config spine.toml

Claude Desktop Integration

Replace all your individual MCP server entries with a single Spine entry:

{
  "mcpServers": {
    "spine": {
      "command": "python",
      "args": ["-u", "-m", "spine.cli", "serve", "--config", "/path/to/spine.toml"],
      "cwd": "/path/to/mcp-spine"
    }
  }
}

The -u flag ensures unbuffered stdout, preventing pipe hangs on Windows.

Features

Stage 1: Security Proxy

JSON-RPC message validation and sanitization
Secret scrubbing (AWS keys, GitHub tokens, bearer tokens, private keys, connection strings)
Per-tool and global rate limiting with sliding windows
Path traversal prevention with symlink-aware jail
Command injection guards for server spawning
HMAC-fingerprinted SQLite audit trail
Circuit breakers on failing servers
Declarative security policies from config

Stage 2: Semantic Router

Local vector embeddings using all-MiniLM-L6-v2 (no API calls, no data leaves your machine)
ChromaDB-backed tool indexing
Query-time routing: only the most relevant tools are sent to the LLM
spine_set_context meta-tool for explicit context switching
Keyword overlap + recency boost reranking
Background model loading — tools work immediately, routing activates when ready

Stage 3: Schema Minification

4 aggression levels (0=off, 1=light, 2=standard, 3=aggressive)
Level 2 achieves 61% token savings on tool schemas
Strips $schema, titles, additionalProperties, parameter descriptions, defaults
Preserves all required fields and type information

Stage 4: State Guard

Watches project files via watchfiles
Maintains SHA-256 manifest with monotonic versioning
Injects compact state pins into tool responses
Prevents LLMs from editing stale file versions

Human-in-the-Loop

require_confirmation policy flag for destructive tools
Spine intercepts the call, shows the arguments, and waits for user approval
spine_confirm / spine_deny meta-tools for the LLM to relay the decision
Per-tool granularity via glob patterns

Tool Output Memory

Ring buffer caching last 50 tool results
Deduplication by tool name + argument hash
TTL expiration (1 hour default)
spine_recall meta-tool to query cached results
Prevents context loss when semantic router swaps tools between turns

Token Budget

Daily token consumption tracking across all tool calls
Configurable daily limit with warn/block actions
Persistent SQLite storage (survives restarts within the same day)
Automatic midnight rollover
spine_budget meta-tool to check usage mid-conversation
Token estimation via character-count heuristic (~4 chars/token)
Non-blocking: budget failures never crash the proxy

Config Hot-Reload

Edit spine.toml while Spine is running — changes apply in seconds
Hot-reloadable: minifier level, rate limits, security policies, token budget, state guard patterns
Non-reloadable (requires restart): server list, commands, audit DB path
SHA-256 polling with 2-second interval
All reloads logged to the audit trail

Plugin System

Drop-in Python plugins that hook into the tool call pipeline
Four hook points: on_tool_call, on_tool_response, on_tool_list, on_startup/on_shutdown
Plugins can transform arguments, filter responses, block calls, or hide tools
Plugin chaining — multiple plugins run in sequence
Allow/deny lists for plugin access control
Auto-discovery from a configurable plugins directory
Example included: Slack channel compliance filter

SSE Transport

Connect to remote MCP servers over HTTP/SSE alongside local stdio servers
No external dependencies (uses stdlib urllib)
Supports custom headers for authentication

Diagnostics

# Check your setup
mcp-spine doctor --config spine.toml

# Live monitoring dashboard
mcp-spine dashboard

# Usage analytics (includes token budget)
mcp-spine analytics --hours 24

# Query audit log
mcp-spine audit --last 50
mcp-spine audit --security-only
mcp-spine audit --tool write_file

Example Config

[spine]
log_level = "info"
audit_db = "spine_audit.db"

# Add as many servers as you need — they start concurrently
[[servers]]
name = "filesystem"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/project"]
timeout_seconds = 120

[[servers]]
name = "github"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
env = { GITHUB_TOKEN = "ghp_..." }
timeout_seconds = 180

[[servers]]
name = "sqlite"
command = "uvx"
args = ["mcp-server-sqlite", "--db-path", "/path/to/database.db"]
timeout_seconds = 60

[[servers]]
name = "memory"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-memory"]
timeout_seconds = 60

[[servers]]
name = "brave-search"
command = "node"
args = ["/path/to/node_modules/@modelcontextprotocol/server-brave-search/dist/index.js"]
env = { BRAVE_API_KEY = "your_key" }
timeout_seconds = 60

# Remote server via SSE
# [[servers]]
# name = "remote-tools"
# transport = "sse"
# url = "https://your-server.com/sse"
# headers = { Authorization = "Bearer token" }
# timeout_seconds = 30

# Semantic routing
[routing]
max_tools = 15
rerank = true

# Schema minification — 61% token savings at level 2
[minifier]
level = 2

# Token budget — track and limit daily token spend
[token_budget]
daily_limit = 500000    # tokens per day (0 = unlimited)
warn_at = 0.8           # warn at 80% usage
action = "warn"         # "warn" = log warning, "block" = reject tool calls

# State guard — prevent context rot
[state_guard]
enabled = true
watch_paths = ["/path/to/project"]

# Plugins — custom middleware hooks
[plugins]
enabled = true
directory = "plugins"
# allow_list = ["slack-filter"]  # optional whitelist
# deny_list = ["debug-plugin"]  # optional blacklist

# Human-in-the-loop for destructive tools
[[security.tools]]
pattern = "write_file"
action = "allow"
require_confirmation = true

[[security.tools]]
pattern = "write_query"
action = "allow"
require_confirmation = true

# Security
[security]
scrub_secrets_in_logs = true
audit_all_tool_calls = true
global_rate_limit = 120
per_tool_rate_limit = 60

[security.path]
allowed_roots = ["/path/to/project"]
denied_patterns = ["**/.env", "**/*.key", "**/*.pem"]

Security Model

Defense-in-depth — every layer assumes the others might fail.

Threat	Mitigation
Prompt injection via tool args	Input validation, tool name allowlists
Path traversal	Symlink-aware jail to `allowed_roots`
Secret leakage	Automatic scrubbing of AWS keys, tokens, private keys
Runaway agent loops	Per-tool + global rate limiting
Command injection	Command allowlist, shell metacharacter blocking
Denial of service	Message size limits, circuit breakers
Sensitive file access	Deny-list patterns for `.env`, `.key`, `.pem`, `.ssh/`
Tool abuse	Policy-based blocking, audit logging, HITL confirmation
Log tampering	HMAC fingerprints on every audit entry
Destructive operations	`require_confirmation` pauses for user approval
Runaway token spend	Daily budget limits with warn/block enforcement
Unvetted plugins	Allow/deny lists, directory isolation, audit logging

Architecture

Client ◄──stdio──► MCP Spine ◄──stdio──► Filesystem Server
                       │      ◄──stdio──► GitHub Server
                       │      ◄──stdio──► SQLite Server
                       │      ◄──stdio──► Memory Server
                       │      ◄──stdio──► Brave Search
                       │      ◄──SSE────► Remote Server
                   ┌───┴───┐
                   │SecPol │  ← Rate limits, path jail, secret scrub
                   │Router │  ← Semantic routing (local embeddings)
                   │Minify │  ← Schema compression (61% savings)
                   │Guard  │  ← File state pinning (SHA-256)
                   │HITL   │  ← Human-in-the-loop confirmation
                   │Memory │  ← Tool output cache
                   │Budget │  ← Daily token tracking + limits
                   │Plugin │  ← Custom middleware hooks
                   └───────┘

Startup Sequence

Instant handshake (~2ms) — Responds to initialize immediately
Concurrent server startup — All servers connect in parallel via asyncio.gather
Progressive readiness — Tools available as soon as any server connects
Late server notification — tools/listChanged sent when slow servers finish
Background ML loading — Semantic router activates silently when model loads

Windows Support

Battle-tested on Windows with specific hardening for:

MSIX sandbox paths for Claude Desktop config and logs
npx.cmd resolution via shutil.which()
Paths with spaces (C:\Users\John Doe\) and parentheses (C:\Program Files (x86)\)
PureWindowsPath for cross-platform basename extraction
Environment variable merging (config env extends, not replaces, system env)
UTF-8 encoding without BOM
Unbuffered stdout (-u flag) to prevent pipe hangs

Project Structure

mcp-spine/
├── pyproject.toml
├── spine/
│   ├── cli.py              # Click CLI (init, serve, verify, audit, dashboard, analytics, doctor)
│   ├── config.py           # TOML config loader with validation
│   ├── proxy.py            # Core proxy event loop
│   ├── protocol.py         # JSON-RPC message handling
│   ├── transport.py        # Server pool, circuit breakers, concurrent startup
│   ├── audit.py            # Structured logging + SQLite audit trail
│   ├── router.py           # Semantic routing (ChromaDB + sentence-transformers)
│   ├── minifier.py         # Schema pruning (4 aggression levels)
│   ├── state_guard.py      # File watcher + SHA-256 manifest + pin injection
│   ├── memory.py           # Tool output cache (ring buffer + dedup + TTL)
│   ├── budget.py           # Token budget tracker (daily limits + persistence)
│   ├── plugins.py          # Plugin system (hooks, discovery, chaining)
│   ├── dashboard.py        # Live TUI dashboard (Rich)
│   ├── sse_client.py       # SSE transport client for remote servers
│   └── security/
│       ├── secrets.py      # Credential detection & scrubbing
│       ├── paths.py        # Path traversal jail
│       ├── validation.py   # JSON-RPC message validation
│       ├── commands.py     # Server spawn guards
│       ├── rate_limit.py   # Sliding window throttling
│       ├── integrity.py    # SHA-256 + HMAC fingerprints
│       ├── env.py          # Fail-closed env var resolution
│       └── policy.py       # Declarative security policies
├── tests/
│   ├── test_security.py    # Security tests
│   ├── test_config.py      # Config validation tests
│   ├── test_minifier.py    # Schema minification tests
│   ├── test_state_guard.py # State guard tests
│   ├── test_proxy_features.py  # HITL, dashboard, analytics tests
│   ├── test_memory.py      # Tool output memory tests
│   └── test_budget.py      # Token budget tracker tests
├── examples/
│   └── slack_filter.py     # Example: Slack compliance filter plugin
├── configs/
│   └── example.spine.toml  # Complete reference config
└── .github/
    └── workflows/
        └── ci.yml          # GitHub Actions: test + lint + publish

Tests

pytest tests/ -v

160+ tests covering security, config validation, schema minification, state guard, HITL policies, dashboard queries, analytics, tool memory, token budget tracking, plugin system, and Windows path edge cases.

CI runs on every push: Windows + Linux, Python 3.11/3.12/3.13.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.5

Apr 25, 2026

0.2.4

Apr 19, 2026

This version

0.2.3

Apr 19, 2026

0.2.2

Apr 18, 2026

0.2.1

Apr 16, 2026

0.2.0

Apr 11, 2026

0.1.0

Apr 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_spine-0.2.3.tar.gz (82.3 kB view details)

Uploaded Apr 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mcp_spine-0.2.3-py3-none-any.whl (76.1 kB view details)

Uploaded Apr 19, 2026 Python 3

File details

Details for the file mcp_spine-0.2.3.tar.gz.

File metadata

Download URL: mcp_spine-0.2.3.tar.gz
Upload date: Apr 19, 2026
Size: 82.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for mcp_spine-0.2.3.tar.gz
Algorithm	Hash digest
SHA256	`6327ed09e903553173ae34d8f9d9474d6a3dcf95fd38afc3b745c6de215a6ea9`
MD5	`298bd177a3374c699f10483b7933df98`
BLAKE2b-256	`8a908a5bbf29e6e94ec5f379739d2229de8a26d2a7d1c6b5380e081005abb946`

See more details on using hashes here.

File details

Details for the file mcp_spine-0.2.3-py3-none-any.whl.

File metadata

Download URL: mcp_spine-0.2.3-py3-none-any.whl
Upload date: Apr 19, 2026
Size: 76.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for mcp_spine-0.2.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7e5dd5bc9f36992b78beb120902e64c4c54ef63b444b1d645584205b3503a9cb`
MD5	`a3ce4f32051d47155c5472c174d3351a`
BLAKE2b-256	`2f85a6cc87da6e9ff30fc9a5c2ea89702b1f6af5cca538dd2c0c5a6486f72b3e`

See more details on using hashes here.

mcp-spine 0.2.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MCP Spine

Why

Demo

Install

Quick Start

Claude Desktop Integration

Features

Stage 1: Security Proxy

Stage 2: Semantic Router

Stage 3: Schema Minification

Stage 4: State Guard

Human-in-the-Loop

Tool Output Memory

Token Budget

Config Hot-Reload

Plugin System

SSE Transport

Diagnostics

Example Config

Security Model

Architecture

Startup Sequence

Windows Support

Project Structure

Tests

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes