Skip to main content

Observability and APM toolkit for MCP servers

Project description

MCP Observatory

MCP Observatory now includes a two-phase execution pattern for high-risk MCP tool calls, with a generic proposer/verifier wrapper that can be reused by any tool:

  1. PROPOSE: plan/simulate, evaluate uncertainty/integrity, no side effects.
  2. COMMIT: execute side effects only when a signed commit token is valid.

Two-Phase Sequence (Text Diagram)

Client
  -> transfer_funds_propose(amount,to)
      -> scoring(output_instability, numeric_variance, prompt_drift)
      -> decision:
          - blocked: deterministic fallback (create_draft), no side effects
          - allowed: issue signed commit_token bound to tool args hash
  <- {proposal_id, commit_token?}

Client
  -> transfer_funds_commit(proposal_id, commit_token, amount, to)
      -> verify signature + expiry + proposal existence + args_hash binding + nonce replay
      -> if valid: perform side effect (funds transfer)
      -> else: block with explicit reason
  <- commit outcome

New Modules

  • mcp_observatory/proposal_commit/hashing.py
    • canonical JSON hashing for stable tool_args_hash
    • normalized prompt_hash
  • mcp_observatory/proposal_commit/scoring.py
    • output_instability = 1 - jaccard_similarity
    • numeric_variance from extracted numbers
    • prompt_drift from prompt hash vs baseline
    • weighted renormalized composite_score
    • demo model_generate(prompt, temperature) stub
  • mcp_observatory/proposal_commit/token.py
    • HMAC-SHA256 token issue/verify
    • payload fields: token_id, proposal_id, tool_name, tool_args_hash, issued_at, expires_at, nonce, composite_score
  • mcp_observatory/proposal_commit/proposer.py
    • generic ToolProposer.propose(...) for any tool name/args
    • deterministic blocked fallback
  • mcp_observatory/proposal_commit/verifier.py
    • commit verification and nonce replay protection
  • mcp_observatory/proposal_commit/storage.py
    • in-memory storage fallback
    • optional Postgres storage via asyncpg
  • mcp_observatory/demo/server.py
    • MCP-like tools:
      • transfer_funds_propose
      • transfer_funds_commit
  • mcp_observatory/demo/run_demo.py
    • propose -> commit -> replay-attempt demo
  • sql/schema.sql
    • Postgres tables: proposals, commits, nonces, tool_prompt_baselines

Security / Verification Rules

Commit verifies all of the following:

  • token signature is valid (bad_signature on failure)
  • token not expired (expired)
  • proposal exists and was allowed (unknown_proposal)
  • commit args hash equals token payload args hash (args_hash_mismatch)
  • nonce has not already been used (nonce_replay)

Deterministic Fallback on Proposal Block

Blocked proposal response is deterministic and side-effect free:

{
  "status": "blocked",
  "action": "create_draft",
  "reason": "low_integrity",
  "draft": {"tool": "transfer_funds", "amount": 100, "to": "acct_123"}
}

Running the Demo

Without Postgres (default)

No env vars needed; in-memory store is used.

python -m mcp_observatory.demo.run_demo

With Postgres

  1. Set DSN:
export MCP_OBSERVATORY_PG_DSN='postgresql://user:pass@localhost:5432/postgres'
  1. Apply schema:
psql "$MCP_OBSERVATORY_PG_DSN" -f sql/schema.sql
  1. Run demo:
python -m mcp_observatory.demo.run_demo

Testing

PYTHONPATH=. pytest -q

The suite includes tests for token verification, hash stability, replay protection, and expired-token rejection.

Real-World MCP Scenario Demo (10 End-to-End Flows)

A realistic MCP server example is available at:

  • mcp_observatory/demo/real_world_server.py
  • executable shim: examples/real_world_mcp_server.py
  • executable client: examples/real_world_mcp_client.py
  • prompt-to-invocation MVP: examples/prompt_to_mcp_invocation_mvp.py
  • OpenAI GPT utility: examples/openai_prompt_to_mcp_invocation.py

It includes:

  • 10 distinct prompts mapped to 10 different MCP tool handlers
  • per-invocation annotations (e.g. destructiveHint, idempotentHint, openWorldHint)
  • proposal/commit execution for HIGH-risk tools (no secondary-response gating)
  • irreversible actions never pass a secondary LLM response
  • simulated LLM responses and grounding summaries for standard-risk tools
  • deterministic fallback routing for blocked/review-required scenarios
  • prompt-to-invocation MVP now extracts required tool parameters from user prompts before server invocation
  • openai-gpt utility for service selection + parameter extraction + MCP invocation

Run server demo:

python examples/real_world_mcp_server.py

Run client demo (client interacting with server):

python examples/real_world_mcp_client.py

Run prompt -> LLM planner -> server invocation MVP:

python examples/prompt_to_mcp_invocation_mvp.py

Run OpenAI GPT utility (manual, requires OPENAI_API_KEY):

python examples/openai_prompt_to_mcp_invocation.py

Optional (recommended): install OpenAI SDK for first-class client support (client auto-detects and uses SDK when available; otherwise it uses HTTPS fallback):

pip install -e .[openai]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_observatory-0.1.0.tar.gz (48.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_observatory-0.1.0-py3-none-any.whl (58.8 kB view details)

Uploaded Python 3

File details

Details for the file mcp_observatory-0.1.0.tar.gz.

File metadata

  • Download URL: mcp_observatory-0.1.0.tar.gz
  • Upload date:
  • Size: 48.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mcp_observatory-0.1.0.tar.gz
Algorithm Hash digest
SHA256 00ff0c9d7f82b74acaba3903eaabea5a9aadb56701559281216088ac617c0326
MD5 364e13051fba3f433ea5b3d9b4864bfc
BLAKE2b-256 f210147abd3eb9dfb0ec5f44a10ee348cd2474ecc1a40954089322f9db6fb4a7

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_observatory-0.1.0.tar.gz:

Publisher: deploy.yaml on rajatarun/mcp-observatory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mcp_observatory-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for mcp_observatory-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a6a7166357797d684cb47cb42cb0427c57954685072771eed7ee9eecb1ffdd92
MD5 a418457adbd5b828e0f65e66c604c668
BLAKE2b-256 aa2c41625ad3702a5542851061d80858ae93b367f621837fb61996a4b1ec0f8a

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_observatory-0.1.0-py3-none-any.whl:

Publisher: deploy.yaml on rajatarun/mcp-observatory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page