Skip to main content

MCP stdio server for the Prompt Test Manager API

Project description

ptm-mcp

MCP (Model Context Protocol) stdio server for the Prompt Test Manager API.

Lets agents built on top of MCP-capable clients (Claude Desktop, Claude Code, Codex, Goose) call PTM as first-class tools: list prompts, run evaluations, submit optimizations. All traffic is tagged with X-PTM-Client: ptm-mcp/<version> + a per-process X-PTM-MCP-Session UUID so the PTM backend can rate-limit, budget, and audit agent traffic separately from humans and service accounts.

Quickstart

# 1. Mint a PTM token (prompts for your PTM URL + email + password)
bash scripts/mint-ptm-mcp-token.sh     # macOS / Linux
pwsh scripts/mint-ptm-mcp-token.ps1    # Windows

# 2. Wire into your MCP client of choice
bash scripts/install-claude-desktop.sh # Claude Desktop (macOS / Linux)
pwsh scripts/install-claude-desktop.ps1 # Claude Desktop (Windows)

# Claude Code - macOS / Linux
claude mcp add --transport stdio --scope user ptm -- uvx ptm-mcp

# Claude Code - Windows
claude mcp add --transport stdio --scope user ptm -- cmd /c uvx ptm-mcp

# Codex (any OS)
codex mcp add ptm -- uvx ptm-mcp

# Env vars can be set via `--env KEY=VAL` on either command.
# Full setup + token flow: docs/mcp-integration.md

# Uninstall Claude Desktop entry (creates a timestamped backup):
bash scripts/uninstall-claude-desktop.sh       # macOS / Linux
pwsh scripts/uninstall-claude-desktop.ps1      # Windows

Full setup + troubleshooting: docs/mcp-integration.md.

Status

Phase 2 complete: 17 tools (canary + 12 read + 4 write), 5 resource URI patterns, read-only gate, live end-to-end integration. Ships at 0.1.0.

Prereqs

  • Python >= 3.12 in a venv you control.
  • PTM backend >= 1.9.0 (MCP middleware chokepoint landed in 1.9.0; older backends cannot enforce agent-scoped limits).
  • For the stdio smoke test: Node >= 18 (for npx @modelcontextprotocol/inspector) or a global install of the MCP Inspector CLI.

Install

pip install ptm-mcp

Requires Python >= 3.12. ptm-mcp pulls in ptm-client and the mcp SDK automatically. For pinned, runtime-tested versions see pyproject.toml in the release tag.

From source (dev mode)

git clone git@github.com:15five/prompt-test-manager
cd prompt-test-manager
python3.12 -m venv .venv && source .venv/bin/activate
pip install -e packages/ptm-client
pip install -e "packages/ptm-mcp[dev]"

Dev mode is what CI exercises. When developing locally, wire your MCP client config at PYTHONPATH=packages/ptm-mcp/src:packages/ptm-client/src so edits in src/ are picked up without reinstalling.

Environment variables

Consumed at startup. Missing required values fail fast with a descriptive error.

Variable Required Default Notes
PTM_API_BASE_URL yes - e.g. https://ptm.example.com
PTM_API_TOKEN yes - PTM bearer. Service-account tokens preferred.
PTM_MCP_READ_ONLY no true Flip to false to unlock the 4 write tools.
PTM_MCP_TIMEOUT_SECONDS no 30 Per-request timeout (1..600).
PTM_MCP_LOG_LEVEL no INFO DEBUG / INFO / WARNING / ERROR / CRITICAL.
CF_ACCESS_CLIENT_ID no - Cloudflare Access service-token Client ID. Paired with CF_ACCESS_CLIENT_SECRET.
CF_ACCESS_CLIENT_SECRET no - Cloudflare Access service-token Client Secret. Paired with CF_ACCESS_CLIENT_ID.
CF_ACCESS_JWT no - Cloudflare Access user JWT (alternative to the service-token pair).

Startup scrubs every env var outside a narrow allow-list (cloud creds, GitHub tokens, PATH - all get dropped). See src/ptm_mcp/env.py.

Cloudflare Access setup (when PTM is behind Zero Trust): see docs/mcp-integration.md#cloudflare-access.

Claude Desktop config snippet

~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "ptm": {
      "command": "uvx",
      "args": ["ptm-mcp"],
      "env": {
        "PTM_API_BASE_URL": "http://localhost:8010",
        "PTM_API_TOKEN": "ptm_u_PASTE_HERE",
        "PTM_MCP_READ_ONLY": "true"
      }
    }
  }
}

uvx handles Python provisioning + caching automatically; no PYTHONPATH needed when installed from PyPI. Other clients (Claude Code, Codex): see docs/mcp-integration.md for client-specific wire-up.

Tool inventory

24 tools total (1 canary + 12 read + 11 write) + 5 resource URI patterns. Write tools are gated by PTM_MCP_READ_ONLY=false; backend RBAC enforces per-prompt ownership, group membership (Phase 15 isolation), and admin role on top of the PAT's scopes. ptm-mcp relays backend errors verbatim with actionable hints; it does not re-implement permission checks.

Read (13)

list_providers, list_prompts, get_prompt, get_prompt_tests, list_prompt_versions, get_prompt_version, compare_prompt_versions, list_runs, get_run, get_run_report, get_optimization_status, get_optimization_history, get_optimization_detail.

Write (11, gated by PTM_MCP_READ_ONLY)

Eval / optimization (4): run_manual_eval, run_prompt_eval, submit_optimization, cancel_optimization.

Library mutation (7, v0.2.0):

  • update_prompt - unified atomic update. Partial-update semantics: null/omitted = leave unchanged, [] = clear a list, populated list = replace. Content fields (prompt_text, tests, deepeval_metrics, kpis, provider_profiles, judge_profile) auto-create a new version; metadata fields (name, team, service, description, tags) may or may not rev the version depending on backend behavior. Required change_summary flows into the audit log.
  • activate_prompt_version - POST /versions/{n}/activate. Rollback / roll-forward path with no content change.
  • share_prompt, unshare_prompt, add_prompt_to_group, remove_prompt_from_group - group-scoped visibility. Admin or group_mgr: only.
  • transfer_prompt_ownership - admin only. Current owner cannot self-transfer.

Permissions model

Action Who can do it
Read tools Anyone with a valid PAT (results filtered by visibility scope)
update_prompt, activate_prompt_version Prompt owner OR PROMPT_OVERWRITE role
share_prompt, unshare_prompt, add_prompt_to_group, remove_prompt_from_group Admin OR group_mgr:
transfer_prompt_ownership Admin ONLY
Eval tools Anyone who can read the prompt

When Phase 15 isolation mode is enabled on the org, group-scoped prompts additionally require the caller to be a member of at least one group the prompt is assigned to.

Backend error codes surface as actionable tool errors:

  • 403 -> "Permission denied. This operation requires prompt ownership or admin / group_mgr role."
  • 404 -> "Prompt or resource not found. Verify prompt_id and PAT visibility."
  • 409 -> "Conflict. Repository-backed prompt (edit the files in git) OR concurrent writer. Re-fetch and retry."
  • 422 -> "Validation failed. Check prompt_text length (max 500k), test / metric shapes, and field types."

Resources (5 URI patterns)

  • ptm://prompts/{prompt_id} - active version's prompt_text (text/plain)
  • ptm://prompts/{prompt_id}/v{N} - that version's prompt_text (text/plain)
  • ptm://runs/{run_key}/report.md - markdown report (text/markdown)
  • ptm://runs/{run_key}/report.html - HTML report (text/html)
  • ptm://optimizations/{optimization_id}/report.md - markdown summary (text/markdown)

Dynamic segments are allow-list validated (^[a-zA-Z0-9_.-]+$ plus explicit ./.. rejection). See docs/mcp-security.md.

Security defaults

  • PTM_MCP_READ_ONLY=true blocks every write tool at call time.
  • X-PTM-Client + X-PTM-MCP-Session on every outbound request.
  • Env scrub at startup.
  • Startup preflight (/healthz + /auth/me + /meta) with exponential backoff on transient failures and dedicated exit codes per failed layer.

Full details: docs/mcp-security.md.

Development

pip install -e packages/ptm-client
pip install -e 'packages/ptm-mcp[dev]'
PYTHONPATH=packages/ptm-mcp/src:packages/ptm-client/src \
  pytest packages/ptm-mcp/tests -q
ruff check packages/ptm-mcp/src packages/ptm-mcp/tests
ruff format --check packages/ptm-mcp/src packages/ptm-mcp/tests

End-to-end smoke (requires a running backend and Node for npx):

PTM_API_BASE_URL=http://localhost:8010 \
PTM_API_TOKEN="ptm_u_..." \
bash scripts/smoke_mcp_inspector.sh

Exit codes

Code Meaning
0 clean shutdown
1 unhandled exception
2 /healthz unreachable after 31s of backoff
3 /auth/me rejected the token
4 backend version < 1.9.0 or unparseable
130 interrupted (SIGINT)

See also

  • docs/mcp-integration.md - end-user setup
  • docs/mcp-developer.md - adding a tool
  • docs/mcp-security.md - tokens, env, headers, read-only, kill switch
  • docs/mcp-admin.md - ops runbook + alert response

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ptm_mcp-0.2.0.tar.gz (27.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ptm_mcp-0.2.0-py3-none-any.whl (32.0 kB view details)

Uploaded Python 3

File details

Details for the file ptm_mcp-0.2.0.tar.gz.

File metadata

  • Download URL: ptm_mcp-0.2.0.tar.gz
  • Upload date:
  • Size: 27.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for ptm_mcp-0.2.0.tar.gz
Algorithm Hash digest
SHA256 edb4476f1e8af86a05ab6fa173fe5dcf62dcaf29166d47846f90ff8ee3a8483f
MD5 ca63cec65cdc73d5c584a4d3aedf8072
BLAKE2b-256 cc6721124a09838833d8800931f411d40447e727d1e898b39aa5e004b7565689

See more details on using hashes here.

File details

Details for the file ptm_mcp-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: ptm_mcp-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 32.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for ptm_mcp-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bc77b9ee5696accf88f4ea8c5d8479029371ba603dfd7742032d25ec0fb16d9c
MD5 20350f5aefde2e3bae35cfb48e7fa9c4
BLAKE2b-256 05c04fb85d32f3a98963cac7b7e2705b4740d09139fe012392c599e2dcc52869

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page