Skip to main content

MCP stdio server for the Prompt Test Manager API

Project description

ptm-mcp

MCP (Model Context Protocol) stdio server for the Prompt Test Manager API.

Lets agents built on top of MCP-capable clients (Claude Desktop, Claude Code, Codex, etc.) call PTM as first-class tools: list prompts, run evaluations, update prompt content, submit optimizations. All traffic is tagged with X-PTM-Client: ptm-mcp/<version> + a per-process X-PTM-MCP-Session UUID so the PTM backend can rate-limit, budget, and audit agent traffic separately from humans and service accounts.

Prereqs

  • Python >= 3.12.
  • A reachable PTM backend (>= 1.9.0) and a personal access token or service-account token with the scopes your flow needs.

Install

pip install ptm-mcp

Or zero-install via uvx:

uvx ptm-mcp

ptm-mcp pulls in ptm-client and the mcp SDK automatically.

Configure your MCP client

Claude Desktop

Config file:

OS Path
macOS ~/Library/Application Support/Claude/claude_desktop_config.json
Windows %APPDATA%\Claude\claude_desktop_config.json

Merge into mcpServers (create the key if it doesn't exist):

{
  "mcpServers": {
    "ptm": {
      "command": "uvx",
      "args": ["ptm-mcp"],
      "env": {
        "PTM_API_BASE_URL": "https://ptm.example.com",
        "PTM_API_TOKEN": "ptm_u_PASTE_HERE",
        "PTM_MCP_READ_ONLY": "true"
      }
    }
  }
}

Fully quit + reopen Claude Desktop after editing. stderr lands in ~/Library/Logs/Claude/mcp-server-ptm.log (macOS) or %APPDATA%\Claude\logs\mcp-server-ptm.log (Windows).

Claude Code

# macOS / Linux
claude mcp add --transport stdio --scope user ptm \
  --env PTM_API_BASE_URL=https://ptm.example.com \
  --env PTM_API_TOKEN=ptm_u_PASTE_HERE \
  --env PTM_MCP_READ_ONLY=true \
  -- uvx ptm-mcp

# Windows (PowerShell / cmd) - needs cmd /c wrapper
claude mcp add --transport stdio --scope user ptm `
  --env PTM_API_BASE_URL=https://ptm.example.com `
  --env PTM_API_TOKEN=ptm_u_PASTE_HERE `
  --env PTM_MCP_READ_ONLY=true `
  -- cmd /c uvx ptm-mcp

Codex

~/.codex/config.toml (macOS / Linux) or %USERPROFILE%\.codex\config.toml (Windows):

[mcp_servers.ptm]
command = "uvx"
args = ["ptm-mcp"]
env = { PTM_API_BASE_URL = "https://ptm.example.com", PTM_API_TOKEN = "ptm_u_PASTE_HERE", PTM_MCP_READ_ONLY = "true" }
startup_timeout_sec = 10
tool_timeout_sec = 60

Or CLI-first: codex mcp add ptm -- uvx ptm-mcp (with --env KEY=VAL per var).

Environment variables

Consumed at startup. Missing required values fail fast with a descriptive error.

Variable Required Default Notes
PTM_API_BASE_URL yes - e.g. https://ptm.example.com
PTM_API_TOKEN yes - PTM bearer. Service-account tokens preferred for long-running agent sessions.
PTM_MCP_READ_ONLY no true Flip to false to unlock write tools.
PTM_MCP_TIMEOUT_SECONDS no 30 Per-request timeout (1..600).
PTM_MCP_LOG_LEVEL no INFO DEBUG / INFO / WARNING / ERROR / CRITICAL.
PTM_SSL_VERIFY no true Set to false to disable TLS certificate verification (allows self-signed / invalid certs against a homelab PTM). Local dev only - never disable in production.
CF_ACCESS_CLIENT_ID no - Cloudflare Access service-token Client ID. Paired with CF_ACCESS_CLIENT_SECRET.
CF_ACCESS_CLIENT_SECRET no - Cloudflare Access service-token Client Secret. Paired with CF_ACCESS_CLIENT_ID.
CF_ACCESS_JWT no - Cloudflare Access user JWT (alternative to the service-token pair).
PTM_CF_AUTO_DISCOVER no true Falsy value opts out of auto-discovery via cloudflared.

Startup scrubs every env var outside a narrow allow-list (cloud creds, GitHub tokens, etc. get dropped).

Cloudflare Access

If your PTM deployment sits behind Cloudflare Access:

  • Default path (recommended): install cloudflared (brew install cloudflared or equivalent) and run cloudflared access login https://your-ptm-host once. ptm-mcp auto-detects CF challenges and injects the cached JWT on request.
  • Service token (CI / headless): ask an admin to mint a service token for the PTM app in Cloudflare Zero Trust -> Access -> Service Auth. Set CF_ACCESS_CLIENT_ID + CF_ACCESS_CLIENT_SECRET in the MCP env block. Explicit config disables auto-discovery.
  • Direct access (no CF Access): skip this section; no CF env vars needed.

On a Cloudflare Access block, ptm-mcp surfaces a CloudflareAccessError with the exact next step rather than a raw JSON decode crash.

Tool inventory

47 tools total (15 read + 32 write) + 5 resource URI patterns. Write tools are gated by PTM_MCP_READ_ONLY=false; the backend enforces per-prompt ownership, group membership, admin role, and the manage_tool_registry / manage_skill_catalog permissions on top of the PAT's scopes. ptm-mcp relays backend errors verbatim with actionable hints; it does not re-implement permission checks.

Read (15)

list_providers, list_prompts, get_prompt, get_prompt_tests, list_prompt_versions, get_prompt_version, compare_prompt_versions, list_runs, get_run, get_run_report, get_optimization_status, get_optimization_history, get_optimization_detail, list_skills, get_skill.

Write (32, gated by PTM_MCP_READ_ONLY)

Eval / optimization (4): run_manual_eval (QA eval with custom prompt + test cases), run_prompt_eval (QA eval against stored test suite), submit_optimization, cancel_optimization.

submit_optimization accepts seven optional variance-aware fields: stability_samples (1-10), validation_samples (1-10), flakiness_threshold (0.0-1.0), min_consistent_improvement (0.0-100.0), variance_aware_mutator (bool), variance_signal (levenshtein / embedding / both), enforce_target_score (bool, default true). Omit any field to inherit the server's admin-configured default.

Ad-hoc execution (2): run_prompt (call a library prompt with your inputs, get the answer - not a test), run_skill (same, by skill qualified name).

Library mutation (9):

  • update_prompt - unified atomic update. Partial-update semantics: null/omitted = leave unchanged, [] = clear a list, populated list = replace. Content fields auto-create a new version. Required change_summary flows into the audit log.
  • activate_prompt_version - rollback / roll-forward with no content change.
  • share_prompt, unshare_prompt, add_prompt_to_group, remove_prompt_from_group - group-scoped visibility. Admin or group-manager roles only.
  • transfer_prompt_ownership - admin only. Current owner cannot self-transfer.
  • update_prompt_sampling - toggle production-judge sampling and sample rate.

Skill Library (7): install_skill (download + install to ~/.claude/skills/{slug}/), load_skill (install if absent + return bundle content for local execution), publish_skill, update_skill, deprecate_skill, run_skill, run_prompt.

Tool Registry (5): create_tool, update_tool, deprecate_tool, delete_mock_profile, export_tool_registry.

Triage (4): promote_run_to_golden, resolve_triage_item, reopen_triage_item, bulk_resolve_triage.

Runs (1): submit_run_feedback.

Permissions model

Action Who can do it
Read tools Anyone with a valid PAT (results filtered by visibility scope)
update_prompt, activate_prompt_version Prompt owner OR a role with prompt-overwrite permission
share_prompt, unshare_prompt, add_prompt_to_group, remove_prompt_from_group Admin OR group manager
transfer_prompt_ownership Admin ONLY
Eval tools Anyone who can read the prompt

Backend error codes surface as actionable tool errors:

  • 403 -> "Permission denied. This operation requires prompt ownership or admin / group-manager role."
  • 404 -> "Prompt or resource not found. Verify prompt_id and PAT visibility."
  • 409 -> "Conflict. Repository-backed prompt (edit the files at source) OR concurrent writer. Re-fetch and retry."
  • 422 -> "Validation failed. Check prompt_text length (max 500k), test / metric shapes, and field types."

Resources (5 URI patterns)

  • ptm://prompts/{prompt_id} - active version's prompt_text (text/plain)
  • ptm://prompts/{prompt_id}/v{N} - that version's prompt_text (text/plain)
  • ptm://runs/{run_key}/report.md - markdown report (text/markdown)
  • ptm://runs/{run_key}/report.html - HTML report (text/html)
  • ptm://optimizations/{optimization_id}/report.md - markdown summary (text/markdown)

Dynamic segments are allow-list validated (^[a-zA-Z0-9_.-]+$ plus explicit ./.. rejection).

Security defaults

  • PTM_MCP_READ_ONLY=true blocks every write tool at call time.
  • X-PTM-Client + X-PTM-MCP-Session on every outbound request so the backend can classify and audit agent traffic.
  • Env scrub at startup drops anything outside the allow-list.
  • Startup preflight (/healthz + /auth/me + /meta) with exponential backoff on transient failures and dedicated exit codes per failed layer.

Exit codes

Code Meaning
0 clean shutdown
1 unhandled exception
2 /healthz unreachable after 31s of backoff
3 /auth/me rejected the token
4 backend version < 1.9.0 or unparseable
130 interrupted (SIGINT)

Status

0.10.0. 47 tools: full PTM surface (prompts, evals, optimization, triage, runs, skills, tool registry, ad-hoc execution) + 5 resource URI patterns, read-only gate, Cloudflare Access auto-discovery. See CHANGELOG.md for release notes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ptm_mcp-0.11.0.tar.gz (48.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ptm_mcp-0.11.0-py3-none-any.whl (56.4 kB view details)

Uploaded Python 3

File details

Details for the file ptm_mcp-0.11.0.tar.gz.

File metadata

  • Download URL: ptm_mcp-0.11.0.tar.gz
  • Upload date:
  • Size: 48.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for ptm_mcp-0.11.0.tar.gz
Algorithm Hash digest
SHA256 1dd1841be1a29d5feb02608a345455fe0c43a4c8a1a8fd56a6c765063c4b5ac2
MD5 f7c13a8afc297c83b4e4fecce7d5f8ec
BLAKE2b-256 122953401fbf93587b4711a3495397910207e0505b4a50c891142ba4171761f0

See more details on using hashes here.

File details

Details for the file ptm_mcp-0.11.0-py3-none-any.whl.

File metadata

  • Download URL: ptm_mcp-0.11.0-py3-none-any.whl
  • Upload date:
  • Size: 56.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for ptm_mcp-0.11.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a11a553ed0d67076581c721ec2a1122b9eac1467f52b4b6a4028f40d001dec01
MD5 3ae923f39bba36eaaba848e1e78014c8
BLAKE2b-256 a91e024430ca5e2ac5455af3c4bf1585ea2504efcb9070b8ea3bf8d1e860f8fd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page