Skip to main content

Lightweight PTM API client for integration with external Python services

Project description

ptm-client

Lightweight Python client for the Prompt Test Manager (PTM) API. Single runtime dependency: requests.

Install

pip install ptm-client

Quick start

from ptm_client import PTMClient

client = PTMClient(base_url="https://ptm.example.com", token="your-api-token")

# List prompts, filtered by tag
prompts = client.list_prompts(tag="my_team")

# Fetch a prompt and its test cases
detail = client.get_prompt("my_team.summarizer")
tests = client.get_prompt_tests("my_team.summarizer")

# Run an eval against the library prompt
run = client.run_eval(
    prompt_ids=["my_team.summarizer"],
    provider_ids=["openai_gpt41_mini"],
)

# Or run a one-off manual eval
run = client.run_manual_eval({
    "prompt_text": "Summarize: {{text}}",
    "tests": [{"description": "smoke", "vars": {"text": "Hello, world."}}],
    "provider_profiles": ["openai_gpt41_mini"],
})

# Block until complete, then fetch a report
result = client.wait_for_run(run["run_key"], timeout=120)
html = client.run_report(run["run_key"])
json_report = client.run_report(run["run_key"], format="json")

Constructor

PTMClient(base_url, token, timeout=30, *, verify_ssl=None)

token is a PTM personal access token or service account token. timeout is the HTTP request timeout in seconds.

verify_ssl controls TLS certificate verification. When None (the default), the value is read from the PTM_SSL_VERIFY environment variable, which defaults to true. Set PTM_SSL_VERIFY=false (or pass verify_ssl=False) to allow self-signed or otherwise invalid certificates - intended for local dev against a homelab PTM instance, never for production. The matching urllib3.InsecureRequestWarning is silenced when verification is disabled.

Public methods

Prompts

  • list_prompts(tag=None, team=None, service=None, source=None, search=None, group=None)
  • get_prompt(prompt_id)
  • get_prompt_tests(prompt_id)
  • list_prompt_versions(prompt_id) (v0.3.0)
  • get_prompt_version(prompt_id, version_number) (v0.3.0)

Providers

  • list_providers()

Evaluations

  • run_eval(prompt_ids, provider_ids, **kwargs)
  • run_manual_eval(payload)
  • run_prompt_eval(prompt_id, provider_ids, *, inject_vars=None, extra_tests=None, visibility_scope="org_visible", label=None)

Runs

  • list_runs(limit=50, terminal_only=False, mine_only=False) (v0.3.0)
  • get_run(run_key)
  • wait_for_run(run_key, timeout=300, poll_interval=5)
  • run_report(run_key, format="html") - html / json / markdown / csv

Optimization (v0.3.0)

  • submit_optimization(prompt_id, provider_profiles=None, judge_profile=None, max_cycles=10, target_score=90.0, min_improvement=2.0, max_cost_usd=20.0, comparison_strategy=None, visibility_scope=None, *, stability_samples=None, validation_samples=None, flakiness_threshold=None, min_consistent_improvement=None, variance_aware_mutator=None, variance_signal=None, enforce_target_score=None, chained_baseline_mode=None) (variance-aware fields added in v0.6.0; chained_baseline_mode in v0.13.0; omit any to inherit server admin defaults)
  • optimize_prompt(...) - deprecated alias for submit_optimization; emits DeprecationWarning; removed in v1.0.0
  • get_optimization_status(prompt_id)
  • get_optimization_history(prompt_id)
  • get_optimization_detail(optimization_id) (v0.3.0)
  • cancel_optimization(optimization_id)
  • wait_for_optimization(prompt_id, *, timeout=600, poll_interval=10)

Skills (v0.10.0)

  • list_skills(filters=None)
  • get_skill(skill_id_or_qname)
  • get_skill_by_name(qualified_name)
  • get_skills_by_prompt(prompt_id)
  • publish_skill(prompt_id, payload, *, skill_name=None)
  • update_skill(skill_id, payload)
  • delete_skill(skill_id)
  • install_skill_for_claude_code(skill_id, *, install_name=None) - writes bundle to ~/.claude/skills/{slug}/
  • invoke_skill(skill_id, inputs, *, provider_profile=None, amend=False) (v0.10.0; amend added v0.14.0; DEPRECATED in v0.15.0 - emits DeprecationWarning and routes to invoke_skill_eval_ptm)
  • invoke_skill_by_name(qualified_name, inputs, *, provider_profile=None, amend=False) (v0.14.0; in v0.15.0 routes to invoke_skill_eval_ptm)
  • invoke_prompt(prompt_id, inputs, *, provider_profile=None, amend=False) (v0.10.0; amend added v0.14.0; DEPRECATED in v0.15.0 - routes to invoke_prompt_eval_ptm)

Inline-vs-eval execution surfaces (v0.15.0)

PTM 1.16.x exposes three execution modes per kind (skill + prompt). Use the helper that matches the user's intent.

  • invoke_skill_eval_ptm(skill_id, inputs, provider_profile, *, judge_profile=None, timeout_seconds=60, amend=False) - blocking server-side eval; returns scored output and retains a Eval: <qualified_name> row in run history.
  • invoke_prompt_eval_ptm(prompt_id, inputs, provider_profile, ...) - prompt twin.
  • invoke_skill_with_bundle(skill_id, inputs, provider_profile, ...) - returns the skill bundle for local execution AND enqueues a parallel server-side scored eval. Does not block on the eval. Response carries the bundle plus run_key.
  • invoke_skill_with_bundle_by_name(qualified_name, ...) - resolves the qualified name and forwards.
  • invoke_prompt_with_bundle(prompt_id, inputs, provider_profile, ...) - prompt twin.
  • record_skill_client_inline_run(skill_id, *, provider_label, duration_ms, input_keys, success) - 204 telemetry for client-side inline executions. All four kwargs are REQUIRED server-side; passing any one omitted raises HTTP 422. The endpoint sees only the names of input keys, never their values.
  • record_prompt_client_inline_run(prompt_id, *, provider_label, duration_ms, input_keys, success) - prompt twin.

Contents + usage metrics (v0.14.0)

  • get_skill_contents(skill_id_or_qname) - fetch bundle in memory as {prompt_md, manifest, files}; no disk write, no LLM call
  • get_prompt_contents(prompt_id, *, version=None) - fetch prompt contract as {prompt_text, version, tests, deepeval_metrics, tool_definitions}
  • get_skill_usage_metrics(skill_id) - {lifetime, last_30d, last_7d, last_request_at} aggregated from fetch-event audit rows; returns null shape when org setting inline_usage_metrics_enabled=false
  • get_prompt_usage_metrics(prompt_id) - same shape for prompt fetch events

Groups (v0.13.0)

  • list_groups()
  • create_prompt(payload)

Test-case shapes

PTM evaluates with three optional scoring layers. Use any combination.

Promptfoo assertions (deterministic)

Go in the assert array inside each test case:

{
    "description": "mention the topic with enough length",
    "vars": {"transcript": "..."},
    "assert": [
        {"type": "icontains", "value": "API migration"},
        {"type": "javascript", "value": "output.length >= 100"},
    ],
}

DeepEval metrics (semantic, judge-LLM)

Go in additional_metrics at the payload root:

{
    "additional_metrics": [
        {"name": "relevance", "criteria": "Output addresses the input topic.", "threshold": 0.7},
    ],
    "judge_profile": "openai_gpt41_mini",
}

KPI configs (custom weighted expressions)

Go in additional_kpis at the payload root:

{
    "additional_kpis": [
        {"name": "cost_ok", "description": "Under $0.05", "expression": "1 if cost < 0.05 else 0", "weight": 1.0},
    ],
}

Inline examples

run_manual_eval - full control

run = client.run_manual_eval({
    "label": "my_custom_eval",
    "prompt_text": "Summarize: {{text}}",
    "tests": [{"description": "short text", "vars": {"text": "The quick brown fox."}}],
    "provider_profiles": ["openai_gpt41_mini"],
    "cost_threshold": 1.0,
    "latency_threshold_ms": 30000,
})

run_prompt_eval - fetch from PTM + inject live data

run = client.run_prompt_eval(
    prompt_id="my_team.summarizer",
    provider_ids=["openai_gpt41_mini"],
    inject_vars={"transcript": real_transcript, "meeting_title": "Weekly 1:1"},
)
result = client.wait_for_run(run["run_key"], timeout=120)

Error handling

from ptm_client import PTMClient, PTMError, PTMTimeoutError

try:
    result = client.wait_for_run(run_key, timeout=60)
except PTMTimeoutError:
    print("Run did not complete in time")
except PTMError as e:
    print(f"PTM API error ({e.status_code}): {e}")

PTMError wraps all HTTP errors, ConnectionError, and requests.Timeout. Check e.status_code (0 for connection/timeout failures).

Compatibility

  • Python 3.12+
  • PTM backend >= 1.9.0 required. get_skill_contents, get_prompt_contents, and usage-metrics methods require PTM >= 1.16.0. Skill invoke methods require PTM >= 1.13.1. Older backends work for all other methods.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ptm_client-0.16.0.tar.gz (51.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ptm_client-0.16.0-py3-none-any.whl (32.4 kB view details)

Uploaded Python 3

File details

Details for the file ptm_client-0.16.0.tar.gz.

File metadata

  • Download URL: ptm_client-0.16.0.tar.gz
  • Upload date:
  • Size: 51.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for ptm_client-0.16.0.tar.gz
Algorithm Hash digest
SHA256 eb7009a7c818e65f29212a75e0228ecbe1f2aaa99fbea3771add9e803885125f
MD5 7aed3a8d84e1fb330fc53392a1ea0020
BLAKE2b-256 b77550a2cca43dca0cb828e9e10cb8b0628330a2ca14ae4f72fbdd8d38e41923

See more details on using hashes here.

File details

Details for the file ptm_client-0.16.0-py3-none-any.whl.

File metadata

  • Download URL: ptm_client-0.16.0-py3-none-any.whl
  • Upload date:
  • Size: 32.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for ptm_client-0.16.0-py3-none-any.whl
Algorithm Hash digest
SHA256 af6b1af11c187cb3c9b5270df6a3b13f68814481a5e2821b305df07726a7c882
MD5 acb09107c262aa15b525eb65ec1a9000
BLAKE2b-256 cbf7441c52bb36106d2a6b8009fd408111c874fee947ed1e6b6fd06bcfbae31b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page