Skip to main content

Lightweight PTM API client for integration with external Python services

Project description

ptm-client

Lightweight Python client for the Prompt Test Manager (PTM) API. Single runtime dependency: requests.

Install

pip install ptm-client

Quick start

from ptm_client import PTMClient

client = PTMClient(base_url="https://ptm.example.com", token="your-api-token")

# List prompts, filtered by tag
prompts = client.list_prompts(tag="my_team")

# Fetch a prompt and its test cases
detail = client.get_prompt("my_team.summarizer")
tests = client.get_prompt_tests("my_team.summarizer")

# Run an eval against the library prompt
run = client.run_eval(
    prompt_ids=["my_team.summarizer"],
    provider_ids=["openai_gpt41_mini"],
)

# Or run a one-off manual eval
run = client.run_manual_eval({
    "prompt_text": "Summarize: {{text}}",
    "tests": [{"description": "smoke", "vars": {"text": "Hello, world."}}],
    "provider_profiles": ["openai_gpt41_mini"],
})

# Block until complete, then fetch a report
result = client.wait_for_run(run["run_key"], timeout=120)
html = client.run_report(run["run_key"])
json_report = client.run_report(run["run_key"], format="json")

Constructor

PTMClient(base_url, token, timeout=30, *, verify_ssl=None)

token is a PTM personal access token or service account token. timeout is the HTTP request timeout in seconds.

verify_ssl controls TLS certificate verification. When None (the default), the value is read from the PTM_SSL_VERIFY environment variable, which defaults to true. Set PTM_SSL_VERIFY=false (or pass verify_ssl=False) to allow self-signed or otherwise invalid certificates - intended for local dev against a homelab PTM instance, never for production. The matching urllib3.InsecureRequestWarning is silenced when verification is disabled.

Public methods

Prompts

  • list_prompts(tag=None, team=None, service=None, source=None, search=None, group=None)
  • get_prompt(prompt_id)
  • get_prompt_tests(prompt_id)
  • list_prompt_versions(prompt_id) (v0.3.0)
  • get_prompt_version(prompt_id, version_number) (v0.3.0)

Providers

  • list_providers()

Evaluations

  • run_eval(prompt_ids, provider_ids, **kwargs)
  • run_manual_eval(payload)
  • run_prompt_eval(prompt_id, provider_ids, *, inject_vars=None, extra_tests=None, visibility_scope="org_visible", label=None)

Runs

  • list_runs(limit=50, terminal_only=False, mine_only=False) (v0.3.0)
  • get_run(run_key)
  • wait_for_run(run_key, timeout=300, poll_interval=5)
  • run_report(run_key, format="html") - html / json / markdown / csv

Optimization (v0.3.0)

  • submit_optimization(prompt_id, provider_profiles=None, judge_profile=None, max_cycles=10, target_score=90.0, min_improvement=2.0, max_cost_usd=20.0, comparison_strategy=None, visibility_scope=None, *, stability_samples=None, validation_samples=None, flakiness_threshold=None, min_consistent_improvement=None, variance_aware_mutator=None, variance_signal=None, enforce_target_score=None, chained_baseline_mode=None) (variance-aware fields added in v0.6.0; chained_baseline_mode in v0.13.0; omit any to inherit server admin defaults)
  • optimize_prompt(...) - deprecated alias for submit_optimization; emits DeprecationWarning; removed in v1.0.0
  • get_optimization_status(prompt_id)
  • get_optimization_history(prompt_id)
  • get_optimization_detail(optimization_id) (v0.3.0)
  • cancel_optimization(optimization_id)
  • wait_for_optimization(prompt_id, *, timeout=600, poll_interval=10)

Skills (v0.10.0)

  • list_skills(filters=None)
  • get_skill(skill_id_or_qname)
  • get_skill_by_name(qualified_name)
  • get_skills_by_prompt(prompt_id)
  • publish_skill(prompt_id, payload, *, skill_name=None)
  • update_skill(skill_id, payload)
  • delete_skill(skill_id)
  • install_skill_for_claude_code(skill_id, *, install_name=None) - writes bundle to ~/.claude/skills/{slug}/
  • invoke_skill(skill_id, inputs, *, provider_profile=None, amend=False) (v0.10.0; amend added v0.14.0; DEPRECATED in v0.15.0 - emits DeprecationWarning and routes to invoke_skill_eval_ptm)
  • invoke_skill_by_name(qualified_name, inputs, *, provider_profile=None, amend=False) (v0.14.0; in v0.15.0 routes to invoke_skill_eval_ptm)
  • invoke_prompt(prompt_id, inputs, *, provider_profile=None, amend=False) (v0.10.0; amend added v0.14.0; DEPRECATED in v0.15.0 - routes to invoke_prompt_eval_ptm)

Inline-vs-eval execution surfaces (v0.15.0)

PTM 1.16.x exposes three execution modes per kind (skill + prompt). Use the helper that matches the user's intent.

  • invoke_skill_eval_ptm(skill_id, inputs, provider_profile, *, judge_profile=None, timeout_seconds=60, amend=False) - blocking server-side eval; returns scored output and retains a Eval: <qualified_name> row in run history.
  • invoke_prompt_eval_ptm(prompt_id, inputs, provider_profile, ...) - prompt twin.
  • invoke_skill_with_bundle(skill_id, inputs, provider_profile, ...) - returns the skill bundle for local execution AND enqueues a parallel server-side scored eval. Does not block on the eval. Response carries the bundle plus run_key.
  • invoke_skill_with_bundle_by_name(qualified_name, ...) - resolves the qualified name and forwards.
  • invoke_prompt_with_bundle(prompt_id, inputs, provider_profile, ...) - prompt twin.
  • record_skill_client_inline_run(skill_id, *, provider_label, duration_ms, input_keys, success) - 204 telemetry for client-side inline executions. All four kwargs are REQUIRED server-side; passing any one omitted raises HTTP 422. The endpoint sees only the names of input keys, never their values.
  • record_prompt_client_inline_run(prompt_id, *, provider_label, duration_ms, input_keys, success) - prompt twin.

Contents + usage metrics (v0.14.0)

  • get_skill_contents(skill_id_or_qname) - fetch bundle in memory as {prompt_md, manifest, files}; no disk write, no LLM call
  • get_prompt_contents(prompt_id, *, version=None) - fetch prompt contract as {prompt_text, version, tests, deepeval_metrics, tool_definitions}
  • get_skill_usage_metrics(skill_id) - {lifetime, last_30d, last_7d, last_request_at} aggregated from fetch-event audit rows; returns null shape when org setting inline_usage_metrics_enabled=false
  • get_prompt_usage_metrics(prompt_id) - same shape for prompt fetch events

Groups (v0.13.0)

  • list_groups()
  • create_prompt(payload)

Test-case shapes

PTM evaluates with three optional scoring layers. Use any combination.

Promptfoo assertions (deterministic)

Go in the assert array inside each test case:

{
    "description": "mention the topic with enough length",
    "vars": {"transcript": "..."},
    "assert": [
        {"type": "icontains", "value": "API migration"},
        {"type": "javascript", "value": "output.length >= 100"},
    ],
}

DeepEval metrics (semantic, judge-LLM)

Go in additional_metrics at the payload root:

{
    "additional_metrics": [
        {"name": "relevance", "criteria": "Output addresses the input topic.", "threshold": 0.7},
    ],
    "judge_profile": "openai_gpt41_mini",
}

KPI configs (custom weighted expressions)

Go in additional_kpis at the payload root:

{
    "additional_kpis": [
        {"name": "cost_ok", "description": "Under $0.05", "expression": "1 if cost < 0.05 else 0", "weight": 1.0},
    ],
}

Inline examples

run_manual_eval - full control

run = client.run_manual_eval({
    "label": "my_custom_eval",
    "prompt_text": "Summarize: {{text}}",
    "tests": [{"description": "short text", "vars": {"text": "The quick brown fox."}}],
    "provider_profiles": ["openai_gpt41_mini"],
    "cost_threshold": 1.0,
    "latency_threshold_ms": 30000,
})

run_prompt_eval - fetch from PTM + inject live data

run = client.run_prompt_eval(
    prompt_id="my_team.summarizer",
    provider_ids=["openai_gpt41_mini"],
    inject_vars={"transcript": real_transcript, "meeting_title": "Weekly 1:1"},
)
result = client.wait_for_run(run["run_key"], timeout=120)

Error handling

from ptm_client import PTMClient, PTMError, PTMTimeoutError

try:
    result = client.wait_for_run(run_key, timeout=60)
except PTMTimeoutError:
    print("Run did not complete in time")
except PTMError as e:
    print(f"PTM API error ({e.status_code}): {e}")

PTMError wraps all HTTP errors, ConnectionError, and requests.Timeout. Check e.status_code (0 for connection/timeout failures).

Compatibility

  • Python 3.12+
  • PTM backend >= 1.9.0 required. get_skill_contents, get_prompt_contents, and usage-metrics methods require PTM >= 1.16.0. Skill invoke methods require PTM >= 1.13.1. Older backends work for all other methods.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ptm_client-0.18.0.tar.gz (52.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ptm_client-0.18.0-py3-none-any.whl (32.8 kB view details)

Uploaded Python 3

File details

Details for the file ptm_client-0.18.0.tar.gz.

File metadata

  • Download URL: ptm_client-0.18.0.tar.gz
  • Upload date:
  • Size: 52.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for ptm_client-0.18.0.tar.gz
Algorithm Hash digest
SHA256 5daa8150037dff12776dac0f9d48ab8ffa6d66f62cae7787067607c82febbc3f
MD5 274cc5ef2b331f92b6aae79b62e3cfe6
BLAKE2b-256 05708e3c4d9d7958708b42b5d7a55a51a4158f5424154cc3e74de2dbeee1eb2a

See more details on using hashes here.

File details

Details for the file ptm_client-0.18.0-py3-none-any.whl.

File metadata

  • Download URL: ptm_client-0.18.0-py3-none-any.whl
  • Upload date:
  • Size: 32.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for ptm_client-0.18.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b9214970529032e085b8b83a316ab85356d4480505271ec05247d76540f5e36a
MD5 627c373548e141017eadf181af94e584
BLAKE2b-256 2365b58715bef0b091a2cdd8031afa20000bd6ccb923311bae48d6a074aed20e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page