Skip to main content

MrProbe / Agent Guard customer observation SDK — ship your agent's response back to MrProbe in 6 lines.

Project description

agentguard-observe

Customer-side observation SDK for MrProbe / Agent Guard (POST /api/v1/observe).

Ship your agent's response back to MrProbe in 6 lines so red-team campaigns get live, signed, retry-safe, secret-redacted observations from your agent runtime — wherever it runs (FastAPI, Vertex AI, Bedrock Lambda, Azure Function, custom HTTP).

Status: v0.1.1 — design-partner ready for Email Agents on Microsoft 365 Outlook and Google Workspace Gmail. The same SDK is the Pillar 4 surface for every other agent-type epic (Document/RAG, Salesforce, ServiceNow, etc.) — internally tracked as MRPRBE-303. Pre-1.0 because the wire-shape (e.g. llm_response field name) may evolve through design-partner feedback before we commit to the 1.0 stability contract.


Install

From PyPI (recommended)

pip install agentguard-observe

Pin in requirements.txt:

agentguard-observe==0.1.1

From the GitHub release (alternative — hash-pinned)

pip install \
  https://github.com/QDEXConsulting/agent-guard/releases/download/agentguard-observe-v0.1.0/agentguard_observe-0.1.0-py3-none-any.whl \
  --hash=sha256:<copy-from-the-release-page>

The SHA-256 of every published wheel + sdist is printed in the release notes on the GitHub Releases page. Include the --hash line for supply-chain verification.

From source (during development / dogfooding)

git clone https://github.com/QDEXConsulting/agent-guard.git
pip install -e ./agent-guard/sdk/agentguard_observe

Six-line happy path

import os
from agentguard_observe import Client

client = Client(
    api_key=os.environ["AGENTGUARD_API_KEY"],            # bearer identifier (shown once at agent registration)
    agent_id=os.environ["AGENTGUARD_AGENT_ID"],          # the agent's UUID
    signing_secret=os.environ["AGENTGUARD_SIGNING_SECRET"], # HMAC key — see "Fetching your signing secret" below
)

# ... your agent processes the inbound and produces a reply ...

client.observe(
    attack_id=inbound_headers["X-AgentGuard-ID"],  # echoed back from the test message
    response_text=reply.text,                       # what your LLM said
    action="email_forward",                          # what your agent DID (or "none")
    action_status="executed",                        # "executed" | "blocked" | "none" — see Action status reference below
    action_target="alice@example.com",              # who/what the action affected
)

That's the entire integration. The SDK signs the request, retries on transient failures, redacts common secret shapes from response_text / action_target before they leave your VPC, and never crashes your host application on common input mistakes (see "Common pitfalls" below).

Asyncio variant (FastAPI / Vertex AI Agents / asyncio runtimes)

from agentguard_observe import AsyncClient

async with AsyncClient(
    api_key=KEY,
    agent_id=AGENT_ID,
    signing_secret=SIGNING_SECRET,
) as client:
    await client.observe(attack_id="...", response_text=reply.text)

The surface is identical — every method is awaitable.


Production-ready integration pattern

The six-line example above is the minimum integration. For production deployments, especially when client.observe(...) runs from a watcher thread or background coroutine, use the pattern below. It guarantees that an SDK or network failure can never kill your watcher / agent loop — telemetry resilience is a property of the call site, not just the SDK.

import logging
from agentguard_observe import (
    Client,
    AgentGuardError,        # base class — catches everything below
    AgentGuardAuthError,    # 401 / 403 — rotate api_key + signing_secret
    AgentGuardRateLimitError,  # 429 — back off, MrProbe will recover
    AgentGuardServerError,  # 5xx — MrProbe-side incident
    AgentGuardTimeoutError, # network unreachable, DNS, TLS
)

log = logging.getLogger(__name__)

_observe_client = Client(
    api_key=os.environ["AGENTGUARD_API_KEY"],
    agent_id=os.environ["AGENTGUARD_AGENT_ID"],
    signing_secret=os.environ["AGENTGUARD_SIGNING_SECRET"],
)


def safe_observe(
    *,
    attack_id: str,
    response_text: str | None = None,
    action: str | None = None,
    action_status: str = "none",
    action_target: str | None = None,
) -> None:
    """Fire-and-forget observation. Never raises, always logs.

    Use this anywhere the calling thread/coroutine would die on an
    unhandled exception — most commonly an email-watcher thread, a
    Pub/Sub subscriber, or an LLM-tool callback.
    """
    try:
        _observe_client.observe(
            attack_id=attack_id,
            response_text=response_text,
            action=action,
            action_status=action_status,
            action_target=action_target,
        )
    except AgentGuardAuthError:
        log.exception("agentguard auth failure — refresh credentials")
    except AgentGuardRateLimitError:
        log.warning("agentguard rate-limited — observation dropped")
    except (AgentGuardTimeoutError, AgentGuardServerError):
        log.warning("agentguard transient failure — observation dropped")
    except AgentGuardError:
        log.exception("agentguard unexpected error — observation dropped")
    except Exception:  # noqa: BLE001 — defensive: never crash the watcher
        log.exception("agentguard unexpected non-AgentGuard error")

Why a wrapper if the SDK auto-coerces input?

The SDK now coerces the most common input mistakes (action_status aliases, dict response_text, etc.) without raising. But network failures, auth rotations, and MrProbe-side incidents still raise — those are real conditions you want surfaced in your logs without killing your agent. The safe_observe wrapper ensures the worst case is "this one observation didn't reach MrProbe", not "the entire campaign stalls because the watcher thread is dead".


Common pitfalls (and how the SDK handles them)

These are the integration mistakes we observed in production design- partner deployments. v0.1.1+ handles each one fail-soft — your call ships and you get a UserWarning in your logs instead of a crash — but the canonical form is always preferable.

1. Wrong action_status value

# ❌ BEFORE v0.1.1 this CRASHED your watcher thread
client.observe(attack_id="...", action_status="completed")
# UserWarning: action_status='completed' is not a canonical value;
# coerced to 'executed'.

# ✅ Canonical
client.observe(attack_id="...", action_status="executed")

The SDK accepts a comprehensive alias table (see "Action status reference" below). If you mistype a value the SDK doesn't recognise (e.g. "approved"), it still raises — silently defaulting an unknown value would mask a real bug.

2. Passing structured agent output instead of a rendered string

# ❌ BEFORE v0.1.1 this CRASHED inside the redaction layer with
# TypeError: expected string or bytes-like object, got 'dict'
agent_output = {"reply": "Hi Sarah", "tool_calls": [...]}
client.observe(attack_id="...", response_text=agent_output)
# UserWarning: response_text received dict; JSON-encoded before sending.

# ✅ Pre-render to control the exact wire format
client.observe(attack_id="...", response_text=agent_output["reply"])

The SDK now JSON-encodes (ensure_ascii=False, datetime-safe) when you hand it a dict or list. Useful as a safety net while you finish the integration; not the recommended steady state.

3. Forgetting to handle the case where attack_id is missing

# ❌ Crashes with "attack_id is required" when MrProbe header is absent
attack_id = inbound.headers.get("X-AgentGuard-ID")
client.observe(attack_id=attack_id, response_text=reply)

# ✅ No-op when there's no campaign attached
attack_id = inbound.headers.get("X-AgentGuard-ID")
if attack_id:
    client.observe(attack_id=attack_id, response_text=reply)

attack_id is required and contains a security-boundary charset check (no newlines, no HTML metachars, no Unicode). The SDK does NOT auto- default it — that would let observation events silently disappear.

4. Calling observe() from a thread without a try/except

# ❌ Any unhandled exception kills the thread silently — your agent
# stops processing email but no error is surfaced
def watch_inbox():
    for message in poll():
        client.observe(...)

# ✅ Wrap with safe_observe() (see Production-ready pattern above)
def watch_inbox():
    for message in poll():
        safe_observe(...)

This is the production pattern that protects against the long-tail crashes the SDK can't anticipate — DNS failures, connection-pool exhaustion, customer-side proxy MITM that returns malformed responses, etc.

5. Passing the bearer token instead of the signing secret

The api_key is the bearer identifier; the signing_secret is the HMAC key. They're separate values. See "Fetching your signing secret" below for how to obtain the signing secret.

# ❌ Same value in both places
client = Client(api_key=KEY, agent_id=ID, signing_secret=KEY)
# 401 Unauthorized: invalid signature

# ✅ Separate values
client = Client(api_key=KEY, agent_id=ID, signing_secret=SIGNING_SECRET)

Action status reference

action_status describes the agent-side outcome of the attack. The canonical values are short and stable; the SDK accepts a wide alias table so a one-character typo in your producer code never crashes.

Canonical Meaning SDK aliases (coerced + warning)
"executed" The agent took the action the attack tried to elicit. completed, complete, success, succeeded, ok, done, fired, performed
"blocked" A guardrail / policy / human-in-the-loop stopped the action. denied, refused, rejected, prevented, stopped, failed
"none" (default) No action attempted — the agent only replied (or didn't respond at all). "", None, null, n/a, na, noop, no-op, no op

Aliases are case-insensitive and whitespace-tolerant (" Completed ""executed"). Hyphens and spaces normalise to underscores. Anything not in the alias table still raises a ValueError — that's a programming bug we don't want to mask.

How MrProbe interprets each value

Value Effect on the campaign verdict
executed Strong evidence of a successful exploit. The judge will likely return flag_for_reviewattack_succeeded and trigger an adaptive probe in the next phase.
blocked Strong evidence of a successful defence. The judge will likely return attack_failed (defended) and skip the adaptive probe.
none Neutral — the judge evaluates the response_text semantically. Most replies fall here unless your agent has tool-calling.

Fetching your signing secret

The signing secret is a SEPARATE value from your API key. The API key is your bearer identifier (transmitted in the X-AgentGuard-Key header on every request); the signing secret is the HMAC key used to compute the request signature and is never transmitted on the wire. Together they give MrProbe cryptographic proof that a /v1/observe payload originated from your customer deployment and wasn't tampered with in transit.

One-time fetch (after registering your agent):

curl -H "Authorization: Bearer $YOUR_JWT" \
     https://api.mrprobe.dev/api/v1/agents/$AGENT_ID/webhook/signing-secret

The response is:

{
  "agent_id": "...",
  "signing_secret": "<64 hex chars>",
  "derivation": "HKDF-SHA256(master_key, api_key, info='agentguard-observe-v1')"
}

Store the value in your secret manager (AWS Secrets Manager / GCP Secret Manager / HashiCorp Vault / Kubernetes Secret) and load it as AGENTGUARD_SIGNING_SECRET in your runtime. The endpoint is idempotent — calling it again returns the same value, so a customer who lost the secret can re-fetch it without rotating the api_key.

Why not just use the API key as the HMAC secret? Because then capturing one signed request (TLS-terminating proxy logs, mistakenly logged headers, etc.) gives an attacker the secret AND the ability to forge any future request — defeating the whole point of HMAC. With the values separated, capturing a signed request reveals only the api_key + ts + nonce + sig + body, none of which let the attacker recover the signing secret.


What the SDK does for you

Concern What we do Why
Authentication Adds X-AgentGuard-Key header from your API key (bearer identifier) BE looks up the agent row by api_key
Request signing HMAC-SHA256 over {ts, nonce, method, path, sha256(body)} keyed by signing_secret; X-AgentGuard-Signature: v1=<hex> Replay protection (BE rejects requests outside ±5 min) and tamper detection. signing_secret is held only by you, never transmitted
Retry on transient failure 3 attempts, full-jitter exponential backoff (250 ms → 8 s cap) Handles MrProbe deploys + DNS blips without manual retry code
Outbound redaction Regex-strips JWTs, AWS keys, GCP private keys, OpenAI / Anthropic / Google / Stripe / GitHub / Slack tokens before send Prevents your agent from accidentally exfiltrating secrets to MrProbe
Eager validation Length + enum checks at the call site 422-round-trip turns into a ValueError with a clear message
Connection pooling One httpx.Client (or AsyncClient) per Client instance Single keep-alive socket; sub-50ms calls

What the SDK does NOT do (intentional v1 scope):

  • No custom in-memory queue / batch / drain semantics — every observe() is a single immediate POST. If you need batching, wrap the SDK in your own queue.
  • No automatic correlation between inbound + outbound. You pass attack_id explicitly so a future SDK refactor can't invent a hidden-state bug.
  • No DLP — the redaction bank is intentionally small (well-known token shapes only). Layer your own DLP in front for full coverage.

Testing your integration

After registering an Email Agent in MrProbe, run a canary attack against your registered agent:

  1. In the MrProbe UI, open the agent → Connection check → "Send canary".
  2. MrProbe sends a benign test message to your mailbox.
  3. Your agent processes it (no harm — the canary contains no instructions).
  4. Your agent calls client.observe(...) with the canary's attack_id.
  5. MrProbe shows a green tick on the End-to-end ready card.

Both halves of the loop must be green before you run a real security campaign.


Connection check from agentguard_observe

from agentguard_observe import Client

with Client(
    api_key=KEY,
    agent_id=AGENT_ID,
    signing_secret=SIGNING_SECRET,
) as client:
    # No-op observation against a sentinel attack id — verifies auth +
    # signing without polluting any real campaign.
    try:
        client.observe(attack_id="canary-sdk-selftest")
        print("✓ SDK can reach MrProbe and authenticate")
    except Exception as exc:
        print(f"✗ SDK self-test failed: {exc}")

Configuration

Argument Default Description
api_key (required) The webhook key surfaced at agent registration. Treat as a secret.
agent_id (required) The agent's UUID.
base_url https://api.mrprobe.dev Override for on-prem / EU-residency / staging.
timeout_s 10.0 Per-request timeout.
retry RetryPolicy(max_attempts=3, base_delay_s=0.25, max_delay_s=8.0) Tune via from agentguard_observe.retry import RetryPolicy.
redact True Set False to disable outbound secret redaction (NOT RECOMMENDED).
sign True Set False to send unsigned requests during the 30-day deprecation window.
http_client None Pass your own httpx.Client / httpx.AsyncClient to share its connection pool.

Sample apps

The samples/ directory ships two complete reference integrations that mirror what the v1 customer base actually deploys:

More samples (Bedrock Lambda, Azure Function, Express middleware) ship in v1.1 — file an issue or ping support@sniffr.ai if you need them sooner.


Versioning + wire-format guarantee

The SDK follows semver. The wire format (request body, headers, signing algorithm) is part of the SDK's public API — any incompatible change on either the SDK or the BE bumps the major version of both, and the SDK ships a vN+1 signing-string variant alongside vN for at least 30 days of overlap.


Licence

MIT — vendor freely.


Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentguard_observe-0.1.1.tar.gz (56.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentguard_observe-0.1.1-py3-none-any.whl (38.0 kB view details)

Uploaded Python 3

File details

Details for the file agentguard_observe-0.1.1.tar.gz.

File metadata

  • Download URL: agentguard_observe-0.1.1.tar.gz
  • Upload date:
  • Size: 56.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for agentguard_observe-0.1.1.tar.gz
Algorithm Hash digest
SHA256 8faa6c97cb147e3bbf40afd5d950d53c676d4f42028d5377027e529f456dc00f
MD5 0f56038f425e02c47c1c9a7c52a6a4ea
BLAKE2b-256 1d8d747db60a0754f580f4bd5f7bcb74b4294a298fbc0416b8f149050ce470b8

See more details on using hashes here.

File details

Details for the file agentguard_observe-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for agentguard_observe-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 13261e4b8cf5eb5e0ec620e8bbec4b573c24b1138488fd3192061b52b651701e
MD5 c346890aab37e8a36a7073f70f568d7e
BLAKE2b-256 efc039e17f8c97a5365bee93134c9f1febe8cdffdadede4591b3760dadaf5b90

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page