Skip to main content

Runtime EDR + inline guard + async response actions for AI agents. Watchtower Guard (sub-200ms block/redact/inject) + Watchtower Commands (kill_session/quarantine_memory/switch_system_prompt/disable_tool) + LangChain / LangGraph / OpenAI Assistants / Anthropic SDK monitoring.

Project description

ShieldPi — Inline Guard + Async Response + Runtime EDR for AI Agents

PyPI Python License

The first AI red-team platform that closes the loop: attack → detect → respond.

  • Watchtower Guard (Tier 1) — sub-200ms inline guard. Block prompt injections, redact secret leaks, inject hardened guardrails — synchronously, in your LLM request path.
  • Watchtower Commands (Tier 2) — async response actions. Kill agent sessions, quarantine memory, switch system prompts, disable tools — issued by the autonomous SOC or your own automation, delivered via webhooks.
  • Runtime Monitor (Tier 0) — zero-code-change agent observability for LangChain, LangGraph, OpenAI Assistants, and the Anthropic SDK.
pip install shieldpi

Watchtower Guard — inline (NEW in 0.6)

Wrap each LLM turn in one synchronous call. Sub-200ms, fails open on timeout.

from shieldpi import ShieldPiGuard

guard = ShieldPiGuard(api_key="shpi_live_...")

def chat(user_msg: str, session_id: str) -> str:
    # 1. Pre-flight check on input.
    pre = guard.check(user_input=user_msg, session_id=session_id)
    if pre.action == "block":
        return pre.replacement_text
    if pre.action == "inject":
        user_msg = pre.guardrail_prefix + user_msg

    # 2. Call your LLM normally.
    model_output = call_llm(user_msg)

    # 3. Post-flight check on output (catches leaked credentials/PII).
    post = guard.check(model_output=model_output, session_id=session_id)
    if post.action == "redact":
        return post.replacement_text   # contains [REDACTED:...] markers
    return model_output

The four verdict actions

verdict.action When Your handler should
allow No detection Send the model output unchanged
block High-confidence prompt injection (≥0.85) Return verdict.replacement_text; do NOT call the LLM
redact Output contains AWS key, JWT, Stripe key, valid Luhn CC, SSN, PEM private key Return verdict.replacement_text (carries [REDACTED:label] markers)
inject Medium-confidence input pattern (0.55–0.85) Prepend verdict.guardrail_prefix to the user message before sending to LLM

Async variant

from shieldpi import AsyncShieldPiGuard

async with AsyncShieldPiGuard(api_key="shpi_live_...") as guard:
    verdict = await guard.check(user_input=msg, session_id=sid)

Tuning

ShieldPiGuard(
    api_key="shpi_live_...",
    timeout_s=0.5,           # default — fails OPEN on timeout (no DoS via slow guard)
    fail_closed=False,        # set True for compliance-grade strict deny
)

Full docs + tuning knobs: https://shieldpi.io/docs/watchtower-guard


Watchtower Commands — async (NEW in 0.6)

For actions that don't fit the inline path: kill an agent session after a verdict, clear poisoned memory, swap to a hardened system prompt mid-session.

Commands flow through your existing webhook URL as command.issued events. Your handler executes the action, then ACKs:

from shieldpi import ShieldPiCommands

commands = ShieldPiCommands(api_key="shpi_live_...")

# In your webhook handler:
def handle_shieldpi_webhook(event: dict) -> None:
    if event["event_type"] != "command.issued":
        return
    cmd_id = event["data"]["command_id"]
    action = event["data"]["action"]

    try:
        if action == "kill_session":
            my_app.terminate(event["data"]["session_id"])
        elif action == "quarantine_memory":
            my_app.snapshot_and_clear_memory(event["data"]["session_id"])
        elif action == "switch_system_prompt":
            my_app.set_system_prompt(event["data"]["session_id"], event["data"]["payload"]["prompt"])
        elif action == "disable_tool":
            my_app.disable_tool(event["data"]["payload"]["tool_name"])

        commands.ack(cmd_id, status="executed", reason="completed")
    except Exception as exc:
        commands.ack(cmd_id, status="failed", reason=str(exc))

Issue commands from your own automation

commands.issue(
    action="kill_session",
    session_id="user-abc-123",
    payload={"reason": "anomalous_behavior_detected"},
    ttl_seconds=600,
)

Full docs: https://shieldpi.io/docs/watchtower-commands · https://shieldpi.io/dashboard/auto-respond


Runtime Monitor — zero-code-change (since 0.2)

Every tool call, LLM call, file read, and outbound request your agent makes streams to ShieldPi's detectors in real time.

pip install "shieldpi[all]"
export SHIELDPI_SDK_KEY=shpi_live_...   # https://shieldpi.io/dashboard
# Add one import at the top of your agent process:
import shieldpi.auto

# Use LangChain / LangGraph / OpenAI Assistants / Anthropic SDK normally.
from langchain.agents import AgentExecutor
agent = AgentExecutor(...)
agent.invoke({"input": "..."})   # every tool call + LLM call is captured

Open https://shieldpi.io/dashboard/agent-monitor to see the live event stream and any alerts the detectors fire.

Supported frameworks

Framework Auto-patch Manual handler Optional dep
LangChain (AgentExecutor) pip install shieldpi[langchain]
LangChain (LCEL tools) ✅ (BaseTool hook) pip install shieldpi[langchain]
LangGraph pip install shieldpi[langgraph]
OpenAI Assistants API pip install shieldpi[openai]
OpenAI Chat Completions w/ tools pip install shieldpi[openai]
Anthropic SDK (tool use) pip install shieldpi[anthropic]
Custom agents (no framework) base install

Manual integration

from shieldpi import Monitor

monitor = Monitor(sdk_key="shpi_live_...")
with monitor.start_session(
    agent_name="invoice-bot",
    stated_goal="help users file invoices",
) as session:
    session.log_user_message("How do I file a Q1 invoice?")
    session.log_tool_call("search_docs", {"query": "Q1 invoice filing"})
    session.log_tool_result("search_docs", {"results": [...]})
    session.log_final_response("Here's how to file a Q1 invoice...")

Environment variables

Variable Purpose
SHIELDPI_API_KEY Customer API key for Guard + Commands (starts with shpi_live_).
SHIELDPI_SDK_KEY Per-target SDK key for Runtime Monitor (starts with shpi_live_).
SHIELDPI_BASE_URL Override the API base URL (defaults to production).
SHIELDPI_AGENT_NAME Logical name for this agent (Monitor only).
SHIELDPI_AGENT_GOAL Stated goal (helps detectors flag off-goal behavior).
SHIELDPI_AUTO_INSTRUMENT Comma-separated framework allowlist. Default: all installed.

Safety guarantees

  • Inline Guard fails open by default. A slow ShieldPi never blocks your customer's request. Reason field carries shieldpi_timeout for observability. Opt into fail_closed=True if you need strict deny.
  • Monitor never crashes your agent. Every HTTP call is fire-and-forget. Monitoring failures log at WARNING; they never raise.
  • All Monitor patches are idempotent — double-import won't double-wrap.
  • No plaintext storage of customer payloads. Guard stores SHA-256 hashes of user_input and model_output, never the originals. Match snippets are truncated to 160 chars.
  • Commands are opt-in. Watchtower auto-respond defaults to off for every customer. Suggest mode requires human approval per command. Auto mode requires per-action explicit opt-in.

What's new in 0.6

  • ShieldPiGuard — inline guard client with sync + async variants. Sub-200ms, fail-open by default.
  • GuardVerdict — structured response with .blocked, .redacted, .needs_inject helpers.
  • ShieldPiCommands — async response action client. .issue, .ack, .get, .list, .supported_actions.
  • Sub-50ms p95 engine work measured against the curated 13K-payload corpus.
  • Output-side scanner detects AWS / GCP / Anthropic / OpenAI / Stripe / GitHub credentials, JWTs, PEM private keys, Luhn-validated credit cards, SSN-shape strings, with severity-tiered redaction.

Links

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shieldpi-0.6.1.tar.gz (30.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

shieldpi-0.6.1-py3-none-any.whl (29.5 kB view details)

Uploaded Python 3

File details

Details for the file shieldpi-0.6.1.tar.gz.

File metadata

  • Download URL: shieldpi-0.6.1.tar.gz
  • Upload date:
  • Size: 30.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for shieldpi-0.6.1.tar.gz
Algorithm Hash digest
SHA256 fb292282ca05d8a86fc416320a724d2b0cf99301ffbbf160e5c804cfa4b71a58
MD5 2abcd049430b37b4506cf93752b73a6f
BLAKE2b-256 52cd4021c0eb3c3158702e147d4f6706632e9e510807b90cc3dd779d0b1e4a5e

See more details on using hashes here.

File details

Details for the file shieldpi-0.6.1-py3-none-any.whl.

File metadata

  • Download URL: shieldpi-0.6.1-py3-none-any.whl
  • Upload date:
  • Size: 29.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for shieldpi-0.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7368b9b75f9cfac59c807da62907d2fb1d75b4defebf3f5c9ed65facdb5272de
MD5 fb2c57ef435b7c75e4e8ce640d4d9d82
BLAKE2b-256 8efed10c1d38ac8952ad6d0cca5f030b6f98ed54c0e28dc4c79243998a8fce1d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page