Skip to main content

Audit agent execution traces against configurable policies — CLI, real-time guard SDK, and framework adapters

Project description

troy

Audit agent execution traces against configurable policies. Post-hoc auditing with LLM explanations, real-time interception via the guard SDK, and framework adapters for LangChain, OpenAI Agents, and CrewAI.

Installation

Requires Python 3.11+.

pip install troy

With framework adapters:

pip install troy[langchain]    # LangChain callback handler
pip install troy[openai-agents] # OpenAI Agents SDK hooks
pip install troy[crewai]        # CrewAI global hooks

From source

git clone https://github.com/sentosa-ai/troy.git
cd troy
uv sync

LLM credentials (for audit / audit-batch commands)

cp .env.example .env
# Edit .env with your API key, base URL, and model

Or pass them as flags / environment variables (see Configuration). The check, replay, and policies commands do not require an LLM.

Quick Start

# Audit a single trace
uv run troy audit traces/agent_run.json examples/policy.json

# Batch audit every trace in a directory
uv run troy audit-batch traces/ examples/policy.json

# Replay a previous audit interactively
uv run troy replay logs/2026-02-15/trace3/audit.json

# Replay with a different policy (no LLM calls, instant)
uv run troy replay logs/2026-02-15/trace3/audit.json --policy examples/policy.json

# Dump replay to stdout for piping / CI
uv run troy replay logs/2026-02-15/trace3/audit.json --policy examples/policy.json --no-interactive

Commands

audit — Single trace audit

Runs the full pipeline: graph building, LLM explanation, policy evaluation, scoring, and reporting.

uv run troy audit <trace_file> <policy_file> [OPTIONS]
Option Env var Default Description
--output, -o logs/{date}/report.md Markdown report output path
--json-output, -j logs/{date}/audit.json JSON report output path
--model, -m TROY_MODEL gpt-4o-mini LLM model name
--base-url OPENAI_BASE_URL API base URL
--api-key OPENAI_API_KEY API key

Output files:

logs/{date}/
├── report.md              # Markdown audit report
├── audit.json             # Full audit result (replayable)
└── llm_responses/         # Raw LLM responses for audit trail
    ├── step_s1.txt
    ├── step_s2.txt
    └── trace_summary.txt

audit-batch — Batch audit

Audits all .json trace files in a directory concurrently (up to 5 in parallel).

uv run troy audit-batch <trace_dir> <policy_file> [OPTIONS]

Same options as audit (model, base-url, api-key). Generates per-trace reports plus a batch summary:

logs/{date}/
├── summary.md             # Table of all traces with violation counts
├── batch.json             # Full batch result
├── trace1/
│   ├── report.md
│   └── audit.json
└── trace2/
    ├── report.md
    └── audit.json

replay — Interactive audit replay

Replays a previously-generated audit.json in the terminal. No LLM calls needed.

uv run troy replay <audit_file> [OPTIONS]
Option Description
--policy <file> Re-evaluate with a different policy file (pure computation, instant)
--no-interactive Dump full replay to stdout instead of interactive mode

Interactive controls:

Key Action
/ n Next step
/ p Previous step
d Step detail view
s Trace summary view
v Violations view
j / k Jump to next / previous violation
q Quit

check — Single action policy check

Evaluate one action against a policy. No LLM needed. Returns JSON. Exit code 0 if allowed, 2 if blocked.

troy check policy.json -a search -i '{"query": "SELECT * FROM users"}'
troy check policy.json -a send_email --mode monitor
troy check policy.json -a bash --metadata '{"permission_level": "admin"}'

policies — Browse and use policy templates

# List all bundled policy templates
troy policies list

# Show rules in a specific policy
troy policies show soc2

# Copy a template to your project
troy policies copy hipaa -o my_policy.json

# Combine multiple templates into one policy
troy policies init -t soc2 -t hipaa -o policy.json

Available templates: minimal, agent_safety, owasp_llm_top10, data_protection, safe_browsing, soc2, hipaa.

How It Works

  1. Ingestion — Loads and validates trace JSON using Pydantic models
  2. Graph Building — Constructs a directed execution graph (NetworkX) representing step dependencies
  3. Explanation — Sends each step + surrounding context to an LLM to infer why the agent made each decision, what data it accessed, what alternatives existed, and what risks are present
  4. Policy Evaluation — Evaluates Python expression conditions against each step with full cross-step context
  5. Scoring — Computes a risk score: min(100, sum(weights of violated rules))
  6. Reporting — Generates markdown and JSON audit reports

Trace Format

troy consumes traces — it doesn't generate them. Your agent logging system needs to produce JSON in this format:

{
  "trace_id": "trace-001",
  "agent_name": "my-agent",
  "steps": [
    {
      "step_id": "step-1",
      "type": "tool_call",
      "description": "Fetch user profile from database",
      "input": { "user_id": "usr_882" },
      "output": { "name": "Jane Doe", "email": "jane@example.com" },
      "metadata": { "data_classification": "pii" },
      "timestamp": "2026-02-15T11:15:00Z",
      "parent_step_id": null
    }
  ],
  "metadata": {
    "environment": "production",
    "permission_level": "user"
  }
}

Step fields

Field Required Description
step_id Yes Unique identifier referenced in violations and reports
type Yes One of llm_call, tool_call, decision, observation
description Yes Human-readable description of what the step does
input Yes Full inputs — prompts, tool args, queries. Without this, auditing is blind
output Yes Full outputs — responses, return values. Needed to verify what actually happened
metadata No Labels like data_classification, network_zone, permission_level, requires_approval. Used by policy rules
timestamp No ISO 8601 timestamp for ordering and timeline analysis
parent_step_id No For nested/branching execution (e.g. sub-agent calls)

Metadata conventions

Policy rules reference these metadata keys. Annotate your steps with them to enable detection:

Key Values Used by
data_classification pii, internal, public PII exfiltration detection
network_zone external, internal External data transmission detection
permission_level user, admin Privilege escalation detection
requires_approval true / false Mandatory approval checks
approval_token token string or null Approval verification
category communication, etc. Communication channel controls

The more context you log per step, the better the audit. At minimum: capture full inputs and outputs. The auditor infers why the agent made each decision by analyzing the execution chain — what came before, what came after, and how data flowed between steps.

Policy Format

Policies are JSON files containing a list of rules. Each rule has a condition — a Python expression that returns True when the rule is violated.

{
  "policy_id": "my-policy",
  "description": "Safety policy for production agents",
  "rules": [
    {
      "rule_id": "pii-exfiltration-protection",
      "description": "Detects PII handling followed by transmission to external endpoints",
      "condition": "get(step, 'metadata.data_classification') == 'pii' and any_next(lambda s: s['type'] == 'tool_call' and get(s, 'metadata.network_zone') == 'external')",
      "severity": "critical",
      "weight": 50
    }
  ]
}

Rule fields

Field Required Default Description
rule_id Yes Unique identifier for the rule
description Yes Human-readable description shown in reports
condition Yes Python expression (see below). True = violated
severity No medium critical, high, medium, low
weight No 10 Points added to risk score when violated

Writing conditions

Conditions are Python expressions evaluated per-step with these variables and helpers in scope:

Variables:

Variable Type Description
step dict Current step being evaluated
steps list[dict] All steps in the trace
step_index int Current step's index
prev_steps list[dict] Steps before the current one
next_steps list[dict] Steps after the current one
trace dict Trace-level info: trace_id, agent_name, metadata
agent dict Agent info: name, metadata (from trace)

Helper functions:

Function Description
get(d, 'a.b.c', default) Safe nested dict access via dot-separated path. Returns default (or None) if any key is missing
matches(text, pattern) Case-insensitive regex search. Returns truthy if pattern is found
any_step(fn) True if fn(step_dict) is true for any step in the trace
any_next(fn) True if fn(step_dict) is true for any step after the current one
any_prev(fn) True if fn(step_dict) is true for any step before the current one

Example conditions:

# PII data followed by an external tool call
"get(step, 'metadata.data_classification') == 'pii' and any_next(lambda s: s['type'] == 'tool_call' and get(s, 'metadata.network_zone') == 'external')"

# Prompt injection patterns in step input
"matches(str(step.get('input', {})), r'ignore previous instructions|system update|run as admin')"

# Raw SQL in tool call inputs
"step['type'] == 'tool_call' and matches(get(step, 'input.query', ''), r'SELECT|INSERT|UPDATE|DELETE|DROP|UNION')"

# Missing approval token on steps that require approval
"get(step, 'metadata.requires_approval') is True and get(step, 'metadata.approval_token') is None"

# Non-admin agent accessing admin-level step
"get(agent, 'metadata.permission_level') != 'admin' and get(step, 'metadata.permission_level') == 'admin'"

# Any tool call categorized as communication
"step['type'] == 'tool_call' and get(step, 'metadata.category') == 'communication'"

Malformed or erroring conditions are silently skipped — they won't crash the engine.

Framework Adapters

All adapters accept an optional metadata_fn callback that maps each action to security metadata. This enables metadata-based policy rules (e.g. network_zone, data_classification, approval_token) when using framework integrations.

from troy.models import StepType

def my_metadata(action: str, input_data: dict, step_type: StepType) -> dict:
    """Map tool/LLM calls to security metadata for policy evaluation."""
    meta = {"network_zone": "internal", "data_classification": "public"}
    if action in ("send_email", "http_request"):
        meta["network_zone"] = "external"
    if step_type == StepType.TOOL_CALL and "pii" in str(input_data):
        meta["data_classification"] = "pii"
    return meta

LangChain:

from troy.adapters.langchain import TroyHandler
handler = TroyHandler(policy="policy.json", metadata_fn=my_metadata)

OpenAI Agents SDK:

from troy.adapters.openai_agents import TroyHooks
hooks = TroyHooks(policy="policy.json", metadata_fn=my_metadata)

CrewAI:

from troy.adapters.crewai import enable_troy
guard = enable_troy(policy="policy.json", metadata_fn=my_metadata)

Without metadata_fn, adapters pass no step metadata — only tool-name and input-content policy rules will fire.

Configuration

LLM settings can be configured three ways (in order of precedence):

  1. CLI flags: --model, --base-url, --api-key
  2. Environment variables: TROY_MODEL, OPENAI_BASE_URL, OPENAI_API_KEY
  3. .env file (loaded automatically via python-dotenv)

The tool uses the OpenAI client library, so it works with any OpenAI-compatible API (OpenAI, Azure, local models via LiteLLM/Ollama, etc).

Testing

uv run pytest tests/ -v

Roadmap

  • Drift detection — Detect when agent behavior drifts from established baselines
  • Regression comparison — Compare audit results across trace versions to catch regressions
  • Structured semantic diffing — Diff two traces at the semantic level, not just textual
  • Risk dashboards — Visual dashboard for risk scores and violation trends over time
  • RBAC — Role-based access control for multi-user audit workflows
  • Persistence — SQLite trace storage for trend analysis and cross-session querying

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

troy-0.1.0.tar.gz (298.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

troy-0.1.0-py3-none-any.whl (40.1 kB view details)

Uploaded Python 3

File details

Details for the file troy-0.1.0.tar.gz.

File metadata

  • Download URL: troy-0.1.0.tar.gz
  • Upload date:
  • Size: 298.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for troy-0.1.0.tar.gz
Algorithm Hash digest
SHA256 fc2e77c859696f29978320a0deda8e797f16279fdfc4d1e6f2daa6987f18dd8b
MD5 280995cfd37d73ba9f012620839339a8
BLAKE2b-256 79aeb5fb5b7519b7be19def60ade34128d31683378be22a27513fd990b904199

See more details on using hashes here.

Provenance

The following attestation bundles were made for troy-0.1.0.tar.gz:

Publisher: publish.yml on radroof22/troy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file troy-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: troy-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 40.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for troy-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 889a25ec0e0be58eaff64726a35fba8318eea36fae79c07d382dd9510437a638
MD5 59ccfe0860399a10d7e13bf9fb7ef0e9
BLAKE2b-256 9640b8e7ccb131a7b15fb898d0187cf3c8dbaba5f458fe1c5cb48a0c4b3635f6

See more details on using hashes here.

Provenance

The following attestation bundles were made for troy-0.1.0-py3-none-any.whl:

Publisher: publish.yml on radroof22/troy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page