Audit agent execution traces against configurable policies — CLI, real-time guard SDK, and framework adapters
Project description
troy
Audit agent execution traces against configurable policies. Post-hoc auditing with LLM explanations, real-time interception via the guard SDK, and framework adapters for LangChain, OpenAI Agents, and CrewAI.
Installation
Requires Python 3.11+.
pip install troy
With framework adapters:
pip install troy[langchain] # LangChain callback handler
pip install troy[openai-agents] # OpenAI Agents SDK hooks
pip install troy[crewai] # CrewAI global hooks
From source
git clone https://github.com/sentosa-ai/troy.git
cd troy
uv sync
LLM credentials (for audit / audit-batch commands)
cp .env.example .env
# Edit .env with your API key, base URL, and model
Or pass them as flags / environment variables (see Configuration). The check, replay, and policies commands do not require an LLM.
Quick Start
# Audit a single trace
uv run troy audit traces/agent_run.json examples/policy.json
# Batch audit every trace in a directory
uv run troy audit-batch traces/ examples/policy.json
# Replay a previous audit interactively
uv run troy replay logs/2026-02-15/trace3/audit.json
# Replay with a different policy (no LLM calls, instant)
uv run troy replay logs/2026-02-15/trace3/audit.json --policy examples/policy.json
# Dump replay to stdout for piping / CI
uv run troy replay logs/2026-02-15/trace3/audit.json --policy examples/policy.json --no-interactive
Commands
audit — Single trace audit
Runs the full pipeline: graph building, LLM explanation, policy evaluation, scoring, and reporting.
uv run troy audit <trace_file> <policy_file> [OPTIONS]
| Option | Env var | Default | Description |
|---|---|---|---|
--output, -o |
— | logs/{date}/report.md |
Markdown report output path |
--json-output, -j |
— | logs/{date}/audit.json |
JSON report output path |
--model, -m |
TROY_MODEL |
gpt-4o-mini |
LLM model name |
--base-url |
OPENAI_BASE_URL |
— | API base URL |
--api-key |
OPENAI_API_KEY |
— | API key |
Output files:
logs/{date}/
├── report.md # Markdown audit report
├── audit.json # Full audit result (replayable)
└── llm_responses/ # Raw LLM responses for audit trail
├── step_s1.txt
├── step_s2.txt
└── trace_summary.txt
audit-batch — Batch audit
Audits all .json trace files in a directory concurrently (up to 5 in parallel).
uv run troy audit-batch <trace_dir> <policy_file> [OPTIONS]
Same options as audit (model, base-url, api-key). Generates per-trace reports plus a batch summary:
logs/{date}/
├── summary.md # Table of all traces with violation counts
├── batch.json # Full batch result
├── trace1/
│ ├── report.md
│ └── audit.json
└── trace2/
├── report.md
└── audit.json
replay — Interactive audit replay
Replays a previously-generated audit.json in the terminal. No LLM calls needed.
uv run troy replay <audit_file> [OPTIONS]
| Option | Description |
|---|---|
--policy <file> |
Re-evaluate with a different policy file (pure computation, instant) |
--no-interactive |
Dump full replay to stdout instead of interactive mode |
Interactive controls:
| Key | Action |
|---|---|
→ / n |
Next step |
← / p |
Previous step |
d |
Step detail view |
s |
Trace summary view |
v |
Violations view |
j / k |
Jump to next / previous violation |
q |
Quit |
check — Single action policy check
Evaluate one action against a policy. No LLM needed. Returns JSON. Exit code 0 if allowed, 2 if blocked.
troy check policy.json -a search -i '{"query": "SELECT * FROM users"}'
troy check policy.json -a send_email --mode monitor
troy check policy.json -a bash --metadata '{"permission_level": "admin"}'
policies — Browse and use policy templates
# List all bundled policy templates
troy policies list
# Show rules in a specific policy
troy policies show soc2
# Copy a template to your project
troy policies copy hipaa -o my_policy.json
# Combine multiple templates into one policy
troy policies init -t soc2 -t hipaa -o policy.json
Available templates: minimal, agent_safety, owasp_llm_top10, data_protection, safe_browsing, soc2, hipaa.
How It Works
- Ingestion — Loads and validates trace JSON using Pydantic models
- Graph Building — Constructs a directed execution graph (NetworkX) representing step dependencies
- Explanation — Sends each step + surrounding context to an LLM to infer why the agent made each decision, what data it accessed, what alternatives existed, and what risks are present
- Policy Evaluation — Evaluates Python expression conditions against each step with full cross-step context
- Scoring — Computes a risk score:
min(100, sum(weights of violated rules)) - Reporting — Generates markdown and JSON audit reports
Trace Format
troy consumes traces — it doesn't generate them. Your agent logging system needs to produce JSON in this format:
{
"trace_id": "trace-001",
"agent_name": "my-agent",
"steps": [
{
"step_id": "step-1",
"type": "tool_call",
"description": "Fetch user profile from database",
"input": { "user_id": "usr_882" },
"output": { "name": "Jane Doe", "email": "jane@example.com" },
"metadata": { "data_classification": "pii" },
"timestamp": "2026-02-15T11:15:00Z",
"parent_step_id": null
}
],
"metadata": {
"environment": "production",
"permission_level": "user"
}
}
Step fields
| Field | Required | Description |
|---|---|---|
step_id |
Yes | Unique identifier referenced in violations and reports |
type |
Yes | One of llm_call, tool_call, decision, observation |
description |
Yes | Human-readable description of what the step does |
input |
Yes | Full inputs — prompts, tool args, queries. Without this, auditing is blind |
output |
Yes | Full outputs — responses, return values. Needed to verify what actually happened |
metadata |
No | Labels like data_classification, network_zone, permission_level, requires_approval. Used by policy rules |
timestamp |
No | ISO 8601 timestamp for ordering and timeline analysis |
parent_step_id |
No | For nested/branching execution (e.g. sub-agent calls) |
Metadata conventions
Policy rules reference these metadata keys. Annotate your steps with them to enable detection:
| Key | Values | Used by |
|---|---|---|
data_classification |
pii, internal, public |
PII exfiltration detection |
network_zone |
external, internal |
External data transmission detection |
permission_level |
user, admin |
Privilege escalation detection |
requires_approval |
true / false |
Mandatory approval checks |
approval_token |
token string or null |
Approval verification |
category |
communication, etc. |
Communication channel controls |
The more context you log per step, the better the audit. At minimum: capture full inputs and outputs. The auditor infers why the agent made each decision by analyzing the execution chain — what came before, what came after, and how data flowed between steps.
Policy Format
Policies are JSON files containing a list of rules. Each rule has a condition — a Python expression that returns True when the rule is violated.
{
"policy_id": "my-policy",
"description": "Safety policy for production agents",
"rules": [
{
"rule_id": "pii-exfiltration-protection",
"description": "Detects PII handling followed by transmission to external endpoints",
"condition": "get(step, 'metadata.data_classification') == 'pii' and any_next(lambda s: s['type'] == 'tool_call' and get(s, 'metadata.network_zone') == 'external')",
"severity": "critical",
"weight": 50
}
]
}
Rule fields
| Field | Required | Default | Description |
|---|---|---|---|
rule_id |
Yes | — | Unique identifier for the rule |
description |
Yes | — | Human-readable description shown in reports |
condition |
Yes | — | Python expression (see below). True = violated |
severity |
No | medium |
critical, high, medium, low |
weight |
No | 10 |
Points added to risk score when violated |
Writing conditions
Conditions are Python expressions evaluated per-step with these variables and helpers in scope:
Variables:
| Variable | Type | Description |
|---|---|---|
step |
dict |
Current step being evaluated |
steps |
list[dict] |
All steps in the trace |
step_index |
int |
Current step's index |
prev_steps |
list[dict] |
Steps before the current one |
next_steps |
list[dict] |
Steps after the current one |
trace |
dict |
Trace-level info: trace_id, agent_name, metadata |
agent |
dict |
Agent info: name, metadata (from trace) |
Helper functions:
| Function | Description |
|---|---|
get(d, 'a.b.c', default) |
Safe nested dict access via dot-separated path. Returns default (or None) if any key is missing |
matches(text, pattern) |
Case-insensitive regex search. Returns truthy if pattern is found |
any_step(fn) |
True if fn(step_dict) is true for any step in the trace |
any_next(fn) |
True if fn(step_dict) is true for any step after the current one |
any_prev(fn) |
True if fn(step_dict) is true for any step before the current one |
Example conditions:
# PII data followed by an external tool call
"get(step, 'metadata.data_classification') == 'pii' and any_next(lambda s: s['type'] == 'tool_call' and get(s, 'metadata.network_zone') == 'external')"
# Prompt injection patterns in step input
"matches(str(step.get('input', {})), r'ignore previous instructions|system update|run as admin')"
# Raw SQL in tool call inputs
"step['type'] == 'tool_call' and matches(get(step, 'input.query', ''), r'SELECT|INSERT|UPDATE|DELETE|DROP|UNION')"
# Missing approval token on steps that require approval
"get(step, 'metadata.requires_approval') is True and get(step, 'metadata.approval_token') is None"
# Non-admin agent accessing admin-level step
"get(agent, 'metadata.permission_level') != 'admin' and get(step, 'metadata.permission_level') == 'admin'"
# Any tool call categorized as communication
"step['type'] == 'tool_call' and get(step, 'metadata.category') == 'communication'"
Malformed or erroring conditions are silently skipped — they won't crash the engine.
Framework Adapters
All adapters accept an optional metadata_fn callback that maps each action to security metadata. This enables metadata-based policy rules (e.g. network_zone, data_classification, approval_token) when using framework integrations.
from troy.models import StepType
def my_metadata(action: str, input_data: dict, step_type: StepType) -> dict:
"""Map tool/LLM calls to security metadata for policy evaluation."""
meta = {"network_zone": "internal", "data_classification": "public"}
if action in ("send_email", "http_request"):
meta["network_zone"] = "external"
if step_type == StepType.TOOL_CALL and "pii" in str(input_data):
meta["data_classification"] = "pii"
return meta
LangChain:
from troy.adapters.langchain import TroyHandler
handler = TroyHandler(policy="policy.json", metadata_fn=my_metadata)
OpenAI Agents SDK:
from troy.adapters.openai_agents import TroyHooks
hooks = TroyHooks(policy="policy.json", metadata_fn=my_metadata)
CrewAI:
from troy.adapters.crewai import enable_troy
guard = enable_troy(policy="policy.json", metadata_fn=my_metadata)
Without metadata_fn, adapters pass no step metadata — only tool-name and input-content policy rules will fire.
Configuration
LLM settings can be configured three ways (in order of precedence):
- CLI flags:
--model,--base-url,--api-key - Environment variables:
TROY_MODEL,OPENAI_BASE_URL,OPENAI_API_KEY .envfile (loaded automatically via python-dotenv)
The tool uses the OpenAI client library, so it works with any OpenAI-compatible API (OpenAI, Azure, local models via LiteLLM/Ollama, etc).
Testing
uv run pytest tests/ -v
Roadmap
- Drift detection — Detect when agent behavior drifts from established baselines
- Regression comparison — Compare audit results across trace versions to catch regressions
- Structured semantic diffing — Diff two traces at the semantic level, not just textual
- Risk dashboards — Visual dashboard for risk scores and violation trends over time
- RBAC — Role-based access control for multi-user audit workflows
- Persistence — SQLite trace storage for trend analysis and cross-session querying
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file troy-0.1.0.tar.gz.
File metadata
- Download URL: troy-0.1.0.tar.gz
- Upload date:
- Size: 298.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc2e77c859696f29978320a0deda8e797f16279fdfc4d1e6f2daa6987f18dd8b
|
|
| MD5 |
280995cfd37d73ba9f012620839339a8
|
|
| BLAKE2b-256 |
79aeb5fb5b7519b7be19def60ade34128d31683378be22a27513fd990b904199
|
Provenance
The following attestation bundles were made for troy-0.1.0.tar.gz:
Publisher:
publish.yml on radroof22/troy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
troy-0.1.0.tar.gz -
Subject digest:
fc2e77c859696f29978320a0deda8e797f16279fdfc4d1e6f2daa6987f18dd8b - Sigstore transparency entry: 955103888
- Sigstore integration time:
-
Permalink:
radroof22/troy@5748b5dbcb17f598df7d9ce8ce8305829b9eb7f6 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/radroof22
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@5748b5dbcb17f598df7d9ce8ce8305829b9eb7f6 -
Trigger Event:
release
-
Statement type:
File details
Details for the file troy-0.1.0-py3-none-any.whl.
File metadata
- Download URL: troy-0.1.0-py3-none-any.whl
- Upload date:
- Size: 40.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
889a25ec0e0be58eaff64726a35fba8318eea36fae79c07d382dd9510437a638
|
|
| MD5 |
59ccfe0860399a10d7e13bf9fb7ef0e9
|
|
| BLAKE2b-256 |
9640b8e7ccb131a7b15fb898d0187cf3c8dbaba5f458fe1c5cb48a0c4b3635f6
|
Provenance
The following attestation bundles were made for troy-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on radroof22/troy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
troy-0.1.0-py3-none-any.whl -
Subject digest:
889a25ec0e0be58eaff64726a35fba8318eea36fae79c07d382dd9510437a638 - Sigstore transparency entry: 955103890
- Sigstore integration time:
-
Permalink:
radroof22/troy@5748b5dbcb17f598df7d9ce8ce8305829b9eb7f6 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/radroof22
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@5748b5dbcb17f598df7d9ce8ce8305829b9eb7f6 -
Trigger Event:
release
-
Statement type: