Skip to main content

Local/offline audit for AI security-agent traces: repeated tool-output work, stale replay risk, and token-savings reports.

Project description

fraQtl

CAIRN Security Agent Audit

Local/offline audit for AI-pentest traces: repeated tool-output work, stale replay risk, protected-lane blocks, and token-savings reports.

Proof page · Open-core · Quickstart · Try your logs · One redacted event · Report output · Product direction


CAIRN audits traces from security agents that call scanners, shells, exploit frameworks, HTTP clients, search tools, and file inspection commands.

It answers one practical question:

Are AI-pentest agents repeatedly re-reading expensive tool outputs, and where
would exact replay be stale or unsafe because target/session state changed?

This repository is the free, open-source audit slice of CAIRN. It is local, CLI-first, and audit-only. It does not run pentests, does not need live target access, and does not auto-serve cached outputs. The commercial CAIRN Runtime is a separate protected sidecar for production reuse decisions.

CAIRN is cache-control for security agents, not generic caching:

Agent trace
  -> normalize tool events
  -> compare protected target/session state
  -> choose the safest action

same work + same protected state      -> EXACT_CACHE
related work + changed/partial state  -> DELTA_SERVE
uncertain or first-seen work           -> LIVE_CALL
unsafe protected-state mismatch        -> BLOCK_REUSE

Open-core scope: this repo is the open audit slice of CAIRN, not the full runtime product. The protected runtime sidecar, production serving layer, enterprise dashboard/history, custom mappers, support, and commercial deployment are not included here. Use this repo to test whether a repeated-work signal exists in your AI-pentest traces.

What It Does

Given JSON/JSONL security-agent logs, CAIRN:

  • normalizes trace records into a common audit schema,
  • groups repeated tool-output/context work,
  • compares protected target/session state,
  • marks exact-cache stale-risk events,
  • classifies LIVE_CALL, EXACT_CACHE, DELTA_SERVE, and BLOCK_REUSE,
  • estimates point-token and carried-context savings,
  • writes terminal JSON plus summary.json and summary.md; report.html is optional with --html.

Quickstart

Install from the repo:

git clone https://github.com/fraqtl-ai/cairn-security-agent-audit.git
cd cairn-security-agent-audit
python3 -m venv .venv
source .venv/bin/activate
python -m pip install -e .

After PyPI release, this becomes:

pip install cairn-security-agent-audit

Run the included pentest sample:

cairn-demo

Terminal-only JSON:

cairn-demo --json-only | less

From a repo clone, ./demo.sh also works without installing.

The sample prints a JSON summary in the terminal and writes JSON/Markdown receipts. For a larger public benchmark HTML example, open:

examples/autopenbench/report.html

Try Your Own Logs

If your trace is JSONL:

cairn-audit \
  --input your_trace.jsonl \
  --out report \
  --price-input-per-m 3.0 \
  --no-cleaned-trace

If your trace is one JSON file:

cairn-audit \
  --input your_trace.json \
  --out report \
  --price-input-per-m 3.0 \
  --no-cleaned-trace

If your traces are a directory of JSON logs:

cairn-audit \
  --input logs/ \
  --glob '*.json' \
  --out report \
  --price-input-per-m 3.0 \
  --no-cleaned-trace

Outputs:

report/summary.json
report/summary.md
report/normalization_summary.json

Input Shape

Preferred input is one JSON object per tool event:

{"session_id":"run-1","step":1,"tool":"shell","command":"nmap -sV 10.0.0.5","output":"PORT 22 open ssh...","output_tokens":900,"before":{"fingerprint":"target-a"},"after":{"fingerprint":"target-a"}}

Useful fields:

session_id or run_id
step index or timestamp
tool/action name
command/action text
stdout/stderr/observation/output text
target/session/provenance hints if available
input/output token counts if available

If you do not know whether your export has the right fields, inspect the shape:

cairn-inspect \
  --input your_trace.jsonl \
  --out report/schema_inspection.json

You can share report/schema_inspection.json or one redacted example row without sharing raw logs.

If output or observation text is missing, CAIRN can still show repeated-work and stale-risk structure, but token-savings and delta-serving estimates will be weaker. If target/session fingerprints are missing, CAIRN can still run with conservative proxy fingerprints, but real fingerprints make the protected-lane analysis stronger.

Have Different Logs?

If your logs do not map cleanly, do not prepare a full export first. Send or inspect one redacted event instead. The minimum useful shape is:

session_id, timestamp or step, tool name, tool input/command, output/observation, target/session state if available

See One Redacted Event for exactly what to share and what to redact. One event is enough to adapt the mapper; the full audit can still run locally inside your environment.

What The Report Shows

The report is designed to be readable by product and engineering teams:

Area What CAIRN reports
Repeated work Events audited, re-reads, repeated-work percentage
Tool families Top repeated commands/tools by carried-context savings
Safety Protected-lane blocks and exact-cache stale-risk events
Opportunities EXACT_CACHE, DELTA_SERVE, LIVE_CALL, BLOCK_REUSE
Savings Point tokens avoided, carried-context tokens avoided, dollar estimate
Examples Concrete commands/actions that created the signal

Public reference result from AutoPenBench / genai-pentest-paper logs:

2,764 tool events audited
1,031 re-reads
37.30% repeated work
548,335 point tokens avoided
3,698,589 carried-context tokens avoided
1,016 protected-lane blocks
0 stale serves
0 false hits

Top repeated families in that public run included:

nmap, curl, ssh, msfconsole, searchsploit, find, cat

How To Read The Actions

LIVE_CALL

First time seeing this work, or not enough evidence to reuse safely.

EXACT_CACHE

Same work repeated and protected state still matches. Exact reuse would be safe.

DELTA_SERVE

Related output repeated, but exact replay is not the right safety choice.
Prior output can still shrink context/reporting burden while staying live-aware.

BLOCK_REUSE

Repeated work exists, but reuse should be blocked.

Open-Core Model

This repository is MIT-licensed and contains the local audit slice: CLI, schema inspector, sample traces, and JSON/Markdown report generation. HTML output is available, but the main product path is terminal-first.

The paid/commercial layer is CAIRN Runtime: a protected sidecar for production reuse decisions, custom trace mappers, dashboard/history, deployment support, and enterprise licensing.

See Open-Core Model.

Product Direction

This repository is the audit slice, not the full CAIRN runtime product.

Design-partner path:

  1. Offline audit: run CAIRN on existing AI-pentest traces and measure the repeated-work signal.
  2. Local dashboard: review repeated tool families, stale-risk examples, and savings over time without raw logs leaving the customer environment.
  3. Protected runtime sidecar: integrate around one high-volume tool family. CAIRN observes tool calls plus target/session state, exact-caches only inside safe provenance cells, delta-serves when appropriate, and falls back live when state changed or is uncertain.

The goal is not to replay pentest results blindly. The goal is to reduce repeated context/tool-output cost while refusing stale replay across protected target/session changes.

Boundaries

CAIRN Security Agent Audit is:

  • local,
  • audit-only,
  • trace/report oriented,
  • designed to avoid stale replay.

It is not a vulnerability scanner, pentest runner, live target automation, or production serving layer.

License

MIT License. See LICENSE.

Commercial CAIRN Runtime, private integrations, managed deployments, support, and enterprise licensing are separate from this open audit package. See Open-Core Model.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cairn_security_agent_audit-0.1.0.tar.gz (29.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cairn_security_agent_audit-0.1.0-py3-none-any.whl (30.3 kB view details)

Uploaded Python 3

File details

Details for the file cairn_security_agent_audit-0.1.0.tar.gz.

File metadata

File hashes

Hashes for cairn_security_agent_audit-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5707fe07efe96fe59de121bd31ae73c0a5757185c529ccaabb44f193b0fc0b9d
MD5 d25cb9b3cf2d78122f744fb5a60e79c8
BLAKE2b-256 bde278bc08fd7a603f05d7034f2802ef34d62fdbefe9582a717e9df5059d24b9

See more details on using hashes here.

File details

Details for the file cairn_security_agent_audit-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for cairn_security_agent_audit-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 278af33410052ed9e341817f6184670196e67509d775e700890eaba7331ad613
MD5 5d94276e8a0ce5c2bfb19e2d5dde2356
BLAKE2b-256 bd714f16c3c6ddc6395c038096a79d63c5fe58192c842a29e776ea0d51fba167

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page