Runtime governance for AI agents — deterministic fail-closed enforcement. Wraps any agent tool and blocks dangerous calls before execution. Zero LLM calls, zero cloud dependencies, works offline.

These details have not been verified by PyPI

Project links

Project description

ShadowAudit

Runtime governance for AI agents — deterministic fail-closed enforcement with auditor-defensible cryptographic audit logs.

Tests: 226 passed

ShadowAudit sits between your agent and its tools. It evaluates every call before execution and blocks anything that exceeds your risk threshold. Three things differentiate it from horizontal governance toolkits like Microsoft AGT: (1) auditor-defensible cryptographic audit logs — every decision is hash-chained and optionally Ed25519-signed, producing evidence conformity assessors accept; (2) financial-vertical taxonomy depth — built-in Stripe, Plaid, and fintech-specific risk categories out of the box; (3) air-gap-first deployment — single pip install, zero external calls, works inside isolated VPCs and on-prem.

Agent → ShadowAudit Gate → Tool (allowed)
                         → Blocked (AgentActionBlocked raised)

Why ShadowAudit?

Problem	ShadowAudit's Answer
Agents execute arbitrary shell commands	Keyword + regex + AST risk scoring with configurable thresholds
No audit trail for agent decisions	Hash-chained, tamper-evident SQLite audit log with SHA-256 linkage and optional Ed25519 signing
Can't prove compliance to auditors	Professional HTML reports with SOX/PCI-DSS mappings + EU AI Act Annex IV evidence pack generator
Agent behavior drifts over time	Adaptive scoring with behavioral state tracking (K/V metrics)
CI/CD deploys unsafe agents	`--fail-on-ungated` flag blocks deployments
Legal team blocks cloud-dependent tools	Works fully offline — zero external calls
EU AI Act Annex IV evidence required	Built-in evidence pack generator (JSON + HTML)

vs Microsoft Agent Governance Toolkit (AGT)

"AGT is the right horizontal governance toolkit. ShadowAudit is the auditor-defensible, financial-vertical, air-gap-ready layer for regulated workloads. Run both — AGT for breadth, ShadowAudit for the audit evidence your conformity assessor will actually accept."

See docs/POSITIONING.md for a detailed, honest comparison.

Dimension	Microsoft AGT	ShadowAudit
License	MIT	MIT (OSS SDK)
Coverage	All 10 OWASP Agentic risks	3–5 of 10, focused on tool-call execution
Vendor	Microsoft	Independent
Audit log	Standard logging	Hash-chained, Ed25519-signed, tamper-evident
Vertical taxonomies	Generic	Financial / fintech depth (Stripe, Plaid)
Air-gap deployment	Possible but assembly required	First-class — single pip install
EU AI Act evidence pack	Compliance module exists	Annex IV evidence-pack generator built-in
Solo-buyable for SMBs	No	Yes

Hosted dashboard and managed cloud tier in development — contact for early access.

Quick Start

pip install shadowaudit

CLI — 3 commands to get started

# 1. Scan your codebase for ungated AI agent tools
shadowaudit check ./src

# 2. Generate a risk assessment with compliance mappings
shadowaudit assess ./src --taxonomy financial --compliance

# 3. Verify your audit log hasn't been tampered with
shadowaudit verify audit.db

# 4. Analyse decisions and get threshold tuning suggestions
shadowaudit tune --audit-log audit.db

For the full CLI reference (all 8 commands with flags and examples), see docs/CLI.md.

Python API — wrap any tool in 5 lines

from shadowaudit import Gate

gate = Gate()
result = gate.evaluate(
    agent_id="agent-1",
    task_context="shell_tool",
    risk_category="execute",
    payload={"command": "rm -rf /"},
)
print(result.passed)        # False
print(result.risk_score)    # 0.11 (varies by payload)
print(result.reason)        # "drift_detected"

Framework adapters: LangChain (ShadowAuditTool), CrewAI (ShadowAuditCrewAITool), LangGraph (ShadowAuditToolNode), OpenAI Agents SDK (ShadowAuditOpenAITool), and MCP (MCPGatewayServer + ShadowAuditMCPSession). See examples/ for runnable scripts for each.

See examples/ for runnable scripts covering every framework adapter.

Features

Tamper-Evident Audit

Every gate decision is recorded in an append-only SQLite log. Entries are hash-chained via SHA-256 — modify any row and the chain breaks. Optional Ed25519 signing cryptographically proves authenticity. Verified with shadowaudit verify. See examples/tamper_demo.py for a live demonstration.

Observe Mode & Bypass

Roll out enforcement gradually with Gate(mode="observe"): decisions are logged but never blocked, and result.metadata["would_have_blocked"] tells you what enforce mode would have done. For human-approved overrides, use the bypass() context manager — every bypass is recorded in the audit log with a mandatory reason string.

# Shadow mode — log everything, block nothing
gate = Gate(mode="observe")
result = gate.evaluate(agent_id, task, category, payload)
print(result.metadata["would_have_blocked"])   # True if enforce would have blocked

# Bypass with immutable audit trail
with gate.bypass("agent-1", reason="approved by oncall #4521"):
    result = gate.evaluate("agent-1", task, category, payload)

Use shadowaudit tune --audit-log audit.db to analyse block rates per category and get threshold adjustment suggestions.

Multi-Agent Trust Propagation

FlowTracer tracks how data moves between agents and propagates trust downward. If Agent A processes untrusted web content, any payload that flows from A into Agent B's tool call is automatically tagged UNTRUSTED — regardless of B's own trust level.

from shadowaudit import FlowTracer, TrustLevel

tracer = FlowTracer()
tracer.record_output("web-scraper", scraped_data, trust=TrustLevel.UNTRUSTED)
tracer.record_flow("web-scraper", "summariser", parsed_data)

annotation = tracer.annotate(
    receiving_agent="payment-agent",
    source_agents=["summariser"],
    declared_trust=TrustLevel.SYSTEM,
)
print(annotation.effective_trust)   # TrustLevel.UNTRUSTED
print(annotation.contaminated_by)   # ['web-scraper']

Vertical Taxonomies

Built-in starter packs across 6 domains: general, financial (32 categories — Stripe, Plaid, wire transfers, KYC/AML), financial crypto (18 categories), healthcare (17 categories), legal, and open banking. Each taxonomy defines risk keywords, threshold deltas, severity levels, and compliance mappings. Build custom taxonomies interactively with shadowaudit build-taxonomy.

Framework Coverage

First-class adapters for LangChain, CrewAI, LangGraph, OpenAI Agents SDK, and MCP (gateway + in-process). Drop-in wrappers — same interface, automatic enforcement. Works with any tool that has name, description, and run().

Compliance Reporting

Generate professional HTML reports with executive summaries, risk breakdowns, and remediation plans. Built-in OWASP Agentic Top 10 coverage matrix (shadowaudit owasp) and EU AI Act Annex IV evidence pack generator (shadowaudit eu-ai-act) for regulatory submission. For an honest account of what ShadowAudit catches and misses, see docs/THREAT_MODEL.md.

Offline-First

No cloud. No LLM calls. No API keys. SQLite-backed state and audit log. Single pip install shadowaudit deploys everything needed for runtime governance inside air-gapped VPCs and on-prem environments.

CI/CD Integration

shadowaudit check --fail-on-ungated exits non-zero if high-risk tools are ungated. Drop into any pipeline to block unsafe deploys. Trace simulator replays agent execution logs through the gate for regression testing. A labelled corpus of 130 traces (50 benign / 50 risky / 30 edge cases) in tests/corpus/ lets you validate scoring changes before shipping.

Architecture

┌───────────────────────────────────────────────────────────┐
│                      ShadowAudit                           │
├───────────┬───────────┬───────────┬───────────┬───────────┤
│  CLI      │ LangChain │  CrewAI   │  Direct   │  MCP      │
│  (click)  │  Adapter  │  Adapter  │   Gate    │  Gateway  │
├───────────┴───────────┴───────────┴───────────┴───────────┤
│                    Core Gate Engine                        │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌────────────┐ │
│ │  Scorer   │ │  Taxonomy │ │    FSM    │ │  Audit Log │ │
│ │ pluggable │ │   Loader  │ │ fail-closed│ │Hash-chained│ │
│ └───────────┘ └───────────┘ └───────────┘ │  + Ed25519 │ │
│ ┌───────────┐ ┌───────────┐               └────────────┘ │
│ │   State   │ │   Hash    │                              │
│ │  (SQLite) │ │ (SHA-256) │                              │
│ └───────────┘ └───────────┘                              │
├───────────────────────────────────────────────────────────┤
│                  Assessment & Reporting                    │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌────────────┐ │
│ │  Scanner  │ │  Reporter │ │ Simulator │ │   Builder  │ │
│ │           │ │  (Jinja2) │ │           │ │            │ │
│ └───────────┘ └───────────┘ └───────────┘ └────────────┘ │
└───────────────────────────────────────────────────────────┘

How a tool call is evaluated

Agent calls a tool → intercepted by the framework adapter or direct Gate.evaluate()
Taxonomy lookup → finds risk category config (keywords, threshold delta, severity)
Scoring → pluggable scorer computes risk score from payload content
Threshold comparison → score vs. taxonomy delta determines pass/fail
Mode / bypass check → observe mode always passes; active bypass() overrides a block
FSM transition → fail-closed state machine: anything not an explicit pass is a block
Audit log → decision recorded with timestamp, agent ID, payload hash, and reason

Installation

# Base install — CLI + core gate (click, jinja2)
pip install shadowaudit

# With LangChain adapter
pip install shadowaudit[langchain]

# With CrewAI adapter (Python 3.10–3.12)
pip install shadowaudit[crewai]

# Development
pip install shadowaudit[dev]

Requirements: Python 3.10+

Examples

See the examples/ directory for runnable scripts:

Example	Description
`local_only.py`	Direct Gate usage — no framework dependencies
`langchain_agent.py`	LangChain agent with ShadowAudit-wrapped tools
`hash_chain_demo.py`	Hash-chained audit log with tamper detection
`tamper_demo.py`	Live tamper-evidence demo: corrupt a row, watch the chain break
`fintech_payment_agent.py`	Production-style payment agent with Gate enforcement and retry logic
`langgraph_demo.py`	LangGraph `ShadowAuditToolNode` integration
`eu_ai_act_demo.py`	EU AI Act Annex IV evidence pack generation

Run all examples at once:

python examples/run_all_examples.py

For the full example index (14 scripts covering every v0.4.0 feature), see docs/FEATURES.md.

Testing

Quick smoke test after installing:

shadowaudit --version && \
shadowaudit check . && \
shadowaudit owasp && \
python -c "from shadowaudit.core.gate import Gate; print(Gate().evaluate({'tool':'read'}).passed)"

For the full testing guide, see docs/TESTING_GUIDE.md.

Project Status

ShadowAudit is v0.4.0 — production-ready for audit-time scanning and assessment workflows; runtime gating is in early-adopter use. APIs may evolve before v1.0.0; breaking changes require a major version bump and migration guide.

✅ Core gate + 5 framework adapters (LangChain, CrewAI, LangGraph, OpenAI Agents, MCP)
✅ Hash-chained, Ed25519-signed audit log with integrity verification
✅ Observe mode, bypass context manager, and threshold tuning CLI
✅ Multi-agent trust propagation via FlowTracer
✅ Vertical taxonomies (general, financial 32-cat, financial_crypto, healthcare, legal, Plaid) + interactive builder
✅ Labelled test corpus (130 traces) + scorer benchmark
✅ Compliance reporting (OWASP matrix, EU AI Act Annex IV evidence packs)
✅ Honest threat model — what ShadowAudit catches and what it doesn't (docs/THREAT_MODEL.md)
✅ Offline-first — zero external calls, air-gap ready

Contributing

Bug reports and pull requests are welcome. See CONTRIBUTING.md for development setup, testing, and the PR process.

git clone https://github.com/AnshumanKumar14/shadowaudit-python.git
cd shadowaudit-python
pip install -e ".[dev,langchain]"
pytest tests/ -q

License

MIT — see LICENSE.

_{Built by Anshuman Kumar}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.6.3

May 19, 2026

0.6.2

May 14, 2026

0.6.1

May 12, 2026

0.6.0

May 12, 2026

This version

0.5.0

May 11, 2026

0.4.0

May 8, 2026

0.3.3

May 8, 2026

0.3.2

May 7, 2026

0.3.1

May 7, 2026

0.3.0

May 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shadowaudit-0.5.0.tar.gz (139.9 kB view details)

Uploaded May 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

shadowaudit-0.5.0-py3-none-any.whl (99.8 kB view details)

Uploaded May 11, 2026 Python 3

File details

Details for the file shadowaudit-0.5.0.tar.gz.

File metadata

Download URL: shadowaudit-0.5.0.tar.gz
Upload date: May 11, 2026
Size: 139.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for shadowaudit-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`52e311597e17005973f3dccbc52760a728a2a7fedea4786bac879db7f8031b06`
MD5	`141b5ee30f94f0293504e30b1d54e655`
BLAKE2b-256	`9951dbe68c9de171079156315e47886d9ea46e6d1a2dcc69a07e69facbc75288`

See more details on using hashes here.

File details

Details for the file shadowaudit-0.5.0-py3-none-any.whl.

File metadata

Download URL: shadowaudit-0.5.0-py3-none-any.whl
Upload date: May 11, 2026
Size: 99.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for shadowaudit-0.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a4f0b1669dfc6235486fd1e3dd3f59b56ca297b263750809c5e3d1bcc539520f`
MD5	`375d3305a7b0d65fa7526221a7f5240b`
BLAKE2b-256	`b9d63b2f8af5873d56966de7d9b8e24abdf69e1f562df73e8333457c6cbb7a89`

See more details on using hashes here.

shadowaudit 0.5.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ShadowAudit

Why ShadowAudit?

vs Microsoft Agent Governance Toolkit (AGT)

Quick Start

CLI — 3 commands to get started

Python API — wrap any tool in 5 lines

Features

Tamper-Evident Audit

Observe Mode & Bypass

Multi-Agent Trust Propagation

Vertical Taxonomies

Framework Coverage

Compliance Reporting

Offline-First

CI/CD Integration

Architecture

How a tool call is evaluated

Installation

Examples

Testing

Project Status

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes