Privacy-preserving audit framework for multi-agent AI systems. Detects cross-agent data leaks, inference attacks, and compliance violations — without accessing raw content.
Project description
Federated Agent Audit
Behavior tracing + federated audit for any multi-agent system — see what your agents do and catch privacy & compliance risks, with the central auditor never seeing raw content.
pip install federated-agent-audit
Two pillars, framework-agnostic and scenario-agnostic:
- Behavior tracing — capture the real agent-to-agent interaction graph (who sent what to whom, tool calls, hand-offs) from CrewAI · LangGraph · AutoGen · OpenAI Agents · LlamaIndex, or any custom orchestration.
- Federated desensitized audit — each agent audits locally; the central auditor only ever sees hashed, pseudonymized, DP-noised metadata. It detects compositional privacy/compliance risks that emerge across agents — and never sees raw content, by architecture.
Think LangSmith/Langfuse for multi-agent systems, but federated: your prompts and outputs never leave the agents' own environments.
Who's this for? Anyone running a multi-agent system who needs to observe and govern its behavior — with extra pull for teams who can't ship raw prompts to a third-party observability vendor (regulated data, on-prem, data residency). A single-LLM-app on-ramp is built in (see the firewall below).
30-Second Quick Start
New here? Start with the firewall — it works on a single LLM call, no multi-agent setup needed. The multi-agent audit is the depth you grow into.
from federated_agent_audit import scan
result = scan("Zhang Wei's SSN is 123-45-6789, salary $185,000")
print(result["clean"]) # False
print(result["detected"]) # ['SSN', 'salary']
print(result["text"]) # "Zhang Wei's [REDACTED] is [SSN], [REDACTED] [DOLLAR_AMOUNT]"
echo "credit card 4532-1234-5678-9012" | federated-audit scan
# REDACTED Detected: credit card
Protect Your LLM Calls
Intercept every OpenAI/Anthropic response automatically — the single-app on-ramp. Production-hardened: fail-open (the firewall can't crash your app), streaming blocked the moment a violation accumulates, and sensitive content inspected inside tool-call arguments.
from federated_agent_audit import firewall
fw = firewall(["salary", "SSN", "diagnosis"])
fw.patch_openai() # done — every response (incl. streaming + tool calls) is now checked
response = client.chat.completions.create(model="gpt-4o", messages=[...])
# Sensitive content in the response is already redacted
The Problem
Multi-agent systems (CrewAI, LangGraph, AutoGen, OpenAI Agents, LlamaIndex) create compound privacy risks that single-agent tools can't detect:
- Agent A shares salary data with Agent B (allowed by A's policy)
- Agent B forwards a "summary" to an external partner (allowed by B's policy)
- Result: salary leaked outside the company — neither agent broke its own rules
Existing observability tools (LangSmith, Langfuse) require uploading raw prompts to their servers. This framework audits agent interactions without the central auditor ever seeing raw content.
📖 Worked case study — a leak that emerges only from
combining two policy-compliant agents, caught with the raw PHI/PII never
leaving the agents' environments (python examples/case_study_healthcare_leak.py).
+---------------+
| Central | Phase 2: Network audit
| Auditor | (desensitized metadata only)
+-------+-------+
|
+---------------+---------------+
| | |
+------+------+ +----+----+ +--------+------+
| Local Audit | | Local | | Local Audit | Phase 1
| (Agent A) | | (Agt B) | | (Agent C) |
+-------------+ +---------+ +---------------+
raw content raw content raw content
stays here stays here stays here
See ARCHITECTURE.md for the two-party model (edge vs. center), deployment topologies, and the tamper-evidence guarantees.
Multi-Agent Trace & Audit
The integrations capture the real agent-to-agent interaction graph — who
sent what to whom — which is exactly what the compositional / cascade /
cross-domain detectors analyze. Everything is built on MultiAgentTracer,
which works with any framework (or none):
from federated_agent_audit import MultiAgentTracer, PrivacyPolicy
tracer = MultiAgentTracer()
tracer.register_agent("hr_bot", PrivacyPolicy(agent_id="hr_bot", must_not_share=["salary"]))
# Each call is a real directed edge; taint (domains, sensitivity, origin,
# hop count) propagates across hops automatically.
tracer.record_handoff("hr_bot", "summary_bot", "Zhang Wei earns $185k", origin="zhang_wei")
tracer.record_handoff("summary_bot", "external_bot", "candidate compensation summary")
result = tracer.network_audit() # Phase-2 central audit
incidents = tracer.aggregated() # denoised, actionable alerts
Tracing, not just auditing. See what your agents did — chronologically and desensitized — whether or not anything went wrong. No raw content, ever:
tracer.timeline() # [{seq, agent, to, action, domains, sensitivity, local_action, timestamp}, ...]
tracer.summary() # per-agent sent/received/internal counts + domains touched
tracer.export() # full interaction graph as a JSON-able dict (no raw text — hashes + metadata)
It catches the compound leak no single agent's policy can see — and the central
auditor still never touched the raw data (python examples/multiagent_trace_demo.py):
Incidents: 5 alert_summary={'critical': 3, 'high': 2}
[CRITICAL] cross_domain_leak — Sensitive health data reaches social domain via 2-agent chain
[CRITICAL] cross_domain_leak — Sensitive finance data reaches social domain via 2-agent chain
[CRITICAL] taint_spreading — Data from origin 'zhang_wei' spread to 4 agents across the network
[HIGH] inference_accumulation — external_bot accumulated high inference risk (77%)
[HIGH] compound_scope_escalation — 3 agent pairs exceed authorized scope
Privacy verification (central reports): hr_bot → clean health_bot → clean summary_bot → clean
Framework Integrations
# CrewAI — captures agent delegation (Delegate/Ask coworker) as A→B edges
from federated_agent_audit.sdk import crew_audit
crew = crew_audit(crew, default_policy=policy) # or policies={role: policy}
crew.kickoff()
result = crew._federated_tracer.network_audit()
# LangChain / LangGraph — per-node identity + node-to-node hand-offs
from federated_agent_audit.sdk import langchain_callback
handler = langchain_callback(default_policy=policy) # asynchronous=True for async graphs
graph.invoke(input, config={"callbacks": [handler]})
result = handler.tracer.network_audit()
# AutoGen / AG2 — hooks every agent-to-agent message
from federated_agent_audit.sdk import autogen_audit
tracer = autogen_audit([assistant, user_proxy, critic], default_policy=policy)
user_proxy.initiate_chat(assistant, message="...")
result = tracer.network_audit()
# OpenAI Agents SDK — captures first-class handoffs
from federated_agent_audit.sdk import openai_agents_hooks
hooks = openai_agents_hooks(default_policy=policy)
await Runner.run(triage_agent, input="...", hooks=hooks)
result = hooks.tracer.network_audit()
# LlamaIndex AgentWorkflow — captures hand-offs from the event stream
from federated_agent_audit.sdk import llamaindex_handler
h = llamaindex_handler(default_policy=policy)
async for event in workflow.run(user_msg="...").stream_events():
h.handle_event(event)
result = h.tracer.network_audit()
# Generic Python — single-agent decorator
from federated_agent_audit import audited
@audited(policy, to_agent="downstream")
def my_agent(input_text: str) -> str:
return process(input_text)
What It Detects
| Risk | What happens | How we catch it |
|---|---|---|
| Cross-domain leak | Health data reaches a social/external agent | Domain boundary analysis on metadata |
| Cross-owner leak | My agent leaks my private data to another user's agent | Owner-boundary analysis (taint origin vs recipient owner) |
| Compositional inference | Agent collects health + identity = reidentification | Quasi-identifier assembly detection |
| Aggregation attack | 3 agents each share a fragment → hub reconstructs full profile | Multi-source convergence analysis |
| Cascading injection | Prompt injection propagates agent-to-agent like a worm | Infection tree + patient-zero attribution |
| Collusion | Two agents exchange complementary data to reconstruct a profile | Bidirectional complementary-flow detection |
| Behavioral drift | Agent suddenly changes behavior (possible compromise) | Cross-session z-score monitoring |
| Negative inference | "I can't share that" confirms the data exists | Refusal pattern detection |
| Regulatory gap | EU AI Act / GDPR / CA SB 243 / COPPA requirements unmet | Per-article compliance scoring |
Detection Effectiveness
A labeled benchmark of multi-agent scenarios (real compositional leaks vs. benign traffic) measures detection quality, not just speed:
python benchmarks/detection_eval.py # precision / recall / F1
python benchmarks/detection_eval.py --sweep # threshold robustness
On the current set (33 scenarios: 19 leak + 14 benign, incl. adversarial
cases — noise-buried leaks, diamond multi-path, same-domain laundering, an
injection worm, sensitivity-under-reporting evasion, multi-origin aggregation,
slow-drip identity assembly, cross-owner group leaks, collusion, high-volume
benign hubs, cross-subject convergence) the pipeline
reaches precision 1.0 / recall 1.0 / F1 1.0 with zero raw-content
leakage into central reports, stable across thresholds 0.3–0.8. Pure
structural signals (topology, timing, behavioral) are reported separately and
not counted as privacy-leak detections. tests/test_detection_benchmark.py
locks the metrics as a regression gate.
Validated live against LangGraph (free, in-suite) and CrewAI + OpenAI streaming (opt-in examples, need an API key).
Accuracy under desensitization + DP
The central auditor never sees raw content, so the real question is whether
detection survives the noise. Running every scenario through the full
pipeline — the 6-layer desensitizer and differential privacy
(python benchmarks/dp_eval.py):
| DP epsilon | recall | specificity | F1 | raw leaks |
|---|---|---|---|---|
| 3.0 | 0.89 | 0.93 | 0.91 | 0 |
| 1.0 | 0.89 | 0.93 | 0.91 | 0 |
| 0.5 | 0.89 | 0.94 | 0.92 | 0 |
The key design point: domains are protected structurally (k-anonymity
generalization), not by per-domain randomized response — which fabricates
spurious sensitive edges and collapses precision (specificity ~0.17). Taint is
preserved through DP. Result: F1 ≈ 0.91 under strong DP with zero raw-content
leakage, stable across epsilon. Locked by tests/test_dp_robustness.py.
Forced-Embed & Attestation
In a forced-embed deployment the auditor ships inside each downloaded agent
(like a mandatory compliance SDK). Edge attestation makes that tamper-evident:
the center pins known-good build fingerprints, checks an HMAC over each report,
enforces per-agent sequence + hash-chain continuity, and flags under-reporting —
so a modified-build / altered / omitted report is detected. It is tamper-evident,
not tamper-proof; hardware attestation (TEE) is the next level. See
examples/marketplace_forced_embed.py.
Compliance Engine
Built-in regulatory mapping for EU AI Act, GDPR, CA SB 243, and COPPA:
from federated_agent_audit import ComplianceEngine
engine = ComplianceEngine(eu_users=True, california_users=True, involves_children=False)
report = engine.evaluate(audit_result)
print(report.overall_score) # 0.0 - 1.0 · report.status: compliant / partial / non_compliant
for gap in report.gaps():
print(f"{gap.regulation} {gap.article}: {gap.remediation}")
CLI & YAML Policies
federated-audit scan "Patient SSN is 123-45-6789" # scan text
echo "salary: $200k" | federated-audit scan --protect salary
federated-audit validate policies/*.yaml # validate policy files
federated-audit demo # quick multi-agent demo
federated-audit server --port 8000 # start the central audit server
# policies/hr_bot.yaml
agent_id: hr_bot
must_not_share: [salary, SSN, performance review]
acceptable_abstractions: {salary: compensation level, SSN: employee identifier}
sensitivity_threshold: 3
from federated_agent_audit import load_policy
policy = load_policy("policies/hr_bot.yaml")
Installation
pip install federated-agent-audit # core
pip install "federated-agent-audit[crewai]" # + a framework adapter
# (or langchain/langgraph/autogen/
# openai-agents/llamaindex)
pip install "federated-agent-audit[transport]" # + the central audit server
pip install "federated-agent-audit[all]" # everything
How It Works
49 modules · 696 tests · 0 external API calls required
Local (Phase 1): Network (Phase 2):
PrivacyGate (regex + PII) Cross-domain / cross-owner detection
SemanticDetector (4-tier) Compositional leak detection
TaintTracker (info flow) Cascade infection tracking
Desensitizer (6-layer) Aggregation / collusion analysis
MemoryAuditor (write audit) Topology + blame attribution
Attestor (tamper-evidence) Compliance engine
Privacy guarantee: the central auditor architecturally cannot see raw content. Data is hashed, pseudonymized, and DP-noised before leaving local agents. Merkle-tree commitments make audit trails tamper-evident without revealing entries.
Development
git clone https://github.com/Justin0504/federated-agent-audit
cd federated-agent-audit
pip install -e ".[dev,langchain,langgraph,transport,yaml]"
pytest # 696 tests
ruff check src/ tests/ benchmarks/ # lint
python examples/multiagent_trace_demo.py
Contributions welcome — see CONTRIBUTING.md, the roadmap, and issues tagged good first issue.
License
Apache 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file federated_agent_audit-0.3.0.tar.gz.
File metadata
- Download URL: federated_agent_audit-0.3.0.tar.gz
- Upload date:
- Size: 225.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f80bc91f41822f8d0b359163b380b59163431f617cdaa0f547f3b263e44941db
|
|
| MD5 |
16c1fe7c34fb53005be85d779edeaa5b
|
|
| BLAKE2b-256 |
356fef819e61630f2021112f1bf529e6f11305b7e36681bdbaf67f17c7b6ba9c
|
Provenance
The following attestation bundles were made for federated_agent_audit-0.3.0.tar.gz:
Publisher:
ci.yml on Justin0504/federated-agent-audit
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
federated_agent_audit-0.3.0.tar.gz -
Subject digest:
f80bc91f41822f8d0b359163b380b59163431f617cdaa0f547f3b263e44941db - Sigstore transparency entry: 1719544833
- Sigstore integration time:
-
Permalink:
Justin0504/federated-agent-audit@b558f61b6c32a2c51368f8c23c8a6ae468f097cc -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/Justin0504
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@b558f61b6c32a2c51368f8c23c8a6ae468f097cc -
Trigger Event:
push
-
Statement type:
File details
Details for the file federated_agent_audit-0.3.0-py3-none-any.whl.
File metadata
- Download URL: federated_agent_audit-0.3.0-py3-none-any.whl
- Upload date:
- Size: 179.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
86aa73c7cdb7a8891324233003df54b8c7de2cd1c89c6016e85f889963a8786c
|
|
| MD5 |
aa4ccf17ed435158c1c99cb88d984c31
|
|
| BLAKE2b-256 |
3711afd988311cf9306e33e72eb895128371b317f6e9b6520355f8a7d1083e05
|
Provenance
The following attestation bundles were made for federated_agent_audit-0.3.0-py3-none-any.whl:
Publisher:
ci.yml on Justin0504/federated-agent-audit
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
federated_agent_audit-0.3.0-py3-none-any.whl -
Subject digest:
86aa73c7cdb7a8891324233003df54b8c7de2cd1c89c6016e85f889963a8786c - Sigstore transparency entry: 1719544973
- Sigstore integration time:
-
Permalink:
Justin0504/federated-agent-audit@b558f61b6c32a2c51368f8c23c8a6ae468f097cc -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/Justin0504
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@b558f61b6c32a2c51368f8c23c8a6ae468f097cc -
Trigger Event:
push
-
Statement type: