Skip to main content

Privacy-preserving audit framework for multi-agent AI systems. Detects cross-agent data leaks, inference attacks, and compliance violations — without accessing raw content.

Project description

Federated Agent Audit

Privacy audit for multi-agent AI systems — without touching raw data.

pip install federated-agent-audit

CI PyPI version Python 3.11+ License Tests Ruff

Audit multi-agent systems (CrewAI · LangGraph · AutoGen) for compositional privacy leaks the central auditor can never see the raw data behind.


30-Second Quick Start

Scan any text for sensitive content:

from federated_agent_audit import scan

result = scan("Zhang Wei's SSN is 123-45-6789, salary $185,000")
print(result["clean"])     # False
print(result["detected"])  # ['SSN', 'salary']
print(result["text"])      # "Zhang Wei's [REDACTED] is [SSN], [REDACTED] [DOLLAR_AMOUNT]"

Or from the command line:

federated-audit scan "My email is john@acme.com"
# REDACTED  Detected: email
#   Output: My [REDACTED] is [EMAIL_ADDRESS]

echo "credit card 4532-1234-5678-9012" | federated-audit scan
# REDACTED  Detected: credit card

Protect Your LLM Calls

Intercept every OpenAI/Anthropic response automatically:

from federated_agent_audit import firewall

fw = firewall(["salary", "SSN", "diagnosis"])
fw.patch_openai()  # done — every response is now checked

# Normal usage — firewall is invisible
response = client.chat.completions.create(model="gpt-4o", messages=[...])
# Sensitive content in response is already redacted

The Problem

Multi-agent systems (CrewAI, LangGraph, AutoGen) create compound privacy risks that single-agent tools can't detect:

  • Agent A shares salary data with Agent B (allowed by A's policy)
  • Agent B forwards a "summary" to an external partner (allowed by B's policy)
  • Result: salary leaked outside the company — neither agent broke its own rules

Existing observability tools (LangSmith, Langfuse) require uploading raw prompts to their servers. This framework audits agent interactions without the central auditor ever seeing raw content.

                       +---------------+
                       |   Central     |  Phase 2: Network audit
                       |   Auditor     |  (desensitized metadata only)
                       +-------+-------+
                               |
               +---------------+---------------+
               |               |               |
        +------+------+  +----+----+  +--------+------+
        | Local Audit |  | Local   |  | Local Audit   |  Phase 1
        | (Agent A)   |  | (Agt B) |  | (Agent C)     |
        +-------------+  +---------+  +---------------+
         raw content      raw content   raw content
         stays here       stays here    stays here

Full Pipeline Example

from federated_agent_audit import (
    FederatedAudit, PrivacyPolicy, NetworkAuditor,
    RiskAggregator, ComplianceEngine,
)

# 1. Define policies
policy_hr = PrivacyPolicy(agent_id="hr_bot", must_not_share=["salary", "SSN"])
policy_ext = PrivacyPolicy(agent_id="notify_bot", must_not_share=["salary", "SSN", "email"])

# 2. Record interactions (each agent audits locally)
audit_hr = FederatedAudit(policy=policy_hr)
audit_hr.record_outgoing("Zhang Wei earns $185k", to_agent="summary_bot")

audit_ext = FederatedAudit(policy=policy_ext)
audit_ext.record_outgoing("Candidate update sent", to_agent="external")

# 3. Central audit (only sees desensitized metadata — never raw text)
net = NetworkAuditor()
net.ingest_report(audit_hr.get_report())
net.ingest_report(audit_ext.get_report())
result = net.audit()

# 4. Compliance check
compliance = ComplianceEngine(eu_users=True).evaluate(result)
print(f"Compliance: {compliance.overall_score:.0%}{compliance.status.value}")
for gap in compliance.gaps():
    print(f"  {gap.regulation} {gap.article}: {gap.title}")

What It Detects

Risk What happens How we catch it
Cross-domain leak Health data reaches social media agent Domain boundary analysis on metadata
Compositional inference Agent collects health + identity = reidentification Quasi-identifier assembly detection
Aggregation attack 3 agents each share a fragment → hub reconstructs full profile Multi-source convergence analysis
Cascading injection Prompt injection propagates agent-to-agent like a worm Infection tree + patient-zero attribution
Behavioral drift Agent suddenly changes behavior (possible compromise) Cross-session z-score monitoring
Negative inference "I can't share that" confirms the data exists Refusal pattern detection
Regulatory gap EU AI Act / GDPR / COPPA requirements unmet Per-article compliance scoring

CLI

# Scan text for sensitive content
federated-audit scan "Patient SSN is 123-45-6789"
echo "salary: $200k" | federated-audit scan --protect salary

# Validate policy files
federated-audit validate policies/*.yaml

# Run a demo
federated-audit demo

# Start the central audit server
federated-audit server --port 8000

YAML Policies

# policies/hr_bot.yaml
agent_id: hr_bot
must_not_share:
  - salary
  - SSN
  - performance review
acceptable_abstractions:
  salary: compensation level
  SSN: employee identifier
sensitivity_threshold: 3
from federated_agent_audit import load_policy
policy = load_policy("policies/hr_bot.yaml")

Compliance Engine

Built-in regulatory mapping for EU AI Act, GDPR, CA SB 243, and COPPA:

from federated_agent_audit import ComplianceEngine

engine = ComplianceEngine(eu_users=True, california_users=True, involves_children=False)
report = engine.evaluate(audit_result)

print(report.overall_score)  # 0.0 - 1.0
print(report.status)         # compliant / partial / non_compliant
for gap in report.gaps():
    print(f"{gap.regulation} {gap.article}: {gap.remediation}")

Multi-Agent Trace Capture

The integrations capture the real agent-to-agent interaction graph — who sent what to whom — which is exactly what the compositional / cascade / cross-domain detectors analyze. Everything is built on MultiAgentTracer, which works with any framework (or none):

from federated_agent_audit import MultiAgentTracer, PrivacyPolicy

tracer = MultiAgentTracer()
tracer.register_agent("hr_bot", PrivacyPolicy(agent_id="hr_bot", must_not_share=["salary"]))

# Each call is a real directed edge; taint (domains, sensitivity, origin,
# hop count) propagates across hops automatically.
tracer.record_handoff("hr_bot", "summary_bot", "Zhang Wei earns $185k", origin="zhang_wei")
tracer.record_handoff("summary_bot", "external_bot", "candidate compensation summary")

result = tracer.network_audit()      # Phase-2 central audit
incidents = tracer.aggregated()      # denoised, actionable alerts

It catches the compound leak no single agent's policy can see — and the central auditor still never touched the raw data (python examples/multiagent_trace_demo.py):

Incidents: 5  alert_summary={'critical': 3, 'high': 2}
  [CRITICAL] cross_domain_leak  — Sensitive health data reaches social domain via 2-agent chain
  [CRITICAL] cross_domain_leak  — Sensitive finance data reaches social domain via 2-agent chain
  [CRITICAL] taint_spreading    — Data from origin 'zhang_wei' spread to 4 agents across the network
  [HIGH]     inference_accumulation — external_bot accumulated high inference risk (77%)
  [HIGH]     compound_scope_escalation — 3 agent pairs exceed authorized scope

Privacy verification (central reports):  hr_bot → clean  health_bot → clean  summary_bot → clean

Framework Integrations

# CrewAI — captures agent delegation (Delegate/Ask coworker) as A→B edges
from federated_agent_audit.sdk import crew_audit
crew = crew_audit(crew, default_policy=policy)   # or policies={role: policy}
crew.kickoff()
result = crew._federated_tracer.network_audit()

# LangChain / LangGraph — per-node identity + node-to-node hand-offs
from federated_agent_audit.sdk import langchain_callback
handler = langchain_callback(default_policy=policy)          # asynchronous=True for async graphs
graph.invoke(input, config={"callbacks": [handler]})
result = handler.tracer.network_audit()

# Generic Python — single-agent decorator
from federated_agent_audit import audited
@audited(policy, to_agent="downstream")
def my_agent(input_text: str) -> str:
    return process(input_text)

The LLM firewall hardens this for production — fail-open (audit never crashes the app), streaming responses blocked the moment a violation accumulates, and sensitive content inspected inside tool-call arguments:

from federated_agent_audit import firewall
fw = firewall(["salary", "SSN"]); fw.patch_openai()   # streaming + tool calls covered

Installation

pip install federated-agent-audit                      # core
pip install "federated-agent-audit[transport]"         # + audit server
pip install "federated-agent-audit[yaml]"              # + YAML policies
pip install "federated-agent-audit[langchain]"         # + LangChain
pip install "federated-agent-audit[all]"               # everything

How It Works

48 modules  ·  597 tests  ·  0 external API calls required

Local (Phase 1):                    Network (Phase 2):
  PrivacyGate (regex + PII)           Cross-domain flow detection
  SemanticDetector (4-tier)           Compositional leak detection
  TaintTracker (info flow)            Cascade infection tracking
  Desensitizer (6-layer)              Topology analysis
  MemoryAuditor (write audit)         Blame attribution
                                      Compliance engine

Privacy guarantee: The central auditor architecturally cannot see raw content. Data is hashed, pseudonymized, and DP-noised before leaving local agents. Merkle tree commitments ensure tamper-proof audit trails without revealing entries.

Detection Effectiveness

A labeled benchmark of multi-agent scenarios (real compositional leaks vs. benign traffic) measures detection quality, not just speed:

python benchmarks/detection_eval.py            # precision / recall / F1
python benchmarks/detection_eval.py --sweep    # threshold robustness

On the current scenario set (13 leak + 11 benign, incl. adversarial cases: noise-buried leaks, diamond multi-path, partial-shared-origin hubs, same-domain laundering, high-volume benign hubs, cross-subject convergence) the pipeline reaches precision 1.0 / recall 1.0 / F1 1.0 with zero raw-content leakage into central reports, stable across thresholds 0.3–0.8. Pure structural signals (topology, timing, behavioral) are reported separately and not counted as privacy-leak detections. The harness is the place to add adversarial scenarios; tests/test_detection_benchmark.py locks the metrics as a regression gate.

Validated live against LangGraph (free, in-suite) and CrewAI + OpenAI streaming (opt-in examples, need an API key).

Development

git clone https://github.com/Justin0504/federated-agent-audit
cd federated-agent-audit
pip install -e ".[dev,transport,yaml]"
pytest                    # 597 tests
ruff check src/ tests/    # lint
python examples/crewai_audit_demo.py  # run the demo

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

federated_agent_audit-0.2.0.tar.gz (202.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

federated_agent_audit-0.2.0-py3-none-any.whl (163.1 kB view details)

Uploaded Python 3

File details

Details for the file federated_agent_audit-0.2.0.tar.gz.

File metadata

  • Download URL: federated_agent_audit-0.2.0.tar.gz
  • Upload date:
  • Size: 202.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for federated_agent_audit-0.2.0.tar.gz
Algorithm Hash digest
SHA256 92aca7e3d35bfb9e405ca1a8c9dc975ff0fe045e704302a21e351cb4e3c4515e
MD5 2a0325160ce07a405cba9df74e32d03f
BLAKE2b-256 8e2d2075ae04cf1d69fb6add487c6a18d6c8a94edd367f3556fd8a562e39fe4b

See more details on using hashes here.

Provenance

The following attestation bundles were made for federated_agent_audit-0.2.0.tar.gz:

Publisher: ci.yml on Justin0504/federated-agent-audit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file federated_agent_audit-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for federated_agent_audit-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8f3221b2649917818024405544b34a02f84456083453350821e6dc74adf2ab0f
MD5 516c7fc1187482eb967bd212239a2935
BLAKE2b-256 72591aaffb37d351aa73987a7cd5b2dcf0cabfa17536cc961b8197a99dfaddd3

See more details on using hashes here.

Provenance

The following attestation bundles were made for federated_agent_audit-0.2.0-py3-none-any.whl:

Publisher: ci.yml on Justin0504/federated-agent-audit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page