Adaptive AI Agent Execution Layer for risk scoring, audit trails, and regulatory compliance
Project description
Vaara is the runtime evidence layer for AI Act compliance. Open source, no SaaS, no telemetry.
Vaara intercepts agent tool calls, scores each one with a conformal risk interval, and writes a hash-chained audit record. Online learning across five expert signals via Multiplicative Weight Update. Distribution-free conformal coverage on the score. An external auditor can verify these properties without trusting your stack. Orchestration toolkits and identity layers (Microsoft Agent Governance Toolkit, others) sit on top.
Numbers
Held-out TEST recall 84.3% (95% Wilson [81.5, 86.7]) at FPR 4.6% [3.1, 7.0]. Multi-attacker PAIR ASR 0/25 across three different attacker models with identical seeds. 140 µs p99 inference latency on commodity CPU (excluding one-time embedding model load). Every number reproducible end-to-end via make repro-v031-bench.
- 7,955-entry adversarial corpus (250 hand-curated + 7,705 LLM-generated), 70/15/15 split stratified by (category, source)
- Classifier with 236 hand-features + 384-dim MiniLM embeddings at calibrated threshold 0.9226 on held-out TEST n=1,196: recall 84.3% [81.5, 86.7] at FPR 4.6% [3.1, 7.0]
- Multi-attacker PAIR robustness: 0/25 successes per attacker across Qwen2.5-32B, Qwen2.5-72B, Llama-3.3-70B hitting identical seed indices, Wilson upper 13.3%
- Chain of custody: corpus manifest SHA → split manifest SHA → training commit → bundle SHA, all locked and printed by every script
- 140 µs / 210 µs p99 inference latency, commodity CPU
- Distribution-free conformal coverage on the score
- MWU regret bound O(sqrt(T log N))
- vaara-bench-v0.32: v0.32 methodology delta. Same corpus, same split, embeddings + retuned hparams + FPR-target threshold calibration. +30.4 pp recall at the same FPR.
- vaara-bench-v0.31: v0.31 methodology spec with Wilson intervals and named limits. Frozen as the v0.31 chain-of-custody anchor.
- vaara-bench-v1: 77-trace synthetic-corpus regression baseline with frozen methodology, 100% soft TPR, 0% hard FPR
Each figure is reproducible from the public corpus or the bench harness in bench/.
Install
pip install vaara
Python 3.10+. Zero runtime deps. Optional XGBoost classifier: pip install vaara[ml].
Releases ship with SLSA Build Level 3 provenance. Verify with slsa-verifier verify-artifact.
Quick start
from vaara.pipeline import InterceptionPipeline
pipeline = InterceptionPipeline()
result = pipeline.intercept(
agent_id="agent-007",
tool_name="fs.write_file",
parameters={"path": "/etc/service.yaml", "content": "..."},
agent_confidence=0.8,
)
if result.allowed:
pipeline.report_outcome(result.action_id, outcome_severity=0.0)
else:
print(result.reason)
report_outcome closes the loop. MWU reweights signals based on which ones predicted the outcome.
Who reaches for Vaara
- AI compliance teams shipping high-risk systems under the EU AI Act — Article 9 risk management, Article 12 logging, Article 15 robustness, Article 61 post-market monitoring evidence.
- ML platform teams adding runtime governance to agentic stacks (LangChain, CrewAI, OpenAI Agents SDK, MCP-based hosts) without rewriting orchestration.
- AI safety and red teams calibrating scorers against adaptive attackers (PAIR, distribution-shift evals, custom corpora).
- Notified Bodies and internal auditors reading article-level evidence reports without trusting the deployer's stack.
What evidence looks like
vaara compliance report --format json against a real audit trail produces an article-level evidence record an auditor can read directly. Status is reported honestly: articles without recorded events return evidence_insufficient, not a rubber-stamp.
{
"system_name": "Acme HR Assistant",
"overall_status": "evidence_insufficient",
"trail_integrity": {"size": 105, "chain_intact": true},
"articles": [
{"article": "Article 12(1)", "title": "Record-Keeping (Logging)",
"status": "evidence_sufficient", "strength": "strong", "evidence_count": 105},
{"article": "Article 9(2)(a)", "title": "Risk Identification and Analysis",
"status": "evidence_sufficient", "strength": "strong", "evidence_count": 35},
{"article": "Article 15(1)", "title": "Accuracy, Robustness and Cybersecurity",
"status": "evidence_insufficient", "strength": "absent", "evidence_count": 0}
]
}
The same data renders as a styled PDF for Notified Bodies (vaara compliance report --format pdf, requires pip install 'vaara[pdf]'), a static HTML dashboard (vaara compliance dashboard), or a Sigstore-signed regulator-handoff envelope (vaara trail export, optional ML-DSA-65 / FIPS 204 post-quantum signer via pip install 'vaara[pq]').
Per-article verdict drill-down
Each article in the report carries two extra surfaces a reviewer can read without re-running the engine. verdict_inputs lists the threshold-vs-observed snapshot the engine compared against (minimum record count, staleness window, strong-strength bounds, future-timestamp and chain-integrity flags) plus a verdict_reasons list of human-readable rationale lines explaining why the status and strength landed where they did. contributing_events lists the most recent qualifying audit records the verdict sits on (record ID, action ID, ISO timestamp, agent, tool, and a filtered drill_down dict of just the data fields that fed the risk/decision/outcome: point estimate, conformal interval, decision, reason, outcome severity). The drill-down renders in every output format: JSON, markdown, narrative, PDF, and the HTML dashboard. An auditor reading the report can trace status → threshold delta → concrete event in one sitting.
Framework adapters
Native adapters in src/vaara/integrations/ route the major Python agent frameworks through Vaara's pipeline. Each intercepts via the framework's own callback or hook surface, scores, gates, and emits the same audit events as a direct pipeline.intercept(). Frameworks are not hard dependencies (lazy import, duck typing).
| Framework | Entry point | Use |
|---|---|---|
| LangChain | VaaraCallbackHandler, vaara_wrap_tool |
Slots into config={"callbacks": [...]} or wraps per-tool |
| CrewAI | VaaraCrewGovernance |
Wraps a crew so every agent action passes through scoring + audit |
| OpenAI Agents SDK | VaaraToolGuardrail, vaara_wrap_function |
Function-tool wrap, compatible with Responses API and Agents-SDK tracing |
| MCP server | vaara.integrations.mcp_server |
Exposes scoring, audit, policy reload as MCP tools |
All four share the same in-process pipeline, so audit records hash-chain together regardless of which framework the action came through. For Vaara in front of an upstream MCP server, see the MCP proxy section below.
Upstream-signal adapters (cloud + OSS guardrails)
Adapters route findings from cloud and OSS guardrails into Vaara's audit trail and OVERT envelope with EU AI Act article tags. The filter runs in the deployer's environment as an upstream signal. Vaara records the verdict, normalises 68 provider categories onto a shared vocabulary, and tags each finding against Art. 5, 10, 13, 15, 53, and the CSAM-specific obligation from the Digital Omnibus political agreement of May 2026.
| Provider | Adapter | Extra | Wraps |
|---|---|---|---|
| AWS Bedrock Guardrails | BedrockGuardrailsAdapter |
vaara[bedrock] |
ApplyGuardrail across five Bedrock policy buckets |
| Azure AI Content Safety | AzureContentSafetyAdapter |
vaara[azure-content-safety] |
analyze_text, Prompt Shields, Protected Material, Groundedness |
| GCP Model Armor | GcpModelArmorAdapter |
vaara[gcp-model-armor] |
sanitize_user_prompt, sanitize_model_response |
| NVIDIA NeMo Guardrails | NemoGuardrailsAdapter |
vaara[nemo-guardrails] |
GenerationResponse.log.activated_rails (input / dialog / output / retrieval) |
| Guardrails AI | GuardrailsAIAdapter |
vaara[guardrails-ai] |
ValidationOutcome.validation_summaries from Guard.parse / Guard.validate |
| LLM Guard | LLMGuardAdapter |
vaara[llm-guard] |
scan_prompt / scan_output, parses (sanitized, results_valid, results_score) |
| Rebuff | RebuffAdapter |
vaara[rebuff] |
DetectResponse across heuristic, model, vector layers + canary-word leak check |
Each adapter returns a ContentSafetyFinding the deployer routes into pipeline.intercept(context=finding.to_audit_context()). The mapping table lives at src/vaara/integrations/_content_safety_articles.py. Article-level rationale in COMPLIANCE.md and COMPLIANCE.md.
HTTP API
The same scorer and audit trail are available over HTTP for non-Python agents and for control planes that prefer a network boundary. Install with the server extra:
pip install 'vaara[server]'
vaara serve --host 0.0.0.0 --port 8000
curl -sX POST http://localhost:8000/v1/score \
-H 'content-type: application/json' \
-d '{"tool_name":"tx.transfer","agent_id":"agent-007","base_risk_score":0.5}'
The wire contract is in docs/openapi.yaml. Vaara defines the interface. Control-plane and orchestration vendors call it. Integration recipes for adopters live under examples/recipes/. Operator endpoints include POST /v1/policy/reload for atomic hot policy swap (start with vaara serve --policy PATH to enable), and POST /v1/detect/injection and POST /v1/detect/pii as named buyer-visible detectors with matching CLI subcommands that exit non-zero on detection for CI gating.
Vaara's scorer can be run alongside external scorers via vaara.scorer.composition.ExternalScorer and vaara.scorer.composite.CompositeScorer. Any service that implements the /v1/score wire contract (NeMo Guardrails, another Vaara instance) can be composed. The composite preserves the strongest decision across members.
TypeScript client
The first-party TypeScript client lives at clients/ts and ships on npm as @vaara/client. Typed wrappers over every v1 endpoint, Node 18+, ESM, declarations shipped. JS/TS agents (LangChain.js, Vercel AI SDK, MCP, any Node service) can call Vaara without a Python sidecar.
npm install @vaara/client
import { VaaraClient } from "@vaara/client";
const vaara = new VaaraClient({ baseUrl: "http://localhost:8000" });
const r = await vaara.score({ tool_name: "tx.transfer", agent_id: "agent-007", base_risk_score: 0.6 });
if (r.decision === "deny") throw new Error("blocked");
MCP proxy (Vaara as a transparent governance layer)
vaara.integrations.mcp_proxy.VaaraMCPProxy sits between an MCP client (Claude Code, Cursor, any MCP-capable host) and an upstream MCP server. Every tools/call from the client routes through Vaara's interception pipeline before reaching the upstream. Allowed calls forward transparently and report the upstream outcome back to the scorer. Blocked calls return an MCP isError: true response with the block reason. The initialization handshake and notifications/* forward unchanged. tools/list, resources/list, resources/read, prompts/list, and prompts/get route through the operator perimeter before reaching the client or upstream.
python -m vaara.integrations.mcp_proxy \
--upstream npx --upstream-arg -y --upstream-arg @sap/mdk-mcp-server \
--db ./mcp_audit.db
Point your MCP client at the proxy instead of the upstream. The audit chain captures every tool call without changing client or upstream behavior. Distinct from mcp_server, which exposes Vaara itself as an MCP server for agents that consult Vaara as a tool.
Operator perimeter: tool, resource, prompt filtering
The proxy accepts repeatable --allow-tool NAME / --deny-tool NAME, --allow-resource URI / --deny-resource URI, and --allow-prompt NAME / --deny-prompt NAME flags. Filtered tools are dropped from tools/list responses before the client sees them and any matching tools/call is rejected at the proxy perimeter without contacting the upstream. The same shape extends to resources/list + resources/read and prompts/list + prompts/get. Denylist wins on overlap with allowlist. No flags = passthrough. Every allowed resources/read and prompts/get writes a request+decision audit pair to the hash chain so a regulator can reconstruct exactly which resources the agent read and which prompts it retrieved. Read-oriented MCP surfaces do not run through the risk scorer. The operator perimeter is the gate, the audit chain is the evidence.
OVERT 1.0 envelopes per interaction
Off by default. When you pass --overt-signing-key KEY.pem, --overt-operator-key OPKEY.bin, and --overt-receipts-dir DIR/, the proxy writes one OVERT 1.0 Protocol Profile 1.0 Base Envelope (canonical CBOR, Ed25519, closed 9-field schema) per governed interaction into DIR/{nanosecond_timestamp}-{counter:010d}.cbor. Covers all four states: allowed tools/call, blocked tools/call, perimeter-filtered call, and perimeter-filtered resources/read / prompts/get. The arbiter public key is pinned alongside as pubkey.bin. Each envelope verifies offline under vaara overt verify against any conformant verifier.
# 1. Generate an Ed25519 signing key (evaluation/demo; for production use a KMS or HSM, see docs/signing-keys.md).
vaara keygen --dev --out signing.pem
# 2. Mint an operator HMAC key (>= 16 raw bytes). Used for request_commitment per OVERT Annex B.4.
head -c 32 /dev/urandom > op.key
# 3. Run the proxy with OVERT emission turned on.
python -m vaara.integrations.mcp_proxy \
--upstream npx --upstream-arg -y --upstream-arg @sap/mdk-mcp-server \
--overt-signing-key signing.pem \
--overt-operator-key op.key \
--overt-receipts-dir ./overt_receipts
# Each interaction now produces a Provisional Receipt:
vaara overt verify ./overt_receipts/1779309684224332669-0000000001.cbor \
--pubkey-file ./overt_receipts/pubkey.bin
# → {"valid": true, "monotonic_counter": 1, ...}
non_content_metadata carries structural fields only (action class, tool/resource/prompt identifier, decision, reason, agent_id, action_id). The request content itself never leaves the operator environment; only its HMAC-SHA256 commitment crosses the trust boundary. The monotonic counter advances strictly across the whole proxy process so gaps are detectable. Emission failure is logged and swallowed: attestation problems must not block legitimate upstream traffic.
Streaming notifications inside the boundary
Long-running upstream tools emit notifications/progress and notifications/message over the lifetime of a tools/call. The proxy routes each notification through the same audit pair (request + decision) and, when OVERT is configured, emits a dedicated Base Envelope with action class mcp.notification.progress or mcp.notification.message. Progress events correlate to the originating call via the _meta.progressToken from the request, so a regulator reading the receipt directory can reconstruct what arrived between request and response. Notifications still forward to the client unchanged. Audit failures are logged and swallowed: observation never blocks streaming.
Worked examples with real upstream servers:
examples/github-mcp-proxy-demo/. Vaara in front ofgithub/github-mcp-server(GitHub's official MCP server, MIT-licensed). End-to-end verified: real subprocess, 42 tools advertised, hash-chained audit trail recorded.examples/sap-mcp-proxy-demo/. Vaara in front of community SAP MCP servers (SAP/mdk-mcp-server,mario-andreschak/mcp-abap-abap-adt-api,lemaiwo/btp-sap-odata-to-mcp-server).
The proxy is MCP-protocol-level, not vendor-specific. The same three-step recipe applies to any stdio-capable MCP server (Microsoft Graph MCP, Salesforce MCP, ServiceNow MCP, cloud-provider MCP servers, Databricks MCP, and so on).
OVERT 1.0 attestation
What. OVERT 1.0 is an open standard for runtime trust in AI systems (overt.is, authored by Glacis Technologies, published 25 March 2026). It defines a signed, schema-closed envelope a relying party can verify offline without trusting the emitter.
Why. A regulator, auditor, or customer can confirm that a runtime decision actually happened the way you say it did, without reading your code or trusting your stack.
How Vaara emits it. Vaara is the Arbiter in OVERT terms and ships Protocol Profile 1.0 Base Envelopes (canonical CBOR per RFC 8949, Ed25519 signatures, HMAC-SHA256 keyed commitments, closed 9-field schema, IEEE-754 float rejection) alongside every audit record when attestation is enabled.
pip install 'vaara[attestation]'
from vaara.attestation.overt import emit_base_envelope, make_request_commitment, encoder_binary_identity
envelope = emit_base_envelope(
signing_key=key,
request_commitment=make_request_commitment(payload, operator_key=op_key),
encoder_binary_identity=encoder_binary_identity(arbiter_version="vaara/0.26.0", policy_hash=ph),
non_content_metadata={"action_class": "tx.transfer", "decision": "escalate"},
monotonic_counter=42,
arbiter_instance_identifier=uuid_bytes,
)
The reference Phase 3 IAP (vaara.attestation.iap) notary-signs the Provisional Receipt and anchors it in a transparency log. Production deployments can swap in sigstore Rekor or an equivalent independently-operated log at the same call sites. The OVERT S3P (MEA-2) emitter at vaara.attestation.s3p ships exact Clopper-Pearson confidence intervals (pure Python, no scipy) and a proposed Protocol Profile extension that reports aggregate statistics over per-action conformal prediction intervals alongside the standard binomial CI.
The vaara overt verify RECEIPT.cbor --pubkey-file PUB.bin CLI validates any canonical-CBOR Base Envelope against a supplied raw 32-byte Ed25519 public key. The verifier reads only the wire format and takes no dependency on Vaara's emitter, so any OVERT-conformant implementation can route its conformance check through it.
An experimental hardware TEE hook (vaara.attestation.tee) binds an OVERT envelope to an AMD SEV-SNP attestation report by placing SHA-512(canonical_cbor(envelope)) in the report's 64-byte REPORT_DATA field. The envelope schema is unchanged (closed per spec). The TEE report is a sibling artefact: a relying party checks the Ed25519 envelope signature and the ECDSA P-384 report signature independently. vaara tee parse and vaara tee verify expose the verifier as a CLI.
See COMPLIANCE.md "Position relative to open runtime-attestation standards" for the architectural framing and "OVERT 1.0 Part 3 (Agentic AI Controls) mapping" for the TOOL-, MCP-, MULTI-, CAP-, DISC-, HITL-, DRIFT-* control-by-control walk.
Where things live
| Path | Contents |
|---|---|
| docs/formal_specification.md | MWU regret bound, conformal coverage, security properties |
| docs/conformal-prediction.md | Plain-language explainer for compliance reviewers and legal counsel |
| COMPLIANCE.md | EU AI Act (Art. 9, 11 to 15, 61) and DORA (Art. 10, 12, 13) mapping, eval numbers, PAIR calibration |
| VERDICTS.md | Per-article evidence sufficiency thresholds and decision tree |
| CHANGELOG.md | Version-by-version feature evolution |
| PRIOR_ART.md | When each Vaara concept first shipped, and a neutral list of adjacent published work |
| OWASP_AGENTIC.md | Vaara mapping to OWASP Top 10 for Agentic Applications 2026 (ASI01 to ASI10) |
| OVERT_CONTROLS.md | Vaara mapping to OVERT 1.0 Part 3 Agentic AI Controls (TOOL-, MCP-, MULTI-, CAP-, DISC-, HITL-, DRIFT-*) |
| docs/signing-keys.md | Release signing and verification |
| SECURITY.md | Security policy and reporting |
| CONTRIBUTING.md | Contribution guidelines |
src/vaara/integrations/ |
LangChain, OpenAI Agents SDK, CrewAI, MCP, Bedrock, Azure, GCP |
src/vaara/audit/ |
Hash-chain trail, SQLite backend, append-only WAL |
src/vaara/policy/ |
YAML / JSON policy schema, vaara policy validate and vaara policy test |
src/vaara/sandbox/ |
Synthetic-trace cold-start calibration |
Acknowledgements:
- Vaara is listed in the industry acknowledgements of the IMDA Model AI Governance Framework for Agentic AI v1.5 (Singapore, 20 May 2026).
- The AMD AI Developer Program ran a coordinated multi-channel developer testimonial of Vaara in May 2026.
- Article 14 runtime: why oversight of agentic AI has to be evidenced as action, not model is the position post on the EU Apply AI Alliance Futurium.
Vaara helps deployers assemble evidence for their own conformity work. It does not certify compliance or constitute legal advice. Deployers own their obligations under the EU AI Act and other applicable law.
License
Apache 2.0. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vaara-0.32.0.tar.gz.
File metadata
- Download URL: vaara-0.32.0.tar.gz
- Upload date:
- Size: 996.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d063b32eefb7581970efe89ff52df4dda5f098fa5f9c95bdd8775743a1eb8916
|
|
| MD5 |
a44863fa084c449908655507c7ff0140
|
|
| BLAKE2b-256 |
0d627d1d3b02cee099afcd310524c165dc607fdf23dc20c5cf2da656e967392a
|
Provenance
The following attestation bundles were made for vaara-0.32.0.tar.gz:
Publisher:
release.yml on vaaraio/vaara
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vaara-0.32.0.tar.gz -
Subject digest:
d063b32eefb7581970efe89ff52df4dda5f098fa5f9c95bdd8775743a1eb8916 - Sigstore transparency entry: 1628967315
- Sigstore integration time:
-
Permalink:
vaaraio/vaara@e0be6e7698ecd2a2ebb217d395d8955f7ace1035 -
Branch / Tag:
refs/tags/v0.32.0 - Owner: https://github.com/vaaraio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e0be6e7698ecd2a2ebb217d395d8955f7ace1035 -
Trigger Event:
push
-
Statement type:
File details
Details for the file vaara-0.32.0-py3-none-any.whl.
File metadata
- Download URL: vaara-0.32.0-py3-none-any.whl
- Upload date:
- Size: 953.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
307a70122f0f9b2600bf8c7497ede013eea258e63d3562e8e0a31093f0cc3a2d
|
|
| MD5 |
b5f53a4d18f8ba36666573edea0fee5a
|
|
| BLAKE2b-256 |
f8f31e06b30e2456ca967acb6e101e2fe88b83d9942c6ced73e434d0319d4890
|
Provenance
The following attestation bundles were made for vaara-0.32.0-py3-none-any.whl:
Publisher:
release.yml on vaaraio/vaara
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vaara-0.32.0-py3-none-any.whl -
Subject digest:
307a70122f0f9b2600bf8c7497ede013eea258e63d3562e8e0a31093f0cc3a2d - Sigstore transparency entry: 1628967338
- Sigstore integration time:
-
Permalink:
vaaraio/vaara@e0be6e7698ecd2a2ebb217d395d8955f7ace1035 -
Branch / Tag:
refs/tags/v0.32.0 - Owner: https://github.com/vaaraio
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e0be6e7698ecd2a2ebb217d395d8955f7ace1035 -
Trigger Event:
push
-
Statement type: