Adaptive AI Agent Execution Layer — risk scoring, audit trails, regulatory compliance
Project description
Vaara
Adaptive AI agent execution layer. Sits between agents and actions, scores risk in real time, and produces EU AI Act conformity evidence as a byproduct.
Three questions for every agent action:
- Should this happen? (adaptive risk scoring with conformal prediction)
- What is this? (action taxonomy with regulatory classification)
- What happened and why? (hash-chained audit trail)
Why Vaara
AI governance tools audit models. Vaara governs actions.
Models are scored once at deployment. Agents act continuously at runtime -- calling tools, moving money, modifying infrastructure. Individual actions may be safe, but sequences can be catastrophic. read_data + export_data + delete_data is a data exfiltration pattern where each step alone is benign.
Vaara catches this. It learns which risk signals predict bad outcomes, adapts its scoring online, and wraps every estimate in a distribution-free confidence interval. No retraining. No manual threshold tuning.
Install
pip install vaara
Python 3.10+. Zero runtime dependencies.
Quick Start
from vaara.pipeline import InterceptionPipeline
pipeline = InterceptionPipeline()
# Agent wants to execute a tool
result = pipeline.intercept(
agent_id="agent-007",
tool_name="tx.transfer",
parameters={"to": "0x...", "amount": 1000},
agent_confidence=0.8,
)
if result.allowed:
execute_tool("tx.transfer", {"to": "0x...", "amount": 1000})
# Report outcome so the scorer learns
pipeline.report_outcome(result.action_id, outcome_severity=0.0)
else:
print(f"Blocked: {result.reason}")
# result.decision is "deny" or "escalate"
# result.risk_score and result.risk_interval available
How It Works
Risk Scoring (Adaptive)
Five expert signals scored per action, combined via Multiplicative Weight Update:
| Signal | What it measures |
|---|---|
| Taxonomy base | Static risk from action metadata (reversibility, blast radius, urgency) |
| Agent history | This agent's track record (denial rate, bad outcomes) |
| Sequence pattern | Dangerous action sequences (data exfiltration, privilege escalation, flash loan attacks) |
| Burst detection | Unusual action frequency spikes |
| Confidence gap | High confidence on high-risk actions = suspicious |
MWU learns which signals actually predict bad outcomes and shifts weight toward them. Converges to within 6% of the best expert within 1,000 observations.
Conformal Prediction
Every risk score is wrapped in a conformal prediction interval:
P(true_risk in [lower, upper]) >= 1 - alpha
No distributional assumptions. No model retraining. The decision uses the upper bound -- conservative by construction. Under distribution shift, FACI adaptive alpha maintains long-run coverage.
Decisions
ALLOW — upper bound < 0.3 (configurable)
ESCALATE — between 0.3 and 0.7 → route to human
DENY — upper bound > 0.7
Cold start is maximally cautious: wide intervals route most actions through human review. As outcomes accumulate, intervals tighten and the system becomes autonomous.
Framework Integrations
LangChain
from vaara.integrations.langchain import VaaraCallbackHandler
pipeline = InterceptionPipeline()
handler = VaaraCallbackHandler(pipeline, agent_id="my-agent")
agent = create_react_agent(llm, tools, callbacks=[handler])
OpenAI Agents SDK
from vaara.integrations.openai_agents import VaaraToolGuardrail
pipeline = InterceptionPipeline()
guardrail = VaaraToolGuardrail(pipeline)
agent = Agent(name="my-agent", tools=[...], output_guardrails=[guardrail])
CrewAI
from vaara.integrations.crewai import VaaraCrewGovernance
pipeline = InterceptionPipeline()
gov = VaaraCrewGovernance(pipeline)
safe_crew = gov.governed_kickoff(crew)
MCP Server (Claude Code, Cursor)
python -m vaara.integrations.mcp_server
Add to Claude Code settings:
{
"mcpServers": {
"vaara": {
"command": "python",
"args": ["-m", "vaara.integrations.mcp_server"]
}
}
}
Microsoft Agent Governance Toolkit
Vaara implements the ExternalPolicyBackend protocol -- plug it into any AGT PolicyEvaluator chain.
DeFi Extension
25 DeFi action types with MEV vulnerability scoring, atomicity classification, and 17 dangerous sequence patterns:
from vaara.taxonomy.defi import register_defi_actions
register_defi_actions(pipeline.registry)
# Now catches: flash loan attacks, rug pulls, governance attacks,
# bridge exploits, vault manipulation, oracle sandwiches, and more
Compliance
Built-in conformity assessment against EU AI Act and DORA:
report = pipeline.run_compliance_assessment(
system_name="My Agent System",
system_version="1.0.0",
)
# Article-by-article evidence mapping
for article in report.articles:
print(f"{article.reference}: {article.status}")
# EU AI Act Art. 9(1): COMPLIANT
# EU AI Act Art. 12(1): COMPLIANT
# ...
Mapped requirements:
- EU AI Act: Articles 9, 11-15, 61 (risk management, documentation, logging, transparency, human oversight, accuracy, post-market monitoring)
- DORA: Articles 10, 12, 13 (ICT risk management, incident detection, incident response)
The audit trail is hash-chained (SHA-256) and tamper-evident, satisfying Article 12(1) logging requirements.
Cold Start
Generate synthetic traces to pre-calibrate the scorer:
from vaara.sandbox.trace_gen import TraceGenerator
gen = TraceGenerator()
traces = gen.generate(n_traces=100)
gen.pre_calibrate(pipeline.scorer, traces)
# Calibration in minutes instead of hours
Three agent archetypes (benign, careless, adversarial) with realistic outcome distributions.
Architecture
Agent (LangChain / OpenAI / CrewAI / MCP)
|
v
InterceptionPipeline.intercept()
|
+-- ActionRegistry -> classify tool_name to ActionType
+-- AdaptiveScorer -> MWU + conformal risk interval
+-- AuditTrail -> hash-chained immutable log
+-- ComplianceEngine -> EU AI Act + DORA evidence mapping
|
v
InterceptionResult { allowed, risk_score, risk_interval, reason }
|
v
Execute or Block
|
v
report_outcome() -> closes feedback loop, MWU learns
Persistence
from vaara.audit.sqlite_backend import SQLiteAuditBackend
backend = SQLiteAuditBackend("audit.db")
trail = AuditTrail(on_record=backend.write)
pipeline = InterceptionPipeline(trail=trail)
WAL-mode SQLite, append-only, hash chain verified on load.
Formal Specification
See docs/formal_specification.md for the mathematical foundations: MWU regret bounds, conformal coverage guarantees, convergence rates, and security properties.
Tests
pip install vaara[dev]
pytest
169 tests, 86% coverage, runs in <0.5s.
License
See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vaara-0.3.0.tar.gz.
File metadata
- Download URL: vaara-0.3.0.tar.gz
- Upload date:
- Size: 92.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
09d9a9aa8484ba0038a1a3e9c7227e374b7c2b64cc8cfc5f8a132e56e58d6d71
|
|
| MD5 |
b54ef2d18d9840e0cdd61f30bf6c8feb
|
|
| BLAKE2b-256 |
a7fd2dd358f07fe5c140135b562142f5c733f3a55e2c0ae913a2758be4a00e82
|
File details
Details for the file vaara-0.3.0-py3-none-any.whl.
File metadata
- Download URL: vaara-0.3.0-py3-none-any.whl
- Upload date:
- Size: 75.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bb419d766a141953b3616a6272e750f91c649cae9d66f82ededd3c8d7b482f61
|
|
| MD5 |
df4d986bdc4819f979566d8e254a666e
|
|
| BLAKE2b-256 |
bf742c0e1e1eb9eb359ef8bbb45c1809ebb4b132011c3eb7fdd114c85e5fdd50
|