Runtime containment kernel for LLM agents. Enforces budget, step, retry, and circuit-breaker limits before the model call.
Project description
VERONICA-core
VERONICA-Core is a zero-dependency runtime containment library for LLM agents, enforcing budget, step, retry, and circuit-breaker limits before failures turn into runaway execution.
It is not a prompt filter. It is not a semantic guardrail. It is an execution enforcement kernel.
pip install veronica-core
Zero required dependencies. Operationally stable at v3.10.0. Python 3.10+. Works with any LLM provider.
What VERONICA-Core does
- Budget ceiling -- hard cost cap per chain. The call is blocked before it reaches the model when the cap is exceeded.
- Step limit -- bounded recursion depth. Agent loops cannot run indefinitely.
- Retry containment -- 3 layers x 4 retries would produce 64 API calls. VERONICA-Core caps the total at your limit.
- Circuit breaker -- per-entity failure counting with automatic COOLDOWN (local or Redis-backed).
- 4-tier degradation -- ALLOW -> DEGRADE -> RATE_LIMIT -> HALT. The call does not proceed past HALT.
- Policy-based execution -- 80+ declarative rule types, YAML/JSON, hot-reload, ed25519 signing.
What it does not do
- No prompt content filtering
- No output moderation or semantic classification
- No hosted control plane (that is veronica, a separate repo)
- No high-availability or federation (planned for v4.0)
- No scheduling, routing, or orchestration
If you want to filter prompt content, use guardrails-ai or NeMo Guardrails. VERONICA-Core caps resource consumption; it does not inspect what the content says.
Quickstart
Two lines. Your existing agent code unchanged. Hard ceiling at $1.00:
from veronica_core.patch import patch_openai
from veronica_core import veronica_guard
patch_openai()
@veronica_guard(max_cost_usd=1.0, max_steps=20)
def run_agent(prompt: str) -> str:
from openai import OpenAI
return OpenAI().chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}],
).choices[0].message.content
Full control with ExecutionContext:
from veronica_core import ExecutionContext, ExecutionConfig, WrapOptions
def simulated_llm_call(prompt: str) -> str:
return f"response to: {prompt}"
config = ExecutionConfig(
max_cost_usd=1.00, # hard cost ceiling per chain
max_steps=50, # hard step ceiling
max_retries_total=10,
timeout_ms=0,
)
with ExecutionContext(config=config) as ctx:
for i in range(3):
decision = ctx.wrap_llm_call(
fn=lambda: simulated_llm_call(f"prompt {i}"),
options=WrapOptions(
operation_name=f"generate_{i}",
cost_estimate_hint=0.04,
),
)
if decision.name == "HALT":
break
snap = ctx.get_graph_snapshot()
print(snap["aggregates"])
# {"total_cost_usd": 0.12, "total_llm_calls": 3, ...}
Core runtime controls
Budget enforcement
Hard cost ceiling per chain. The reserve/commit two-phase protocol prevents double-spending across concurrent calls. When the ceiling is reached, the decision is HALT before the call is dispatched.
Step and loop limits
Bounded recursion depth per entity. A max_steps limit stops agents from looping past a fixed count. Semantic loop detection (word-level Jaccard similarity, no ML dependencies) catches repeated content without requiring an additional model call.
Retry amplification
When retries are nested, they multiply. max_retries_total caps the total across all layers. Jitter and backoff are configurable.
Circuit breaker
Per-entity failure counting with CLOSED -> OPEN -> HALF_OPEN states. Local backend or Redis-backed for shared state across workers. DistributedCircuitBreaker uses a Lua script for atomic slot reservation.
Policy-based execution
PolicyEngine evaluates rules before each step. 80+ built-in rule types covering shell commands, network destinations, filesystem paths, token budgets, and agent identity. Rules load from YAML or JSON, hot-reload without restart, and sign with ed25519 for tamper detection.
Adapters and integrations
12 adapters ship with the library. All adapters use ExecutionContext internally.
| Framework | Adapter | Notes |
|---|---|---|
| OpenAI SDK | patch_openai() |
Monkey-patch; no code changes required |
| Anthropic SDK | patch_anthropic() |
Same patch pattern |
| LangChain | VeronicaCallbackHandler |
Callback-based |
| AG2 (AutoGen) | CircuitBreakerCapability |
AgentCapability protocol; PR #2430 merged in AG2 v0.11.3 |
| AG2 eligibility | AgentEligibilityPolicy |
Per-agent step gate |
| LlamaIndex | VeronicaLlamaIndexHandler |
Handler-based |
| CrewAI | VeronicaCrewAIListener |
Listener-based |
| LangGraph | VeronicaLangGraphCallback |
Callback-based |
| ASGI/WSGI | VeronicaASGIMiddleware |
HTTP and WebSocket |
| MCP (sync) | MCPContainmentAdapter |
See MCP section below |
| MCP (async) | async variant of MCP adapter | See MCP section below |
| A2A (client) | A2AContainmentAdapter |
Google A2A protocol, client side |
| A2A (server) | A2AServerContainmentMiddleware |
Google A2A protocol, server side |
Examples: examples/ | LangChain quickstart: examples/langchain_minimal.py | LangGraph: examples/langgraph_minimal.py | AG2: examples/ag2_circuit_breaker.py
Policy engine
Rules are declared in YAML or JSON and evaluated before each governed step:
# veronica_policy.yaml
rules:
- id: block_shell_rm
type: shell_command
pattern: "rm"
verdict: DENY
- id: cap_token_output
type: token_budget
max_output_tokens: 2048
verdict: DEGRADE
from veronica_core.policy import PolicyEngine
engine = PolicyEngine.from_yaml("veronica_policy.yaml")
result = engine.evaluate_shell(["rm", "-rf", "/tmp/data"])
# result.verdict == "DENY"
Features:
- 80+ built-in rule types: shell, network, filesystem, token, agent identity, memory, MCP tool
- YAML and JSON formats with hot-reload (file watcher, no restart required)
- ed25519 signing for tamper detection on policy files
PolicySimulatorfor dry-run evaluation against recorded execution logsAgentEligibilityPolicyfor per-agent step gating based on identity and trust level
Audit, compliance, observability
Audit log
Every HALT and DEGRADE decision is written to AuditLog with timestamp, rule ID, verdict, and call graph node ID. The log is append-only within a session.
from veronica_core.audit import AuditLog
log = AuditLog()
entries = log.export_compliance(format="json")
Compliance export
export_compliance() produces structured JSON suitable for external retention. Covers budget overage events, circuit breaker state transitions, and policy DENY/HALT verdicts.
OpenTelemetry
pip install veronica-core[otel]
from veronica_core.otel import enable_otel_with_provider
enable_otel_with_provider(tracer_provider)
Traces are emitted for each wrap_llm_call span. Budget and circuit breaker state are attached as span attributes. OTelExecutionGraphObserver exports the full execution graph on session close.
OTel failure does not affect containment decisions. The enforcement path and the telemetry path are separate.
MCP support
MCP (Model Context Protocol) tools run as external processes that any LLM may call. VERONICA-Core can sit between the agent and the MCP server to enforce per-tool resource limits.
Current scope
- Per-tool budget enforcement: cost tracked per tool call, HALT when ceiling reached
- Step limit per tool invocation
- Policy rule evaluation before each tool call (
PolicyEngineintegration) - Sync and async variants:
MCPContainmentAdapter,AsyncMCPContainmentAdapter
from veronica_core.adapters.mcp import MCPContainmentAdapter
from veronica_core import ExecutionContext, ExecutionConfig
config = ExecutionConfig(max_cost_usd=0.10, max_steps=5)
ctx = ExecutionContext(config=config)
adapter = MCPContainmentAdapter(ctx)
result = adapter.call_tool("web_search", {"query": "latest news"}, fn=mcp_client.call)
Not yet implemented
- Server-level process isolation (containment is in-process only)
- Tool description pinning or schema provenance verification
- Cross-server budget coordination (each adapter instance tracks independently)
- MCP server registry integration
Operational maturity
- 6100+ tests (collected count as of v3.7.x)
- 90%+ coverage (CI-enforced floor)
- CI runs on Python 3.10, 3.11, 3.12, 3.13
- CI includes a free-threaded Python job (Python 3.13t,
PYTHON_GIL=0) to catch nogil regressions - Zero required dependencies (redis and otel are optional extras)
- Zero breaking changes from v2.1.0 through v3.7.4
- v3.7.5 removes the deprecated
veronica_core.adaptershim (deprecated since v3.4.0; useveronica_core.adapters.execinstead) - 4 independent security audits (130+ findings addressed)
- 20-scenario red-team regression suite: exfiltration, credential hunt, workflow poisoning, persistence
Security details: docs/SECURITY_CONTAINMENT_PLAN.md | docs/THREAT_MODEL.md
Positioning
Observability tools (Langfuse, LangSmith, Arize) record what happened after the call. They do not stop the call.
VERONICA-Core decides before the call: should this step proceed, degrade, or halt. It is an enforcement point, not a recording point.
veronica-core is the in-process enforcement kernel. It runs inside your agent process.
veronica is the separate control plane: policy management, fleet coordination, and dashboard. It tells veronica-core what to enforce.
TriMemory resolves which document governs before inference; veronica-core enforces that the inference does not exceed its resource budget.
Architecture
Each call passes through a ShieldPipeline of registered hooks. Any hook may emit DEGRADE or HALT. A HALT blocks the call and emits a SafetyEvent. veronica-core enforces that the evaluation occurs and the call does not proceed past HALT.
Details: docs/architecture.md | docs/diagrams/supporting-systems.svg | docs/diagrams/shield-pipeline-flow.svg
Examples
| File | Description |
|---|---|
| basic_usage.py | Budget enforcement and step limits |
| execution_context_demo.py | Step limit, budget, abort, circuit, divergence |
| adaptive_demo.py | Adaptive ceiling, cooldown, anomaly, replay |
| ag2_circuit_breaker.py | AG2 agent-level circuit breaker |
| langchain_minimal.py | LangChain integration quickstart |
| langgraph_minimal.py | LangGraph integration quickstart |
Install
pip install veronica-core
Optional extras:
pip install veronica-core[redis] # DistributedCircuitBreaker, RedisBudgetBackend
pip install veronica-core[otel] # OpenTelemetry export
pip install veronica-core[vault] # VaultKeyProvider (HashiCorp Vault)
Development:
git clone https://github.com/amabito/veronica-core
cd veronica-core
pip install -e ".[dev]"
pytest
Roadmap
v4.0 Federation (multi-process policy coordination) is the next planned milestone. No timeline commitment.
Full roadmap: docs/ROADMAP.md | CHANGELOG.md | docs/EVALUATION.md
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file veronica_core-3.10.0.tar.gz.
File metadata
- Download URL: veronica_core-3.10.0.tar.gz
- Upload date:
- Size: 340.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a61bc589549700ff4ab889fd1d897774a44087463acfa4ebd5d1f045ed50717
|
|
| MD5 |
8c7284f824ba7e67b0fabe020f865afb
|
|
| BLAKE2b-256 |
a0f2287d2f21c22886f5ea523030937587801fcaf20531750ad8d4416675254c
|
File details
Details for the file veronica_core-3.10.0-py3-none-any.whl.
File metadata
- Download URL: veronica_core-3.10.0-py3-none-any.whl
- Upload date:
- Size: 444.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4b5b77e096ae552991a7ee5bd9d8361a0907059430b703bc919ebdb0398e3e64
|
|
| MD5 |
57eb8d2652ccb0dc07eab1b5f5a3c2f9
|
|
| BLAKE2b-256 |
3949689f0b2eac7511d393c0dcb84f327d0f3687b9156d99935220c6a682fe57
|