The agent framework where nothing is done until a grader returns a verdict — verification + grounded perception (AgentVision eyes), compounding memory, a fleet, agent-built tooling, and agent-run CI/CD.
Project description
Verel — Verified Agents 👁️🧠
Problem: AI agents declare work “done” on their own say-so — shipping broken UIs, failing tests and unverified claims they can’t actually check. Result: Verel makes “done” a verdict, not an opinion — every action is graded by real senses (including eyes, via AgentVision), and only verified work compounds into the fleet’s shared memory.
Verel is an agent framework built on the idea that every agent action is a hypothesis:
write → perceive → gate (verdict bus) → fix → re-render → pass (self-computed)
One verdict bus unifies vision + tests + lint + types into a single pass / warn / fail,
so progress, “done”, and what compounds are all decided in one place — with grader
attestation so a hollow check can’t mint green.
See it in 15 seconds
A repo ships with failing tests and no hint of the fix. Verel runs the real grader, an agent patches the source (never the tests), and the stage re-gates until the graders themselves go green — the agent never decides "done", the verdict bus does:
The 60-second pitch
pip install verel
verel doctor # check your environment
verel heal --repo . # self-healing CI: failing tests → agent fixes → green
from verel.ci import inner_loop_stage, self_heal
result = self_heal(".", inner_loop_stage(".", with_lint=False)) # tests fail → agent patches → pass
print(result.healed, result.terminated_on) # True passed
Default LLM is Ollama Cloud (~/.config/ollama/key, model qwen3-coder:480b); set
VEREL_LLM_PROVIDER=openai to switch. Claude is one branch away in agents/llm.py.
New here? 5-minute tutorial →
The five organs
| Organ | Module | What it does |
|---|---|---|
| 🧠 Brain | verel.memory |
Memory that compounds — trust + provenance, consolidation, and a held-out, attested promotion gate. Only verified facts/skills graduate. Lifecycle controls (pin / volatile-until-confirmed / TTL / correction chains / adaptive decay — useful memories decay slower) keep it from becoming a junk drawer. Consolidation induces structured rules (condition→action), a multi-hop schema hierarchy (rules → principles → meta-principles), and cross-scope rules (a bug recurring across repos becomes a global rule) — and revises by contradiction: a rule a new failure violates is weakened, then split into a narrowed rule + an exception (or rejected) — and the split propagates up the schema hierarchy so principles above it stop over-claiming. A scope lattice (self → team → org → global) turns it into a shared brain: recall resolves down (the most specific scope wins), and a belief verified across sibling scopes graduates up as a candidate that must re-earn trust. A hosted memory service (MemoryServer/RemoteMemory) lets a fleet on different machines share one brain over HTTP — safely: a peer's belief enters as a candidate and re-verifies before it's trusted (import_belief), and author reputation (AuthorTrust) means a noisy agent's claims need more corroboration, so one bad actor can't poison the swarm. A librarian pass (the brain's "sleep") periodically consolidates, graduates, and prunes so it compounds without rotting. For HA, ReplicatedMemory runs the store as a leader-fenced, fault-tolerant cluster — one leader at a time, mutations replicate to followers (a dead follower can't block writes; a write_quorum sets durability), a deposed leader is fenced out (no split-brain, no SPOF), and a lagging node catches up — automatically, via a background anti-entropy reconciler that pulls the current leader's state. On-disk stores are crash-safe (WAL + fsync), so an acked write survives a leader crash. Reads are local/eventual by default, read-your-writes (route to the leader) for strong consistency, or quorum — versioned records let a point read poll replicas and return the freshest copy, so a read survives the leader being down. Backends: zero-dep LocalMemory or rented mem0; semantic recall + clustering via embeddings. |
| 👁️ Eyes | verel.senses |
AgentVision as a perception organ (DOM/contrast/OCR grounded) feeding both the verdict bus and the brain as one of many senses. |
| ⚖️ Verdict bus | verel.verdict |
One schema for every sense, with an advisory ceiling clamp, grader attestation, scrubbed fingerprints, and strict-subset stuck/progress detection. |
| 🚁 Fleet | verel.fleet |
Agents managing agents — an LLM manager fans out, a scheduler runs workers in isolated git worktrees under budget, each gated by the bus. Concurrent managers are safe via fencing leases (a stale leader's writes are rejected) — enforced even at the remote by a git pre-receive fencing sink (a stale push is refused), and across machines by a hosted control plane (lease authority behind an HTTP API). Multi-repo changes run as one cross-linked DAG and commit as an atomic saga (a failure compensates the repos that already landed, in reverse). |
| 🔧 Tool-smith | verel.toolsmith |
Agents build their own tools: detect → scaffold → test → register → reuse, sandboxed (bwrap), admitted only on a passing attested eval. |
| ♻️ Agent-run CI/CD | verel.ci |
Self-healing pipeline (inner-loop → pre-commit → pre-merge → canary) with a deterministic rollback engine that never acts on advisory evidence. Graders span Python / JS-TS / Go (tests · lint · types) plus perf (budget) and security (SAST/audit) senses — all on one bus, one gate. |
Eyes & Brain — Verel × AgentVision
Two systems, one nervous system. AgentVision is the eyes; Verel is the brain. The eyes perceive a rendered artifact and grade it — including does it match what we set out to build? — then hand a clean signal up the optic nerve. The brain decides with grader attestation, acts, and only verified work compounds into memory. Then the eyes look again.
They ship and version independently (pip install agentvision, pip install verel), but in
sync: AgentVision's perception maps onto Verel's verdict bus as one grounded sense among many,
and its intent conformance (matches_intent) is recorded in the brain's episodic memory
every iteration. A full brain like Verel ingests the rich Report and runs its own gate;
AgentVision's distilled Handoff is there for simpler brains. See
AgentVision's handoff doc.
The eyes can also watch over time — verel.senses.watch(...) drives AgentVision's temporal
verification (playback / loading / liveness for streaming UIs, video, live dashboards). A
deterministic video stall gates the bus to FAIL, and playing / live / stabilized
land in the brain's memory — so a release can be gated on verified playback, and "the player
plays" compounds across builds.
What makes it trustworthy
- Grader attestation — a required grader must present a signed
run_receiptproving it ran the frozen suite over the changed files. A hollowPASS, issues=[]fails the gate. - Precise vs advisory — per-issue trust keys off the source (DOM/CV/OCR/test = precise;
vision/LLM-judge = advisory, clamped to
warn). Destructive actions (rollback) never depend on advisory evidence. - Only verified work compounds — a consolidated rule starts
inferredand reachesverifiedonly by passing a held-out, agent-inaccessible eval (with a leakage canary). - Dogfooded — Verel gates its own development with its own verdict bus (CI runs the
pre-merge gate over Verel and asserts
pass). The infographic above was rendered and verified by the eyes Verel ships.
Many faces, one core
| Surface | For |
|---|---|
Library (import verel) |
Python apps & custom harnesses |
CLI (verel …) |
doctor · loop · fleet · heal · ci |
CI CLI / git hook (verel-ci, python -m verel.ci) |
agent-run CI, pre-commit gates |
MCP server (verel-mcp) |
Cursor, Claude, any MCP host |
Drop it into your workflow & your agents
CI gate (GitHub Action) — unify tests + lint + types into one verdict and fail the build:
# .github/workflows/verify.yml
jobs:
verify:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: amitpatole/verel@v0.31.0
with:
repo: .
install: "-e .[dev]" # your project deps so its tests import
pre-commit (this repo ships .pre-commit-hooks.yaml):
- repo: https://github.com/amitpatole/verel
rev: v0.31.0
hooks: [{ id: verel-precommit }]
Native git hook / any script:
verel-ci check --repo . # verdict bus gate; non-zero exit on FAIL
verel-ci install --repo . # wire a native pre-commit hook
In your agents — verel-mcp exposes the verdict bus + memory to any MCP host; the eyes
(AgentVision) plug in as the sight sense. Add verel[sight] for visual gating, and
verel.senses.watch(...) to gate on verified playback over time.
Real-world scenarios
Six situations a team actually hits — each a runnable script whose output below is real, not mocked. Full write-ups with captured output: Real-world scenarios →.
| # | The situation | What Verel does | Run it |
|---|---|---|---|
| 1 | Your CI went red | Real pytest fails → an agent patches the source (never the tests) → the stage re-gates until the graders go green (terminated_on=passed). |
demo_selfheal.py |
| 2 | A bad merge slipped through | Canary grader fails → deterministic git revert to the last good HEAD — and refuses to act when the only evidence is advisory. |
demo_canary_rollback.py |
| 3 | Scale one fix across many repos | Concurrent managers fenced by leases (stale leader's writes refused, even at the git remote); a multi-repo change commits as an atomic saga — nothing left half-applied. | demo_distributed_fleet.py |
| 4 | A polyglot monorepo | pytest + jest + go test + lint + types + perf budget + security scan all map to one verdict schema, one gate. |
demo_polyglot_ci.py |
| 5 | An agent builds its own tool | detect → scaffold → test → register on a passing held-out eval, then jailed to the syscalls it earned — a socket/subprocess it never exercised is refused at the kernel. | demo_capability_jail.py |
| 6 | A shared team brain | Recall down a self→team→org→global lattice, graduate verified beliefs up; a peer's claim re-verifies before it's trusted; the store is leader-fenced HA with quorum reads that survive the leader being down. |
demo_shared_brain.py |
pip install verel
python examples/demo_selfheal.py # 1 · red CI heals itself (live LLM + real pytest)
python examples/demo_canary_rollback.py # 2 · bad merge auto-reverted on precise evidence
python examples/demo_distributed_fleet.py # 3 · fenced concurrent managers + atomic cross-repo saga
python examples/demo_polyglot_ci.py # 4 · Python/JS/Go + perf + security on one gate
python examples/demo_capability_jail.py # 5 · a tool jailed to the syscalls it earned
python examples/demo_shared_brain.py # 6 · shared brain — un-poisonable, HA, crash-tolerant
More feature-level demos
python examples/demo_consolidation.py # failures → structured rules → a 2nd-order schema
python examples/demo_toolsmith.py # the full detect→scaffold→test→register→reuse lifecycle
python examples/demo_overflow_loop.py # fix a UI until AgentVision returns pass
python examples/demo_fleet_worktrees.py # LLM manager fans out → isolated-worktree workers
python examples/demo_hosted_registry.py # publish a skill over HTTP; another tenant re-verifies
python examples/run_h2.py # LIVE: build skills, measure cross-tenant transfer
python examples/run_h2_sweep.py # LIVE: sweep the transfer measurement across models
Honesty (what we do not claim)
- The in-process tool guard is a guardrail, not a sandbox — real isolation is the
bwrapcontainer runner (isolation="container"): no network, read-only system-only fs, ephemeral tmp, cleared env, and a seccomp-bpf syscall filter (verel[container]). Three profiles, weakest→strongest: denylist (default; EPERM on ptrace/mount/raw-socket/namespace/module/bpf — safe for arbitrary tools), allowlist (default-deny, only what a pure-compute CPython needs — no network, subprocess, or threads), and capability — the tightest: a tool may use only the syscalls it exercised while passing its held-out eval (learned viastrace), so anything it never earned — including a syscall the allow-list would permit — is refused at the kernel. - The moat (a public verified-skill registry) is a bet we measure, not assume — the H2
experiment (
verel.registry) re-verifies live-built skills against other tenants' held-out cases. A two-model sweep (Ollamaqwen3-coder:480band OpenAIgpt-4o-mini, 12 skills × 4 tenants) measured ~88–89% transfer → BUILD on both (results): universal skills transfer 100%, tenant-specific ones only where the rule matches. The decision is swept across models, not taken from one run — and the registry it justifies now ships (RegistryServer/RemoteRegistry): a fetched skill is a candidate until the importer's own eval passes, so distribution moves bytes, never a verdict. - Advisory (vision/LLM) findings are advisory; they inform, they don’t gate destructive acts.
Documentation
📖 Full docs site: amitpatole.github.io/verel
- Get started · 5-minute tutorial · Developer guide · Architecture & roadmap · Module guide · Changelog
License
MIT © Amit Patole · eyes by AgentVision
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file verel-0.31.0.tar.gz.
File metadata
- Download URL: verel-0.31.0.tar.gz
- Upload date:
- Size: 373.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Rocky Linux","version":"9.5","id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e5df516910835e5cadbc463f7c81058c43f2740a12b61162076af8eaf890ba2
|
|
| MD5 |
5a3e9827be4d52e733149ceafbf32e86
|
|
| BLAKE2b-256 |
db6c8a85c492931f0c449dd9c09eb477edb813b1d762784196ce9eff0cf80432
|
File details
Details for the file verel-0.31.0-py3-none-any.whl.
File metadata
- Download URL: verel-0.31.0-py3-none-any.whl
- Upload date:
- Size: 175.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Rocky Linux","version":"9.5","id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ff77b771b52e6289579bde328422012b6c51ea8dbf38b01209fd3dadcbd8e129
|
|
| MD5 |
85f76f77ad083ba00103f6215ee822bc
|
|
| BLAKE2b-256 |
cba69de555307181d7f59c3c7cf5cbf3ad9e0c4276e4c627e2ff46160a54a1c8
|