Adversarial multi-perspective council MCP server for hermes-agent
Project description
hermes-council
Adversarial preflight and decision review for Hermes Agent.
Hermes Council is an MCP server that lets Hermes Agent stress-test plans, diffs, claims, decisions, and risky actions before it acts. It returns structured verdicts, verified evidence snippets, required checks, and DPO preference pairs for evaluator and RL workflows.
Why | What it does | Quickstart | Tools | Architecture | Development
Why It Exists
Autonomous agents are useful because they act. That is also where they fail. The failure mode is not usually a syntax error; it is an overconfident plan, a weak claim, an unsafe command, a diff that looks plausible but breaks a boundary, or a deployment that should have been stopped by one more adversarial review.
Hermes Council gives Hermes Agent a dedicated judgment layer. Before the agent ships a plan, changes code, accepts a claim, or takes a risky action, it can call a council that forces multiple intellectual traditions to argue, dissent, and produce a structured final verdict.
The result is not another long chain-of-thought prompt. It is an MCP toolset with
explicit verdict fields: allow, allow_with_conditions, deny, top_risks,
required_checks, missing_evidence, verified_sources, and next_actions.
What It Does
| Preflight risky actions Review deploys, migrations, file operations, public messages, and other high-stakes actions before execution. Returns a verdict, blocking risks, required checks, and safer alternatives. |
Review plans and diffs Stress-test implementation plans and code diffs for bugs, missing tests, integration failures, security regressions, and weak assumptions. |
| Fact-check claims with evidence Fetch supplied URLs, optionally search the web, pass source snippets to the council, and separate `verified_sources` from model-cited `sources`. |
Compare decisions Evaluate multiple options against explicit criteria and return one recommended path with risks, evidence gaps, and next actions. |
| Run adversarial deliberation Use Advocate, Skeptic, Oracle, Contrarian, and Arbiter personas to expose disagreement and synthesize a calibrated verdict. |
Produce RL signals Extract DPO preference pairs and normalized rewards from council verdicts for evaluator and training workflows. |
Supporting features: custom personas, fast/standard/deep modes, optional audit logs, packaged Hermes skills, OpenAI-compatible provider support, and stdio MCP transport.
Quickstart
Install
pip install "hermes-council @ git+https://github.com/Ridwannurudeen/hermes-council.git"
For RL/evaluator usage:
pip install "hermes-council[rl] @ git+https://github.com/Ridwannurudeen/hermes-council.git"
Configure Hermes Agent
Add the MCP server to ~/.hermes/config.yaml:
mcp_servers:
council:
command: python
args: ["-m", "hermes_council.server"]
hermes-council-server is also installed as a console script. The python -m
form is recommended because it works even when the Python user scripts directory
is not on PATH.
Set one provider key:
export OPENROUTER_API_KEY=your-key-here
PowerShell:
$env:OPENROUTER_API_KEY = "your-key-here"
Then restart Hermes Agent or run /reload-mcp.
Install Hermes Skills
hermes-council install-skills
This copies skill definitions to ~/.hermes/skills/council/.
Smoke Test Without Hermes Agent
python -m hermes_council.server
The server speaks MCP over stdio, so it waits for JSON-RPC messages. Use this mainly to verify that the module entrypoint imports cleanly.
Use The Evaluator Directly
import asyncio
from hermes_council.rl.evaluator import CouncilEvaluator
async def main():
evaluator = CouncilEvaluator(model="nousresearch/hermes-3-llama-3.1-70b")
verdict = await evaluator.evaluate(
content="Ship the migration after tests pass.",
question="Is this deployment plan safe enough?",
criteria=["safety", "rollback", "evidence"],
)
print(verdict.confidence_score)
print(evaluator.normalized_reward(verdict))
asyncio.run(main())
Tools
Hermes Agent exposes MCP tools with a server prefix. With the config above, the
runtime names are mcp_council_council_query,
mcp_council_council_gate, and so on.
| Tool | Purpose | Best Use |
|---|---|---|
council_query |
General adversarial deliberation | Complex questions with no obvious answer |
council_evaluate |
Content quality critique | Research, summaries, specs, and generated answers |
council_gate |
Safety decision | allow, allow_with_conditions, or deny before action |
council_preflight |
Gate with explicit checks | Deployment, migration, irreversible command, public send |
council_review_plan |
Plan review | Implementation plan before coding |
council_review_diff |
Diff review | Code changes before commit or PR |
council_review_claim |
Claim review | Fact-checking with optional evidence retrieval |
council_decision |
Option comparison | Pick one path from two or more options |
Gate Output
{
"success": true,
"verdict": "allow_with_conditions",
"allowed": true,
"can_proceed_now": false,
"required_checks": ["verify rollback", "run dry-run"],
"blocking_risks": ["rollback untested"],
"safe_alternative": "stage the action first",
"action_summary": {
"recommendation": "Proceed after verifying rollback and dry-run output.",
"top_risks": ["rollback untested"],
"missing_evidence": ["dry-run output"],
"next_actions": ["run dry-run", "verify rollback"]
}
}
Council Modes
| Mode | Calls | Use Case |
|---|---|---|
fast |
Skeptic + Arbiter | Cheap pre-checks |
standard |
Advocate + Skeptic + Oracle + Contrarian + Arbiter | Normal review |
deep |
Standard + second Arbiter pass | High-stakes or contentious decisions |
Architecture
Hermes Agent
|
| MCP stdio tool call
v
hermes-council server
|
+--> optional evidence retrieval
| - fetch supplied URLs
| - optional DuckDuckGo HTML search
| - block localhost/private IP targets
|
+--> parallel deliberators
| - Advocate: steel-man the proposal
| - Skeptic: find falsifiers and failure modes
| - Oracle: base rates and empirical grounding
| - Contrarian: challenge the framing
|
+--> Arbiter
- synthesize disagreement
- emit structured JSON verdict
- produce risks, checks, actions, DPO pairs
The server uses an OpenAI-compatible async client. It tries JSON mode first and
falls back to text parsing when a provider rejects response_format.
Evidence Model
evidence_search=true runs retrieval before persona calls.
| Field | Meaning |
|---|---|
verified_sources |
URLs actually fetched and summarized by Hermes Council |
sources |
URLs cited by model outputs |
evidence_errors |
Non-fatal retrieval errors from the evidence layer |
Security boundaries:
- Only
httpandhttpsURLs are fetched. - Localhost, private IPs, link-local IPs, multicast, reserved, and unspecified IPs are blocked before fetch.
- Set
COUNCIL_EVIDENCE_SEARCH=0to disable DuckDuckGo search while still allowing supplied public URLs to be fetched.
Configuration
API key priority is:
COUNCIL_API_KEY > OPENROUTER_API_KEY > NOUS_API_KEY > OPENAI_API_KEY
| Variable | Description | Default |
|---|---|---|
COUNCIL_API_KEY |
Council-specific API key | unset |
OPENROUTER_API_KEY |
OpenRouter API key | unset |
NOUS_API_KEY |
Nous API key | unset |
OPENAI_API_KEY |
OpenAI API key | unset |
COUNCIL_BASE_URL |
Base URL when COUNCIL_API_KEY is used |
https://openrouter.ai/api/v1 |
OPENAI_BASE_URL |
Base URL when OPENAI_API_KEY is used |
https://api.openai.com/v1 |
COUNCIL_MODEL |
Model for persona calls | nousresearch/hermes-3-llama-3.1-70b |
COUNCIL_TIMEOUT |
LLM request timeout in seconds | 60 |
COUNCIL_CONFIG |
Custom persona config path | ~/.hermes-council/config.yaml |
COUNCIL_EVIDENCE_SEARCH |
Enable web search evidence retrieval | 1 |
COUNCIL_EVIDENCE_TIMEOUT |
Evidence fetch timeout in seconds | 8 |
COUNCIL_AUDIT_LOG |
Write local JSON audit records | 0 |
COUNCIL_AUDIT_DIR |
Audit record directory | ~/.hermes-council/audit |
Personas
| Persona | Tradition | Role |
|---|---|---|
| Advocate | Steel-manning | Builds the strongest case for the proposal |
| Skeptic | Popperian falsificationism | Finds the observation that would kill the claim |
| Oracle | Empirical base-rate reasoning | Grounds the debate in history and data |
| Contrarian | Kuhnian paradigm critique | Rejects the framing and proposes alternatives |
| Arbiter | Bayesian synthesis | Updates on all arguments and emits the final verdict |
Custom Personas
Create ~/.hermes-council/config.yaml:
personas:
security_analyst:
tradition: "Adversarial security thinking"
system_prompt: "You are a security analyst. Evaluate every claim for attack vectors, failure modes, and adversarial scenarios."
scoring_weights:
threat_assessment: 0.4
evidence: 0.3
rigor: 0.3
tags: ["security", "adversarial"]
Custom personas merge with the defaults. Use the same name to override a default.
Project Layout
hermes-council/
src/hermes_council/
server.py # FastMCP stdio server and public tool handlers
deliberation.py # persona orchestration, modes, JSON negotiation, DPO pairs
evidence.py # URL/search evidence retrieval and SSRF guards
audit.py # optional local JSON verdict logs
client.py # provider config and AsyncOpenAI singleton
personas.py # default and custom persona definitions
schemas.py # Pydantic models for structured model output
parsing.py # fallback text parsers for non-JSON providers
cli.py # skill installer
rl/evaluator.py # direct evaluator API for reward and DPO workflows
skills/council/ # packaged Hermes skill definitions
examples/ # Atropos/Ouroboros evaluator example
tests/ # unit, integration, packaging, and MCP runtime tests
docs/plans/ # original design and implementation notes
Development
git clone https://github.com/Ridwannurudeen/hermes-council.git
cd hermes-council
pip install -e ".[dev]"
python -m pytest -q
Useful checks:
python -m pytest -q
python -m ruff check src tests
python -m pytest --cov=hermes_council --cov-report=term-missing -q
python -m pip wheel --no-deps . -w dist
Verification
The test suite covers:
- MCP stdio server startup via
python -m hermes_council.server - tool discovery for all council tools
- Hermes-compatible no-key failure behavior
- gate verdict semantics
- evidence retrieval and private-network URL blocking
- packaged skill installation from a wheel
- audit log writing
- custom persona loading
- JSON-mode and fallback parsing paths
Honest Limitations
- A real provider key is required for actual model-backed verdicts.
- DuckDuckGo HTML search can change; supplied URLs are more reliable than search results.
verified_sourcesproves retrieval, not truth. The Arbiter still weighs the evidence.- The server is stdio MCP only. There is no hosted HTTP service in this repo.
- The council adds latency and token cost. Use
fastmode for routine preflight. - Model compliance with JSON fields depends on provider/model behavior, though fallback parsing is implemented.
Roadmap
- Add a live-provider smoke workflow that runs only when a CI secret is present.
- Add source ranking and citation-quality scoring for evidence snippets.
- Add a compact verdict-only mode for very low-latency gates.
- Add optional HTTP/streamable MCP transport.
- Add first-class examples for Hermes Agent plan review and diff review sessions.
- Add benchmark fixtures comparing council review against single-model critique.
Origin
This project was built in response to feedback on hermes-agent PR #848, where the adversarial council concept was proposed as a core subsystem. The recommendation was to rebuild it as an external MCP server to avoid core tool injection, provider bypass, hidden LLM costs, and brittle parsing.
Related integration PR: NousResearch/hermes-agent#1972.
Contributing
- Keep MCP tool outputs structured and backward-compatible where possible.
- Add tests for every new tool field or runtime boundary.
- Use
python -m hermes_council.serverin docs and integration examples. - Keep provider-specific behavior behind OpenAI-compatible client settings.
- Run tests, lint, coverage, and wheel packaging checks before opening a PR.
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hermes_council-0.1.1.tar.gz.
File metadata
- Download URL: hermes_council-0.1.1.tar.gz
- Upload date:
- Size: 51.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
539dd1d6435a5382c419b409bf7946509034c62c44a190c7615680837331a54b
|
|
| MD5 |
bd697075924baa7282042fa8dcdda940
|
|
| BLAKE2b-256 |
5809b5e0f302b1cfe64af81ad4f592ed112ef9c3957fdbfb31e63900233b03ba
|
File details
Details for the file hermes_council-0.1.1-py3-none-any.whl.
File metadata
- Download URL: hermes_council-0.1.1-py3-none-any.whl
- Upload date:
- Size: 40.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cd79ce23eff7e2faf70da4ca6a43df242199b42661786e0ac860eff718f5448c
|
|
| MD5 |
dca1e6ee2281392c65b261904fefd213
|
|
| BLAKE2b-256 |
43e140fc01f38453e6276513f568e181808edba952f2f15f180330252fb33410
|