Skip to main content

Adversarial multi-perspective council MCP server for hermes-agent

Project description

hermes-council

Adversarial preflight and decision review for Hermes Agent.

Hermes Council is an MCP server that lets Hermes Agent stress-test plans, diffs, claims, decisions, and risky actions before it acts. It returns structured verdicts, verified evidence snippets, required checks, and DPO preference pairs for evaluator and RL workflows.

CI Coverage Python MCP License

Why | What it does | Quickstart | Tools | Architecture | Development

Why It Exists

Autonomous agents are useful because they act. That is also where they fail. The failure mode is not usually a syntax error; it is an overconfident plan, a weak claim, an unsafe command, a diff that looks plausible but breaks a boundary, or a deployment that should have been stopped by one more adversarial review.

Hermes Council gives Hermes Agent a dedicated judgment layer. Before the agent ships a plan, changes code, accepts a claim, or takes a risky action, it can call a council that forces multiple intellectual traditions to argue, dissent, and produce a structured final verdict.

The result is not another long chain-of-thought prompt. It is an MCP toolset with explicit verdict fields: allow, allow_with_conditions, deny, top_risks, required_checks, missing_evidence, verified_sources, and next_actions.

What It Does

Preflight risky actions
Review deploys, migrations, file operations, public messages, and other high-stakes actions before execution. Returns a verdict, blocking risks, required checks, and safer alternatives.
Review plans and diffs
Stress-test implementation plans and code diffs for bugs, missing tests, integration failures, security regressions, and weak assumptions.
Fact-check claims with evidence
Fetch supplied URLs, optionally search the web, pass source snippets to the council, and separate `verified_sources` from model-cited `sources`.
Compare decisions
Evaluate multiple options against explicit criteria and return one recommended path with risks, evidence gaps, and next actions.
Run adversarial deliberation
Use Advocate, Skeptic, Oracle, Contrarian, and Arbiter personas to expose disagreement and synthesize a calibrated verdict.
Produce RL signals
Extract DPO preference pairs and normalized rewards from council verdicts for evaluator and training workflows.

Supporting features: custom personas, fast/standard/deep modes, optional audit logs, packaged Hermes skills, OpenAI-compatible provider support, and stdio MCP transport.

Quickstart

Install

pip install "hermes-council @ git+https://github.com/Ridwannurudeen/hermes-council.git"

For RL/evaluator usage:

pip install "hermes-council[rl] @ git+https://github.com/Ridwannurudeen/hermes-council.git"

Configure Hermes Agent

Add the MCP server to ~/.hermes/config.yaml:

mcp_servers:
  council:
    command: python
    args: ["-m", "hermes_council.server"]

hermes-council-server is also installed as a console script. The python -m form is recommended because it works even when the Python user scripts directory is not on PATH.

Set one provider key:

export OPENROUTER_API_KEY=your-key-here

PowerShell:

$env:OPENROUTER_API_KEY = "your-key-here"

Then restart Hermes Agent or run /reload-mcp.

Install Hermes Skills

hermes-council install-skills

This copies skill definitions to ~/.hermes/skills/council/.

Smoke Test Without Hermes Agent

python -m hermes_council.server

The server speaks MCP over stdio, so it waits for JSON-RPC messages. Use this mainly to verify that the module entrypoint imports cleanly.

Use The Evaluator Directly

import asyncio
from hermes_council.rl.evaluator import CouncilEvaluator


async def main():
    evaluator = CouncilEvaluator(model="nousresearch/hermes-3-llama-3.1-70b")
    verdict = await evaluator.evaluate(
        content="Ship the migration after tests pass.",
        question="Is this deployment plan safe enough?",
        criteria=["safety", "rollback", "evidence"],
    )
    print(verdict.confidence_score)
    print(evaluator.normalized_reward(verdict))


asyncio.run(main())

Tools

Hermes Agent exposes MCP tools with a server prefix. With the config above, the runtime names are mcp_council_council_query, mcp_council_council_gate, and so on.

Tool Purpose Best Use
council_query General adversarial deliberation Complex questions with no obvious answer
council_evaluate Content quality critique Research, summaries, specs, and generated answers
council_gate Safety decision allow, allow_with_conditions, or deny before action
council_preflight Gate with explicit checks Deployment, migration, irreversible command, public send
council_review_plan Plan review Implementation plan before coding
council_review_diff Diff review Code changes before commit or PR
council_review_claim Claim review Fact-checking with optional evidence retrieval
council_decision Option comparison Pick one path from two or more options

Gate Output

{
  "success": true,
  "verdict": "allow_with_conditions",
  "allowed": true,
  "can_proceed_now": false,
  "required_checks": ["verify rollback", "run dry-run"],
  "blocking_risks": ["rollback untested"],
  "safe_alternative": "stage the action first",
  "action_summary": {
    "recommendation": "Proceed after verifying rollback and dry-run output.",
    "top_risks": ["rollback untested"],
    "missing_evidence": ["dry-run output"],
    "next_actions": ["run dry-run", "verify rollback"]
  }
}

Council Modes

Mode Calls Use Case
fast Skeptic + Arbiter Cheap pre-checks
standard Advocate + Skeptic + Oracle + Contrarian + Arbiter Normal review
deep Standard + second Arbiter pass High-stakes or contentious decisions

Architecture

Hermes Agent
    |
    | MCP stdio tool call
    v
hermes-council server
    |
    +--> optional evidence retrieval
    |       - fetch supplied URLs
    |       - optional DuckDuckGo HTML search
    |       - block localhost/private IP targets
    |
    +--> parallel deliberators
    |       - Advocate: steel-man the proposal
    |       - Skeptic: find falsifiers and failure modes
    |       - Oracle: base rates and empirical grounding
    |       - Contrarian: challenge the framing
    |
    +--> Arbiter
            - synthesize disagreement
            - emit structured JSON verdict
            - produce risks, checks, actions, DPO pairs

The server uses an OpenAI-compatible async client. It tries JSON mode first and falls back to text parsing when a provider rejects response_format.

Evidence Model

evidence_search=true runs retrieval before persona calls.

Field Meaning
verified_sources URLs actually fetched and summarized by Hermes Council
sources URLs cited by model outputs
evidence_errors Non-fatal retrieval errors from the evidence layer

Security boundaries:

  • Only http and https URLs are fetched.
  • Localhost, private IPs, link-local IPs, multicast, reserved, and unspecified IPs are blocked before fetch.
  • Set COUNCIL_EVIDENCE_SEARCH=0 to disable DuckDuckGo search while still allowing supplied public URLs to be fetched.

Configuration

API key priority is:

COUNCIL_API_KEY > OPENROUTER_API_KEY > NOUS_API_KEY > OPENAI_API_KEY
Variable Description Default
COUNCIL_API_KEY Council-specific API key unset
OPENROUTER_API_KEY OpenRouter API key unset
NOUS_API_KEY Nous API key unset
OPENAI_API_KEY OpenAI API key unset
COUNCIL_BASE_URL Base URL when COUNCIL_API_KEY is used https://openrouter.ai/api/v1
OPENAI_BASE_URL Base URL when OPENAI_API_KEY is used https://api.openai.com/v1
COUNCIL_MODEL Model for persona calls nousresearch/hermes-3-llama-3.1-70b
COUNCIL_TIMEOUT LLM request timeout in seconds 60
COUNCIL_CONFIG Custom persona config path ~/.hermes-council/config.yaml
COUNCIL_EVIDENCE_SEARCH Enable web search evidence retrieval 1
COUNCIL_EVIDENCE_TIMEOUT Evidence fetch timeout in seconds 8
COUNCIL_AUDIT_LOG Write local JSON audit records 0
COUNCIL_AUDIT_DIR Audit record directory ~/.hermes-council/audit

Personas

Persona Tradition Role
Advocate Steel-manning Builds the strongest case for the proposal
Skeptic Popperian falsificationism Finds the observation that would kill the claim
Oracle Empirical base-rate reasoning Grounds the debate in history and data
Contrarian Kuhnian paradigm critique Rejects the framing and proposes alternatives
Arbiter Bayesian synthesis Updates on all arguments and emits the final verdict

Custom Personas

Create ~/.hermes-council/config.yaml:

personas:
  security_analyst:
    tradition: "Adversarial security thinking"
    system_prompt: "You are a security analyst. Evaluate every claim for attack vectors, failure modes, and adversarial scenarios."
    scoring_weights:
      threat_assessment: 0.4
      evidence: 0.3
      rigor: 0.3
    tags: ["security", "adversarial"]

Custom personas merge with the defaults. Use the same name to override a default.

Project Layout

hermes-council/
  src/hermes_council/
    server.py          # FastMCP stdio server and public tool handlers
    deliberation.py    # persona orchestration, modes, JSON negotiation, DPO pairs
    evidence.py        # URL/search evidence retrieval and SSRF guards
    audit.py           # optional local JSON verdict logs
    client.py          # provider config and AsyncOpenAI singleton
    personas.py        # default and custom persona definitions
    schemas.py         # Pydantic models for structured model output
    parsing.py         # fallback text parsers for non-JSON providers
    cli.py             # skill installer
    rl/evaluator.py    # direct evaluator API for reward and DPO workflows
  skills/council/      # packaged Hermes skill definitions
  examples/            # Atropos/Ouroboros evaluator example
  tests/               # unit, integration, packaging, and MCP runtime tests
  docs/plans/          # original design and implementation notes

Development

git clone https://github.com/Ridwannurudeen/hermes-council.git
cd hermes-council
pip install -e ".[dev]"
python -m pytest -q

Useful checks:

python -m pytest -q
python -m ruff check src tests
python -m pytest --cov=hermes_council --cov-report=term-missing -q
python -m pip wheel --no-deps . -w dist

Verification

The test suite covers:

  • MCP stdio server startup via python -m hermes_council.server
  • tool discovery for all council tools
  • Hermes-compatible no-key failure behavior
  • gate verdict semantics
  • evidence retrieval and private-network URL blocking
  • packaged skill installation from a wheel
  • audit log writing
  • custom persona loading
  • JSON-mode and fallback parsing paths

Honest Limitations

  • A real provider key is required for actual model-backed verdicts.
  • DuckDuckGo HTML search can change; supplied URLs are more reliable than search results.
  • verified_sources proves retrieval, not truth. The Arbiter still weighs the evidence.
  • The server is stdio MCP only. There is no hosted HTTP service in this repo.
  • The council adds latency and token cost. Use fast mode for routine preflight.
  • Model compliance with JSON fields depends on provider/model behavior, though fallback parsing is implemented.

Roadmap

  • Add a live-provider smoke workflow that runs only when a CI secret is present.
  • Add source ranking and citation-quality scoring for evidence snippets.
  • Add a compact verdict-only mode for very low-latency gates.
  • Add optional HTTP/streamable MCP transport.
  • Add first-class examples for Hermes Agent plan review and diff review sessions.
  • Add benchmark fixtures comparing council review against single-model critique.

Origin

This project was built in response to feedback on hermes-agent PR #848, where the adversarial council concept was proposed as a core subsystem. The recommendation was to rebuild it as an external MCP server to avoid core tool injection, provider bypass, hidden LLM costs, and brittle parsing.

Related integration PR: NousResearch/hermes-agent#1972.

Contributing

  • Keep MCP tool outputs structured and backward-compatible where possible.
  • Add tests for every new tool field or runtime boundary.
  • Use python -m hermes_council.server in docs and integration examples.
  • Keep provider-specific behavior behind OpenAI-compatible client settings.
  • Run tests, lint, coverage, and wheel packaging checks before opening a PR.

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hermes_council-0.1.1.tar.gz (51.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hermes_council-0.1.1-py3-none-any.whl (40.4 kB view details)

Uploaded Python 3

File details

Details for the file hermes_council-0.1.1.tar.gz.

File metadata

  • Download URL: hermes_council-0.1.1.tar.gz
  • Upload date:
  • Size: 51.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for hermes_council-0.1.1.tar.gz
Algorithm Hash digest
SHA256 539dd1d6435a5382c419b409bf7946509034c62c44a190c7615680837331a54b
MD5 bd697075924baa7282042fa8dcdda940
BLAKE2b-256 5809b5e0f302b1cfe64af81ad4f592ed112ef9c3957fdbfb31e63900233b03ba

See more details on using hashes here.

File details

Details for the file hermes_council-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: hermes_council-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 40.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for hermes_council-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cd79ce23eff7e2faf70da4ca6a43df242199b42661786e0ac860eff718f5448c
MD5 dca1e6ee2281392c65b261904fefd213
BLAKE2b-256 43e140fc01f38453e6276513f568e181808edba952f2f15f180330252fb33410

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page