Skip to main content

Control plane for multi-agent vetted decisionmaking across org knowledge and channels

Project description

Aragora

Aragora orchestrates 42 AI agents to adversarially vet decisions through structured debate, delivering audit-ready decision receipts. Built for enterprises where AI decisions carry real consequences.

The Decision Integrity Platform

PyPI Tests Python 3.10+ License: MIT

Individual LLMs are unreliable. Their personas shift with context, their confidence doesn't correlate with accuracy, and they say what you want to hear. For consequential decisions, you need infrastructure that treats this as a feature to be engineered around, not a problem to be ignored.

Aragora orchestrates 42 agent types in structured adversarial debates -- forcing models to challenge each other's reasoning, surface blind spots, and produce decisions with complete audit trails showing where they agreed, where they disagreed, and why.

Try It Now

pip install aragora

# Zero-config demo — runs a full adversarial debate, no API keys needed
aragora demo

# Or run the guided quickstart (opens receipt in your browser)
aragora quickstart --demo
What you'll see (click to expand)
================================================================
  ARAGORA DEMO -- Adversarial Decision Stress-Test
================================================================

  Topic:  Should we adopt microservices?
  Agents: Analyst, Critic, Synthesizer, Devil's Advocate
  Rounds: 2

  --- Round 1 --------------------------------------------------

  [ANALYST] (supportive)
    This is a sound strategy. The evidence points toward
    significant gains in maintainability and team productivity.

  [CRITIC] (critical)
    The claimed benefits are overstated. Most organizations
    underestimated the operational burden by 3-5x. I recommend
    a modular monolith as the safer path.

  [SYNTHESIZER] (balanced)
    The tradeoffs here are real. On one hand, the current
    architecture limits independent scaling. On the other,
    the migration carries execution risk.

  --- Decision Receipt -----------------------------------------

  Verdict:    CONDITIONAL APPROVAL
  Confidence: 72%
  Consensus:  Partial (3 of 4 agents)
  Dissent:    Devil's Advocate flagged migration risk
# Review your current changes against main
git diff main | aragora review --demo

# Or review a GitHub PR
aragora review --pr https://github.com/org/repo/pull/123 --demo
# Stress-test a specification
aragora gauntlet spec.md --profile thorough --output receipt.html

# Run a multi-agent debate
aragora ask "Design a rate limiter for 1M req/sec" --agents anthropic-api,openai-api,gemini

# Start the API server
aragora serve

Add to Your CI Pipeline (1 minute)

# .github/workflows/aragora-review.yml
name: Aragora Review
on:
  pull_request:
    types: [opened, synchronize]
jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: an0mium/aragora@main
        with:
          anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}
          openai-api-key: ${{ secrets.OPENAI_API_KEY }}

Or generate it automatically: aragora init --ci github


Five Pillars

Aragora is built on five architectural commitments designed for a world where individual AI agents cannot be trusted with consequential decisions alone.

1. SMB-Ready, Enterprise-Grade

Aragora is useful to a 5-person startup on day one and scales to regulated enterprise without rearchitecting. Enterprise features -- OIDC/SAML SSO, MFA, AES-256-GCM encryption, multi-tenant isolation, RBAC with 7 roles and 360+ permissions, SOC 2 / GDPR / HIPAA compliance frameworks -- are built in, not bolted on. Security hardening (rate limiting, SSRF protection, path traversal guards, input validation, audit trails) is the default, not a premium tier.

2. Leading-Edge Memory and Context

Single agents lose context. Aragora's 4-tier Continuum Memory (fast / medium / slow / glacial) and Knowledge Mound with 33 registered adapters give every debate access to institutional history, cross-session learning, and evidence provenance. The RLM (Recursive Language Models) system compresses and structures context to reduce prompt bloat, enabling debates that sustain coherence across long multi-round sessions and large document sets where individual models would degrade.

3. Extensible and Modular

Connectors for Slack, Teams, Discord, Telegram, WhatsApp, email, voice, Kafka, RabbitMQ, GitHub, Jira, Salesforce, healthcare HL7/FHIR, and dozens more. SDKs in Python and TypeScript (140 namespaces in the TypeScript SDK). 2,000+ API operations across 1,800+ paths and 190+ WebSocket event types. OpenClaw integration for portable agent governance. A workflow engine with DAG execution and 50+ templates. A marketplace for agent personas, debate templates, and workflow patterns. Aragora adapts to your stack -- not the other way around.

4. Multi-Agent Robustness

Individual LLMs exhibit persona instability -- their outputs shift based on framing, context, and even prompt ordering. Aragora treats this as a feature: by running Claude, GPT, Gemini, Grok, Mistral, DeepSeek, Qwen, Kimi, and local models in structured Propose / Critique / Revise debates, the system surfaces disagreements that reveal genuine uncertainty. ELO rankings track agent performance. Calibration scoring (Brier scores) measures prediction accuracy. The Trickster detects hollow consensus where models agree without genuine reasoning. The result: when models with different training data independently converge on an answer, that convergence is meaningful -- and when they disagree, the dissent trail tells you exactly where human judgment is needed.

5. Self-Healing and Self-Extending

The Nomic Loop is Aragora's autonomous self-improvement system: agents debate improvements to the codebase, design solutions, implement code, run tests, and verify changes -- with human approval gates and automatic rollback on failure. This is how Aragora grew from a debate engine to 3,200+ modules. Red-team mode stress-tests the platform's own specs. The Gauntlet runs adversarial attacks against proposed changes. The system hardens itself.


Why Aragora?

A single LLM will confidently give you a wrong answer and you won't know it. Research shows that LLM personas are context-dependent, fragile under adversarial pressure, and prone to sycophantic agreement with whoever is asking. Stanford's taxonomy of LLM reasoning failures documents systematic breakdowns in formal logic, unfaithful chain-of-thought, and robustness failures under minor prompt variations -- exactly the failure modes that structured adversarial debate is designed to surface. When the decision matters -- hiring, architecture, compliance, strategy -- one model's opinion is insufficient.

Aragora treats each model as an unreliable witness and uses structured debate protocols to extract signal from their disagreements:

What you get How it works
Adversarial Validation Models with different training data and blind spots challenge each other's reasoning
Decision Receipts Cryptographic audit trails with evidence chains, dissent tracking, and confidence calibration
Gauntlet Mode Red-team stress-tests for specs, policies, and architectures using adversarial personas
Calibrated Trust ELO rankings and Brier scores track which models are actually reliable on which domains
Institutional Memory Decisions persist across sessions with 4-tier memory and Knowledge Mound (33 adapters)
Channel Delivery Results route to Slack, Teams, Discord, Telegram, WhatsApp, email, or voice

Quick Start

1. Install and Try It (30 seconds)

pip install aragora

# Run a zero-config demo debate — opens receipt in your browser
aragora quickstart --demo

# Or review your uncommitted changes — no API keys needed in demo mode
git diff main | aragora review --demo

See docs/QUICKSTART_DEVELOPER.md for the full developer quickstart.

2. Run Debates and Start the Server

# Set at least one API key
export ANTHROPIC_API_KEY=your-key  # or OPENAI_API_KEY, GEMINI_API_KEY, XAI_API_KEY

# Run a multi-agent debate
aragora ask "Should we adopt microservices?" --agents anthropic-api,openai-api --rounds 3

# Start the API server
aragora serve

See docs/guides/GETTING_STARTED.md for the complete 5-minute setup.

3. Develop with the SDK

Package Install Purpose PyPI
aragora pip install aragora Full platform (server, CLI, debate engine) v2.6.3
aragora-debate pip install aragora-debate Standalone debate engine (no server needed) v0.2.0
aragora-sdk pip install aragora-sdk Python client SDK for connecting to aragora v2.6.3
@aragora/sdk npm install @aragora/sdk TypeScript/Node.js client SDK

Core Workflows

1. Gauntlet Mode -- Adversarial Stress Testing

Stress-test specs, architectures, and policies before they ship:

aragora gauntlet spec.md --input-type spec --profile quick
aragora gauntlet policy.yaml --input-type policy --persona gdpr
aragora gauntlet architecture.md --profile thorough --output report.html
Attack Type What It Tests
Red Team Security holes, injection points, auth bypasses
Devil's Advocate Logic flaws, hidden assumptions, edge cases
Scaling Critic Performance bottlenecks, SPOF, thundering herd
Compliance GDPR, HIPAA, SOC 2, AI Act violations

Decision receipts provide cryptographic audit trails for every finding.

2. AI Code Review

Get multi-model consensus on your pull requests:

git diff main | aragora review
aragora review https://github.com/owner/repo/pull/123
aragora review --demo  # try without API keys

When 3+ independent models with different training data agree on an issue, that convergence is meaningful. Split opinions show where human judgment is needed -- the disagreement is the signal.

3. Structured Debates

The debate protocol follows thesis > antithesis > synthesis:

  1. Propose -- Agents generate initial responses from different perspectives
  2. Critique -- Agents challenge each other's proposals with severity scores
  3. Revise -- Proposers incorporate valid critiques
  4. Synthesize -- Judge combines best elements into a final answer

Configurable consensus: majority, unanimous, judge-based, or none.


Architecture

aragora/
├── debate/         # Core debate engine (210+ modules)
│   ├── orchestrator.py   # Arena -- main debate loop
│   ├── consensus.py      # Consensus detection and proofs
│   ├── convergence.py    # Semantic similarity detection
│   └── phases/           # Propose, critique, revise, vote, judge
├── agents/         # 42 registered agent types (CLI, direct API, OpenRouter, local)
│   ├── api_agents/       # Anthropic, OpenAI, Gemini, Grok, Mistral, OpenRouter
│   ├── cli_agents.py     # Claude Code, Codex, Gemini CLI, Grok CLI
│   └── fallback.py       # OpenRouter fallback on quota errors
├── gauntlet/       # Adversarial stress testing
├── knowledge/      # Knowledge Mound with 33 registered adapters
├── memory/         # 4-tier memory (fast/medium/slow/glacial)
├── server/         # 2,000+ API operations, 190+ WebSocket event types
├── pipeline/       # Decision-to-PR generation
├── genesis/        # Fractal debates, agent evolution
├── sandbox/        # Docker-based safe execution
├── rbac/           # Role-based access control (7 roles, 360+ permissions)
├── compliance/     # SOC 2, GDPR, HIPAA frameworks
└── workflow/       # DAG-based automation engine

Scale: 3,200+ Python modules | 137,000+ tests


Programmatic Usage

from aragora import Arena, Environment, DebateProtocol
from aragora.agents import create_agent

agents = [
    create_agent("anthropic-api", name="claude", role="proposer"),
    create_agent("openai-api", name="gpt", role="critic"),
    create_agent("gemini", name="gemini", role="synthesizer"),
]

env = Environment(task="Design a distributed cache with LRU eviction")
protocol = DebateProtocol(rounds=3, consensus="majority")
arena = Arena(env, agents, protocol)
result = await arena.run()

print(result.final_answer)
print(f"Consensus: {result.consensus_reached} ({result.confidence:.0%})")

Python SDK

from aragora.client import AragoraClient

client = AragoraClient(base_url="http://localhost:8080")
debate = client.debates.run(task="Should we adopt microservices?")
receipt = await client.gauntlet.run_and_wait(input_content="spec.md")

See docs/SDK_GUIDE.md for the full API.


Channels and Integrations

Aragora delivers debate results to wherever your team works:

Channel Status
Slack Bot + OAuth
Microsoft Teams Bot + OAuth
Discord Interactions API
Telegram Bot API
WhatsApp Business API
Email SMTP + Gmail + Outlook
Voice TTS integration
Webhooks Custom delivery

Results automatically route to the originating channel via bidirectional chat routing.

See docs/integrations/INTEGRATIONS.md for setup.


Enterprise Features

Category Capabilities
Authentication OIDC/SAML SSO, MFA (TOTP/HOTP), API key management, SCIM 2.0
Multi-Tenancy Tenant isolation, resource quotas, usage metering
Security AES-256-GCM encryption, rate limiting, SSRF protection, key rotation
Compliance SOC 2 controls, GDPR support, HIPAA, audit trails
Observability Prometheus metrics, Grafana dashboards, OpenTelemetry tracing
RBAC 7 roles, 360+ permissions, decorator-based enforcement
Backup Incremental backups, retention policies, disaster recovery
Control Plane Agent registry, task scheduler, health monitoring, policy governance

See docs/enterprise/ENTERPRISE_FEATURES.md for details.


Self-Improvement (Nomic Loop)

Aragora includes an autonomous self-improvement system where agents debate and implement improvements to the codebase itself. Experimental -- always run in a sandbox with human review.

python scripts/run_nomic_with_stream.py run --cycles 3
python scripts/self_develop.py --goal "Improve test coverage" --require-approval

Safety: automatic backups, protected file checksums, rollback on failure, human approval gates.


Deployment

# Local development
aragora serve --api-port 8080 --ws-port 8765

# Docker (self-hosted)
git clone https://github.com/an0mium/aragora.git && cd aragora
cp .env.example .env  # add your API keys
docker compose -f deploy/docker-compose.yml up

See docs/SELF_HOSTING.md for the full self-hosting guide.

API: REST endpoints at /api/v2/* | WebSocket streaming at /ws Docs: OpenAPI at /api/openapi | Swagger UI at /api/docs


Security

  • Ed25519 signature verification for webhooks (Discord, Slack)
  • Rate limiting (IP, token, and endpoint-based)
  • Input validation and content-length enforcement
  • CORS allowlists, security headers, error message sanitization
  • Path traversal protection, upload validation with magic byte checking
  • WebSocket message limits (64KB), debate timeouts, backpressure control

See docs/enterprise/SECURITY.md and docs/enterprise/COMPLIANCE.md.


Documentation

Need Where
Developer quickstart QUICKSTART_DEVELOPER.md
First-time setup GETTING_STARTED.md
API reference API_REFERENCE.md
SDK guide SDK_GUIDE.md
Enterprise features ENTERPRISE_FEATURES.md
Gauntlet guide GAUNTLET.md
Agent catalog AGENTS.md
Feature discovery FEATURE_DISCOVERY.md
Extended README EXTENDED_README.md
Full index INDEX.md

Inspiration and Citations

Aragora synthesizes ideas from these open-source projects:

See the full attribution table in docs/reference/CREDITS.md.


Contributing

Contributions welcome. Areas of interest:

  • Additional agent backends
  • Debate visualization
  • Benchmark datasets for agent evaluation
  • Lean 4 theorem proving integration

License

MIT


The name "aragora" evokes the Greek agora -- the public assembly where citizens debated and reached collective decisions through reasoned discourse.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aragora-2.7.2.tar.gz (12.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aragora-2.7.2-py3-none-any.whl (12.1 MB view details)

Uploaded Python 3

File details

Details for the file aragora-2.7.2.tar.gz.

File metadata

  • Download URL: aragora-2.7.2.tar.gz
  • Upload date:
  • Size: 12.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aragora-2.7.2.tar.gz
Algorithm Hash digest
SHA256 64d4d9a7e0649ab13e41e44701abd89df2b897d2259e64d8469fd6e9034d7172
MD5 eb50ffd2f13b8304f14400ac4544dd69
BLAKE2b-256 c1cdb83c1e0236396350bd22873e2171240fac0420f261f957fd51c98d6c1bde

See more details on using hashes here.

File details

Details for the file aragora-2.7.2-py3-none-any.whl.

File metadata

  • Download URL: aragora-2.7.2-py3-none-any.whl
  • Upload date:
  • Size: 12.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aragora-2.7.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f031bf85253afc0dbd3fe7a4699b64127e3856f400591ba47e2baaf662dd4dd5
MD5 5f77eb26e8375990a8ff1b98b9c84315
BLAKE2b-256 b88fb7045af5a204eba2d486eec09468875f9e372c1c0b2cddcd2038354003a4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page