Skip to main content

Multi-agent MCP server with trust scoring and multi-round debate

Project description

GlassBox AI ๐Ÿ’Ž

Transparent Multi-Agent Systems with Trust Scoring

The first production-ready framework for building auditable, trustworthy multi-agent AI systems with runtime trust distribution.


๐ŸŽฏ Vision

Transform AI from black boxes to glass boxes - where every decision is traceable, every agent is accountable, and trust is earned through verified outcomes.

Problem: Existing multi-agent frameworks (CrewAI, AutoGen) orchestrate agents but hide their reasoning. You see the final answer, not WHY it was chosen or WHO to trust.

Solution: GlassBox AI adds a transparent trust layer - agents debate, trust scores weight their votes, outcomes update reputations, and full provenance chains show exactly how decisions were made.


๐Ÿ“ Current Status

Milestone 1 (In Progress): MCP Server MVP
Target: Feb 15, 2026
Status: ๐ŸŸก Building core components


๐Ÿ—“๏ธ Roadmap & Milestones

โœ… Milestone 0: Research & Design (DONE)

  • Analyze competitive landscape (LIME, SHAP, CrewAI, AutoGen, Vector Institute research)
  • Identify unique positioning (trust scoring + claim verification + provenance)
  • Define architecture (MCP server โ†’ orchestrator โ†’ trust DB)

๐ŸŸก Milestone 1: MCP Server MVP (Feb 13-15, 2026)

Goal: Working multi-agent MCP server that integrates with Windsurf

Deliverables:

  • Project structure
  • server.py - MCP protocol handlers
  • orchestrator.py - Parallel GPT agent execution with 3 personas
  • trust_db.py - SQLite persistence for trust scores
  • requirements.txt and .env.example
  • Dockerfile for GHCR deployment
  • README with setup instructions
  • Test with Windsurf locally

Agents (GPT-only for MVP):

  • @architect - Long-term thinking, scalability focus
  • @pragmatist - Ship fast, iterate, business value
  • @critic - Edge cases, security, failure modes

Trust Mechanism:

  • Initial scores: 0.85 for all agents
  • Update formula: new_trust = old_trust + 0.1 * (outcome - old_trust)
  • Weighted consensus by trust

๐Ÿ”ฒ Milestone 2: Claim Verification Layer (Feb 16-20, 2026)

Goal: Add fact-checking to agent responses

Deliverables:

  • verifier.py - Claim extraction from agent responses
  • Source grounding validation (does citation support claim?)
  • Confidence scoring per claim
  • Provenance chain tracking (claim โ†’ agent โ†’ source โ†’ line number)

Example:

Agent says: "Use Redis for caching." [Source: docs.redis.io]
Verifier checks: Does source actually recommend Redis for this use case?
Result: โœ… Supported (confidence: 0.92)

๐Ÿ”ฒ Milestone 3: Web Dashboard (Feb 21-28, 2026)

Goal: Visual interface for trust evolution and agent debates

Deliverables:

  • FastAPI backend serving agent analysis API
  • React frontend with:
    • Real-time agent conversation display
    • Trust score graphs over time
    • Provenance tree visualization
    • Manual trust adjustment controls
  • Deployed demo at demo.glassbox-ai.dev

UI Mockup:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๐Ÿค– Multi-Agent Analysis                โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ @architect (Trust: 0.92) ๐Ÿ“ˆ            โ”‚
โ”‚ "Use Postgres with materialized views" โ”‚
โ”‚                                         โ”‚
โ”‚ @pragmatist (Trust: 0.85) ๐Ÿ“Š           โ”‚
โ”‚ "Start with Redis, migrate later"      โ”‚
โ”‚                                         โ”‚
โ”‚ @critic (Trust: 0.88) โš ๏ธ               โ”‚
โ”‚ "What's your eviction policy?"          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ โš–๏ธ Weighted Consensus: Redis (0.87)    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ”ฒ Milestone 4: Multi-Model Support (Mar 1-7, 2026)

Goal: Support Claude + GPT + Gemini for true agent diversity

Deliverables:

  • Anthropic API integration
  • Google Gemini API integration
  • Agent pool with mixed models:
    • @architect โ†’ Claude Opus
    • @pragmatist โ†’ GPT-4o
    • @critic โ†’ Claude Sonnet
    • @innovator โ†’ Gemini Pro
  • Cost tracking per agent/model

๐Ÿ”ฒ Milestone 5: Production Hardening (Mar 8-15, 2026)

Goal: Enterprise-ready deployment

Deliverables:

  • Rate limiting and retry logic
  • Error recovery and fallbacks
  • Observability (Prometheus metrics, OpenTelemetry traces)
  • Security audit (API key handling, input validation)
  • Load testing (100 concurrent analyses)
  • Documentation site (docs.glassbox-ai.dev)

๐Ÿ”ฒ Milestone 6: CLI & PyPI Release (Mar 16-22, 2026)

Goal: Shareable package anyone can install

Deliverables:

  • glassbox CLI tool
  • PyPI package: pip install glassbox-ai
  • Usage examples and tutorials
  • Blog post: "Building Transparent Multi-Agent Systems"
  • LinkedIn case study with screenshots
  • GitHub Sponsors / funding model

Usage:

pip install glassbox-ai

# Analyze a problem
glassbox analyze "Should we use Redis or Postgres?"

# View trust dashboard
glassbox trust-dashboard

# Update trust manually
glassbox update-trust architect --correct

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Windsurf Chat / CLI / Web Dashboard       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                  โ”‚ MCP Protocol / API
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚         MCP Server (server.py)              โ”‚
โ”‚  Tools: multi_agent_analyze,                โ”‚
โ”‚         get_trust_scores, update_trust      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                  โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚      Orchestrator (orchestrator.py)         โ”‚
โ”‚  - Parallel agent execution                 โ”‚
โ”‚  - Weighted consensus                       โ”‚
โ”‚  - Provenance tracking                      โ”‚
โ””โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
   โ”‚            โ”‚            โ”‚
โ”Œโ”€โ”€โ–ผโ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”
โ”‚GPT-4โ”‚   โ”‚GPT-4o  โ”‚   โ”‚GPT-4   โ”‚
โ”‚Opus โ”‚   โ”‚        โ”‚   โ”‚Turbo   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
 @arch     @pragma      @critic

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚       Trust DB (trust_db.py)                โ”‚
โ”‚  SQLite: agent โ†’ trust score โ†’ history     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐ŸŽฏ Success Metrics

Technical:

  • <500ms latency for 3-agent analysis
  • Trust score convergence within 10 iterations
  • 95%+ uptime on demo deployment

Adoption:

  • 100 GitHub stars by end of March
  • 50 PyPI downloads/week
  • 1 enterprise POC

Validation:

  • Featured in a newsletter (e.g., TLDR AI, The Batch)
  • 1 blog post or paper citing this work
  • Positive feedback from 5 real users

๐Ÿš€ Quick Start (After Milestone 1)

Local Setup

git clone https://github.com/yourusername/glassbox-ai
cd glassbox-ai
pip install -r requirements.txt

# Add API key
cp .env.example .env
# Edit .env: OPENAI_API_KEY=sk-...

# Run MCP server
python server.py

Windsurf Integration

Add to ~/.codeium/windsurf/mcp_servers.json:

{
  "glassbox-ai": {
    "command": "python",
    "args": ["/path/to/glassbox-ai/server.py"]
  }
}

Restart Windsurf, then:

You: "Should we use Redis or Postgres for session storage?"

[Cascade invokes multi_agent_analyze]

๐Ÿค– 3 agents analyzing...
โœ… Consensus ready

๐Ÿ“‚ Project Structure

glassbox-ai/
โ”œโ”€โ”€ README.md              # This file
โ”œโ”€โ”€ server.py              # MCP entry point
โ”œโ”€โ”€ orchestrator.py        # Multi-agent logic
โ”œโ”€โ”€ trust_db.py           # Trust persistence
โ”œโ”€โ”€ verifier.py           # (Milestone 2) Claim checking
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ Dockerfile
โ”œโ”€โ”€ .env.example
โ”œโ”€โ”€ .gitignore
โ”œโ”€โ”€ tests/                # (Milestone 5) Test suite
โ”œโ”€โ”€ docs/                 # (Milestone 5) Documentation
โ””โ”€โ”€ web/                  # (Milestone 3) Dashboard
    โ”œโ”€โ”€ backend/
    โ””โ”€โ”€ frontend/

๐Ÿค Contributing

We're in MVP phase. Contributions welcome after Milestone 1 is complete.

Roadmap priorities:

  1. Core MCP server stability
  2. Claim verification accuracy
  3. Trust evolution algorithms
  4. Multi-model integration

๐Ÿ“œ License

MIT (open source, shareable, production-ready)


๐Ÿ“ง Contact

Built by @yourname
LinkedIn: Your Profile

Building in public. Follow along for updates on transparent AI systems.


๐Ÿ”— Related Work

Inspiration:

Frameworks we build on:

Our unique contribution: First to combine trust scoring + claim verification + provenance in a production multi-agent system.


๐Ÿ’Ž Glass Box over โฌ› Black Box

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glassbox_ai-0.3.0.tar.gz (12.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

glassbox_ai-0.3.0-py3-none-any.whl (9.7 kB view details)

Uploaded Python 3

File details

Details for the file glassbox_ai-0.3.0.tar.gz.

File metadata

  • Download URL: glassbox_ai-0.3.0.tar.gz
  • Upload date:
  • Size: 12.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for glassbox_ai-0.3.0.tar.gz
Algorithm Hash digest
SHA256 59a96a11504af98a2d42e19c9254d3928b249996ddf65fa35eaa1a69736bb5d6
MD5 97c1644019f9efdc85535f79d7df6b07
BLAKE2b-256 e0a14b11b6ec8b93dff74fcdf21123b7ab9fec6612364b9ff25c562604936ca1

See more details on using hashes here.

File details

Details for the file glassbox_ai-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: glassbox_ai-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 9.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for glassbox_ai-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 254dd3fce39bbe88bc7151dfa6b73b0cdbe04a3d51e52da5b77007899db3ae25
MD5 6ce71b4595369a7a511690b655073982
BLAKE2b-256 009209d6f1929cfd5b24254f4860f26175245acbbb96674c783d6517e02b102a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page