Multi-agent MCP server with trust scoring and multi-round debate
Project description
GlassBox AI ๐
Transparent Multi-Agent Systems with Trust Scoring
The first production-ready framework for building auditable, trustworthy multi-agent AI systems with runtime trust distribution.
๐ฏ Vision
Transform AI from black boxes to glass boxes - where every decision is traceable, every agent is accountable, and trust is earned through verified outcomes.
Problem: Existing multi-agent frameworks (CrewAI, AutoGen) orchestrate agents but hide their reasoning. You see the final answer, not WHY it was chosen or WHO to trust.
Solution: GlassBox AI adds a transparent trust layer - agents debate, trust scores weight their votes, outcomes update reputations, and full provenance chains show exactly how decisions were made.
๐ Current Status
Milestone 1 (In Progress): MCP Server MVP
Target: Feb 15, 2026
Status: ๐ก Building core components
๐๏ธ Roadmap & Milestones
โ Milestone 0: Research & Design (DONE)
- Analyze competitive landscape (LIME, SHAP, CrewAI, AutoGen, Vector Institute research)
- Identify unique positioning (trust scoring + claim verification + provenance)
- Define architecture (MCP server โ orchestrator โ trust DB)
๐ก Milestone 1: MCP Server MVP (Feb 13-15, 2026)
Goal: Working multi-agent MCP server that integrates with Windsurf
Deliverables:
- Project structure
-
server.py- MCP protocol handlers -
orchestrator.py- Parallel GPT agent execution with 3 personas -
trust_db.py- SQLite persistence for trust scores -
requirements.txtand.env.example - Dockerfile for GHCR deployment
- README with setup instructions
- Test with Windsurf locally
Agents (GPT-only for MVP):
@architect- Long-term thinking, scalability focus@pragmatist- Ship fast, iterate, business value@critic- Edge cases, security, failure modes
Trust Mechanism:
- Initial scores: 0.85 for all agents
- Update formula:
new_trust = old_trust + 0.1 * (outcome - old_trust) - Weighted consensus by trust
๐ฒ Milestone 2: Claim Verification Layer (Feb 16-20, 2026)
Goal: Add fact-checking to agent responses
Deliverables:
-
verifier.py- Claim extraction from agent responses - Source grounding validation (does citation support claim?)
- Confidence scoring per claim
- Provenance chain tracking (claim โ agent โ source โ line number)
Example:
Agent says: "Use Redis for caching." [Source: docs.redis.io]
Verifier checks: Does source actually recommend Redis for this use case?
Result: โ
Supported (confidence: 0.92)
๐ฒ Milestone 3: Web Dashboard (Feb 21-28, 2026)
Goal: Visual interface for trust evolution and agent debates
Deliverables:
- FastAPI backend serving agent analysis API
- React frontend with:
- Real-time agent conversation display
- Trust score graphs over time
- Provenance tree visualization
- Manual trust adjustment controls
- Deployed demo at
demo.glassbox-ai.dev
UI Mockup:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ๐ค Multi-Agent Analysis โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ @architect (Trust: 0.92) ๐ โ
โ "Use Postgres with materialized views" โ
โ โ
โ @pragmatist (Trust: 0.85) ๐ โ
โ "Start with Redis, migrate later" โ
โ โ
โ @critic (Trust: 0.88) โ ๏ธ โ
โ "What's your eviction policy?" โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ๏ธ Weighted Consensus: Redis (0.87) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ฒ Milestone 4: Multi-Model Support (Mar 1-7, 2026)
Goal: Support Claude + GPT + Gemini for true agent diversity
Deliverables:
- Anthropic API integration
- Google Gemini API integration
- Agent pool with mixed models:
@architectโ Claude Opus@pragmatistโ GPT-4o@criticโ Claude Sonnet@innovatorโ Gemini Pro
- Cost tracking per agent/model
๐ฒ Milestone 5: Production Hardening (Mar 8-15, 2026)
Goal: Enterprise-ready deployment
Deliverables:
- Rate limiting and retry logic
- Error recovery and fallbacks
- Observability (Prometheus metrics, OpenTelemetry traces)
- Security audit (API key handling, input validation)
- Load testing (100 concurrent analyses)
- Documentation site (docs.glassbox-ai.dev)
๐ฒ Milestone 6: CLI & PyPI Release (Mar 16-22, 2026)
Goal: Shareable package anyone can install
Deliverables:
-
glassboxCLI tool - PyPI package:
pip install glassbox-ai - Usage examples and tutorials
- Blog post: "Building Transparent Multi-Agent Systems"
- LinkedIn case study with screenshots
- GitHub Sponsors / funding model
Usage:
pip install glassbox-ai
# Analyze a problem
glassbox analyze "Should we use Redis or Postgres?"
# View trust dashboard
glassbox trust-dashboard
# Update trust manually
glassbox update-trust architect --correct
๐๏ธ Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Windsurf Chat / CLI / Web Dashboard โ
โโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ MCP Protocol / API
โโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ MCP Server (server.py) โ
โ Tools: multi_agent_analyze, โ
โ get_trust_scores, update_trust โ
โโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Orchestrator (orchestrator.py) โ
โ - Parallel agent execution โ
โ - Weighted consensus โ
โ - Provenance tracking โ
โโโโฌโโโโโโโโโโโโโฌโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโ
โ โ โ
โโโโผโโโ โโโโโโผโโโโ โโโโโผโโโโโ
โGPT-4โ โGPT-4o โ โGPT-4 โ
โOpus โ โ โ โTurbo โ
โโโโโโโ โโโโโโโโโโ โโโโโโโโโโ
@arch @pragma @critic
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Trust DB (trust_db.py) โ
โ SQLite: agent โ trust score โ history โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ฏ Success Metrics
Technical:
- <500ms latency for 3-agent analysis
- Trust score convergence within 10 iterations
- 95%+ uptime on demo deployment
Adoption:
- 100 GitHub stars by end of March
- 50 PyPI downloads/week
- 1 enterprise POC
Validation:
- Featured in a newsletter (e.g., TLDR AI, The Batch)
- 1 blog post or paper citing this work
- Positive feedback from 5 real users
๐ Quick Start (After Milestone 1)
Local Setup
git clone https://github.com/yourusername/glassbox-ai
cd glassbox-ai
pip install -r requirements.txt
# Add API key
cp .env.example .env
# Edit .env: OPENAI_API_KEY=sk-...
# Run MCP server
python server.py
Windsurf Integration
Add to ~/.codeium/windsurf/mcp_servers.json:
{
"glassbox-ai": {
"command": "python",
"args": ["/path/to/glassbox-ai/server.py"]
}
}
Restart Windsurf, then:
You: "Should we use Redis or Postgres for session storage?"
[Cascade invokes multi_agent_analyze]
๐ค 3 agents analyzing...
โ
Consensus ready
๐ Project Structure
glassbox-ai/
โโโ README.md # This file
โโโ server.py # MCP entry point
โโโ orchestrator.py # Multi-agent logic
โโโ trust_db.py # Trust persistence
โโโ verifier.py # (Milestone 2) Claim checking
โโโ requirements.txt
โโโ Dockerfile
โโโ .env.example
โโโ .gitignore
โโโ tests/ # (Milestone 5) Test suite
โโโ docs/ # (Milestone 5) Documentation
โโโ web/ # (Milestone 3) Dashboard
โโโ backend/
โโโ frontend/
๐ค Contributing
We're in MVP phase. Contributions welcome after Milestone 1 is complete.
Roadmap priorities:
- Core MCP server stability
- Claim verification accuracy
- Trust evolution algorithms
- Multi-model integration
๐ License
MIT (open source, shareable, production-ready)
๐ง Contact
Built by @yourname
LinkedIn: Your Profile
Building in public. Follow along for updates on transparent AI systems.
๐ Related Work
Inspiration:
Frameworks we build on:
- CrewAI - Multi-agent patterns
- AutoGen - Agent orchestration
- InterpretML - Glass box models
Our unique contribution: First to combine trust scoring + claim verification + provenance in a production multi-agent system.
๐ Glass Box over โฌ Black Box
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file glassbox_ai-0.2.0.tar.gz.
File metadata
- Download URL: glassbox_ai-0.2.0.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
70949f634c469f8b5890889cd57e45245252c02667443023f7e9cf4a2ba0ec0e
|
|
| MD5 |
1d1715e9588f84334f1a438d84b651d1
|
|
| BLAKE2b-256 |
2569e2f423df7e0b6016210deb5d0ecfba142ad40443989337f104ca79036afd
|
File details
Details for the file glassbox_ai-0.2.0-py3-none-any.whl.
File metadata
- Download URL: glassbox_ai-0.2.0-py3-none-any.whl
- Upload date:
- Size: 9.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a5157ff0cec487a20453577adecd4ce7a2e2c3a9c0aaa4c7810c34dd620f791
|
|
| MD5 |
ca3a28d8620771f2de42a3e590bc06dc
|
|
| BLAKE2b-256 |
032f8da1167a6062cfcb47f829a0b38bccb4eb053fd02597cf4af30ebcee50fd
|