🚨 MCP as a Judge: Prevent bad coding practices with AI-powered evaluation and user-driven decision making
Project description
⚖️ MCP as a Judge
Prevent bad coding practices with AI-powered evaluation and user-driven decision making
MCP as a Judge is a revolutionary Model Context Protocol (MCP) server that transforms the developer-AI collaboration experience. It acts as an intelligent gatekeeper for software development, preventing bad coding practices by using AI-powered evaluation and involving users in critical decisions when requirements are unclear or obstacles arise.
⚖️ Concept: This project extends the LLM-as-a-Judge paradigm to software engineering workflows, where AI models evaluate and guide development decisions rather than just generating code.
⚖️ Main Purpose: Improve Developer-AI Interface
The core mission is to enhance the interface between developers and AI coding assistants by:
- 🛡️ Preventing AI from making poor decisions through intelligent evaluation
- 🤝 Involving humans in critical choices instead of AI making assumptions
- 🔍 Enforcing research and best practices before implementation
- ⚖️ Creating a collaborative AI-human workflow for better software quality
Vibe Coding doesn't have to be frustrating
What It Enforces:
- ✅ Deep research of existing solutions and best practices
- ✅ Generic, reusable solutions instead of quick fixes
- ✅ User requirements alignment in all implementations
- ✅ Comprehensive planning before coding begins
- ✅ User involvement in all critical decisions
- ✅ Intelligent AI-human collaboration with clear boundaries and responsibilities
🛠️ Features
🔍 Intelligent Code Evaluation
- LLM-powered analysis using MCP sampling capability
- Software engineering best practices enforcement
- Security vulnerability detection
- Performance and maintainability assessment
📋 Comprehensive Planning Review
- Architecture validation against industry standards
- Research depth analysis to prevent reinventing solutions
- Requirements alignment verification
- Implementation approach evaluation
🤝 User-Driven Decision Making
- Obstacle resolution through user involvement via MCP elicitation
- Requirements clarification when requests are unclear
- No hidden fallbacks - transparent decision making
- Interactive problem solving with real-time user input
🛠️ List of Tools
| Tool Name | Description |
|---|---|
get_workflow_guidance |
Smart workflow analysis and tool recommendation |
judge_coding_plan |
Comprehensive plan evaluation with requirements alignment |
judge_code_change |
Code review with security and quality checks |
raise_obstacle |
User involvement when blockers arise |
elicit_missing_requirements |
Clarification of unclear requests |
🚀 Quick Start
Requirements & Recommendations
MCP Client Prerequisites
MCP as a Judge is heavily dependent on MCP Sampling and MCP Elicitation features for its core functionality:
- MCP Sampling - Required for AI-powered code evaluation and judgment
- MCP Elicitation - Required for interactive user decision prompts
System Prerequisites
- Python 3.13+ - Required for running the MCP server
Supported AI Assistants
| AI Assistant | Platform | MCP Support | Status | Notes |
|---|---|---|---|---|
| GitHub Copilot | Visual Studio Code | ✅ Full | Recommended | Complete MCP integration with tool calling |
✅ Recommended Setup: GitHub Copilot in Visual Studio Code for the best MCP as a Judge experience.
💡 Recommendations
- Large Context Window Models: 1M+ token size models are strongly preferred for optimal performance
- Models with larger context windows provide better code analysis and more comprehensive judgments
Note: MCP servers communicate via stdio (standard input/output), not HTTP ports. No network configuration is needed.
🔧 Visual Studio Code Configuration
Configure MCP as a Judge in Visual Studio Code with GitHub Copilot:
Method 1: Using uv (Recommended)
-
Install the package:
uv add mcp-as-a-judge
-
Configure Visual Studio Code MCP settings:
Add this to your Visual Studio Code MCP configuration file:
{ "servers": { "mcp-as-a-judge": { "command": "uv", "args": ["run", "mcp-as-a-judge"] } } }
Method 2: Using Docker
-
Pull the Docker image:
docker pull ghcr.io/hepivax/mcp-as-a-judge:latest
-
Configure Visual Studio Code MCP settings:
Add this to your Visual Studio Code MCP configuration file:
{ "servers": { "mcp-as-a-judge": { "command": "docker", "args": ["run", "--rm", "-i", "ghcr.io/hepivax/mcp-as-a-judge:latest"] } } }
📖 How It Works
Once MCP as a Judge is configured in Visual Studio Code with GitHub Copilot, it automatically guides your AI assistant through a structured software engineering workflow. The system operates transparently in the background, ensuring every development task follows best practices.
🔄 Automatic Workflow Enforcement
1. Intelligent Workflow Guidance
- When you make any development request, the AI assistant automatically calls
get_workflow_guidance - This tool uses AI analysis to determine which validation steps are required for your specific task
- Provides smart recommendations on which tools to use next and in what order
- No manual intervention needed - the workflow starts automatically with intelligent guidance
2. Planning & Design Phase
- For any implementation task, the AI assistant must first help you create:
- Detailed coding plan - Step-by-step implementation approach
- System design - Architecture, components, and technical decisions
- Research findings - Analysis of existing solutions and best practices
- Once complete,
judge_coding_planautomatically evaluates the plan using MCP Sampling - AI-powered evaluation checks for design quality, security, research thoroughness, and requirements alignment
3. Code Implementation Review
- After any code is written or modified,
judge_code_changeis automatically triggered - Mandatory code review happens immediately after file creation/modification
- Uses MCP Sampling to evaluate code quality, security vulnerabilities, and best practices
- Ensures every code change meets professional standards
🤝 User Involvement When Needed
Obstacle Resolution
- When the AI assistant encounters blockers or conflicting requirements,
raise_obstacleautomatically engages you - Uses MCP Elicitation to present options and get your decision
- No hidden fallbacks - you're always involved in critical decisions
Requirements Clarification
- If your request lacks sufficient detail,
elicit_missing_requirementsautomatically asks for clarification - Uses MCP Elicitation to gather specific missing information
- Ensures implementation matches your actual needs
🎯 What to Expect
- Automatic guidance - No need to explicitly ask the AI coding assistant to call tools
- Comprehensive planning - Every implementation starts with proper design and research
- Quality enforcement - All code changes are automatically reviewed against industry standards
- User-driven decisions - You're involved whenever your original request cannot be satisfied
- Professional standards - Consistent application of software engineering best practices
🔒 Privacy & API Key Free
🔑 No LLM API Key Required
- All judgments are performed using MCP Sampling capability
- No need to configure or pay for external LLM API services
- Works directly with your MCP-compatible client's existing AI model
🛡️ Your Privacy Matters
- The server runs locally on your machine
- No data collection - your code and conversations stay private
- No external API calls - everything happens within your local environment
- Complete control over your development workflow and sensitive information
🤝 Contributing
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
Development Setup
# Clone the repository
git clone https://github.com/hepivax/mcp-as-a-judge.git
cd mcp-as-a-judge
# Install dependencies with uv
uv sync --all-extras --dev
# Install pre-commit hooks
uv run pre-commit install
# Run tests
uv run pytest
# Run all checks
uv run pytest && uv run ruff check && uv run ruff format --check && uv run mypy src
Release Process
This project uses automated semantic versioning:
- Commit with conventional commits:
feat:,fix:,docs:, etc. - Push to main: Semantic release will automatically create tags and releases
- Manual releases: Create a tag
v1.2.3and push to trigger release workflow
# Example conventional commits
git commit -m "feat: add new validation rule for async functions"
git commit -m "fix: resolve memory leak in server startup"
git commit -m "docs: update installation instructions"
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- Model Context Protocol by Anthropic
- The amazing MCP community for inspiration and best practices
- All developers who will benefit from better coding practices
🚨 Ready to revolutionize your development workflow? Install MCP as a Judge today!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mcp_as_a_judge-0.1.1.tar.gz.
File metadata
- Download URL: mcp_as_a_judge-0.1.1.tar.gz
- Upload date:
- Size: 71.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bb8f897bb606715d8317f887b38a80675977565429745ac208c2af9b19c05765
|
|
| MD5 |
f9c1a2c9971343dd9291c9d7d214b944
|
|
| BLAKE2b-256 |
339713f743668cc8ba78d2f8c0ab171e76f816dbe285769bef4053ad30db74ce
|
File details
Details for the file mcp_as_a_judge-0.1.1-py3-none-any.whl.
File metadata
- Download URL: mcp_as_a_judge-0.1.1-py3-none-any.whl
- Upload date:
- Size: 14.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b77213c9e4ee8b9f724558a008d7d4caaa533763fa6198d56f3a9ff367104511
|
|
| MD5 |
7d7819be578aa0222242127cf4250078
|
|
| BLAKE2b-256 |
db28b74b58f39ebe742a6abb4df810d1865a19b2eb56ef9d3760d7710a4da36a
|