Skip to main content

🚨 MCP as a Judge: Prevent bad coding practices with AI-powered evaluation and user-driven decision making

Project description

⚖️ MCP as a Judge

Prevent bad coding practices with AI-powered evaluation and user-driven decision making

License: MIT Python 3.13+ MCP Compatible

CI Release PyPI version

codecov

MCP as a Judge is a revolutionary Model Context Protocol (MCP) server that transforms the developer-AI collaboration experience. It acts as an intelligent gatekeeper for software development, preventing bad coding practices by using AI-powered evaluation and involving users in critical decisions when requirements are unclear or obstacles arise.

⚖️ Concept: This project extends the LLM-as-a-Judge paradigm to software engineering workflows, where AI models evaluate and guide development decisions rather than just generating code.

⚖️ Main Purpose: Improve Developer-AI Interface

The core mission is to enhance the interface between developers and AI coding assistants by:

  • 🛡️ Preventing AI from making poor decisions through intelligent evaluation
  • 🤝 Involving humans in critical choices instead of AI making assumptions
  • 🔍 Enforcing research and best practices before implementation
  • ⚖️ Creating a collaborative AI-human workflow for better software quality

Vibe Coding doesn't have to be frustrating

What It Enforces:

  • Deep research of existing solutions and best practices
  • Generic, reusable solutions instead of quick fixes
  • User requirements alignment in all implementations
  • Comprehensive planning before coding begins
  • User involvement in all critical decisions
  • Intelligent AI-human collaboration with clear boundaries and responsibilities

🛠️ Features

🔍 Intelligent Code Evaluation

  • LLM-powered analysis using MCP sampling capability
  • Software engineering best practices enforcement
  • Security vulnerability detection
  • Performance and maintainability assessment

📋 Comprehensive Planning Review

  • Architecture validation against industry standards
  • Research depth analysis to prevent reinventing solutions
  • Requirements alignment verification
  • Implementation approach evaluation

🤝 User-Driven Decision Making

  • Obstacle resolution through user involvement via MCP elicitation
  • Requirements clarification when requests are unclear
  • No hidden fallbacks - transparent decision making
  • Interactive problem solving with real-time user input

🛠️ List of Tools

Tool Name Description
get_workflow_guidance Smart workflow analysis and tool recommendation
judge_coding_plan Comprehensive plan evaluation with requirements alignment
judge_code_change Code review with security and quality checks
raise_obstacle User involvement when blockers arise
elicit_missing_requirements Clarification of unclear requests

🚀 Quick Start

Requirements & Recommendations

MCP Client Prerequisites

MCP as a Judge is heavily dependent on MCP Sampling and MCP Elicitation features for its core functionality:

System Prerequisites

  • Python 3.13+ - Required for running the MCP server

Supported AI Assistants

AI Assistant Platform MCP Support Status Notes
GitHub Copilot Visual Studio Code ✅ Full Recommended Complete MCP integration with tool calling

✅ Recommended Setup: GitHub Copilot in Visual Studio Code for the best MCP as a Judge experience.

💡 Recommendations

  • Large Context Window Models: 1M+ token size models are strongly preferred for optimal performance
  • Models with larger context windows provide better code analysis and more comprehensive judgments

Note: MCP servers communicate via stdio (standard input/output), not HTTP ports. No network configuration is needed.

🔧 Visual Studio Code Configuration

Configure MCP as a Judge in Visual Studio Code with GitHub Copilot:

Method 1: Using uv (Recommended)

  1. Install the package:

    uv add mcp-as-a-judge
    
  2. Configure Visual Studio Code MCP settings:

    Add this to your Visual Studio Code MCP configuration file:

    {
      "servers": {
        "mcp-as-a-judge": {
          "command": "uv",
          "args": ["run", "mcp-as-a-judge"]
        }
      }
    }
    

Method 2: Using Docker

  1. Pull the Docker image:

    docker pull ghcr.io/hepivax/mcp-as-a-judge:latest
    
  2. Configure Visual Studio Code MCP settings:

    Add this to your Visual Studio Code MCP configuration file:

    {
      "servers": {
        "mcp-as-a-judge": {
          "command": "docker",
          "args": ["run", "--rm", "-i", "ghcr.io/hepivax/mcp-as-a-judge:latest"]
        }
      }
    }
    

📖 How It Works

Once MCP as a Judge is configured in Visual Studio Code with GitHub Copilot, it automatically guides your AI assistant through a structured software engineering workflow. The system operates transparently in the background, ensuring every development task follows best practices.

🔄 Automatic Workflow Enforcement

1. Intelligent Workflow Guidance

  • When you make any development request, the AI assistant automatically calls get_workflow_guidance
  • This tool uses AI analysis to determine which validation steps are required for your specific task
  • Provides smart recommendations on which tools to use next and in what order
  • No manual intervention needed - the workflow starts automatically with intelligent guidance

2. Planning & Design Phase

  • For any implementation task, the AI assistant must first help you create:
    • Detailed coding plan - Step-by-step implementation approach
    • System design - Architecture, components, and technical decisions
    • Research findings - Analysis of existing solutions and best practices
  • Once complete, judge_coding_plan automatically evaluates the plan using MCP Sampling
  • AI-powered evaluation checks for design quality, security, research thoroughness, and requirements alignment

3. Code Implementation Review

  • After any code is written or modified, judge_code_change is automatically triggered
  • Mandatory code review happens immediately after file creation/modification
  • Uses MCP Sampling to evaluate code quality, security vulnerabilities, and best practices
  • Ensures every code change meets professional standards

🤝 User Involvement When Needed

Obstacle Resolution

  • When the AI assistant encounters blockers or conflicting requirements, raise_obstacle automatically engages you
  • Uses MCP Elicitation to present options and get your decision
  • No hidden fallbacks - you're always involved in critical decisions

Requirements Clarification

  • If your request lacks sufficient detail, elicit_missing_requirements automatically asks for clarification
  • Uses MCP Elicitation to gather specific missing information
  • Ensures implementation matches your actual needs

🎯 What to Expect

  • Automatic guidance - No need to explicitly ask the AI coding assistant to call tools
  • Comprehensive planning - Every implementation starts with proper design and research
  • Quality enforcement - All code changes are automatically reviewed against industry standards
  • User-driven decisions - You're involved whenever your original request cannot be satisfied
  • Professional standards - Consistent application of software engineering best practices

🔒 Privacy & API Key Free

🔑 No LLM API Key Required

  • All judgments are performed using MCP Sampling capability
  • No need to configure or pay for external LLM API services
  • Works directly with your MCP-compatible client's existing AI model

🛡️ Your Privacy Matters

  • The server runs locally on your machine
  • No data collection - your code and conversations stay private
  • No external API calls - everything happens within your local environment
  • Complete control over your development workflow and sensitive information

🤝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Development Setup

# Clone the repository
git clone https://github.com/hepivax/mcp-as-a-judge.git
cd mcp-as-a-judge

# Install dependencies with uv
uv sync --all-extras --dev

# Install pre-commit hooks
uv run pre-commit install

# Run tests
uv run pytest

# Run all checks
uv run pytest && uv run ruff check && uv run ruff format --check && uv run mypy src

Release Process

This project uses automated semantic versioning:

  1. Commit with conventional commits: feat:, fix:, docs:, etc.
  2. Push to main: Semantic release will automatically create tags and releases
  3. Manual releases: Create a tag v1.2.3 and push to trigger release workflow
# Example conventional commits
git commit -m "feat: add new validation rule for async functions"
git commit -m "fix: resolve memory leak in server startup"
git commit -m "docs: update installation instructions"

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • Model Context Protocol by Anthropic
  • The amazing MCP community for inspiration and best practices
  • All developers who will benefit from better coding practices

🚨 Ready to revolutionize your development workflow? Install MCP as a Judge today!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_as_a_judge-0.1.0.tar.gz (71.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_as_a_judge-0.1.0-py3-none-any.whl (14.4 kB view details)

Uploaded Python 3

File details

Details for the file mcp_as_a_judge-0.1.0.tar.gz.

File metadata

  • Download URL: mcp_as_a_judge-0.1.0.tar.gz
  • Upload date:
  • Size: 71.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for mcp_as_a_judge-0.1.0.tar.gz
Algorithm Hash digest
SHA256 498067ce80641fd08b01ed0fa1c0e858ff653b484f8636871f5daba0d8d1830a
MD5 03fd24b8ab24f56a7d7c1d1ed5864665
BLAKE2b-256 57c48e5733cef2340a5d64d1dfbdd39770481bc50d6289e517c3cfb43637148d

See more details on using hashes here.

File details

Details for the file mcp_as_a_judge-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mcp_as_a_judge-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 14.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for mcp_as_a_judge-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dde7da77e23b72a26ef973d877d0b7a372c30ba20a45f2d6ef1dce7f55242d6b
MD5 36596fff29b51bcbaf7573474bf434bc
BLAKE2b-256 ebe1da60178f4f5e068b7a28d401e6f5ec7a4cb568fcd77ecddfee3a24548625

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page