Skip to main content

Enhanced Subject Researcher MCP - Iterative target-driven research with multi-vertical search, quality metrics, and sophisticated claim mining

Project description

๐Ÿš€ Enhanced Subject Researcher MCP

CI/CD Pipeline Python 3.10+ License: MIT PyPI version

Advanced iterative target-driven research with multi-vertical search, sophisticated claim mining, and evidence-based synthesis

โœจ Features

๐ŸŽฏ Iterative Target-Driven Research

  • Quality Meters: Coverage, recency, novelty, agreement, contradictions tracking
  • Stop Criteria: Configurable quality gates with automatic continuation logic
  • Stagnation Detection: Automatic scope widening when research plateaus
  • Adaptive Queries: Smart query generation based on iteration state

๐Ÿ” Multi-Vertical Search Engine

  • Real Web Search: DuckDuckGo integration for actual web results
  • 5 Search Verticals: Web, News, Docs, Community, Academic sources
  • Fallback System: Graceful degradation with high-quality synthetic results
  • Source Deduplication: Cross-iteration URL tracking

โš—๏ธ Sophisticated Claim Mining

  • Atomic Claims: Extracts falsifiable, standalone statements
  • Metadata Extraction: Units, measurements, caveats automatically detected
  • Independence Detection: Cross-source validation and duplicate identification
  • Confidence Scoring: Evidence-based claim reliability assessment

๐Ÿ“Š Enhanced Credibility Scoring

  • Multi-Factor Analysis: Domain authority, recency, content quality, independence
  • Independence Matrix: Detects source relationships and potential bias
  • Transparency: Detailed credibility breakdown for every source
  • Real-Time Updates: Dynamic scoring based on cross-validation

๐Ÿ“‹ Answer-First Synthesis

  • Direct Answers: Immediate response to research questions
  • Inline Citations: Professional citation system with automatic numbering
  • Evidence Weighting: Confidence scores based on source quality
  • Professional Reports: Executive summaries with actionable recommendations

๐Ÿš€ Quick Start

๐Ÿณ Recommended: Docker Installation (Easiest)

The easiest way to use the Enhanced Subject Researcher MCP is with Docker - no Python dependencies or environment setup required!

Why Docker?

  • โœ… Zero setup - No Python environment configuration needed
  • โœ… Consistent - Works the same across all systems (Windows, macOS, Linux)
  • โœ… Isolated - No conflicts with your existing Python packages
  • โœ… Latest version - Always get the most recent release automatically
# Pull the latest Docker image
docker pull elad12390/subject-researcher-mcp:latest

# Test the server (optional)
docker run --rm -i elad12390/subject-researcher-mcp:latest

Alternative: Python Installation

pip install subject-researcher-mcp

Basic Usage

import asyncio
from subject_researcher_mcp import ResearchEngine, ResearchInputs

async def research_example():
    engine = ResearchEngine()
    
    # Configure research parameters
    inputs = ResearchInputs(
        subject="Python async programming best practices",
        objective="comprehensive_analysis",
        max_sources=15,
        constraints={
            "max_iterations": 3,
            "gate_thresholds": {
                "min_coverage": 0.7,
                "min_recency": 0.5,
                "novelty_threshold": 0.1,
                "max_contradictions": 0.3
            }
        }
    )
    
    # Execute iterative research
    report = await engine.conduct_iterative_research(inputs)
    
    print(f"Research completed: {len(report.sources)} sources, {len(report.claims)} claims")
    print(f"Confidence: {report.confidence:.1%}")
    print(f"Executive Summary: {report.executive_summary}")
    
    await engine.close()

# Run the research
asyncio.run(research_example())

๐Ÿ”ง Installation for AI Editors

Claude Desktop

๐Ÿณ Recommended: Docker Method (Easiest)

  1. Pull the Docker image:

    docker pull elad12390/subject-researcher-mcp:latest
    
  2. Configure Claude Desktop:

    • Open Claude Desktop
    • Click the Claude menu โ†’ Settings
    • Go to Developer tab โ†’ Edit Config
    • Add this configuration:
    {
      "mcpServers": {
        "subject-researcher": {
          "command": "docker",
          "args": [
            "run",
            "--rm",
            "-i",
            "elad12390/subject-researcher-mcp:latest"
          ],
          "env": {
            "GEMINI_API_KEY": "your-optional-gemini-api-key"
          }
        }
      }
    }
    
  3. Restart Claude Desktop and look for the MCP server indicator (๐Ÿ”Œ) in the chat input.

๐Ÿ“ฆ Alternative: Python Method

  1. Install the package:

    pip install subject-researcher-mcp
    
  2. Configure Claude Desktop:

    {
      "mcpServers": {
        "subject-researcher": {
          "command": "python",
          "args": ["-m", "subject_researcher_mcp.server"],
          "env": {
            "GEMINI_API_KEY": "your-optional-gemini-api-key"
          }
        }
      }
    }
    

Cursor IDE

๐Ÿณ Recommended: Docker Method (Easiest)

  1. Pull the Docker image:

    docker pull elad12390/subject-researcher-mcp:latest
    
  2. Configure Cursor:

    • Create .cursor/mcp.json in your project root (or ~/.cursor/mcp.json for global access)
    • Add this configuration:
    {
      "mcpServers": {
        "subject-researcher": {
          "command": "docker",
          "args": [
            "run",
            "--rm", 
            "-i",
            "elad12390/subject-researcher-mcp:latest"
          ],
          "enabled": true,
          "env": {
            "GEMINI_API_KEY": "your-optional-gemini-api-key"
          }
        }
      }
    }
    
  3. Usage in Cursor:

    • Open the Composer Agent
    • MCP tools will be listed under "Available Tools"
    • Ask for research using natural language

๐Ÿ“ฆ Alternative: Python Method

  1. Install the package:

    pip install subject-researcher-mcp
    
  2. Configure Cursor:

    {
      "mcpServers": {
        "subject-researcher": {
          "command": "python",
          "args": ["-m", "subject_researcher_mcp.server"],
          "enabled": true,
          "env": {
            "GEMINI_API_KEY": "your-optional-gemini-api-key"
          }
        }
      }
    }
    

Claude Code

๐Ÿš€ Command-Line Method (Easiest)

# Using Docker (Recommended)
claude mcp add subject-researcher --env GEMINI_API_KEY=your-optional-key \
  -- docker run --rm -i elad12390/subject-researcher-mcp:latest

# Or using Python
claude mcp add subject-researcher --env GEMINI_API_KEY=your-optional-key \
  -- python -m subject_researcher_mcp.server

๐Ÿ”ง Manual Configuration

  1. Pull the Docker image:

    docker pull elad12390/subject-researcher-mcp:latest
    
  2. Add manually via JSON:

    claude mcp add-json subject-researcher '{
      "type":"stdio",
      "command":"docker",
      "args":["run","--rm","-i","elad12390/subject-researcher-mcp:latest"],
      "env":{"GEMINI_API_KEY":"your-optional-key"}
    }'
    

OpenCode

๐Ÿณ Recommended: Docker Method (Easiest)

  1. Pull the Docker image:

    docker pull elad12390/subject-researcher-mcp:latest
    
  2. Configure OpenCode:

    • In your project directory, edit opencode.json
    • Add this to the configuration:
    {
      "$schema": "https://opencode.ai/config.json",
      "mcp": {
        "subject-researcher": {
          "type": "local",
          "command": [
            "docker",
            "run",
            "--rm",
            "-i", 
            "elad12390/subject-researcher-mcp:latest"
          ],
          "enabled": true,
          "environment": {
            "GEMINI_API_KEY": "your-optional-gemini-api-key"
          }
        }
      }
    }
    
  3. Usage in OpenCode:

    • MCP tools are automatically available to the LLM
    • Ask for research and OpenCode will use the tools as needed

๐Ÿ“ฆ Alternative: Python Method

  1. Install the package:

    pip install subject-researcher-mcp
    
  2. Configure OpenCode:

    {
      "$schema": "https://opencode.ai/config.json",
      "mcp": {
        "subject-researcher": {
          "type": "local",
          "command": ["python", "-m", "subject_researcher_mcp.server"],
          "enabled": true,
          "environment": {
            "GEMINI_API_KEY": "your-optional-gemini-api-key"
          }
        }
      }
    }
    

๐Ÿ“– MCP Server Usage

Once configured in your AI editor, you can use natural language to request research:

Example requests:

  • "Research the latest developments in quantum computing applications"
  • "Analyze current best practices for microservices architecture"
  • "Investigate recent security vulnerabilities in popular Python packages"

The MCP server provides these tools:

  • conduct_iterative_research - Full 11-phase research methodology
  • conduct_research - Basic multi-source research
  • analyze_research_quality - Quality assessment of research results

๐Ÿ”‘ Environment Variables

The Subject Researcher MCP supports the following optional environment variables:

Variable Description Required Default
GEMINI_API_KEY Google Gemini API key for enhanced analysis and synthesis No Not used

Note: The research engine works fully without any API keys, using free search APIs. The Gemini API key is only used for optional enhanced analysis features.

Direct MCP Server Usage

# Start the MCP server
python -m subject_researcher_mcp.server

# Or using Docker
docker run -p 8000:8000 your-org/subject-researcher-mcp:latest

๐Ÿ› ๏ธ Development

Prerequisites

  • Python 3.10+
  • Git

Setup

# Clone the repository
git clone https://github.com/your-org/subject-researcher-mcp.git
cd subject-researcher-mcp

# Install in development mode
pip install -e .

# Install development dependencies
pip install pytest pytest-asyncio ruff build

Running Tests

# Run all tests
pytest tests/ -v

# Run E2E tests
pytest tests/test_real_mcp_e2e.py -v

# Run quick validation
python -c "from src.subject_researcher_mcp.research_engine import ResearchEngine; print('โœ… Validation passed')"

Code Quality

# Lint code
ruff check src/ tests/

# Format code
ruff format src/ tests/

# Type checking (if using mypy)
mypy src/

๐Ÿ“– API Reference

ResearchEngine

Main class for conducting iterative research.

class ResearchEngine:
    def __init__(self, gemini_api_key: Optional[str] = None)
    
    async def conduct_iterative_research(self, inputs: ResearchInputs) -> ResearchReport
    async def conduct_research(self, inputs: ResearchInputs) -> ResearchReport  # Legacy method

ResearchInputs

Configuration for research execution.

@dataclass
class ResearchInputs:
    subject: str
    objective: str = "comprehensive_analysis"  # or "best_options", "decision_support"
    depth: str = "standard"  # "fast", "standard", "deep"
    max_sources: int = 50
    recency_months: int = 18
    constraints: Dict[str, Any] = field(default_factory=dict)

Quality Gates Configuration

gate_thresholds = {
    "min_coverage": 0.7,        # Minimum topic coverage (0-1)
    "min_recency": 0.5,         # Minimum source freshness (0-1)
    "novelty_threshold": 0.1,   # Minimum new info rate (0-1)
    "max_contradictions": 0.3   # Maximum contradiction level (0-1)
}

๐Ÿ”ง Configuration

Environment Variables

# Optional: Gemini API key for enhanced AI analysis
export GEMINI_API_KEY="your-gemini-api-key"

# Optional: Custom configuration
export RESEARCH_MAX_ITERATIONS=5
export RESEARCH_TIMEOUT=300

MCP Client Configuration

{
  "mcpServers": {
    "subject-researcher": {
      "command": "python",
      "args": ["-m", "subject_researcher_mcp.server"],
      "env": {
        "GEMINI_API_KEY": "your-key-here"
      }
    }
  }
}

๐Ÿ—๏ธ Architecture

Research Methodology

The Enhanced Subject Researcher implements an 11-phase iterative methodology:

  1. Plan - Generate research questions and hypotheses
  2. Query Design - Create adaptive search queries
  3. Harvest - Multi-vertical search execution
  4. Triage - Source quality filtering
  5. Claim Mining - Atomic claim extraction
  6. Cluster & Triangulate - Cross-source validation
  7. Evaluate Credibility - Enhanced scoring with independence matrix
  8. Topic Logic - Domain-specific analysis (for "best X" queries)
  9. Synthesize - Answer-first report generation
  10. Self-Critique - Gap identification and quality assessment
  11. Package & Verify - Final report assembly and validation

Quality Metrics

  • Coverage: How comprehensively the topic has been researched
  • Recency: Average age and freshness of sources
  • Novelty: Rate of new information discovery per iteration
  • Agreement: Level of consensus across sources
  • Contradictions: Amount of conflicting information found

๐Ÿ”’ Security

Data Privacy

  • No personal data collection
  • API keys handled securely
  • Source URLs and content processed locally

Security Scanning

  • Automated dependency vulnerability scanning
  • Code security analysis with Bandit
  • Regular security updates

๐Ÿ“Š Performance

Benchmarks

  • Search Speed: 2-5 real sources per iteration (15-45 seconds)
  • Claim Extraction: 2-3 atomic claims per source
  • Memory Usage: ~50-100MB for standard research
  • Accuracy: 85%+ confidence scores in controlled tests

Optimization Tips

  • Use depth="fast" for quick research (2-3 iterations)
  • Adjust max_sources based on thoroughness needs
  • Configure gate_thresholds for different quality requirements

๐Ÿค Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Workflow

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Add tests for new functionality
  5. Ensure all tests pass (pytest)
  6. Submit a pull request

Code Standards

  • Follow PEP 8 style guidelines
  • Add type hints for all functions
  • Include docstrings for public APIs
  • Maintain test coverage above 80%

๐Ÿ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • MCP Protocol for the foundation
  • DuckDuckGo for search capabilities
  • Wikipedia API for reliable reference data
  • Research methodology inspired by academic research best practices

๐Ÿ“ˆ Roadmap

v2.1.0 (Planned)

  • Real-time research monitoring dashboard
  • Advanced NLP for better claim extraction
  • Integration with academic databases
  • Research collaboration features

v2.2.0 (Future)

  • Machine learning for query optimization
  • Multi-language research support
  • Advanced visualization tools
  • Research template system

๐Ÿ“ž Support


Made with โค๏ธ by the Enhanced Subject Researcher Team

"Transforming information chaos into evidence-based insights through intelligent automation"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

subject_researcher_mcp-0.2.0.tar.gz (1.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

subject_researcher_mcp-0.2.0-py3-none-any.whl (29.4 kB view details)

Uploaded Python 3

File details

Details for the file subject_researcher_mcp-0.2.0.tar.gz.

File metadata

  • Download URL: subject_researcher_mcp-0.2.0.tar.gz
  • Upload date:
  • Size: 1.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for subject_researcher_mcp-0.2.0.tar.gz
Algorithm Hash digest
SHA256 cb20ebd2cf26721d9dd68860b9a6abc84e9b876f55fbbb92bd54f2dc7480d6df
MD5 1b17a303dda111d07accd1dda2602c7d
BLAKE2b-256 83a3a07849f6fa5a2ca41f4d562eef91a22ebcf2d1e3bfbd303c450091220b22

See more details on using hashes here.

File details

Details for the file subject_researcher_mcp-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for subject_researcher_mcp-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 19622d5fe21bba665f62d74fa202817420c6360e9cb4e07a8d101571f4954462
MD5 ecd71e9686d530eeb3a0dd34bd9ef3bb
BLAKE2b-256 18f7bc31b37bc153007a861737d038bfd697c940a752c5689d687ddb49d6c4bc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page