Enhanced Subject Researcher MCP - Iterative target-driven research with multi-vertical search, quality metrics, and sophisticated claim mining
Project description
๐ Enhanced Subject Researcher MCP
Advanced iterative target-driven research with multi-vertical search, sophisticated claim mining, and evidence-based synthesis
โจ Features
๐ฏ Iterative Target-Driven Research
- Quality Meters: Coverage, recency, novelty, agreement, contradictions tracking
- Stop Criteria: Configurable quality gates with automatic continuation logic
- Stagnation Detection: Automatic scope widening when research plateaus
- Adaptive Queries: Smart query generation based on iteration state
๐ Multi-Vertical Search Engine
- Real Web Search: DuckDuckGo integration for actual web results
- 5 Search Verticals: Web, News, Docs, Community, Academic sources
- Fallback System: Graceful degradation with high-quality synthetic results
- Source Deduplication: Cross-iteration URL tracking
โ๏ธ Sophisticated Claim Mining
- Atomic Claims: Extracts falsifiable, standalone statements
- Metadata Extraction: Units, measurements, caveats automatically detected
- Independence Detection: Cross-source validation and duplicate identification
- Confidence Scoring: Evidence-based claim reliability assessment
๐ Enhanced Credibility Scoring
- Multi-Factor Analysis: Domain authority, recency, content quality, independence
- Independence Matrix: Detects source relationships and potential bias
- Transparency: Detailed credibility breakdown for every source
- Real-Time Updates: Dynamic scoring based on cross-validation
๐ Answer-First Synthesis
- Direct Answers: Immediate response to research questions
- Inline Citations: Professional citation system with automatic numbering
- Evidence Weighting: Confidence scores based on source quality
- Professional Reports: Executive summaries with actionable recommendations
๐ Quick Start
๐ณ Recommended: Docker Installation (Easiest)
The easiest way to use the Enhanced Subject Researcher MCP is with Docker - no Python dependencies or environment setup required!
Why Docker?
- โ Zero setup - No Python environment configuration needed
- โ Consistent - Works the same across all systems (Windows, macOS, Linux)
- โ Isolated - No conflicts with your existing Python packages
- โ Latest version - Always get the most recent release automatically
# Pull the latest Docker image
docker pull elad12390/subject-researcher-mcp:latest
# Test the server (optional)
docker run --rm -i elad12390/subject-researcher-mcp:latest
Alternative: Python Installation
pip install subject-researcher-mcp
Basic Usage
import asyncio
from subject_researcher_mcp import ResearchEngine, ResearchInputs
async def research_example():
engine = ResearchEngine()
# Configure research parameters
inputs = ResearchInputs(
subject="Python async programming best practices",
objective="comprehensive_analysis",
max_sources=15,
constraints={
"max_iterations": 3,
"gate_thresholds": {
"min_coverage": 0.7,
"min_recency": 0.5,
"novelty_threshold": 0.1,
"max_contradictions": 0.3
}
}
)
# Execute iterative research
report = await engine.conduct_iterative_research(inputs)
print(f"Research completed: {len(report.sources)} sources, {len(report.claims)} claims")
print(f"Confidence: {report.confidence:.1%}")
print(f"Executive Summary: {report.executive_summary}")
await engine.close()
# Run the research
asyncio.run(research_example())
๐ง Installation for AI Editors
Claude Desktop
๐ณ Recommended: Docker Method (Easiest)
-
Pull the Docker image:
docker pull elad12390/subject-researcher-mcp:latest
-
Configure Claude Desktop:
- Open Claude Desktop
- Click the Claude menu โ Settings
- Go to Developer tab โ Edit Config
- Add this configuration:
{ "mcpServers": { "subject-researcher": { "command": "docker", "args": [ "run", "--rm", "-i", "elad12390/subject-researcher-mcp:latest" ], "env": { "GEMINI_API_KEY": "your-optional-gemini-api-key" } } } }
-
Restart Claude Desktop and look for the MCP server indicator (๐) in the chat input.
๐ฆ Alternative: Python Method
-
Install the package:
pip install subject-researcher-mcp
-
Configure Claude Desktop:
{ "mcpServers": { "subject-researcher": { "command": "python", "args": ["-m", "subject_researcher_mcp.server"], "env": { "GEMINI_API_KEY": "your-optional-gemini-api-key" } } } }
Cursor IDE
๐ณ Recommended: Docker Method (Easiest)
-
Pull the Docker image:
docker pull elad12390/subject-researcher-mcp:latest
-
Configure Cursor:
- Create
.cursor/mcp.jsonin your project root (or~/.cursor/mcp.jsonfor global access) - Add this configuration:
{ "mcpServers": { "subject-researcher": { "command": "docker", "args": [ "run", "--rm", "-i", "elad12390/subject-researcher-mcp:latest" ], "enabled": true, "env": { "GEMINI_API_KEY": "your-optional-gemini-api-key" } } } }
- Create
-
Usage in Cursor:
- Open the Composer Agent
- MCP tools will be listed under "Available Tools"
- Ask for research using natural language
๐ฆ Alternative: Python Method
-
Install the package:
pip install subject-researcher-mcp
-
Configure Cursor:
{ "mcpServers": { "subject-researcher": { "command": "python", "args": ["-m", "subject_researcher_mcp.server"], "enabled": true, "env": { "GEMINI_API_KEY": "your-optional-gemini-api-key" } } } }
Claude Code
๐ Command-Line Method (Easiest)
# Using Docker (Recommended)
claude mcp add subject-researcher --env GEMINI_API_KEY=your-optional-key \
-- docker run --rm -i elad12390/subject-researcher-mcp:latest
# Or using Python
claude mcp add subject-researcher --env GEMINI_API_KEY=your-optional-key \
-- python -m subject_researcher_mcp.server
๐ง Manual Configuration
-
Pull the Docker image:
docker pull elad12390/subject-researcher-mcp:latest
-
Add manually via JSON:
claude mcp add-json subject-researcher '{ "type":"stdio", "command":"docker", "args":["run","--rm","-i","elad12390/subject-researcher-mcp:latest"], "env":{"GEMINI_API_KEY":"your-optional-key"} }'
OpenCode
๐ณ Recommended: Docker Method (Easiest)
-
Pull the Docker image:
docker pull elad12390/subject-researcher-mcp:latest
-
Configure OpenCode:
- In your project directory, edit
opencode.json - Add this to the configuration:
{ "$schema": "https://opencode.ai/config.json", "mcp": { "subject-researcher": { "type": "local", "command": [ "docker", "run", "--rm", "-i", "elad12390/subject-researcher-mcp:latest" ], "enabled": true, "environment": { "GEMINI_API_KEY": "your-optional-gemini-api-key" } } } }
- In your project directory, edit
-
Usage in OpenCode:
- MCP tools are automatically available to the LLM
- Ask for research and OpenCode will use the tools as needed
๐ฆ Alternative: Python Method
-
Install the package:
pip install subject-researcher-mcp
-
Configure OpenCode:
{ "$schema": "https://opencode.ai/config.json", "mcp": { "subject-researcher": { "type": "local", "command": ["python", "-m", "subject_researcher_mcp.server"], "enabled": true, "environment": { "GEMINI_API_KEY": "your-optional-gemini-api-key" } } } }
๐ MCP Server Usage
Once configured in your AI editor, you can use natural language to request research:
Example requests:
- "Research the latest developments in quantum computing applications"
- "Analyze current best practices for microservices architecture"
- "Investigate recent security vulnerabilities in popular Python packages"
The MCP server provides these tools:
conduct_iterative_research- Full 11-phase research methodologyconduct_research- Basic multi-source researchanalyze_research_quality- Quality assessment of research results
๐ Environment Variables
The Subject Researcher MCP supports the following optional environment variables:
| Variable | Description | Required | Default |
|---|---|---|---|
GEMINI_API_KEY |
Google Gemini API key for enhanced analysis and synthesis | No | Not used |
Note: The research engine works fully without any API keys, using free search APIs. The Gemini API key is only used for optional enhanced analysis features.
Direct MCP Server Usage
# Start the MCP server
python -m subject_researcher_mcp.server
# Or using Docker
docker run -p 8000:8000 your-org/subject-researcher-mcp:latest
๐ ๏ธ Development
Prerequisites
- Python 3.10+
- Git
Setup
# Clone the repository
git clone https://github.com/your-org/subject-researcher-mcp.git
cd subject-researcher-mcp
# Install in development mode
pip install -e .
# Install development dependencies
pip install pytest pytest-asyncio ruff build
Running Tests
# Run all tests
pytest tests/ -v
# Run E2E tests
pytest tests/test_real_mcp_e2e.py -v
# Run quick validation
python -c "from src.subject_researcher_mcp.research_engine import ResearchEngine; print('โ
Validation passed')"
Code Quality
# Lint code
ruff check src/ tests/
# Format code
ruff format src/ tests/
# Type checking (if using mypy)
mypy src/
๐ API Reference
ResearchEngine
Main class for conducting iterative research.
class ResearchEngine:
def __init__(self, gemini_api_key: Optional[str] = None)
async def conduct_iterative_research(self, inputs: ResearchInputs) -> ResearchReport
async def conduct_research(self, inputs: ResearchInputs) -> ResearchReport # Legacy method
ResearchInputs
Configuration for research execution.
@dataclass
class ResearchInputs:
subject: str
objective: str = "comprehensive_analysis" # or "best_options", "decision_support"
depth: str = "standard" # "fast", "standard", "deep"
max_sources: int = 50
recency_months: int = 18
constraints: Dict[str, Any] = field(default_factory=dict)
Quality Gates Configuration
gate_thresholds = {
"min_coverage": 0.7, # Minimum topic coverage (0-1)
"min_recency": 0.5, # Minimum source freshness (0-1)
"novelty_threshold": 0.1, # Minimum new info rate (0-1)
"max_contradictions": 0.3 # Maximum contradiction level (0-1)
}
๐ง Configuration
Environment Variables
# Optional: Gemini API key for enhanced AI analysis
export GEMINI_API_KEY="your-gemini-api-key"
# Optional: Custom configuration
export RESEARCH_MAX_ITERATIONS=5
export RESEARCH_TIMEOUT=300
MCP Client Configuration
{
"mcpServers": {
"subject-researcher": {
"command": "python",
"args": ["-m", "subject_researcher_mcp.server"],
"env": {
"GEMINI_API_KEY": "your-key-here"
}
}
}
}
๐๏ธ Architecture
Research Methodology
The Enhanced Subject Researcher implements an 11-phase iterative methodology:
- Plan - Generate research questions and hypotheses
- Query Design - Create adaptive search queries
- Harvest - Multi-vertical search execution
- Triage - Source quality filtering
- Claim Mining - Atomic claim extraction
- Cluster & Triangulate - Cross-source validation
- Evaluate Credibility - Enhanced scoring with independence matrix
- Topic Logic - Domain-specific analysis (for "best X" queries)
- Synthesize - Answer-first report generation
- Self-Critique - Gap identification and quality assessment
- Package & Verify - Final report assembly and validation
Quality Metrics
- Coverage: How comprehensively the topic has been researched
- Recency: Average age and freshness of sources
- Novelty: Rate of new information discovery per iteration
- Agreement: Level of consensus across sources
- Contradictions: Amount of conflicting information found
๐ Security
Data Privacy
- No personal data collection
- API keys handled securely
- Source URLs and content processed locally
Security Scanning
- Automated dependency vulnerability scanning
- Code security analysis with Bandit
- Regular security updates
๐ Performance
Benchmarks
- Search Speed: 2-5 real sources per iteration (15-45 seconds)
- Claim Extraction: 2-3 atomic claims per source
- Memory Usage: ~50-100MB for standard research
- Accuracy: 85%+ confidence scores in controlled tests
Optimization Tips
- Use
depth="fast"for quick research (2-3 iterations) - Adjust
max_sourcesbased on thoroughness needs - Configure
gate_thresholdsfor different quality requirements
๐ค Contributing
We welcome contributions! Please see our Contributing Guide for details.
Development Workflow
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Add tests for new functionality
- Ensure all tests pass (
pytest) - Submit a pull request
Code Standards
- Follow PEP 8 style guidelines
- Add type hints for all functions
- Include docstrings for public APIs
- Maintain test coverage above 80%
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
- MCP Protocol for the foundation
- DuckDuckGo for search capabilities
- Wikipedia API for reliable reference data
- Research methodology inspired by academic research best practices
๐ Roadmap
v2.1.0 (Planned)
- Real-time research monitoring dashboard
- Advanced NLP for better claim extraction
- Integration with academic databases
- Research collaboration features
v2.2.0 (Future)
- Machine learning for query optimization
- Multi-language research support
- Advanced visualization tools
- Research template system
๐ Support
- Documentation: GitHub Wiki
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: research-support@your-org.com
Made with โค๏ธ by the Enhanced Subject Researcher Team
"Transforming information chaos into evidence-based insights through intelligent automation"
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file subject_researcher_mcp-0.2.0.tar.gz.
File metadata
- Download URL: subject_researcher_mcp-0.2.0.tar.gz
- Upload date:
- Size: 1.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cb20ebd2cf26721d9dd68860b9a6abc84e9b876f55fbbb92bd54f2dc7480d6df
|
|
| MD5 |
1b17a303dda111d07accd1dda2602c7d
|
|
| BLAKE2b-256 |
83a3a07849f6fa5a2ca41f4d562eef91a22ebcf2d1e3bfbd303c450091220b22
|
File details
Details for the file subject_researcher_mcp-0.2.0-py3-none-any.whl.
File metadata
- Download URL: subject_researcher_mcp-0.2.0-py3-none-any.whl
- Upload date:
- Size: 29.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
19622d5fe21bba665f62d74fa202817420c6360e9cb4e07a8d101571f4954462
|
|
| MD5 |
ecd71e9686d530eeb3a0dd34bd9ef3bb
|
|
| BLAKE2b-256 |
18f7bc31b37bc153007a861737d038bfd697c940a752c5689d687ddb49d6c4bc
|