Skip to main content

MCP server for PubMed literature search with MeSH, PICO, and intelligent query expansion

Project description

PubMed Search MCP

PyPI version Python 3.10+ License: Apache 2.0 MCP Test Coverage

Professional Literature Research Assistant for AI Agents - More than just an API wrapper

PubMed Search MCP Architecture

A Domain-Driven Design (DDD) based MCP server that serves as an intelligent research assistant for AI agents, providing task-oriented literature search and analysis capabilities.

โœจ What's Included:

  • ๐Ÿ”ง 40 MCP Tools - Streamlined PubMed, Europe PMC, CORE, NCBI database access, and Research Timeline / Context Graph
  • ๐Ÿ–ผ๏ธ OA Figure Extraction - Pull figure captions, direct image URLs, and PDF links from PMC Open Access articles
  • ๐Ÿ“˜ Docs Site - Browse overview, architecture, source contracts, quick reference, troubleshooting, and deployment in one place at docs/index.html
  • ๐Ÿ“š 24 Claude Skills - Ready-to-use workflow guides for AI agents (Claude Code-specific)
  • ๐Ÿ“– Copilot Instructions - VS Code GitHub Copilot integration guide

๐ŸŒ Language: English | ็น้ซ”ไธญๆ–‡


๐Ÿš€ Quick Install

Prerequisites

  • Python 3.10+ โ€” Download

  • uv (recommended) โ€” Install uv

    # macOS / Linux
    curl -LsSf https://astral.sh/uv/install.sh | sh
    # Windows
    powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
    
  • NCBI Email โ€” Required by NCBI API policy. Any valid email address.

  • NCBI API Key (optional) โ€” Get one here for higher rate limits (10 req/s vs 3 req/s)

Install & Run

# Option 1: Zero-install with uvx (recommended for trying out)
uvx pubmed-search-mcp

# Option 2: Add as project dependency
uv add pubmed-search-mcp

# Option 3: pip install
pip install pubmed-search-mcp

โš™๏ธ Configuration

This MCP server works with any MCP-compatible AI tool. Choose your preferred client:

VS Code / Cursor (.vscode/mcp.json)

{
  "servers": {
    "pubmed-search": {
      "type": "stdio",
      "command": "uvx",
      "args": ["pubmed-search-mcp"],
      "env": {
        "NCBI_EMAIL": "your@email.com"
      }
    }
  }
}

Claude Desktop (claude_desktop_config.json)

{
  "mcpServers": {
    "pubmed-search": {
      "command": "uvx",
      "args": ["pubmed-search-mcp"],
      "env": {
        "NCBI_EMAIL": "your@email.com"
      }
    }
  }
}

Config file location:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json
  • Linux: ~/.config/Claude/claude_desktop_config.json

Claude Code

claude mcp add pubmed-search -- uvx pubmed-search-mcp

Or add to .mcp.json in your project root:

{
  "mcpServers": {
    "pubmed-search": {
      "command": "uvx",
      "args": ["pubmed-search-mcp"],
      "env": {
        "NCBI_EMAIL": "your@email.com"
      }
    }
  }
}

Zed AI (settings.json)

Zed editor (z.ai) supports MCP servers natively. Add to your Zed settings.json:

{
  "context_servers": {
    "pubmed-search": {
      "command": "uvx",
      "args": ["pubmed-search-mcp"],
      "env": {
        "NCBI_EMAIL": "your@email.com"
      }
    }
  }
}

Tip: Open Command Palette โ†’ zed: open settings to edit, or go to Agent Panel โ†’ Settings โ†’ "Add Custom Server".

OpenClaw ๐Ÿฆž (~/.openclaw/openclaw.json)

OpenClaw uses MCP servers via the mcp-adapter plugin. Install the adapter first:

openclaw plugins install mcp-adapter

Then add to ~/.openclaw/openclaw.json:

{
  "plugins": {
    "entries": {
      "mcp-adapter": {
        "enabled": true,
        "config": {
          "servers": [
            {
              "name": "pubmed-search",
              "transport": "stdio",
              "command": "uvx",
              "args": ["pubmed-search-mcp"],
              "env": {
                "NCBI_EMAIL": "your@email.com"
              }
            }
          ]
        }
      }
    }
  }
}

Restart the gateway after configuration:

openclaw gateway restart
openclaw plugins list  # Should show: mcp-adapter | loaded

Cline (cline_mcp_settings.json)

{
  "mcpServers": {
    "pubmed-search": {
      "command": "uvx",
      "args": ["pubmed-search-mcp"],
      "env": {
        "NCBI_EMAIL": "your@email.com"
      },
      "alwaysAllow": [],
      "disabled": false
    }
  }
}

Other MCP Clients

Any MCP-compatible client can use this server via stdio transport:

# Command
uvx pubmed-search-mcp

# With environment variable
NCBI_EMAIL=your@email.com uvx pubmed-search-mcp

Note: NCBI_EMAIL is required by NCBI API policy. Optionally set NCBI_API_KEY for higher rate limits (10 req/s vs 3 req/s). ๐Ÿ“– Detailed Integration Guides: See docs/INTEGRATIONS.md for all environment variables, Copilot Studio setup, Docker deployment, proxy configuration, and troubleshooting.


๐ŸŽฏ Design Philosophy

Core Positioning: The intelligent middleware between AI Agents and academic search engines.

Why This Server?

Other tools give you raw API access. We give you vocabulary translation + intelligent routing + research analysis:

Challenge Our Solution
Agent uses ICD codes, PubMed needs MeSH โœ… Auto ICDโ†’MeSH conversion
Multiple databases, different APIs โœ… Unified Search single entry point
Clinical questions need structured search โœ… PICO toolkit (parse_pico + generate_search_queries for Agent-driven workflow)
Typos in medical terms โœ… ESpell auto-correction
Too many results from one source โœ… Parallel multi-source with dedup
Need to trace research evolution โœ… Research Timeline & Tree with landmark detection, diagnostics, and sub-topic branching
Citation context is unclear โœ… Citation Tree forward/backward/network
Can't access full text โœ… Multi-source fulltext (Europe PMC, CORE, CrossRef)
Gene/drug info scattered across DBs โœ… NCBI Extended (Gene, PubChem, ClinVar)
Need cutting-edge preprints โœ… Preprint search (arXiv, medRxiv, bioRxiv) with peer-review filtering
Export to reference managers โœ… One-click export (RIS, BibTeX, CSV, MEDLINE)

Key Differentiators

  1. Vocabulary Translation Layer - Agent speaks naturally, we translate to each database's terminology (MeSH, ICD-10, text-mined entities)
  2. Unified Search Gateway - One unified_search() call, auto-dispatch to PubMed/Europe PMC/CORE/OpenAlex
  3. PICO Toolkit - parse_pico() decomposes clinical questions into P/I/C/O elements; Agent then calls generate_search_queries() per element and builds Boolean query
  4. Research Timeline & Lineage Tree - Detect milestones with policy-driven heuristics, identify landmark papers via multi-signal scoring, surface timeline diagnostics, and visualize research evolution as branching trees by sub-topic
  5. Citation Network Analysis - Build multi-level citation trees to map an entire research landscape from a single paper
  6. Full Research Lifecycle - From search โ†’ discovery โ†’ full text โ†’ analysis โ†’ export, all in one server
  7. Agent-First Design - Output optimized for machine decision-making, not human reading

๐Ÿ“ก External APIs & Data Sources

This MCP server integrates with multiple academic databases and APIs:

Core Data Sources

Source Coverage Vocabulary Auto-Convert Description
NCBI PubMed 36M+ articles MeSH โœ… Native Primary biomedical literature
NCBI Entrez Multi-DB MeSH โœ… Native Gene, PubChem, ClinVar
Europe PMC 33M+ Text-mined โœ… Extraction Full text XML access
CORE 200M+ None โžก๏ธ Free-text Open access aggregator
Semantic Scholar 200M+ S2 Fields โžก๏ธ Free-text AI-powered recommendations
OpenAlex 250M+ Concepts โžก๏ธ Free-text Open scholarly metadata
NIH iCite PubMed N/A N/A Citation metrics (RCR)

๐Ÿ”‘ Key: โœ… = Full vocabulary support | โžก๏ธ = Query pass-through (no controlled vocabulary)

ICD Codes: Auto-detected and converted to MeSH before PubMed search

Environment Variables

# Required
NCBI_EMAIL=your@email.com          # Required by NCBI policy

# Optional - For higher rate limits
NCBI_API_KEY=your_ncbi_api_key     # Get from: https://www.ncbi.nlm.nih.gov/account/settings/
CORE_API_KEY=your_core_api_key     # Get from: https://core.ac.uk/services/api
S2_API_KEY=your_s2_api_key         # Get from: https://www.semanticscholar.org/product/api

# Optional - Network settings
HTTP_PROXY=http://proxy:8080       # HTTP proxy for API requests
HTTPS_PROXY=https://proxy:8080     # HTTPS proxy for API requests

๐Ÿ”„ How It Works: The Middleware Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                              AI AGENT                                        โ”‚
โ”‚                                                                              โ”‚
โ”‚   "Find papers about I10 hypertension treatment in diabetic patients"       โ”‚
โ”‚                                                                              โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                  โ”‚
                                  โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                     ๐Ÿ”„ PUBMED SEARCH MCP (MIDDLEWARE)                        โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”โ”‚
โ”‚  โ”‚  1๏ธโƒฃ VOCABULARY TRANSLATION                                              โ”‚โ”‚
โ”‚  โ”‚     โ€ข ICD-10 "I10" โ†’ MeSH "Hypertension"                                โ”‚โ”‚
โ”‚  โ”‚     โ€ข "diabetic" โ†’ MeSH "Diabetes Mellitus"                             โ”‚โ”‚
โ”‚  โ”‚     โ€ข ESpell: "hypertention" โ†’ "hypertension"                           โ”‚โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”โ”‚
โ”‚  โ”‚  2๏ธโƒฃ INTELLIGENT ROUTING                                                 โ”‚โ”‚
โ”‚  โ”‚     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”             โ”‚โ”‚
โ”‚  โ”‚     โ”‚ PubMed   โ”‚  โ”‚Europe PMCโ”‚  โ”‚   CORE   โ”‚  โ”‚ OpenAlex โ”‚             โ”‚โ”‚
โ”‚  โ”‚     โ”‚  36M+    โ”‚  โ”‚   33M+   โ”‚  โ”‚  200M+   โ”‚  โ”‚  250M+   โ”‚             โ”‚โ”‚
โ”‚  โ”‚     โ”‚  (MeSH)  โ”‚  โ”‚(fulltext)โ”‚  โ”‚  (OA)    โ”‚  โ”‚(metadata)โ”‚             โ”‚โ”‚
โ”‚  โ”‚     โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜             โ”‚โ”‚
โ”‚  โ”‚          โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                 โ”‚โ”‚
โ”‚  โ”‚                              โ–ผ                                          โ”‚โ”‚
โ”‚  โ”‚  3๏ธโƒฃ RESULT AGGREGATION: Dedupe + Rank + Enrich                         โ”‚โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                  โ”‚
                                  โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                         UNIFIED RESULTS                                      โ”‚
โ”‚   โ€ข 150 unique papers (deduplicated from 4 sources)                          โ”‚
โ”‚   โ€ข Ranked by relevance + citation impact (RCR)                              โ”‚
โ”‚   โ€ข Full text links enriched from Europe PMC                                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ› ๏ธ MCP Tools Overview

๐Ÿ” Search & Query Intelligence

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                      SEARCH ENTRY POINT                          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                  โ”‚
โ”‚   unified_search()          โ† ๐ŸŒŸ Single entry for all sources    โ”‚
โ”‚        โ”‚                                                         โ”‚
โ”‚        โ”œโ”€โ”€ Quick search     โ†’ Direct multi-source query          โ”‚
โ”‚        โ”œโ”€โ”€ PICO hints       โ†’ Detects comparison, shows P/I/C/O  โ”‚
โ”‚        โ””โ”€โ”€ ICD expansion    โ†’ Auto ICDโ†’MeSH conversion           โ”‚
โ”‚                                                                  โ”‚
โ”‚   Sources: PubMed ยท Europe PMC ยท CORE ยท OpenAlex                 โ”‚
โ”‚   Auto: Deduplicate โ†’ Rank โ†’ Enrich full-text links              โ”‚
โ”‚                                                                  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚   QUERY INTELLIGENCE                                             โ”‚
โ”‚                                                                  โ”‚
โ”‚   generate_search_queries() โ†’ MeSH expansion + synonym discovery โ”‚
โ”‚   parse_pico()              โ†’ PICO element decomposition         โ”‚
โ”‚   analyze_search_query()    โ†’ Query analysis without execution   โ”‚
โ”‚                                                                  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ”ฌ Discovery Tools (After Finding Key Papers)

                        Found important paper (PMID)
                                   โ”‚
           โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
           โ”‚                       โ”‚                       โ”‚
           โ–ผ                       โ–ผ                       โ–ผ
    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚  BACKWARD   โ”‚        โ”‚  SIMILAR    โ”‚        โ”‚  FORWARD    โ”‚
    โ”‚  โ—€โ”€โ”€โ”€โ”€โ”€โ”€    โ”‚        โ”‚  โ‰ˆโ‰ˆโ‰ˆโ‰ˆโ‰ˆโ‰ˆ     โ”‚        โ”‚  โ”€โ”€โ”€โ”€โ”€โ”€โ–ถ    โ”‚
    โ”‚             โ”‚        โ”‚             โ”‚        โ”‚             โ”‚
    โ”‚ get_article โ”‚        โ”‚find_related โ”‚        โ”‚find_citing  โ”‚
    โ”‚ _references โ”‚        โ”‚ _articles   โ”‚        โ”‚ _articles   โ”‚
    โ”‚             โ”‚        โ”‚             โ”‚        โ”‚             โ”‚
    โ”‚ Foundation  โ”‚        โ”‚  Similar    โ”‚        โ”‚ Follow-up   โ”‚
    โ”‚  papers     โ”‚        โ”‚   topic     โ”‚        โ”‚  research   โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

    fetch_article_details()   โ†’ Detailed article metadata
    get_citation_metrics()    โ†’ iCite RCR, citation percentile
    build_citation_tree()     โ†’ Full network visualization (6 formats)

๐Ÿ“š Full Text, Figure Extraction & Export

Category Tools
Full Text get_fulltext โ†’ Multi-source retrieval (Europe PMC, CORE, PubMed, CrossRef)
Figures get_article_figures โ†’ Extract figure labels, captions, image URLs, and PDF links from PMC Open Access articles
Figure-aware Full Text get_fulltext(include_figures=True) โ†’ Embed figure metadata alongside structured fulltext
Text Mining get_text_mined_terms โ†’ Extract genes, diseases, chemicals
Export prepare_export โ†’ RIS, BibTeX, CSV, MEDLINE, JSON

๐Ÿ–ผ๏ธ OA Figure-First Exploration

Use the PMC Open Access path when an agent needs evidence figures, not just article text:

  • get_article_figures(identifier="PMC12086443") โ†’ Figure labels, captions, image URLs, and PDF/article links
  • get_fulltext(pmcid="PMC7096777", include_figures=True) โ†’ Structured fulltext with figures inline
  • Figure output preserves article context, so agents can connect each figure back to the sections where it is mentioned

๐Ÿงฌ NCBI Extended Databases

Tool Description
search_gene Search NCBI Gene database
get_gene_details Gene details by NCBI Gene ID
get_gene_literature PubMed articles linked to a gene
search_compound Search PubChem compounds
get_compound_details Compound details by PubChem CID
get_compound_literature PubMed articles linked to a compound
search_clinvar Search ClinVar clinical variants

๐Ÿ•ฐ๏ธ Research Timeline & Lineage Tree

Tool Description
build_research_timeline Build timeline/tree with landmark detection and formatted diagnostics. Output: text, tree, mermaid, mindmap, json
analyze_timeline_milestones Analyze milestone distribution with diagnostics payload
compare_timelines Compare multiple topic timelines with per-topic diagnostics

๐Ÿฅ Institutional Access & ICD Conversion

Tool Description
configure_institutional_access Configure institution's link resolver
get_institutional_link Generate OpenURL access link
list_resolver_presets List resolver presets
test_institutional_access Test resolver configuration
convert_icd_mesh Convert between ICD codes and MeSH terms (bidirectional)
unified_search Auto-detect ICD codes in queries and expand them to MeSH

๐Ÿ’พ Session Management

Tool Description
get_session_pmids Retrieve cached PMID lists
get_cached_article Get article from session cache (no API cost)
get_session_summary Session status overview

Dynamic MCP resources are also available for agents that can read resources directly:

  • session://context โ€” active session status
  • session://last-search โ€” latest search metadata
  • session://last-search/pmids โ€” latest PMID list + CSV form
  • session://last-search/results โ€” cached article payloads for the latest search

๐Ÿ” Pipeline Management

Tool Description
save_pipeline Save a pipeline config for later reuse (YAML/JSON, auto-validated)
list_pipelines List saved pipelines (filter by tag/scope)
load_pipeline Load pipeline from name or file for review/editing
delete_pipeline Delete pipeline and its execution history
get_pipeline_history View execution history with article diff analysis
schedule_pipeline Schedule periodic execution (Phase 4)

๐Ÿ‘๏ธ Vision & Image Search

Tool Description
analyze_figure_for_search Analyze scientific figure for search
search_biomedical_images Search biomedical images across Open-i (X-ray, microscopy, photos, diagrams)

๐Ÿ“„ Preprint Search

Search arXiv, medRxiv, and bioRxiv preprint servers via unified_search options flags:

  • preprints: Enable dedicated preprint search and show results in a separate section.
  • all_types: Keep non-peer-reviewed content in main aggregated results.

Recommended combinations:

  • Empty options: Peer-reviewed results only.
  • options="preprints": Peer-reviewed main results plus a separate preprint section.
  • options="preprints, all_types": Separate preprint section plus non-peer-reviewed content retained in main results.
  • options="all_types": No dedicated preprint crawl, but non-peer-reviewed items from searched sources are retained.

Preprint detection โ€” articles are identified as preprints by:

  • Article type from source API (OpenAlex, CrossRef, Semantic Scholar)
  • arXiv ID present without PubMed ID
  • Known preprint server source or journal name
  • DOI prefix matching preprint servers (e.g., 10.1101/ โ†’ bioRxiv/medRxiv, 10.48550/ โ†’ arXiv)

๐ŸŒณ Research Context Graph

unified_search can append a lightweight research lineage view built from PMID-backed ranked results:

Option Flag Description
context_graph Append a Research Context Graph preview to Markdown output and include research_context in JSON output

This is useful when an agent needs quick thematic branching without making a second build_research_timeline call.

๐Ÿ“Š Count-First Orientation

unified_search can also front-load the existing source coverage and decision hints for agents that want routing help before reading the ranked list:

Option Flag Description
counts_first Add a source-count table, coverage summary, and next-tool recommendations to the response

Example:

unified_search(query="remimazolam ICU sedation", options="counts_first")

This mode is useful when the agent should decide whether to expand a source, inspect the lead PMID, fetch fulltext, extract figures, or pivot into timeline exploration.

โฑ๏ธ MCP Progress Reporting

When the MCP client provides a progress token, unified_search, build_research_timeline, analyze_timeline_milestones, compare_timelines, get_fulltext, and get_text_mined_terms emit progress updates for their major phases. This reduces the "black box" wait time for agents during longer searches.


๐Ÿ“‹ Agent Usage Examples

1๏ธโƒฃ Quick Search (Simplest)

# Agent just asks naturally - middleware handles everything
unified_search(query="remimazolam ICU sedation", limit=20)

# Or with clinical codes - auto-converted to MeSH
unified_search(query="I10 treatment in E11.9 patients")
#                     โ†‘ ICD-10           โ†‘ ICD-10
#                     Hypertension       Type 2 Diabetes

2๏ธโƒฃ PICO Clinical Question

Simple path โ€” unified_search can search directly (no PICO decomposition):

# unified_search searches as-is; detects "A vs B" pattern and shows PICO hints in metadata
unified_search(query="Is remimazolam better than propofol for ICU sedation?")
# โ†’ Multi-source keyword search + PICO hint metadata in output
# โš ๏ธ This does NOT auto-decompose PICO or expand MeSH!
# For structured PICO search, use the Agent workflow below

Agent workflow โ€” PICO decomposition + MeSH expansion (recommended for clinical questions):

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  "Is remimazolam better than propofol for ICU sedation?"                โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                  โ”‚
                                  โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                         parse_pico()                                     โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                     โ”‚
โ”‚  โ”‚    P    โ”‚  โ”‚    I    โ”‚  โ”‚    C    โ”‚  โ”‚    O    โ”‚                     โ”‚
โ”‚  โ”‚  ICU    โ”‚  โ”‚remimaz- โ”‚  โ”‚propofol โ”‚  โ”‚sedation โ”‚                     โ”‚
โ”‚  โ”‚patients โ”‚  โ”‚  olam   โ”‚  โ”‚         โ”‚  โ”‚outcomes โ”‚                     โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜                     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
        โ”‚            โ”‚            โ”‚            โ”‚
        โ–ผ            โ–ผ            โ–ผ            โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚              generate_search_queries() ร— 4 (parallel)                    โ”‚
โ”‚                                                                          โ”‚
โ”‚  P โ†’ "Intensive Care Units"[MeSH]                                        โ”‚
โ”‚  I โ†’ "remimazolam" [Supplementary Concept], "CNS 7056"                   โ”‚
โ”‚  C โ†’ "Propofol"[MeSH], "Diprivan"                                        โ”‚
โ”‚  O โ†’ "Conscious Sedation"[MeSH], "Deep Sedation"[MeSH]                   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                  โ”‚
                                  โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚              Agent combines with Boolean logic                           โ”‚
โ”‚                                                                          โ”‚
โ”‚  (P) AND (I) AND (C) AND (O)  โ† High precision                           โ”‚
โ”‚  (P) AND (I OR C) AND (O)     โ† High recall                              โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                  โ”‚
                                  โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚              unified_search() (auto multi-source + dedup)                โ”‚
โ”‚                                                                          โ”‚
โ”‚  PubMed + Europe PMC + CORE + OpenAlex โ†’ Auto deduplicate & rank         โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
# Step 1: Parse clinical question
parse_pico("Is remimazolam better than propofol for ICU sedation?")
# Returns: P=ICU patients, I=remimazolam, C=propofol, O=sedation outcomes

# Step 2: Get MeSH for each element (parallel!)
generate_search_queries(topic="ICU patients")   # P
generate_search_queries(topic="remimazolam")    # I
generate_search_queries(topic="propofol")       # C
generate_search_queries(topic="sedation")       # O

# Step 3: Agent combines with Boolean
query = '("Intensive Care Units"[MeSH]) AND (remimazolam OR "CNS 7056") AND propofol AND sedation'

# Step 4: Search (auto multi-source, dedup, rank)
unified_search(query=query)

3๏ธโƒฃ Explore from Key Paper

# Found landmark paper PMID: 33475315
find_related_articles(pmid="33475315")   # Similar methodology
find_citing_articles(pmid="33475315")    # Who built on this?
get_article_references(pmid="33475315")  # What's the foundation?

# Build complete research map
build_citation_tree(pmid="33475315", depth=2, output_format="mermaid")

4๏ธโƒฃ Gene/Drug Research

# Research a gene
search_gene(query="BRCA1", organism="human")
get_gene_literature(gene_id="672", limit=20)

# Research a drug compound
search_compound(query="propofol")
get_compound_literature(cid="4943", limit=20)

5๏ธโƒฃ Export Results

# Export last search results
prepare_export(pmids="last", format="ris")      # โ†’ EndNote/Zotero
prepare_export(pmids="last", format="bibtex")   # โ†’ LaTeX

# Retrieve full text for a selected paper from the last search
get_fulltext(pmid="12345678", extended_sources=True)

6๏ธโƒฃ Preprint Search

# Include preprints alongside peer-reviewed results
unified_search(query="COVID-19 vaccine efficacy", options="preprints")
# โ†’ Main results (peer-reviewed) + Separate preprint section (arXiv, medRxiv, bioRxiv)

# Include preprints and retain non-peer-reviewed items in main results
unified_search(query="CRISPR gene therapy", options="preprints, all_types")
# โ†’ Separate preprint section + non-peer-reviewed items retained in main results

# Only peer-reviewed (default behavior)
unified_search("diabetes treatment")
# โ†’ Preprints from any source automatically filtered out

# Add a research context graph preview to the same search response
unified_search("remimazolam ICU sedation", options="context_graph")

7๏ธโƒฃ Pipeline (Reusable Search Plans)

# Save a template-based pipeline
save_pipeline(
    name="icu_sedation_weekly",
    config="template: pico\nparams:\n  P: ICU patients\n  I: remimazolam\n  C: propofol\n  O: delirium",
    tags="anesthesia,sedation",
    description="Weekly ICU sedation monitoring"
)

# Save a custom DAG pipeline
save_pipeline(
    name="brca1_comprehensive",
    config="""
steps:
  - id: expand
    action: expand
    params: { topic: BRCA1 breast cancer }
  - id: pubmed
    action: search
    params: { query: BRCA1, sources: pubmed, limit: 50 }
  - id: expanded
    action: search
    inputs: [expand]
    params: { strategy: mesh, sources: pubmed,openalex, limit: 50 }
  - id: merged
    action: merge
    inputs: [pubmed, expanded]
    params: { method: rrf }
  - id: enriched
    action: metrics
    inputs: [merged]
output:
  limit: 30
  ranking: quality
"""
)

# Execute a saved pipeline
unified_search(pipeline="saved:icu_sedation_weekly")

# List & manage
list_pipelines(tag="anesthesia")
load_pipeline(source="brca1_comprehensive")  # Review YAML
get_pipeline_history(name="icu_sedation_weekly")  # View past runs

๐Ÿ” Search Mode Comparison

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                        SEARCH MODE DECISION TREE                         โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                          โ”‚
โ”‚   "What kind of search do I need?"                                       โ”‚
โ”‚         โ”‚                                                                โ”‚
โ”‚         โ”œโ”€โ”€ Know exactly what to search?                                 โ”‚
โ”‚         โ”‚   โ””โ”€โ”€ unified_search(query="topic keywords")                   โ”‚
โ”‚         โ”‚       โ†’ Quick, auto-routing to best sources                    โ”‚
โ”‚         โ”‚                                                                โ”‚
โ”‚         โ”œโ”€โ”€ Have a clinical question (A vs B)?                           โ”‚
โ”‚         โ”‚   โ””โ”€โ”€ parse_pico() โ†’ generate_search_queries() ร— N             โ”‚
โ”‚         โ”‚       โ†’ Agent builds Boolean โ†’ unified_search()                โ”‚
โ”‚         โ”‚                                                                โ”‚
โ”‚         โ”œโ”€โ”€ Need comprehensive systematic coverage?                      โ”‚
โ”‚         โ”‚   โ””โ”€โ”€ generate_search_queries() โ†’ parallel search              โ”‚
โ”‚         โ”‚       โ†’ MeSH expansion, multiple strategies, merge             โ”‚
โ”‚         โ”‚                                                                โ”‚
โ”‚         โ””โ”€โ”€ Exploring from a key paper?                                  โ”‚
โ”‚             โ””โ”€โ”€ find_related/citing/references โ†’ build_citation_tree     โ”‚
โ”‚                 โ†’ Citation network, research context                     โ”‚
โ”‚                                                                          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
Mode Entry Point Best For Auto-Features
Quick unified_search() Fast topic search ICDโ†’MeSH, multi-source, dedup
PICO parse_pico() โ†’ Agent Clinical questions Agent: decompose โ†’ MeSH expand โ†’ Boolean
Systematic generate_search_queries() Literature reviews MeSH expansion, synonyms
Exploration find_*_articles() From key paper Citation network, related

๐Ÿค– Claude Skills (AI Agent Workflows)

Pre-built workflow guides in .claude/skills/, divided into Usage Skills (for using the MCP server) and Development Skills (for maintaining the project):

๐Ÿ“š Usage Skills (10) โ€” For AI Agents Using This MCP Server

Skill Description
pubmed-quick-search Basic search with filters
pubmed-systematic-search MeSH expansion, comprehensive
pubmed-pico-search Clinical question decomposition
pubmed-paper-exploration Citation tree, related articles
pubmed-gene-drug-research Gene/PubChem/ClinVar
pubmed-fulltext-access Europe PMC, CORE full text
pubmed-export-citations RIS/BibTeX/CSV export
pubmed-multi-source-search Cross-database unified search
pubmed-mcp-tools-reference Complete tool reference guide
pipeline-persistence Save, load, reuse search plans

๐Ÿ”ง Development Skills (13) โ€” For Project Contributors

Skill Description
changelog-updater Auto-update CHANGELOG.md
code-refactor DDD architecture refactoring
code-reviewer Code quality & security review
ddd-architect DDD scaffold for new features
git-doc-updater Sync docs before commits
git-precommit Pre-commit workflow orchestration
memory-checkpoint Save context to Memory Bank
memory-updater Update Memory Bank files
project-init Initialize new projects
readme-i18n Multilingual README sync
readme-updater Sync README with code changes
roadmap-updater Update ROADMAP.md status
test-generator Generate test suites

๐Ÿ“ Location: .claude/skills/*/SKILL.md (Claude Code-specific, and the single source of truth for repo skills) Do not mirror or split repo skills into .github/skills/. These repo skills are project-scoped and should remain version-controlled. Personal cross-project skills belong in a user directory such as ~/.copilot/skills/ or ~/.claude/skills/, not in this repository.


๐Ÿ—๏ธ Architecture (DDD)

This project uses Domain-Driven Design (DDD) architecture, with literature research domain knowledge as the core model.

src/pubmed_search/
โ”œโ”€โ”€ domain/                     # Core business logic
โ”‚   โ””โ”€โ”€ entities/article.py     # UnifiedArticle, Author, etc.
โ”œโ”€โ”€ application/                # Use cases
โ”‚   โ”œโ”€โ”€ search/                 # QueryAnalyzer, ResultAggregator
โ”‚   โ”œโ”€โ”€ export/                 # Citation export (RIS, BibTeX...)
โ”‚   โ””โ”€โ”€ session/                # SessionManager
โ”œโ”€โ”€ infrastructure/             # External systems
โ”‚   โ”œโ”€โ”€ ncbi/                   # Entrez, iCite, Citation Exporter
โ”‚   โ”œโ”€โ”€ sources/                # Europe PMC, CORE, CrossRef...
โ”‚   โ””โ”€โ”€ http/                   # HTTP clients
โ”œโ”€โ”€ presentation/               # User interfaces
โ”‚   โ”œโ”€โ”€ mcp_server/             # MCP tools, prompts, resources
โ”‚   โ”‚   โ””โ”€โ”€ tools/              # discovery, strategy, pico, export...
โ”‚   โ””โ”€โ”€ api/                    # REST API (Copilot Studio)
โ””โ”€โ”€ shared/                     # Cross-cutting concerns
    โ”œโ”€โ”€ exceptions.py           # Unified error handling
    โ””โ”€โ”€ async_utils.py          # Rate limiter, retry, circuit breaker

Internal Mechanisms (Transparent to Agent)

Mechanism Description
Session Auto-create, auto-switch
Cache Auto-cache search results, avoid duplicate API calls
Rate Limit Auto-comply with NCBI API limits (0.34s/0.1s)
MeSH Lookup generate_search_queries() auto-queries NCBI MeSH database
ESpell Auto spelling correction (remifentanyl โ†’ remifentanil)
Query Analysis Each suggested query shows how PubMed actually interprets it

Vocabulary Translation Layer (Key Feature)

Our Core Value: We are the intelligent middleware between Agent and Search Engines, automatically handling vocabulary standardization so Agent doesn't need to know each database's terminology.

Different data sources use different controlled vocabulary systems. This server provides automatic conversion:

API / Database Vocabulary System Auto-Conversion
PubMed / NCBI MeSH (Medical Subject Headings) โœ… Full support via expand_with_mesh()
ICD Codes ICD-10-CM / ICD-9-CM โœ… Auto-detect & convert to MeSH
Europe PMC Text-mined entities (Gene, Disease, Chemical) โœ… get_text_mined_terms() extraction
OpenAlex OpenAlex Concepts (deprecated) โŒ Free-text only
Semantic Scholar S2 Field of Study โŒ Free-text only
CORE None โŒ Free-text only
CrossRef None โŒ Free-text only

Automatic ICD โ†’ MeSH Conversion

When searching with ICD codes (e.g., I10 for Hypertension), unified_search() automatically:

  1. Detects ICD-10/ICD-9 patterns via detect_and_expand_icd_codes()
  2. Looks up corresponding MeSH terms from internal mapping (ICD10_TO_MESH, ICD9_TO_MESH)
  3. Expands query with MeSH synonyms for comprehensive search
# Agent calls unified_search with clinical terminology
unified_search(query="I10 treatment outcomes")

# Server auto-expands to PubMed-compatible query
"(I10 OR Hypertension[MeSH]) treatment outcomes"

๐Ÿ“– Full architecture documentation: ARCHITECTURE.md

MeSH Auto-Expansion + Query Analysis

When calling generate_search_queries("remimazolam sedation"), internally it:

  1. ESpell Correction - Fix spelling errors
  2. MeSH Query - Entrez.esearch(db="mesh") to get standard vocabulary
  3. Synonym Extraction - Get synonyms from MeSH Entry Terms
  4. Query Analysis - Analyze how PubMed interprets each query
{
  "mesh_terms": [
    {
      "input": "remimazolam",
      "preferred": "remimazolam [Supplementary Concept]",
      "synonyms": ["CNS 7056", "ONO 2745"]
    }
  ],
  "all_synonyms": ["CNS 7056", "ONO 2745", ...],
  "suggested_queries": [
    {
      "id": "q1_title",
      "query": "(remimazolam sedation)[Title]",
      "purpose": "Exact title match - highest precision",
      "estimated_count": 8,
      "pubmed_translation": "\"remimazolam sedation\"[Title]"
    },
    {
      "id": "q3_and",
      "query": "(remimazolam AND sedation)",
      "purpose": "All keywords required",
      "estimated_count": 561,
      "pubmed_translation": "(\"remimazolam\"[Supplementary Concept] OR \"remimazolam\"[All Fields]) AND (\"sedate\"[All Fields] OR ...)"
    }
  ]
}

Value of Query Analysis: Agent thinks remimazolam AND sedation only searches these two words, but PubMed actually expands to Supplementary Concept + synonyms, results go from 8 to 561. This helps Agent understand the difference between intent and actual search.


๐Ÿ”’ HTTPS Deployment

Enable HTTPS secure communication for production environments.

Copilot Studio Quick Start

# Step 1: Generate SSL certificates
./scripts/generate-ssl-certs.sh

# Step 2: Start HTTPS service (Docker)
./scripts/start-https-docker.sh up

# Verify deployment
curl -k https://localhost/

HTTPS Endpoints

Service URL Description
MCP SSE https://localhost/sse SSE connection (MCP)
Messages https://localhost/messages MCP POST
Health https://localhost/health Health check

Claude Desktop Configuration

{
  "mcpServers": {
    "pubmed-search": {
      "url": "https://localhost/sse"
    }
  }
}

๐Ÿข Microsoft Copilot Studio Integration

Integrate PubMed Search MCP with Microsoft 365 Copilot (Word, Teams, Outlook)!

Quick Start

# Start with Streamable HTTP transport (required by Copilot Studio)
uv run python run_server.py --transport streamable-http --port 8765

# Enable Copilot-compatible HTTP semantics while keeping full tool schemas
uv run python run_server.py --transport streamable-http --copilot-compatible --port 8765

# Or use the dedicated script with ngrok
./scripts/start-copilot-studio.sh --with-ngrok

Copilot Studio Configuration

Field Value
Server name PubMed Search
Server URL https://your-server.com/mcp
Authentication None (or API Key)

๐Ÿ“– Full documentation: copilot-studio/README.md

Use --copilot-compatible with run_server.py for Copilot HTTP semantics, or run_copilot.py if you also need simplified tool schemas.

โš ๏ธ Note: SSE transport deprecated since Aug 2025. Use streamable-http.


๐Ÿ“– More documentation:


๐Ÿ” Security

Security Features

Layer Feature Description
HTTPS TLS 1.2/1.3 encryption All traffic encrypted via Nginx
Rate Limiting 30 req/s Nginx level protection
Security Headers XSS/CSRF protection X-Frame-Options, X-Content-Type-Options
SSE Optimization 24h timeout Long-lived connections for real-time
No Database Stateless No SQL injection risk
No Secrets In-memory only No credentials stored

See DEPLOYMENT.md for detailed deployment instructions.


๐Ÿ“ค Export Formats

Export your search results in formats compatible with major reference managers:

Format Compatible With Use Case
RIS EndNote, Zotero, Mendeley Universal import
BibTeX LaTeX, Overleaf, JabRef Academic writing
CSV Excel, Google Sheets Data analysis
MEDLINE PubMed native format Archiving
JSON Programmatic access Custom processing

Exported Fields

  • Core: PMID, Title, Authors, Journal, Year, Volume, Issue, Pages
  • Identifiers: DOI, PMC ID, ISSN
  • Content: Abstract (HTML tags cleaned)
  • Metadata: Language, Publication Type, Keywords
  • Access: DOI URL, PMC URL, Full-text availability

Special Character Handling

  • BibTeX exports use pylatexenc for proper LaTeX encoding
  • Nordic characters (รธ, รฆ, รฅ), umlauts (รผ, รถ, รค), and accents are correctly converted
  • Example: Sรธren Hansen โ†’ S{\o}ren Hansen

๐Ÿ“š Citation

GitHub will show Cite this repository from CITATION.cff. If you use PubMed Search MCP in research, methods sections, or internal technical reports, prefer the GitHub-generated citation or reuse the repository metadata directly.

@software{pubmed_search_mcp,
  title = {PubMed Search MCP},
  author = {u9401066},
  url = {https://github.com/u9401066/pubmed-search-mcp}
}

๐Ÿ“„ License

Apache License 2.0 - see LICENSE


๐Ÿ”— Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pubmed_search_mcp-0.5.0.tar.gz (404.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pubmed_search_mcp-0.5.0-py3-none-any.whl (467.0 kB view details)

Uploaded Python 3

File details

Details for the file pubmed_search_mcp-0.5.0.tar.gz.

File metadata

  • Download URL: pubmed_search_mcp-0.5.0.tar.gz
  • Upload date:
  • Size: 404.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pubmed_search_mcp-0.5.0.tar.gz
Algorithm Hash digest
SHA256 5affe0796a311a57dda7fcd0be0345ca299a885faba76d93b390ce23c34e1df9
MD5 e08f5d1a888b21264cca9f156199a242
BLAKE2b-256 36c1afa7d58bdf6caa499b4dfda4204bce7f02f395fd9c8b78e131fccb744026

See more details on using hashes here.

Provenance

The following attestation bundles were made for pubmed_search_mcp-0.5.0.tar.gz:

Publisher: publish.yml on u9401066/pubmed-search-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pubmed_search_mcp-0.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pubmed_search_mcp-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7e407d4406c144d170e134a81b07cded8bde5b8cd1b28d9dec8f5e770a781683
MD5 563cadcc326f8432519efa5610a789ea
BLAKE2b-256 639e7e1a7243324568eee3dacbcfd26528d95c1d9cd64bc84f52b0ec58e2456d

See more details on using hashes here.

Provenance

The following attestation bundles were made for pubmed_search_mcp-0.5.0-py3-none-any.whl:

Publisher: publish.yml on u9401066/pubmed-search-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page