MCP server for PubMed literature search with MeSH, PICO, and intelligent query expansion
Project description
PubMed Search MCP
Professional Literature Research Assistant for AI Agents - More than just an API wrapper
A Domain-Driven Design (DDD) based MCP server that serves as an intelligent research assistant for AI agents, providing task-oriented literature search and analysis capabilities.
โจ What's Included:
- ๐ง 42 MCP Tools - Streamlined PubMed, Europe PMC, CORE, NCBI database access, and Research Timeline / Context Graph
- ๐ผ๏ธ OA Figure Extraction - Pull figure captions, direct image URLs, and PDF links from PMC Open Access articles
- ๐ Docs Site - Browse overview, architecture, quick reference, pipeline tutorials, source contracts, troubleshooting, and deployment in one place at docs/index.html
- ๐ 24 Claude Skills - Ready-to-use workflow guides for AI agents (Claude Code-specific)
- ๐ Copilot Instructions - VS Code GitHub Copilot integration guide
๐ Language: English | ็น้ซไธญๆ
๐ Quick Install
Prerequisites
-
Python 3.10+ โ Download
-
uv (recommended) โ Install uv
# macOS / Linux curl -LsSf https://astral.sh/uv/install.sh | sh # Windows powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
-
NCBI Email โ Required by NCBI API policy. Any valid email address.
-
NCBI API Key (optional) โ Get one here for higher rate limits (10 req/s vs 3 req/s)
Install & Run
# Option 1: Zero-install with uvx (recommended for trying out)
uvx pubmed-search-mcp
# Option 2: Add as project dependency
uv add pubmed-search-mcp
# Option 3: pip install
pip install pubmed-search-mcp
โ๏ธ Configuration
This MCP server works with any MCP-compatible AI tool. Choose your preferred client:
VS Code / Cursor (.vscode/mcp.json)
{
"servers": {
"pubmed-search": {
"type": "stdio",
"command": "uvx",
"args": ["pubmed-search-mcp"],
"env": {
"NCBI_EMAIL": "your@email.com"
}
}
}
}
Claude Desktop (claude_desktop_config.json)
{
"mcpServers": {
"pubmed-search": {
"command": "uvx",
"args": ["pubmed-search-mcp"],
"env": {
"NCBI_EMAIL": "your@email.com"
}
}
}
}
Config file location:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json- Windows:
%APPDATA%\Claude\claude_desktop_config.json- Linux:
~/.config/Claude/claude_desktop_config.json
Claude Code
claude mcp add pubmed-search -- uvx pubmed-search-mcp
Or add to .mcp.json in your project root:
{
"mcpServers": {
"pubmed-search": {
"command": "uvx",
"args": ["pubmed-search-mcp"],
"env": {
"NCBI_EMAIL": "your@email.com"
}
}
}
}
Zed AI (settings.json)
Zed editor (z.ai) supports MCP servers natively. Add to your Zed settings.json:
{
"context_servers": {
"pubmed-search": {
"command": "uvx",
"args": ["pubmed-search-mcp"],
"env": {
"NCBI_EMAIL": "your@email.com"
}
}
}
}
Tip: Open Command Palette โ
zed: open settingsto edit, or go to Agent Panel โ Settings โ "Add Custom Server".
OpenClaw ๐ฆ (~/.openclaw/openclaw.json)
OpenClaw uses MCP servers via the mcp-adapter plugin. Install the adapter first:
openclaw plugins install mcp-adapter
Then add to ~/.openclaw/openclaw.json:
{
"plugins": {
"entries": {
"mcp-adapter": {
"enabled": true,
"config": {
"servers": [
{
"name": "pubmed-search",
"transport": "stdio",
"command": "uvx",
"args": ["pubmed-search-mcp"],
"env": {
"NCBI_EMAIL": "your@email.com"
}
}
]
}
}
}
}
}
Restart the gateway after configuration:
openclaw gateway restart
openclaw plugins list # Should show: mcp-adapter | loaded
Cline (cline_mcp_settings.json)
{
"mcpServers": {
"pubmed-search": {
"command": "uvx",
"args": ["pubmed-search-mcp"],
"env": {
"NCBI_EMAIL": "your@email.com"
},
"alwaysAllow": [],
"disabled": false
}
}
}
Other MCP Clients
Any MCP-compatible client can use this server via stdio transport:
# Command
uvx pubmed-search-mcp
# With environment variable
NCBI_EMAIL=your@email.com uvx pubmed-search-mcp
Note:
NCBI_EMAILis required by NCBI API policy. Optionally setNCBI_API_KEYfor higher rate limits (10 req/s vs 3 req/s). ๐ Detailed Integration Guides: See docs/INTEGRATIONS.md for all environment variables, Copilot Studio setup, Docker deployment, proxy configuration, and troubleshooting.
๐ฏ Design Philosophy
Core Positioning: The intelligent middleware between AI Agents and academic search engines.
Why This Server?
Other tools give you raw API access. We give you vocabulary translation + intelligent routing + research analysis:
| Challenge | Our Solution |
|---|---|
| Agent uses ICD codes, PubMed needs MeSH | โ Auto ICDโMeSH conversion |
| Multiple databases, different APIs | โ Unified Search single entry point |
| Clinical questions need structured search | โ
PICO toolkit (parse_pico + generate_search_queries for Agent-driven workflow) |
| Typos in medical terms | โ ESpell auto-correction |
| Too many results from one source | โ Parallel multi-source with dedup |
| Need to trace research evolution | โ Research Timeline & Tree with landmark detection, diagnostics, and sub-topic branching |
| Citation context is unclear | โ Citation Tree forward/backward/network |
| Can't access full text | โ Multi-source fulltext (Europe PMC, CORE, CrossRef) |
| Gene/drug info scattered across DBs | โ NCBI Extended (Gene, PubChem, ClinVar) |
| Need cutting-edge preprints | โ Preprint search (arXiv, medRxiv, bioRxiv) with peer-review filtering |
| Export to reference managers | โ One-click export (RIS, BibTeX, CSV, MEDLINE) |
Key Differentiators
- Vocabulary Translation Layer - Agent speaks naturally, we translate to each database's terminology (MeSH, ICD-10, text-mined entities)
- Unified Search Gateway - One
unified_search()call, auto-dispatch to PubMed/Europe PMC/CORE/OpenAlex - PICO Toolkit -
parse_pico()decomposes clinical questions into P/I/C/O elements; Agent then callsgenerate_search_queries()per element and builds Boolean query - Research Timeline & Lineage Tree - Detect milestones with policy-driven heuristics, identify landmark papers via multi-signal scoring, surface timeline diagnostics, and visualize research evolution as branching trees by sub-topic
- Citation Network Analysis - Build multi-level citation trees to map an entire research landscape from a single paper
- Full Research Lifecycle - From search โ discovery โ full text โ analysis โ export, all in one server
- Agent-First Design - Output optimized for machine decision-making, not human reading
๐ก External APIs & Data Sources
This MCP server integrates with multiple academic databases and APIs:
Core Data Sources
| Source | Coverage | Vocabulary | Auto-Convert | Description |
|---|---|---|---|---|
| NCBI PubMed | 36M+ articles | MeSH | โ Native | Primary biomedical literature |
| NCBI Entrez | Multi-DB | MeSH | โ Native | Gene, PubChem, ClinVar |
| Europe PMC | 33M+ | Text-mined | โ Extraction | Full text XML access |
| CORE | 200M+ | None | โก๏ธ Free-text | Open access aggregator |
| Semantic Scholar | 200M+ | S2 Fields | โก๏ธ Free-text | AI-powered recommendations |
| OpenAlex | 250M+ | Concepts | โก๏ธ Free-text | Open scholarly metadata |
| NIH iCite | PubMed | N/A | N/A | Citation metrics (RCR) |
๐ Key: โ = Full vocabulary support | โก๏ธ = Query pass-through (no controlled vocabulary)
ICD Codes: Auto-detected and converted to MeSH before PubMed search
Environment Variables
# Required
NCBI_EMAIL=your@email.com # Required by NCBI policy
# Optional - For higher rate limits
NCBI_API_KEY=your_ncbi_api_key # Get from: https://www.ncbi.nlm.nih.gov/account/settings/
CORE_API_KEY=your_core_api_key # Get from: https://core.ac.uk/services/api
S2_API_KEY=your_s2_api_key # Get from: https://www.semanticscholar.org/product/api
# Optional - Network settings
HTTP_PROXY=http://proxy:8080 # HTTP proxy for API requests
HTTPS_PROXY=https://proxy:8080 # HTTPS proxy for API requests
๐ How It Works: The Middleware Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ AI AGENT โ
โ โ
โ "Find papers about I10 hypertension treatment in diabetic patients" โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ๐ PUBMED SEARCH MCP (MIDDLEWARE) โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ 1๏ธโฃ VOCABULARY TRANSLATION โโ
โ โ โข ICD-10 "I10" โ MeSH "Hypertension" โโ
โ โ โข "diabetic" โ MeSH "Diabetes Mellitus" โโ
โ โ โข ESpell: "hypertention" โ "hypertension" โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ 2๏ธโฃ INTELLIGENT ROUTING โโ
โ โ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ โโ
โ โ โ PubMed โ โEurope PMCโ โ CORE โ โ OpenAlex โ โโ
โ โ โ 36M+ โ โ 33M+ โ โ 200M+ โ โ 250M+ โ โโ
โ โ โ (MeSH) โ โ(fulltext)โ โ (OA) โ โ(metadata)โ โโ
โ โ โโโโโโฌโโโโโโ โโโโโโฌโโโโโโ โโโโโโฌโโโโโโ โโโโโโฌโโโโโโ โโ
โ โ โโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโ โโ
โ โ โผ โโ
โ โ 3๏ธโฃ RESULT AGGREGATION: Dedupe + Rank + Enrich โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ UNIFIED RESULTS โ
โ โข 150 unique papers (deduplicated from 4 sources) โ
โ โข Ranked by relevance + citation impact (RCR) โ
โ โข Full text links enriched from Europe PMC โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ ๏ธ MCP Tools Overview
๐ Search & Query Intelligence
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ SEARCH ENTRY POINT โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ unified_search() โ ๐ Single entry for all sources โ
โ โ โ
โ โโโ Quick search โ Direct multi-source query โ
โ โโโ PICO hints โ Detects comparison, shows P/I/C/O โ
โ โโโ ICD expansion โ Auto ICDโMeSH conversion โ
โ โ
โ Sources: PubMed ยท Europe PMC ยท CORE ยท OpenAlex โ
โ Auto: Deduplicate โ Rank โ Enrich full-text links โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ QUERY INTELLIGENCE โ
โ โ
โ generate_search_queries() โ MeSH expansion + synonym discovery โ
โ parse_pico() โ PICO element decomposition โ
โ analyze_search_query() โ Query analysis without execution โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ฌ Discovery Tools (After Finding Key Papers)
Found important paper (PMID)
โ
โโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ โ
โผ โผ โผ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ BACKWARD โ โ SIMILAR โ โ FORWARD โ
โ โโโโโโโ โ โ โโโโโโ โ โ โโโโโโโถ โ
โ โ โ โ โ โ
โ get_article โ โfind_related โ โfind_citing โ
โ _references โ โ _articles โ โ _articles โ
โ โ โ โ โ โ
โ Foundation โ โ Similar โ โ Follow-up โ
โ papers โ โ topic โ โ research โ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
fetch_article_details() โ Detailed article metadata
get_citation_metrics() โ iCite RCR, citation percentile
build_citation_tree() โ Full network visualization (6 formats)
๐ Full Text, Figure Extraction & Export
| Category | Tools |
|---|---|
| Full Text | get_fulltext โ Multi-source retrieval (Europe PMC, CORE, PubMed, CrossRef) |
| Figures | get_article_figures โ Extract figure labels, captions, image URLs, and PDF links from PMC Open Access articles |
| Figure-aware Full Text | get_fulltext(include_figures=True) โ Embed figure metadata alongside structured fulltext |
| Text Mining | get_text_mined_terms โ Extract genes, diseases, chemicals |
| Export | prepare_export โ RIS, BibTeX, CSV, MEDLINE, JSON |
๐ผ๏ธ OA Figure-First Exploration
Use the PMC Open Access path when an agent needs evidence figures, not just article text:
get_article_figures(identifier="PMC12086443")โ Figure labels, captions, image URLs, and PDF/article linksget_fulltext(pmcid="PMC7096777", include_figures=True)โ Structured fulltext with figures inline- Figure output preserves article context, so agents can connect each figure back to the sections where it is mentioned
๐งฌ NCBI Extended Databases
| Tool | Description |
|---|---|
search_gene |
Search NCBI Gene database |
get_gene_details |
Gene details by NCBI Gene ID |
get_gene_literature |
PubMed articles linked to a gene |
search_compound |
Search PubChem compounds |
get_compound_details |
Compound details by PubChem CID |
get_compound_literature |
PubMed articles linked to a compound |
search_clinvar |
Search ClinVar clinical variants |
๐ฐ๏ธ Research Timeline & Lineage Tree
| Tool | Description |
|---|---|
build_research_timeline |
Build timeline/tree with landmark detection and formatted diagnostics. Output: text, tree, mermaid, mindmap, json |
analyze_timeline_milestones |
Analyze milestone distribution with diagnostics payload |
compare_timelines |
Compare multiple topic timelines with per-topic diagnostics |
๐ฅ Institutional Access & ICD Conversion
| Tool | Description |
|---|---|
configure_institutional_access |
Configure institution's link resolver |
get_institutional_link |
Generate OpenURL access link |
list_resolver_presets |
List resolver presets |
test_institutional_access |
Test resolver configuration |
convert_icd_mesh |
Convert between ICD codes and MeSH terms (bidirectional) |
unified_search |
Auto-detect ICD codes in queries and expand them to MeSH |
๐พ Session Management
| Tool | Description |
|---|---|
get_session_pmids |
Retrieve cached PMID lists |
get_cached_article |
Get article from session cache (no API cost) |
get_session_summary |
Session status overview |
Dynamic MCP resources are also available for agents that can read resources directly:
session://contextโ active session statussession://last-searchโ latest search metadatasession://last-search/pmidsโ latest PMID list + CSV formsession://last-search/resultsโ cached article payloads for the latest search
๐ Pipeline Management
manage_pipeline is the primary facade for pipeline CRUD, history, and scheduling. The more specific pipeline tools remain available as compatibility wrappers.
| Tool | Description |
|---|---|
manage_pipeline |
Primary facade for save, list, load, delete, history, and schedule actions |
save_pipeline |
Save a pipeline config for later reuse (YAML/JSON, auto-validated) |
list_pipelines |
List saved pipelines (filter by tag/scope) |
load_pipeline |
Load pipeline from name or file for review/editing |
delete_pipeline |
Delete pipeline and its execution history |
get_pipeline_history |
View execution history with article diff analysis |
schedule_pipeline |
Create, update, or remove recurring pipeline schedules |
Step-by-step tutorials:
- English: docs/PIPELINE_MODE_TUTORIAL.en.md
- ็น้ซไธญๆ: docs/PIPELINE_MODE_TUTORIAL.md
๐๏ธ Vision & Image Search
| Tool | Description |
|---|---|
analyze_figure_for_search |
Analyze scientific figure for search |
search_biomedical_images |
Search biomedical images across Open-i (X-ray, microscopy, photos, diagrams) |
๐ Preprint Search
Search arXiv, medRxiv, and bioRxiv preprint servers via unified_search options flags:
preprints: Enable dedicated preprint search and show results in a separate section.all_types: Keep non-peer-reviewed content in main aggregated results.
Recommended combinations:
- Empty
options: Peer-reviewed results only. options="preprints": Peer-reviewed main results plus a separate preprint section.options="preprints, all_types": Separate preprint section plus non-peer-reviewed content retained in main results.options="all_types": No dedicated preprint crawl, but non-peer-reviewed items from searched sources are retained.
Preprint detection โ articles are identified as preprints by:
- Article type from source API (OpenAlex, CrossRef, Semantic Scholar)
- arXiv ID present without PubMed ID
- Known preprint server source or journal name
- DOI prefix matching preprint servers (e.g.,
10.1101/โ bioRxiv/medRxiv,10.48550/โ arXiv)
๐ณ Research Context Graph
unified_search can append a lightweight research lineage view built from PMID-backed ranked results:
| Option Flag | Description |
|---|---|
context_graph |
Append a Research Context Graph preview to Markdown output and include research_context in JSON output |
This is useful when an agent needs quick thematic branching without making a second build_research_timeline call.
๐ Count-First Orientation
unified_search can also front-load the existing source coverage and decision hints for agents that want routing help before reading the ranked list:
| Option Flag | Description |
|---|---|
counts_first |
Add a source-count table, coverage summary, and next-tool recommendations to the response |
Example:
unified_search(query="remimazolam ICU sedation", options="counts_first")
This mode is useful when the agent should decide whether to expand a source, inspect the lead PMID, fetch fulltext, extract figures, or pivot into timeline exploration.
โฑ๏ธ MCP Progress Reporting
When the MCP client provides a progress token, unified_search, build_research_timeline, analyze_timeline_milestones, compare_timelines, get_fulltext, and get_text_mined_terms emit progress updates for their major phases.
This reduces the "black box" wait time for agents during longer searches.
๐ Agent Usage Examples
1๏ธโฃ Quick Search (Simplest)
# Agent just asks naturally - middleware handles everything
unified_search(query="remimazolam ICU sedation", limit=20)
# Or with clinical codes - auto-converted to MeSH
unified_search(query="I10 treatment in E11.9 patients")
# โ ICD-10 โ ICD-10
# Hypertension Type 2 Diabetes
2๏ธโฃ PICO Clinical Question
Simple path โ unified_search can search directly (no PICO decomposition):
# unified_search searches as-is; detects "A vs B" pattern and shows PICO hints in metadata
unified_search(query="Is remimazolam better than propofol for ICU sedation?")
# โ Multi-source keyword search + PICO hint metadata in output
# โ ๏ธ This does NOT auto-decompose PICO or expand MeSH!
# For structured PICO search, use the Agent workflow below
Agent workflow โ PICO decomposition + MeSH expansion (recommended for clinical questions):
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ "Is remimazolam better than propofol for ICU sedation?" โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ parse_pico() โ
โ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โ
โ โ P โ โ I โ โ C โ โ O โ โ
โ โ ICU โ โremimaz- โ โpropofol โ โsedation โ โ
โ โpatients โ โ olam โ โ โ โoutcomes โ โ
โ โโโโโโฌโโโโโ โโโโโโฌโโโโโ โโโโโโฌโโโโโ โโโโโโฌโโโโโ โ
โโโโโโโโโผโโโโโโโโโโโโโผโโโโโโโโโโโโโผโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ โ โ
โผ โผ โผ โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ generate_search_queries() ร 4 (parallel) โ
โ โ
โ P โ "Intensive Care Units"[MeSH] โ
โ I โ "remimazolam" [Supplementary Concept], "CNS 7056" โ
โ C โ "Propofol"[MeSH], "Diprivan" โ
โ O โ "Conscious Sedation"[MeSH], "Deep Sedation"[MeSH] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Agent combines with Boolean logic โ
โ โ
โ (P) AND (I) AND (C) AND (O) โ High precision โ
โ (P) AND (I OR C) AND (O) โ High recall โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ unified_search() (auto multi-source + dedup) โ
โ โ
โ PubMed + Europe PMC + CORE + OpenAlex โ Auto deduplicate & rank โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
# Step 1: Parse clinical question
parse_pico("Is remimazolam better than propofol for ICU sedation?")
# Returns: P=ICU patients, I=remimazolam, C=propofol, O=sedation outcomes
# Step 2: Get MeSH for each element (parallel!)
generate_search_queries(topic="ICU patients") # P
generate_search_queries(topic="remimazolam") # I
generate_search_queries(topic="propofol") # C
generate_search_queries(topic="sedation") # O
# Step 3: Agent combines with Boolean
query = '("Intensive Care Units"[MeSH]) AND (remimazolam OR "CNS 7056") AND propofol AND sedation'
# Step 4: Search (auto multi-source, dedup, rank)
unified_search(query=query)
3๏ธโฃ Explore from Key Paper
# Found landmark paper PMID: 33475315
find_related_articles(pmid="33475315") # Similar methodology
find_citing_articles(pmid="33475315") # Who built on this?
get_article_references(pmid="33475315") # What's the foundation?
# Build complete research map
build_citation_tree(pmid="33475315", depth=2, output_format="mermaid")
4๏ธโฃ Gene/Drug Research
# Research a gene
search_gene(query="BRCA1", organism="human")
get_gene_literature(gene_id="672", limit=20)
# Research a drug compound
search_compound(query="propofol")
get_compound_literature(cid="4943", limit=20)
5๏ธโฃ Export Results
# Export last search results
prepare_export(pmids="last", format="ris") # โ EndNote/Zotero
prepare_export(pmids="last", format="bibtex") # โ LaTeX
# Retrieve full text for a selected paper from the last search
get_fulltext(pmid="12345678", extended_sources=True)
6๏ธโฃ Preprint Search
# Include preprints alongside peer-reviewed results
unified_search(query="COVID-19 vaccine efficacy", options="preprints")
# โ Main results (peer-reviewed) + Separate preprint section (arXiv, medRxiv, bioRxiv)
# Include preprints and retain non-peer-reviewed items in main results
unified_search(query="CRISPR gene therapy", options="preprints, all_types")
# โ Separate preprint section + non-peer-reviewed items retained in main results
# Only peer-reviewed (default behavior)
unified_search("diabetes treatment")
# โ Preprints from any source automatically filtered out
# Add a research context graph preview to the same search response
unified_search("remimazolam ICU sedation", options="context_graph")
7๏ธโฃ Pipeline (Reusable Search Plans)
# Save a template-based pipeline through the primary facade
manage_pipeline(
action="save",
name="icu_sedation_weekly",
config="template: pico\nparams:\n P: ICU patients\n I: remimazolam\n C: propofol\n O: delirium",
tags="anesthesia,sedation",
description="Weekly ICU sedation monitoring"
)
# Save a custom DAG pipeline
manage_pipeline(
action="save",
name="brca1_comprehensive",
config="""
steps:
- id: expand
action: expand
params: { topic: BRCA1 breast cancer }
- id: pubmed
action: search
params: { query: BRCA1, sources: pubmed, limit: 50 }
- id: expanded
action: search
inputs: [expand]
params: { strategy: mesh, sources: pubmed,openalex, limit: 50 }
- id: merged
action: merge
inputs: [pubmed, expanded]
params: { method: rrf }
- id: enriched
action: metrics
inputs: [merged]
output:
limit: 30
ranking: quality
"""
)
# Execute a saved pipeline
unified_search(pipeline="saved:icu_sedation_weekly")
# List & manage
manage_pipeline(action="list", tag="anesthesia")
manage_pipeline(action="load", source="brca1_comprehensive") # Review YAML
manage_pipeline(action="history", name="icu_sedation_weekly") # View past runs
๐ Search Mode Comparison
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ SEARCH MODE DECISION TREE โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ "What kind of search do I need?" โ
โ โ โ
โ โโโ Know exactly what to search? โ
โ โ โโโ unified_search(query="topic keywords") โ
โ โ โ Quick, auto-routing to best sources โ
โ โ โ
โ โโโ Have a clinical question (A vs B)? โ
โ โ โโโ parse_pico() โ generate_search_queries() ร N โ
โ โ โ Agent builds Boolean โ unified_search() โ
โ โ โ
โ โโโ Need comprehensive systematic coverage? โ
โ โ โโโ generate_search_queries() โ parallel search โ
โ โ โ MeSH expansion, multiple strategies, merge โ
โ โ โ
โ โโโ Exploring from a key paper? โ
โ โโโ find_related/citing/references โ build_citation_tree โ
โ โ Citation network, research context โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
| Mode | Entry Point | Best For | Auto-Features |
|---|---|---|---|
| Quick | unified_search() |
Fast topic search | ICDโMeSH, multi-source, dedup |
| PICO | parse_pico() โ Agent |
Clinical questions | Agent: decompose โ MeSH expand โ Boolean |
| Systematic | generate_search_queries() |
Literature reviews | MeSH expansion, synonyms |
| Exploration | find_*_articles() |
From key paper | Citation network, related |
๐ค Claude Skills (AI Agent Workflows)
Pre-built workflow guides in .claude/skills/, divided into Usage Skills (for using the MCP server) and Development Skills (for maintaining the project):
๐ Usage Skills (10) โ For AI Agents Using This MCP Server
| Skill | Description |
|---|---|
pubmed-quick-search |
Basic search with filters |
pubmed-systematic-search |
MeSH expansion, comprehensive |
pubmed-pico-search |
Clinical question decomposition |
pubmed-paper-exploration |
Citation tree, related articles |
pubmed-gene-drug-research |
Gene/PubChem/ClinVar |
pubmed-fulltext-access |
Europe PMC, CORE full text |
pubmed-export-citations |
RIS/BibTeX/CSV export |
pubmed-multi-source-search |
Cross-database unified search |
pubmed-mcp-tools-reference |
Complete tool reference guide |
pipeline-persistence |
Save, load, reuse search plans |
๐ง Development Skills (13) โ For Project Contributors
| Skill | Description |
|---|---|
changelog-updater |
Auto-update CHANGELOG.md |
code-refactor |
DDD architecture refactoring |
code-reviewer |
Code quality & security review |
ddd-architect |
DDD scaffold for new features |
git-doc-updater |
Sync docs before commits |
git-precommit |
Pre-commit workflow orchestration |
memory-checkpoint |
Save context to Memory Bank |
memory-updater |
Update Memory Bank files |
project-init |
Initialize new projects |
readme-i18n |
Multilingual README sync |
readme-updater |
Sync README with code changes |
roadmap-updater |
Update ROADMAP.md status |
test-generator |
Generate test suites |
๐ Location:
.claude/skills/*/SKILL.md(Claude Code-specific, and the single source of truth for repo skills) Do not mirror or split repo skills into.github/skills/. These repo skills are project-scoped and should remain version-controlled. Personal cross-project skills belong in a user directory such as~/.copilot/skills/or~/.claude/skills/, not in this repository.
๐๏ธ Architecture (DDD)
This project uses Domain-Driven Design (DDD) architecture, with literature research domain knowledge as the core model.
src/pubmed_search/
โโโ domain/ # Core business logic
โ โโโ entities/article.py # UnifiedArticle, Author, etc.
โโโ application/ # Use cases
โ โโโ search/ # QueryAnalyzer, ResultAggregator
โ โโโ export/ # Citation export (RIS, BibTeX...)
โ โโโ session/ # SessionManager
โโโ infrastructure/ # External systems
โ โโโ ncbi/ # Entrez, iCite, Citation Exporter
โ โโโ sources/ # Europe PMC, CORE, CrossRef...
โ โโโ http/ # HTTP clients
โโโ presentation/ # User interfaces
โ โโโ mcp_server/ # MCP tools, prompts, resources
โ โ โโโ tools/ # discovery, strategy, pico, export...
โ โโโ api/ # REST API (Copilot Studio)
โโโ shared/ # Cross-cutting concerns
โโโ exceptions.py # Unified error handling
โโโ async_utils.py # Rate limiter, retry, circuit breaker
Internal Mechanisms (Transparent to Agent)
| Mechanism | Description |
|---|---|
| Session | Auto-create, auto-switch |
| Cache | Auto-cache search results, avoid duplicate API calls |
| Rate Limit | Auto-comply with NCBI API limits (0.34s/0.1s) |
| MeSH Lookup | generate_search_queries() auto-queries NCBI MeSH database |
| ESpell | Auto spelling correction (remifentanyl โ remifentanil) |
| Query Analysis | Each suggested query shows how PubMed actually interprets it |
Vocabulary Translation Layer (Key Feature)
Our Core Value: We are the intelligent middleware between Agent and Search Engines, automatically handling vocabulary standardization so Agent doesn't need to know each database's terminology.
Different data sources use different controlled vocabulary systems. This server provides automatic conversion:
| API / Database | Vocabulary System | Auto-Conversion |
|---|---|---|
| PubMed / NCBI | MeSH (Medical Subject Headings) | โ
Full support via expand_with_mesh() |
| ICD Codes | ICD-10-CM / ICD-9-CM | โ Auto-detect & convert to MeSH |
| Europe PMC | Text-mined entities (Gene, Disease, Chemical) | โ
get_text_mined_terms() extraction |
| OpenAlex | OpenAlex Concepts (deprecated) | โ Free-text only |
| Semantic Scholar | S2 Field of Study | โ Free-text only |
| CORE | None | โ Free-text only |
| CrossRef | None | โ Free-text only |
Automatic ICD โ MeSH Conversion
When searching with ICD codes (e.g., I10 for Hypertension), unified_search() automatically:
- Detects ICD-10/ICD-9 patterns via
detect_and_expand_icd_codes() - Looks up corresponding MeSH terms from internal mapping (
ICD10_TO_MESH,ICD9_TO_MESH) - Expands query with MeSH synonyms for comprehensive search
# Agent calls unified_search with clinical terminology
unified_search(query="I10 treatment outcomes")
# Server auto-expands to PubMed-compatible query
"(I10 OR Hypertension[MeSH]) treatment outcomes"
๐ Full architecture documentation: ARCHITECTURE.md
MeSH Auto-Expansion + Query Analysis
When calling generate_search_queries("remimazolam sedation"), internally it:
- ESpell Correction - Fix spelling errors
- MeSH Query -
Entrez.esearch(db="mesh")to get standard vocabulary - Synonym Extraction - Get synonyms from MeSH Entry Terms
- Query Analysis - Analyze how PubMed interprets each query
{
"mesh_terms": [
{
"input": "remimazolam",
"preferred": "remimazolam [Supplementary Concept]",
"synonyms": ["CNS 7056", "ONO 2745"]
}
],
"all_synonyms": ["CNS 7056", "ONO 2745", ...],
"suggested_queries": [
{
"id": "q1_title",
"query": "(remimazolam sedation)[Title]",
"purpose": "Exact title match - highest precision",
"estimated_count": 8,
"pubmed_translation": "\"remimazolam sedation\"[Title]"
},
{
"id": "q3_and",
"query": "(remimazolam AND sedation)",
"purpose": "All keywords required",
"estimated_count": 561,
"pubmed_translation": "(\"remimazolam\"[Supplementary Concept] OR \"remimazolam\"[All Fields]) AND (\"sedate\"[All Fields] OR ...)"
}
]
}
Value of Query Analysis: Agent thinks
remimazolam AND sedationonly searches these two words, but PubMed actually expands to Supplementary Concept + synonyms, results go from 8 to 561. This helps Agent understand the difference between intent and actual search.
๐ HTTPS Deployment
Enable HTTPS secure communication for production environments.
Copilot Studio Quick Start
# Step 1: Generate SSL certificates
./scripts/generate-ssl-certs.sh
# Step 2: Start HTTPS service (Docker)
./scripts/start-https-docker.sh up
# Verify deployment
curl -k https://localhost/
HTTPS Endpoints
| Service | URL | Description |
|---|---|---|
| MCP SSE | https://localhost/sse |
SSE connection (MCP) |
| Messages | https://localhost/messages |
MCP POST |
| Health | https://localhost/health |
Health check |
Claude Desktop Configuration
{
"mcpServers": {
"pubmed-search": {
"url": "https://localhost/sse"
}
}
}
๐ข Microsoft Copilot Studio Integration
Integrate PubMed Search MCP with Microsoft 365 Copilot (Word, Teams, Outlook)!
Quick Start
# Start with Streamable HTTP transport (required by Copilot Studio)
uv run python run_server.py --transport streamable-http --port 8765
# Enable Copilot-compatible HTTP semantics while keeping full tool schemas
uv run python run_server.py --transport streamable-http --copilot-compatible --port 8765
# Or use the dedicated script with ngrok
./scripts/start-copilot-studio.sh --with-ngrok
Copilot Studio Configuration
| Field | Value |
|---|---|
| Server name | PubMed Search |
| Server URL | https://your-server.com/mcp |
| Authentication | None (or API Key) |
๐ Full documentation: copilot-studio/README.md
Use
--copilot-compatiblewithrun_server.pyfor Copilot HTTP semantics, orrun_copilot.pyif you also need simplified tool schemas.โ ๏ธ Note: SSE transport deprecated since Aug 2025. Use
streamable-http.
๐ More documentation:
- Architecture โ ARCHITECTURE.md
- Pipeline tutorial (English) โ docs/PIPELINE_MODE_TUTORIAL.en.md
- Pipeline tutorial (zh-TW) โ docs/PIPELINE_MODE_TUTORIAL.md
- Deployment guide โ DEPLOYMENT.md
- Copilot Studio โ copilot-studio/README.md
๐ Security
Security Features
| Layer | Feature | Description |
|---|---|---|
| HTTPS | TLS 1.2/1.3 encryption | All traffic encrypted via Nginx |
| Rate Limiting | 30 req/s | Nginx level protection |
| Security Headers | XSS/CSRF protection | X-Frame-Options, X-Content-Type-Options |
| SSE Optimization | 24h timeout | Long-lived connections for real-time |
| No Database | Stateless | No SQL injection risk |
| No Secrets | In-memory only | No credentials stored |
See DEPLOYMENT.md for detailed deployment instructions.
๐ค Export Formats
Export your search results in formats compatible with major reference managers:
| Format | Compatible With | Use Case |
|---|---|---|
| RIS | EndNote, Zotero, Mendeley | Universal import |
| BibTeX | LaTeX, Overleaf, JabRef | Academic writing |
| CSV | Excel, Google Sheets | Data analysis |
| MEDLINE | PubMed native format | Archiving |
| JSON | Programmatic access | Custom processing |
Exported Fields
- Core: PMID, Title, Authors, Journal, Year, Volume, Issue, Pages
- Identifiers: DOI, PMC ID, ISSN
- Content: Abstract (HTML tags cleaned)
- Metadata: Language, Publication Type, Keywords
- Access: DOI URL, PMC URL, Full-text availability
Special Character Handling
- BibTeX exports use pylatexenc for proper LaTeX encoding
- Nordic characters (รธ, รฆ, รฅ), umlauts (รผ, รถ, รค), and accents are correctly converted
- Example:
Sรธren HansenโS{\o}ren Hansen
๐ Citation
GitHub will show Cite this repository from CITATION.cff. If you use PubMed Search MCP in research, methods sections, or internal technical reports, prefer the GitHub-generated citation or reuse the repository metadata directly.
@software{pubmed_search_mcp,
title = {PubMed Search MCP},
author = {u9401066},
url = {https://github.com/u9401066/pubmed-search-mcp}
}
๐ License
Apache License 2.0 - see LICENSE
๐ Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pubmed_search_mcp-0.5.1.tar.gz.
File metadata
- Download URL: pubmed_search_mcp-0.5.1.tar.gz
- Upload date:
- Size: 425.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d2e4db195c77409c579b8564c9ee691288847c801e79cd5a78d2212ff040b837
|
|
| MD5 |
802d0ecfaaa250ae3f27e5359cca536a
|
|
| BLAKE2b-256 |
3295b24fc684f7210664a67664347636bcead5a617dd0e95573cb6d216825d6f
|
Provenance
The following attestation bundles were made for pubmed_search_mcp-0.5.1.tar.gz:
Publisher:
publish.yml on u9401066/pubmed-search-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pubmed_search_mcp-0.5.1.tar.gz -
Subject digest:
d2e4db195c77409c579b8564c9ee691288847c801e79cd5a78d2212ff040b837 - Sigstore transparency entry: 1245340210
- Sigstore integration time:
-
Permalink:
u9401066/pubmed-search-mcp@d2e0d6c022b24c9de50f4d8e7741ac7703fa93ed -
Branch / Tag:
refs/tags/v0.5.1 - Owner: https://github.com/u9401066
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d2e0d6c022b24c9de50f4d8e7741ac7703fa93ed -
Trigger Event:
push
-
Statement type:
File details
Details for the file pubmed_search_mcp-0.5.1-py3-none-any.whl.
File metadata
- Download URL: pubmed_search_mcp-0.5.1-py3-none-any.whl
- Upload date:
- Size: 500.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c830fc3e00aa168d377ea480bce107af1edcbd65e622069d4471f1f17db1c851
|
|
| MD5 |
d1cc80b20b9f34cc967b56a688f133f1
|
|
| BLAKE2b-256 |
96fb84b1590857feec61cb0e04751a52070534e66be727a4e5abe63bfebd00c5
|
Provenance
The following attestation bundles were made for pubmed_search_mcp-0.5.1-py3-none-any.whl:
Publisher:
publish.yml on u9401066/pubmed-search-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pubmed_search_mcp-0.5.1-py3-none-any.whl -
Subject digest:
c830fc3e00aa168d377ea480bce107af1edcbd65e622069d4471f1f17db1c851 - Sigstore transparency entry: 1245340216
- Sigstore integration time:
-
Permalink:
u9401066/pubmed-search-mcp@d2e0d6c022b24c9de50f4d8e7741ac7703fa93ed -
Branch / Tag:
refs/tags/v0.5.1 - Owner: https://github.com/u9401066
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d2e0d6c022b24c9de50f4d8e7741ac7703fa93ed -
Trigger Event:
push
-
Statement type: