Skip to main content

MCP server for PubMed literature search with MeSH, PICO, and intelligent query expansion

Project description

PubMed Search MCP

PyPI version Python 3.10+ License: Apache 2.0 MCP Smithery Test Coverage

Professional Literature Research Assistant for AI Agents - More than just an API wrapper

A Domain-Driven Design (DDD) based MCP server that serves as an intelligent research assistant for AI agents, providing task-oriented literature search and analysis capabilities.

โœจ What's Included:

  • ๐Ÿ”ง 21 MCP Tools - Streamlined PubMed, Europe PMC, CORE, and NCBI database access
  • ๐Ÿ“š 9 Claude Skills - Ready-to-use workflow guides for AI agents
  • ๐Ÿ“– Copilot Instructions - VS Code GitHub Copilot integration guide

๐ŸŒ Language: English | ็น้ซ”ไธญๆ–‡


๐Ÿš€ Quick Install

Via Smithery (Recommended for Claude Desktop)

npx -y @smithery/cli install pubmed-search-mcp --client claude

Via pip

pip install pubmed-search-mcp

Via uv

uv add pubmed-search-mcp

Via uvx (Zero Install)

uvx pubmed-search-mcp

โš™๏ธ Configuration

Claude Desktop (claude_desktop_config.json)

{
  "mcpServers": {
    "pubmed-search": {
      "command": "uvx",
      "args": ["pubmed-search-mcp"],
      "env": {
        "NCBI_EMAIL": "your@email.com"
      }
    }
  }
}

VS Code / Cursor (.vscode/mcp.json)

{
  "servers": {
    "pubmed-search": {
      "type": "stdio",
      "command": "uvx",
      "args": ["pubmed-search-mcp"],
      "env": {
        "NCBI_EMAIL": "your@email.com"
      }
    }
  }
}

Note: NCBI_EMAIL is required by NCBI API policy. Optionally set NCBI_API_KEY for higher rate limits.


๐ŸŽฏ Design Philosophy

  • Agent-First - Designed for AI Agents, output optimized for machine decision-making
  • Task-Oriented - Tools organized by research tasks, not low-level APIs
  • DDD Architecture - Core modeling based on literature research domain knowledge
  • Context-Aware - Maintains research state through Session

Positioning: PubMed-specialized AI research assistant

  • โœ… MeSH vocabulary integration - Not available from other sources
  • โœ… PICO structured queries - Medical specialty
  • โœ… ESpell spelling correction - Auto-correction
  • โœ… Batch parallel search - High efficiency

๐Ÿ“ก External APIs & Data Sources

This MCP server integrates with multiple academic databases and APIs:

Core Data Sources

Source Coverage API Key Rate Limit Description
NCBI PubMed 36M+ articles Optional 3/s โ†’ 10/s Primary biomedical literature
NCBI Entrez Multi-DB Optional 3/s โ†’ 10/s Gene, PubChem, ClinVar
Europe PMC 33M+ Not required Generous Full text XML access
CORE 200M+ Optional 100/day โ†’ 5K/day Open access aggregator
Semantic Scholar 200M+ Optional 100/s โ†’ 1K/s AI-powered recommendations
OpenAlex 250M+ Not required 100K/day Open scholarly metadata
NIH iCite PubMed Not required Generous Citation metrics (RCR)

Environment Variables

# Required
NCBI_EMAIL=your@email.com          # Required by NCBI policy

# Optional - For higher rate limits
NCBI_API_KEY=your_ncbi_api_key     # Get from: https://www.ncbi.nlm.nih.gov/account/settings/
CORE_API_KEY=your_core_api_key     # Get from: https://core.ac.uk/services/api
S2_API_KEY=your_s2_api_key         # Get from: https://www.semanticscholar.org/product/api

# Optional - Network settings
HTTP_PROXY=http://proxy:8080       # HTTP proxy for API requests
HTTPS_PROXY=https://proxy:8080     # HTTPS proxy for API requests

Python Dependencies

biopython>=1.81        # NCBI Entrez E-utilities
requests>=2.28.0       # HTTP client
pylatexenc>=2.10       # Unicode to LaTeX (BibTeX export)
mcp>=1.0.0             # Model Context Protocol

Features

  • Search PubMed: Full-text and advanced query support
  • Related Articles: Find papers related to a given PMID
  • Citing Articles: Find papers that cite a given PMID
  • Parallel Search: Generate multiple queries for comprehensive searches
  • PDF Access: Get open-access PDF URLs from PubMed Central
  • Export Formats: RIS, BibTeX, CSV, MEDLINE, JSON (EndNote/Zotero/Mendeley compatible)
  • MCP Integration: Use with VS Code + GitHub Copilot or any MCP client
  • Remote Server: Deploy as HTTP service for multi-machine access
  • Submodule Ready: Use as a Git submodule in larger projects
  • Multi-Source Search: PubMed, Europe PMC (33M+), CORE (200M+), Semantic Scholar, OpenAlex
  • Full Text Access: Direct XML/text retrieval from Europe PMC and CORE
  • NCBI Extended: Gene, PubChem compound, and ClinVar clinical variant databases
  • Claude Skills: 9 pre-built workflow guides for AI agent development
  • Copilot Integration: GitHub Copilot instructions for VS Code users

๐Ÿค– Claude Skills (AI Agent Workflows)

This project includes 9 Claude Skill files in .claude/skills/ that teach AI agents how to effectively use the MCP tools. These skills provide:

  • Step-by-step workflows with decision trees
  • Code examples ready for immediate use
  • Best practices for each research scenario

Available Skills

Skill Description Trigger Examples
pubmed-quick-search Basic PubMed search "search for", "find papers"
pubmed-systematic-search MeSH expansion, comprehensive search "systematic review", "comprehensive"
pubmed-pico-search PICO clinical question decomposition "is A better than B?", "PICO"
pubmed-paper-exploration Citation tree, related articles "citing articles", "related papers"
pubmed-gene-drug-research Gene, PubChem, ClinVar integration "gene function", "drug compound"
pubmed-fulltext-access Europe PMC, CORE full text retrieval "full text", "PDF", "open access"
pubmed-export-citations RIS, BibTeX, CSV export "export", "EndNote", "Zotero"
pubmed-multi-source-search Cross-database search strategy "all sources", "multi-database"
pubmed-mcp-tools-reference Complete 35+ tools reference "all tools", "what can you do"

Using Skills

For Claude Desktop / Claude Code:

# Skills are automatically loaded from .claude/skills/
# Just ask naturally:
"Help me do a systematic search for remimazolam"
"What are the citing articles for this paper?"

For VS Code GitHub Copilot:

# The .github/copilot-instructions.md provides guidance
# Copilot will use the skill patterns automatically

Skill File Structure

Each skill file follows this structure:

---
name: pubmed-quick-search
description: Quick PubMed search. Triggers: search, find papers...
---
# Quick PubMed Search

## Description
...

## Workflow
...

## Code Examples
...

๐Ÿ“ Skill files location: .claude/skills/pubmed-*/SKILL.md


๐Ÿ› ๏ธ MCP Tools (35+ Tools)

Discovery Tools

Tool Description Direction
search_literature Search PubMed literature -
find_related_articles Find similar articles (PubMed algorithm) Similarity
find_citing_articles Find papers citing this article (follow-up research) Forward โžก๏ธ
get_article_references Get this article's references (research foundation) Backward โฌ…๏ธ
fetch_article_details Get full article information -
get_citation_metrics Get citation metrics (iCite RCR/Percentile) -
build_citation_tree Build citation network tree (6 formats) Both โ†”๏ธ
suggest_citation_tree Evaluate if building citation tree is worthwhile -

Parallel Search Tools

Tool Description
parse_pico Parse PICO clinical questions (search entry point)
generate_search_queries Generate multiple search strategies (ESpell + MeSH)
merge_search_results Merge and deduplicate search results
expand_search_queries Expand search strategies

Export Tools

Tool Description
prepare_export Export citation formats (RIS/BibTeX/CSV/MEDLINE/JSON)
get_article_fulltext_links Get full-text links (PMC/DOI)
analyze_fulltext_access Analyze open access availability

๐Ÿ‡ช๐Ÿ‡บ Europe PMC Tools (Full Text Access)

Tool Description
search_europe_pmc Search 33M+ publications with OA/fulltext filters
get_fulltext ๐Ÿ“„ Get parsed full text (structured sections)
get_fulltext_xml Get raw JATS XML
get_text_mined_terms ๐Ÿ”ฌ Get annotations (genes, diseases, chemicals)
get_europe_pmc_citations Citation network (citing/references)

๐Ÿ“š CORE Tools (200M+ Open Access Papers)

Tool Description
search_core Search 200M+ open access papers
search_core_fulltext Search within paper content (42M+ full texts)
get_core_paper Get paper details by CORE ID
get_core_fulltext ๐Ÿ“„ Get full text content
find_in_core Find papers by DOI/PMID

๐Ÿงฌ NCBI Extended Database Tools

Tool Description
search_gene ๐Ÿงฌ Search NCBI Gene database
get_gene_details Get gene information
get_gene_literature Get gene-linked PubMed articles
search_compound ๐Ÿ’Š Search PubChem compounds
get_compound_details Get compound info (formula, SMILES)
get_compound_literature Get compound-linked PubMed articles
search_clinvar ๐Ÿ”ฌ Search ClinVar clinical variants

Session Management Tools

Tool Description
get_session_pmids Get cached PMID list from searches
list_search_history List search history
get_cached_article Get article from cache (no API call)
get_session_summary Get session status summary

Design Principle: Focus on search. Session/Cache/Reading List are all internal mechanisms that operate automatically - Agents don't need to manage them.


๐Ÿ“‹ Agent Usage Workflow

Simple Search

search_literature(query="remimazolam ICU sedation", limit=10)

Using PubMed Official Syntax

# MeSH standard vocabulary
search_literature(query='"Diabetes Mellitus"[MeSH]')

# Field-specific search
search_literature(query='(BRAF[Gene Name]) AND (melanoma[Title/Abstract])')

# Date range
search_literature(query='COVID-19[Title] AND 2024[dp]')

# Publication type
search_literature(query='propofol sedation AND Review[pt]')

# Combined search
search_literature(query='("Intensive Care Units"[MeSH]) AND (remimazolam[tiab] OR "CNS 7056"[tiab])')

PubMed Official Field Tags

Tag Description Example
[Title] or [ti] Title COVID-19[ti]
[Title/Abstract] or [tiab] Title + Abstract sedation[tiab]
[MeSH] or [mh] MeSH standard vocabulary "Diabetes Mellitus"[MeSH]
[MeSH Major Topic] or [majr] MeSH major topic "Anesthesia"[majr]
[Author] or [au] Author Smith J[au]
[Journal] or [ta] Journal abbreviation Nature[ta]
[Publication Type] or [pt] Publication type Review[pt], Clinical Trial[pt]
[Date - Publication] or [dp] Publication date 2024[dp], 2020:2024[dp]
[Gene Name] Gene name BRAF[Gene Name]
[Substance Name] Substance name propofol[Substance Name]

Full syntax reference: PubMed Search Field Tags

Deep Exploration (After finding important papers)

find_related_articles(pmid="12345678")   # Related articles (PubMed algorithm)
find_citing_articles(pmid="12345678")    # Papers citing this one (forward in time)
get_article_references(pmid="12345678")  # This paper's references (backward in time)

๐Ÿ”ฌ Citation Discovery Guide

After finding an important paper, there are 5 tools to explore related literature. Choosing the right tool can greatly improve research efficiency:

Tool Comparison

Tool Direction Data Source Use Case API Calls
find_related_articles Similarity PubMed algorithm Find topic/method similar articles 1
find_citing_articles Forward โžก๏ธ PMC citations Find follow-up research 1
get_article_references Backward โฌ…๏ธ PMC references Find foundational papers 1
build_citation_tree Both โ†”๏ธ PMC (BFS traversal) Build complete citation network Multiple
suggest_citation_tree - Article info Evaluate if tree building is worthwhile 1

Usage Decision Tree

Found an important paper (PMID: 12345678)
    โ”‚
    โ”œโ”€โ”€ Want to find "similar topic" articles?
    โ”‚   โ””โ”€โ”€ โœ… find_related_articles(pmid="12345678")
    โ”‚       โ†’ PubMed algorithm finds similar articles by MeSH, keywords, citation patterns
    โ”‚
    โ”œโ”€โ”€ Want to know "how subsequent research developed"?
    โ”‚   โ””โ”€โ”€ โœ… find_citing_articles(pmid="12345678")
    โ”‚       โ†’ Find all papers citing this one (timeline: forward โ†’ now)
    โ”‚
    โ”œโ”€โ”€ Want to understand "what this article is based on"?
    โ”‚   โ””โ”€โ”€ โœ… get_article_references(pmid="12345678")
    โ”‚       โ†’ Get this article's reference list (timeline: backward โ† past)
    โ”‚
    โ””โ”€โ”€ Want to build "complete research context network"?
        โ”‚
        โ”œโ”€โ”€ First evaluate: suggest_citation_tree(pmid="12345678")
        โ”‚   โ†’ Check citation count to decide if tree building is worthwhile
        โ”‚
        โ””โ”€โ”€ Build network: build_citation_tree(pmid="12345678", depth=2)
            โ†’ Output Mermaid/Cytoscape/GraphML formats

Practical Examples

Scenario 1: Quick related paper search

# Found an important RCT on remimazolam, want to see similar studies
find_related_articles(pmid="33475315", limit=10)

Scenario 2: Track research impact

# What subsequent research did this 2020 paper influence?
find_citing_articles(pmid="33475315", limit=20)

Scenario 3: Understand research foundation

# What key literature did this article cite? Find foundation papers
get_article_references(pmid="33475315", limit=30)

Scenario 4: Build research context map (Literature Review)

# Step 1: Evaluate if tree building is worthwhile
suggest_citation_tree(pmid="33475315")

# Step 2: Build 2-level citation network, output Mermaid format (previewable in VS Code)
build_citation_tree(
    pmid="33475315",
    depth=2,
    direction="both",
    output_format="mermaid"
)

Citation Tree Output Formats

Format Use Case Tool
mermaid VS Code Markdown preview Built-in Mermaid extension
cytoscape Academic standard, bioinformatics Cytoscape.js
g6 Modern web visualization AntV G6
d3 Flexible customization D3.js force layout
vis Rapid prototyping vis-network
graphml Desktop analysis software Gephi, VOSviewer, yEd

๐Ÿ” Deep Search: Two Entry Modes

This tool provides two deep search entry points, both completed through parallel search + merge deduplication:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                      Deep Search Flowchart                               โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                          โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                   โ”‚
โ”‚   โ”‚  Keyword Entry    โ”‚         โ”‚  PICO Clinical    โ”‚                   โ”‚
โ”‚   โ”‚  (Know what to    โ”‚         โ”‚  Question Entry   โ”‚                   โ”‚
โ”‚   โ”‚   search)         โ”‚         โ”‚  (Have clinical   โ”‚                   โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜         โ”‚   description)    โ”‚                   โ”‚
โ”‚             โ”‚                   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                   โ”‚
โ”‚             โ”‚                             โ”‚                              โ”‚
โ”‚             โ”‚                             โ–ผ                              โ”‚
โ”‚             โ”‚                   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                   โ”‚
โ”‚             โ”‚                   โ”‚   parse_pico()    โ”‚                   โ”‚
โ”‚             โ”‚                   โ”‚   Parse P/I/C/O   โ”‚                   โ”‚
โ”‚             โ”‚                   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                   โ”‚
โ”‚             โ”‚                             โ”‚                              โ”‚
โ”‚             โ–ผ                             โ–ผ                              โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”       โ”‚
โ”‚   โ”‚              generate_search_queries()                       โ”‚       โ”‚
โ”‚   โ”‚              (ESpell correction + MeSH expansion + synonyms) โ”‚       โ”‚
โ”‚   โ”‚                                                              โ”‚       โ”‚
โ”‚   โ”‚   Keyword mode: 1 call                                       โ”‚       โ”‚
โ”‚   โ”‚   PICO mode: 1 call per element (P/I/C/O) in parallel        โ”‚       โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ”‚
โ”‚                              โ”‚                                           โ”‚
โ”‚                              โ–ผ                                           โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”       โ”‚
โ”‚   โ”‚              Agent combines query strategies                 โ”‚       โ”‚
โ”‚   โ”‚                                                              โ”‚       โ”‚
โ”‚   โ”‚   โ€ข Use returned suggested_queries                           โ”‚       โ”‚
โ”‚   โ”‚   โ€ข Or combine mesh_terms + all_synonyms yourself            โ”‚       โ”‚
โ”‚   โ”‚   โ€ข PICO mode: Use Boolean logic (P) AND (I) AND (O)         โ”‚       โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ”‚
โ”‚                              โ”‚                                           โ”‚
โ”‚                              โ–ผ                                           โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”       โ”‚
โ”‚   โ”‚              search_literature() ร— N (parallel execution)    โ”‚       โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ”‚
โ”‚                              โ”‚                                           โ”‚
โ”‚                              โ–ผ                                           โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”       โ”‚
โ”‚   โ”‚              merge_search_results()                          โ”‚       โ”‚
โ”‚   โ”‚              Merge + dedupe + mark high-relevance articles   โ”‚       โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ”‚
โ”‚                                                                          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Entry 1๏ธโƒฃ: Keyword-Oriented

Use Case: Already know the keywords or topic to search

# Step 1: Get search materials (ESpell + MeSH + synonyms)
generate_search_queries(topic="remimazolam ICU sedation")

# Returns:
{
  "corrected_topic": "remimazolam icu sedation",   # Spelling corrected
  "mesh_terms": [
    {"input": "remimazolam", "preferred": "remimazolam [Supplementary Concept]", 
     "synonyms": ["CNS 7056", "ONO 2745"]},
    {"input": "sedation", "preferred": "Deep Sedation", 
     "synonyms": ["Sedation, Deep"]}
  ],
  "all_synonyms": ["CNS 7056", "ONO 2745", "Sedation, Deep", ...],
  "suggested_queries": [
    {"id": "q1_title", "query": "(remimazolam icu sedation)[Title]"},
    {"id": "q2_tiab", "query": "(remimazolam icu sedation)[Title/Abstract]"},
    {"id": "q4_mesh", "query": "\"remimazolam [Supplementary Concept]\"[MeSH Terms]"},
    {"id": "q6_syn", "query": "(CNS 7056)[Title/Abstract]"},
    ...
  ]
}

# Step 2: Execute searches in parallel
search_literature(query="(remimazolam icu sedation)[Title]")          # parallel
search_literature(query="(remimazolam icu sedation)[Title/Abstract]") # parallel
search_literature(query="\"Deep Sedation\"[MeSH Terms]")              # parallel
...

# Step 3: Merge results
merge_search_results(results_json='[["pmid1","pmid2"],["pmid2","pmid3"]]')
# โ†’ unique_pmids: Deduplicated PMID list
# โ†’ high_relevance_pmids: High-relevance articles hit by multiple strategies

Entry 2๏ธโƒฃ: PICO Clinical Question

Use Case: Have a clinical question that needs to be decomposed into structured search

# Step 1: Parse PICO structure
parse_pico(description="Is remimazolam better than propofol for ICU sedation? Does it reduce delirium?")

# Returns:
{
  "pico": {
    "P": "ICU patients requiring sedation",
    "I": "remimazolam",
    "C": "propofol", 
    "O": "delirium incidence"
  },
  "question_type": "therapy",  # Suggested Clinical Query filter
  "next_steps": "Call generate_search_queries() for each PICO element"
}

# Step 2: Get search materials for each PICO element (in parallel!)
generate_search_queries(topic="ICU patients")  # P โ†’ MeSH: "Intensive Care Units"
generate_search_queries(topic="remimazolam")   # I โ†’ MeSH: "remimazolam [Supplementary Concept]"
generate_search_queries(topic="propofol")      # C โ†’ MeSH: "Propofol"
generate_search_queries(topic="delirium")      # O โ†’ MeSH: "Delirium"

# Step 3: Agent combines queries (using Boolean logic)
# High precision: (P) AND (I) AND (C) AND (O)
query_precise = '("Intensive Care Units"[MeSH] OR ICU[tiab]) AND ' \
                '(remimazolam[tiab] OR "CNS 7056"[tiab]) AND ' \
                '(propofol[tiab] OR Diprivan[tiab]) AND ' \
                '(delirium[tiab] OR "Emergence Delirium"[MeSH])'

# High recall: (P) AND (I OR C) AND (O)
query_recall = '(ICU[tiab]) AND (remimazolam[tiab] OR propofol[tiab]) AND (delirium[tiab])'

# Step 4: Parallel search + merge
search_literature(query=query_precise)  # parallel
search_literature(query=query_recall)   # parallel
merge_search_results(...)

Two Entry Points Comparison

Feature Keyword-Oriented PICO Clinical Question
Entry Tool generate_search_queries(topic) parse_pico(description)
Use Case Know what keywords to search Have clinical question to decompose
MeSH Expansion 1 call 4 calls (one for P/I/C/O each)
Query Combination Use suggested_queries Agent combines with Boolean
Example Input "remimazolam ICU sedation" "Is remimazolam better than propofol in ICU?"

Design Philosophy: Tools provide materials (MeSH terms, synonyms), Agent makes decisions (how to combine queries)


๐Ÿ—๏ธ Architecture (DDD)

This project uses Domain-Driven Design (DDD) architecture, with literature research domain knowledge as the core model.

src/pubmed_search/
โ”œโ”€โ”€ mcp/
โ”‚   โ””โ”€โ”€ tools/
โ”‚       โ”œโ”€โ”€ discovery.py     # Discovery (search, related, citing, details)
โ”‚       โ”œโ”€โ”€ strategy.py      # Strategy (generate_queries, expand)
โ”‚       โ”œโ”€โ”€ pico.py          # PICO parsing
โ”‚       โ”œโ”€โ”€ merge.py         # Result merging
โ”‚       โ”œโ”€โ”€ export.py        # Export tools
โ”‚       โ”œโ”€โ”€ citation_tree.py # Citation network visualization (6 formats)
โ”‚       โ”œโ”€โ”€ europe_pmc.py    # Europe PMC full text access
โ”‚       โ”œโ”€โ”€ core.py          # CORE open access search
โ”‚       โ””โ”€โ”€ ncbi_extended.py # Gene, PubChem, ClinVar
โ”œโ”€โ”€ sources/                 # Multi-source search
โ”‚   โ”œโ”€โ”€ europe_pmc.py        # Europe PMC client (33M+ papers)
โ”‚   โ”œโ”€โ”€ core.py              # CORE client (200M+ papers)
โ”‚   โ”œโ”€โ”€ ncbi_extended.py     # Gene, PubChem, ClinVar
โ”‚   โ”œโ”€โ”€ semantic_scholar.py  # Semantic Scholar client
โ”‚   โ””โ”€โ”€ openalex.py          # OpenAlex client
โ”œโ”€โ”€ entrez/                  # NCBI Entrez API wrapper
โ”œโ”€โ”€ exports/                 # Export formats (RIS, BibTeX, CSV)
โ””โ”€โ”€ session.py               # Session management (internal mechanism)

Internal Mechanisms (Transparent to Agent)

Mechanism Description
Session Auto-create, auto-switch
Cache Auto-cache search results, avoid duplicate API calls
Rate Limit Auto-comply with NCBI API limits (0.34s/0.1s)
MeSH Lookup generate_search_queries() auto-queries NCBI MeSH database
ESpell Auto spelling correction (remifentanyl โ†’ remifentanil)
Query Analysis Each suggested query shows how PubMed actually interprets it

๐Ÿ“– Full architecture documentation: ARCHITECTURE.md

MeSH Auto-Expansion + Query Analysis

When calling generate_search_queries("remimazolam sedation"), internally it:

  1. ESpell Correction - Fix spelling errors
  2. MeSH Query - Entrez.esearch(db="mesh") to get standard vocabulary
  3. Synonym Extraction - Get synonyms from MeSH Entry Terms
  4. Query Analysis - Analyze how PubMed interprets each query
{
  "mesh_terms": [
    {
      "input": "remimazolam",
      "preferred": "remimazolam [Supplementary Concept]",
      "synonyms": ["CNS 7056", "ONO 2745"]
    }
  ],
  "all_synonyms": ["CNS 7056", "ONO 2745", ...],
  "suggested_queries": [
    {
      "id": "q1_title",
      "query": "(remimazolam sedation)[Title]",
      "purpose": "Exact title match - highest precision",
      "estimated_count": 8,
      "pubmed_translation": "\"remimazolam sedation\"[Title]"
    },
    {
      "id": "q3_and",
      "query": "(remimazolam AND sedation)",
      "purpose": "All keywords required",
      "estimated_count": 561,
      "pubmed_translation": "(\"remimazolam\"[Supplementary Concept] OR \"remimazolam\"[All Fields]) AND (\"sedate\"[All Fields] OR ...)"
    }
  ]
}

Value of Query Analysis: Agent thinks remimazolam AND sedation only searches these two words, but PubMed actually expands to Supplementary Concept + synonyms, results go from 8 to 561. This helps Agent understand the difference between intent and actual search.


๐Ÿ”’ HTTPS Deployment

Enable HTTPS secure communication for production environments.

Quick Start

# Step 1: Generate SSL certificates
./scripts/generate-ssl-certs.sh

# Step 2: Start HTTPS service (Docker)
./scripts/start-https-docker.sh up

# Verify deployment
curl -k https://localhost/

HTTPS Endpoints

Service URL Description
MCP SSE https://localhost/sse SSE connection (MCP)
Messages https://localhost/messages MCP POST
Health https://localhost/health Health check

Claude Desktop Configuration

{
  "mcpServers": {
    "pubmed-search": {
      "url": "https://localhost/sse"
    }
  }
}

๐Ÿ“– Full documentation:


๐Ÿ” Security

Security Features

Layer Feature Description
HTTPS TLS 1.2/1.3 encryption All traffic encrypted via Nginx
Rate Limiting 30 req/s Nginx level protection
Security Headers XSS/CSRF protection X-Frame-Options, X-Content-Type-Options
SSE Optimization 24h timeout Long-lived connections for real-time
No Database Stateless No SQL injection risk
No Secrets In-memory only No credentials stored

๐Ÿ“ฆ Installation

Basic Installation (Library Only)

pip install pubmed-search

With MCP Server Support

pip install "pubmed-search[mcp]"

From Source

git clone https://github.com/u9401066/pubmed-search-mcp.git
cd pubmed-search-mcp
pip install -e ".[all]"

As a Git Submodule

# Add as submodule to your project
git submodule add https://github.com/u9401066/pubmed-search-mcp.git src/pubmed_search

# Install dependencies
pip install biopython requests mcp

Then import in your code:

from src.pubmed_search import PubMedClient
# or add src to your Python path

๐Ÿ“š Usage

As a Python Library

from pubmed_search import PubMedClient

client = PubMedClient(email="your@email.com")

# Search for papers
results = client.search("anesthesia complications", limit=10)
for paper in results:
    print(f"{paper.pmid}: {paper.title}")

# Get related articles
related = client.find_related("12345678", limit=5)

# Get citing articles
citing = client.find_citing("12345678")

As an MCP Server (Local - stdio)

VS Code Configuration

Add to your .vscode/mcp.json:

{
  "servers": {
    "pubmed-search": {
      "type": "stdio",
      "command": "pubmed-search-mcp",
      "args": ["your@email.com"]
    }
  }
}

Or using Python module:

{
  "servers": {
    "pubmed-search": {
      "type": "stdio",
      "command": "python",
      "args": ["-m", "pubmed_search.mcp", "your@email.com"]
    }
  }
}

Running Standalone

# Using the console script
pubmed-search-mcp your@email.com

# Or using Python
python -m pubmed_search.mcp your@email.com

As a Remote MCP Server (HTTP/SSE)

For serving multiple machines, run the server in HTTP mode:

# Quick start
./start.sh

# Or with custom options
python run_server.py --transport sse --port 8765 --email your@email.com

# Using Docker
docker compose up -d

Remote Client Configuration

On other machines, configure .vscode/mcp.json:

{
  "servers": {
    "pubmed-search": {
      "type": "sse",
      "url": "http://YOUR_SERVER_IP:8765/sse"
    }
  }
}

See DEPLOYMENT.md for detailed deployment instructions.


๐Ÿ“ค Export Formats

Export your search results in formats compatible with major reference managers:

Format Compatible With Use Case
RIS EndNote, Zotero, Mendeley Universal import
BibTeX LaTeX, Overleaf, JabRef Academic writing
CSV Excel, Google Sheets Data analysis
MEDLINE PubMed native format Archiving
JSON Programmatic access Custom processing

Exported Fields

  • Core: PMID, Title, Authors, Journal, Year, Volume, Issue, Pages
  • Identifiers: DOI, PMC ID, ISSN
  • Content: Abstract (HTML tags cleaned)
  • Metadata: Language, Publication Type, Keywords
  • Access: DOI URL, PMC URL, Full-text availability

Special Character Handling

  • BibTeX exports use pylatexenc for proper LaTeX encoding
  • Nordic characters (รธ, รฆ, รฅ), umlauts (รผ, รถ, รค), and accents are correctly converted
  • Example: Sรธren Hansen โ†’ S{\o}ren Hansen

๐Ÿ“– API Documentation

PubMedClient

The main client class for interacting with PubMed.

from pubmed_search import PubMedClient

client = PubMedClient(
    email="your@email.com",  # Required by NCBI
    api_key=None,            # Optional: NCBI API key for higher rate limits
    tool="pubmed-search"     # Tool name for NCBI tracking
)

Low-level Entrez API

For more control, use the low-level Entrez interface:

from pubmed_search.entrez import LiteratureSearcher

searcher = LiteratureSearcher(email="your@email.com")

# Advanced search with filters
results = searcher.search_advanced(
    term="propofol sedation",
    filter_humans=True,
    filter_english=True,
    date_range=("2020", "2024"),
    max_results=50
)

๐Ÿ“„ License

Apache License 2.0 - see LICENSE

๐Ÿค Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Run tests: pytest
  5. Submit a pull request

๏ฟฝ Project Structure

pubmed-search-mcp/
โ”œโ”€โ”€ src/pubmed_search/          # Core library
โ”‚   โ”œโ”€โ”€ mcp/                    # MCP server and tools
โ”‚   โ”‚   โ”œโ”€โ”€ tools/              # 35+ MCP tools
โ”‚   โ”‚   โ””โ”€โ”€ prompts.py          # MCP prompt templates
โ”‚   โ”œโ”€โ”€ sources/                # Multi-source clients
โ”‚   โ””โ”€โ”€ exports/                # Export formatters
โ”œโ”€โ”€ .claude/skills/             # ๐Ÿ†• Claude Skill files
โ”‚   โ”œโ”€โ”€ pubmed-quick-search/
โ”‚   โ”œโ”€โ”€ pubmed-systematic-search/
โ”‚   โ”œโ”€โ”€ pubmed-pico-search/
โ”‚   โ”œโ”€โ”€ pubmed-paper-exploration/
โ”‚   โ”œโ”€โ”€ pubmed-gene-drug-research/
โ”‚   โ”œโ”€โ”€ pubmed-fulltext-access/
โ”‚   โ”œโ”€โ”€ pubmed-export-citations/
โ”‚   โ”œโ”€โ”€ pubmed-multi-source-search/
โ”‚   โ””โ”€โ”€ pubmed-mcp-tools-reference/
โ”œโ”€โ”€ .github/
โ”‚   โ””โ”€โ”€ copilot-instructions.md # ๐Ÿ†• VS Code Copilot guide
โ”œโ”€โ”€ README.md                   # English documentation
โ””โ”€โ”€ README.zh-TW.md            # ็น้ซ”ไธญๆ–‡ๆ–‡ไปถ

๏ฟฝ๐Ÿ”— Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pubmed_search_mcp-0.1.24.tar.gz (171.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pubmed_search_mcp-0.1.24-py3-none-any.whl (196.6 kB view details)

Uploaded Python 3

File details

Details for the file pubmed_search_mcp-0.1.24.tar.gz.

File metadata

  • Download URL: pubmed_search_mcp-0.1.24.tar.gz
  • Upload date:
  • Size: 171.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pubmed_search_mcp-0.1.24.tar.gz
Algorithm Hash digest
SHA256 679430c4f35fd13095abef70f772b672f1eb7b28ae62f517ec4abc65600776f3
MD5 fecc67664f9add882976b51154e133b7
BLAKE2b-256 fdd1d69ff2aa2069d897cc20d9a633f77992ee0f302acb67940e149a1e1e6812

See more details on using hashes here.

Provenance

The following attestation bundles were made for pubmed_search_mcp-0.1.24.tar.gz:

Publisher: publish.yml on u9401066/pubmed-search-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pubmed_search_mcp-0.1.24-py3-none-any.whl.

File metadata

File hashes

Hashes for pubmed_search_mcp-0.1.24-py3-none-any.whl
Algorithm Hash digest
SHA256 994c7a5cc13915fadc214dddc543c9f7866df77f6a98e88f015d27f245bb79b0
MD5 6808f31a84b36ece6408af4f59035b05
BLAKE2b-256 f2d367f9ae4f18737d14c93bd437adc33a779996e36b52c429359c4eea7711d7

See more details on using hashes here.

Provenance

The following attestation bundles were made for pubmed_search_mcp-0.1.24-py3-none-any.whl:

Publisher: publish.yml on u9401066/pubmed-search-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page