Skip to main content

MCP server for PubMed literature search with MeSH, PICO, and intelligent query expansion

Project description

PubMed Search MCP

PyPI version Python 3.10+ License: Apache 2.0 MCP Smithery Test Coverage

Professional Literature Research Assistant for AI Agents - More than just an API wrapper

A Domain-Driven Design (DDD) based MCP server that serves as an intelligent research assistant for AI agents, providing task-oriented literature search and analysis capabilities.

๐ŸŒ Language: English | ็น้ซ”ไธญๆ–‡


๐Ÿš€ Quick Install

Via Smithery (Recommended for Claude Desktop)

npx -y @smithery/cli install pubmed-search-mcp --client claude

Via pip

pip install pubmed-search-mcp

Via uv

uv add pubmed-search-mcp

Via uvx (Zero Install)

uvx pubmed-search-mcp

โš™๏ธ Configuration

Claude Desktop (claude_desktop_config.json)

{
  "mcpServers": {
    "pubmed-search": {
      "command": "uvx",
      "args": ["pubmed-search-mcp"],
      "env": {
        "NCBI_EMAIL": "your@email.com"
      }
    }
  }
}

VS Code / Cursor (.vscode/mcp.json)

{
  "servers": {
    "pubmed-search": {
      "type": "stdio",
      "command": "uvx",
      "args": ["pubmed-search-mcp"],
      "env": {
        "NCBI_EMAIL": "your@email.com"
      }
    }
  }
}

Note: NCBI_EMAIL is required by NCBI API policy. Optionally set NCBI_API_KEY for higher rate limits.


๐ŸŽฏ Design Philosophy

  • Agent-First - Designed for AI Agents, output optimized for machine decision-making
  • Task-Oriented - Tools organized by research tasks, not low-level APIs
  • DDD Architecture - Core modeling based on literature research domain knowledge
  • Context-Aware - Maintains research state through Session

Positioning: PubMed-specialized AI research assistant

  • โœ… MeSH vocabulary integration - Not available from other sources
  • โœ… PICO structured queries - Medical specialty
  • โœ… ESpell spelling correction - Auto-correction
  • โœ… Batch parallel search - High efficiency

Features

  • Search PubMed: Full-text and advanced query support
  • Related Articles: Find papers related to a given PMID
  • Citing Articles: Find papers that cite a given PMID
  • Parallel Search: Generate multiple queries for comprehensive searches
  • PDF Access: Get open-access PDF URLs from PubMed Central
  • Export Formats: RIS, BibTeX, CSV, MEDLINE, JSON (EndNote/Zotero/Mendeley compatible)
  • MCP Integration: Use with VS Code + GitHub Copilot or any MCP client
  • Remote Server: Deploy as HTTP service for multi-machine access
  • Submodule Ready: Use as a Git submodule in larger projects

๐Ÿ› ๏ธ MCP Tools (14 Tools)

Discovery Tools

Tool Description Direction
search_literature Search PubMed literature -
find_related_articles Find similar articles (PubMed algorithm) Similarity
find_citing_articles Find papers citing this article (follow-up research) Forward โžก๏ธ
get_article_references Get this article's references (research foundation) Backward โฌ…๏ธ
fetch_article_details Get full article information -
get_citation_metrics Get citation metrics (iCite RCR/Percentile) -
build_citation_tree Build citation network tree (6 formats) Both โ†”๏ธ
suggest_citation_tree Evaluate if building citation tree is worthwhile -

Parallel Search Tools

Tool Description
parse_pico Parse PICO clinical questions (search entry point)
generate_search_queries Generate multiple search strategies (ESpell + MeSH)
merge_search_results Merge and deduplicate search results
expand_search_queries Expand search strategies

Export Tools

Tool Description
prepare_export Export citation formats (RIS/BibTeX/CSV/MEDLINE/JSON)
get_article_fulltext_links Get full-text links (PMC/DOI)
analyze_fulltext_access Analyze open access availability

Design Principle: Focus on search. Session/Cache/Reading List are all internal mechanisms that operate automatically - Agents don't need to manage them.


๐Ÿ“‹ Agent Usage Workflow

Simple Search

search_literature(query="remimazolam ICU sedation", limit=10)

Using PubMed Official Syntax

# MeSH standard vocabulary
search_literature(query='"Diabetes Mellitus"[MeSH]')

# Field-specific search
search_literature(query='(BRAF[Gene Name]) AND (melanoma[Title/Abstract])')

# Date range
search_literature(query='COVID-19[Title] AND 2024[dp]')

# Publication type
search_literature(query='propofol sedation AND Review[pt]')

# Combined search
search_literature(query='("Intensive Care Units"[MeSH]) AND (remimazolam[tiab] OR "CNS 7056"[tiab])')

PubMed Official Field Tags

Tag Description Example
[Title] or [ti] Title COVID-19[ti]
[Title/Abstract] or [tiab] Title + Abstract sedation[tiab]
[MeSH] or [mh] MeSH standard vocabulary "Diabetes Mellitus"[MeSH]
[MeSH Major Topic] or [majr] MeSH major topic "Anesthesia"[majr]
[Author] or [au] Author Smith J[au]
[Journal] or [ta] Journal abbreviation Nature[ta]
[Publication Type] or [pt] Publication type Review[pt], Clinical Trial[pt]
[Date - Publication] or [dp] Publication date 2024[dp], 2020:2024[dp]
[Gene Name] Gene name BRAF[Gene Name]
[Substance Name] Substance name propofol[Substance Name]

Full syntax reference: PubMed Search Field Tags

Deep Exploration (After finding important papers)

find_related_articles(pmid="12345678")   # Related articles (PubMed algorithm)
find_citing_articles(pmid="12345678")    # Papers citing this one (forward in time)
get_article_references(pmid="12345678")  # This paper's references (backward in time)

๐Ÿ”ฌ Citation Discovery Guide

After finding an important paper, there are 5 tools to explore related literature. Choosing the right tool can greatly improve research efficiency:

Tool Comparison

Tool Direction Data Source Use Case API Calls
find_related_articles Similarity PubMed algorithm Find topic/method similar articles 1
find_citing_articles Forward โžก๏ธ PMC citations Find follow-up research 1
get_article_references Backward โฌ…๏ธ PMC references Find foundational papers 1
build_citation_tree Both โ†”๏ธ PMC (BFS traversal) Build complete citation network Multiple
suggest_citation_tree - Article info Evaluate if tree building is worthwhile 1

Usage Decision Tree

Found an important paper (PMID: 12345678)
    โ”‚
    โ”œโ”€โ”€ Want to find "similar topic" articles?
    โ”‚   โ””โ”€โ”€ โœ… find_related_articles(pmid="12345678")
    โ”‚       โ†’ PubMed algorithm finds similar articles by MeSH, keywords, citation patterns
    โ”‚
    โ”œโ”€โ”€ Want to know "how subsequent research developed"?
    โ”‚   โ””โ”€โ”€ โœ… find_citing_articles(pmid="12345678")
    โ”‚       โ†’ Find all papers citing this one (timeline: forward โ†’ now)
    โ”‚
    โ”œโ”€โ”€ Want to understand "what this article is based on"?
    โ”‚   โ””โ”€โ”€ โœ… get_article_references(pmid="12345678")
    โ”‚       โ†’ Get this article's reference list (timeline: backward โ† past)
    โ”‚
    โ””โ”€โ”€ Want to build "complete research context network"?
        โ”‚
        โ”œโ”€โ”€ First evaluate: suggest_citation_tree(pmid="12345678")
        โ”‚   โ†’ Check citation count to decide if tree building is worthwhile
        โ”‚
        โ””โ”€โ”€ Build network: build_citation_tree(pmid="12345678", depth=2)
            โ†’ Output Mermaid/Cytoscape/GraphML formats

Practical Examples

Scenario 1: Quick related paper search

# Found an important RCT on remimazolam, want to see similar studies
find_related_articles(pmid="33475315", limit=10)

Scenario 2: Track research impact

# What subsequent research did this 2020 paper influence?
find_citing_articles(pmid="33475315", limit=20)

Scenario 3: Understand research foundation

# What key literature did this article cite? Find foundation papers
get_article_references(pmid="33475315", limit=30)

Scenario 4: Build research context map (Literature Review)

# Step 1: Evaluate if tree building is worthwhile
suggest_citation_tree(pmid="33475315")

# Step 2: Build 2-level citation network, output Mermaid format (previewable in VS Code)
build_citation_tree(
    pmid="33475315",
    depth=2,
    direction="both",
    output_format="mermaid"
)

Citation Tree Output Formats

Format Use Case Tool
mermaid VS Code Markdown preview Built-in Mermaid extension
cytoscape Academic standard, bioinformatics Cytoscape.js
g6 Modern web visualization AntV G6
d3 Flexible customization D3.js force layout
vis Rapid prototyping vis-network
graphml Desktop analysis software Gephi, VOSviewer, yEd

๐Ÿ” Deep Search: Two Entry Modes

This tool provides two deep search entry points, both completed through parallel search + merge deduplication:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                      Deep Search Flowchart                               โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                          โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                   โ”‚
โ”‚   โ”‚  Keyword Entry    โ”‚         โ”‚  PICO Clinical    โ”‚                   โ”‚
โ”‚   โ”‚  (Know what to    โ”‚         โ”‚  Question Entry   โ”‚                   โ”‚
โ”‚   โ”‚   search)         โ”‚         โ”‚  (Have clinical   โ”‚                   โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜         โ”‚   description)    โ”‚                   โ”‚
โ”‚             โ”‚                   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                   โ”‚
โ”‚             โ”‚                             โ”‚                              โ”‚
โ”‚             โ”‚                             โ–ผ                              โ”‚
โ”‚             โ”‚                   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                   โ”‚
โ”‚             โ”‚                   โ”‚   parse_pico()    โ”‚                   โ”‚
โ”‚             โ”‚                   โ”‚   Parse P/I/C/O   โ”‚                   โ”‚
โ”‚             โ”‚                   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                   โ”‚
โ”‚             โ”‚                             โ”‚                              โ”‚
โ”‚             โ–ผ                             โ–ผ                              โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”       โ”‚
โ”‚   โ”‚              generate_search_queries()                       โ”‚       โ”‚
โ”‚   โ”‚              (ESpell correction + MeSH expansion + synonyms) โ”‚       โ”‚
โ”‚   โ”‚                                                              โ”‚       โ”‚
โ”‚   โ”‚   Keyword mode: 1 call                                       โ”‚       โ”‚
โ”‚   โ”‚   PICO mode: 1 call per element (P/I/C/O) in parallel        โ”‚       โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ”‚
โ”‚                              โ”‚                                           โ”‚
โ”‚                              โ–ผ                                           โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”       โ”‚
โ”‚   โ”‚              Agent combines query strategies                 โ”‚       โ”‚
โ”‚   โ”‚                                                              โ”‚       โ”‚
โ”‚   โ”‚   โ€ข Use returned suggested_queries                           โ”‚       โ”‚
โ”‚   โ”‚   โ€ข Or combine mesh_terms + all_synonyms yourself            โ”‚       โ”‚
โ”‚   โ”‚   โ€ข PICO mode: Use Boolean logic (P) AND (I) AND (O)         โ”‚       โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ”‚
โ”‚                              โ”‚                                           โ”‚
โ”‚                              โ–ผ                                           โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”       โ”‚
โ”‚   โ”‚              search_literature() ร— N (parallel execution)    โ”‚       โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ”‚
โ”‚                              โ”‚                                           โ”‚
โ”‚                              โ–ผ                                           โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”       โ”‚
โ”‚   โ”‚              merge_search_results()                          โ”‚       โ”‚
โ”‚   โ”‚              Merge + dedupe + mark high-relevance articles   โ”‚       โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ”‚
โ”‚                                                                          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Entry 1๏ธโƒฃ: Keyword-Oriented

Use Case: Already know the keywords or topic to search

# Step 1: Get search materials (ESpell + MeSH + synonyms)
generate_search_queries(topic="remimazolam ICU sedation")

# Returns:
{
  "corrected_topic": "remimazolam icu sedation",   # Spelling corrected
  "mesh_terms": [
    {"input": "remimazolam", "preferred": "remimazolam [Supplementary Concept]", 
     "synonyms": ["CNS 7056", "ONO 2745"]},
    {"input": "sedation", "preferred": "Deep Sedation", 
     "synonyms": ["Sedation, Deep"]}
  ],
  "all_synonyms": ["CNS 7056", "ONO 2745", "Sedation, Deep", ...],
  "suggested_queries": [
    {"id": "q1_title", "query": "(remimazolam icu sedation)[Title]"},
    {"id": "q2_tiab", "query": "(remimazolam icu sedation)[Title/Abstract]"},
    {"id": "q4_mesh", "query": "\"remimazolam [Supplementary Concept]\"[MeSH Terms]"},
    {"id": "q6_syn", "query": "(CNS 7056)[Title/Abstract]"},
    ...
  ]
}

# Step 2: Execute searches in parallel
search_literature(query="(remimazolam icu sedation)[Title]")          # parallel
search_literature(query="(remimazolam icu sedation)[Title/Abstract]") # parallel
search_literature(query="\"Deep Sedation\"[MeSH Terms]")              # parallel
...

# Step 3: Merge results
merge_search_results(results_json='[["pmid1","pmid2"],["pmid2","pmid3"]]')
# โ†’ unique_pmids: Deduplicated PMID list
# โ†’ high_relevance_pmids: High-relevance articles hit by multiple strategies

Entry 2๏ธโƒฃ: PICO Clinical Question

Use Case: Have a clinical question that needs to be decomposed into structured search

# Step 1: Parse PICO structure
parse_pico(description="Is remimazolam better than propofol for ICU sedation? Does it reduce delirium?")

# Returns:
{
  "pico": {
    "P": "ICU patients requiring sedation",
    "I": "remimazolam",
    "C": "propofol", 
    "O": "delirium incidence"
  },
  "question_type": "therapy",  # Suggested Clinical Query filter
  "next_steps": "Call generate_search_queries() for each PICO element"
}

# Step 2: Get search materials for each PICO element (in parallel!)
generate_search_queries(topic="ICU patients")  # P โ†’ MeSH: "Intensive Care Units"
generate_search_queries(topic="remimazolam")   # I โ†’ MeSH: "remimazolam [Supplementary Concept]"
generate_search_queries(topic="propofol")      # C โ†’ MeSH: "Propofol"
generate_search_queries(topic="delirium")      # O โ†’ MeSH: "Delirium"

# Step 3: Agent combines queries (using Boolean logic)
# High precision: (P) AND (I) AND (C) AND (O)
query_precise = '("Intensive Care Units"[MeSH] OR ICU[tiab]) AND ' \
                '(remimazolam[tiab] OR "CNS 7056"[tiab]) AND ' \
                '(propofol[tiab] OR Diprivan[tiab]) AND ' \
                '(delirium[tiab] OR "Emergence Delirium"[MeSH])'

# High recall: (P) AND (I OR C) AND (O)
query_recall = '(ICU[tiab]) AND (remimazolam[tiab] OR propofol[tiab]) AND (delirium[tiab])'

# Step 4: Parallel search + merge
search_literature(query=query_precise)  # parallel
search_literature(query=query_recall)   # parallel
merge_search_results(...)

Two Entry Points Comparison

Feature Keyword-Oriented PICO Clinical Question
Entry Tool generate_search_queries(topic) parse_pico(description)
Use Case Know what keywords to search Have clinical question to decompose
MeSH Expansion 1 call 4 calls (one for P/I/C/O each)
Query Combination Use suggested_queries Agent combines with Boolean
Example Input "remimazolam ICU sedation" "Is remimazolam better than propofol in ICU?"

Design Philosophy: Tools provide materials (MeSH terms, synonyms), Agent makes decisions (how to combine queries)


๐Ÿ—๏ธ Architecture (DDD)

This project uses Domain-Driven Design (DDD) architecture, with literature research domain knowledge as the core model.

src/pubmed_search/
โ”œโ”€โ”€ mcp/
โ”‚   โ””โ”€โ”€ tools/
โ”‚       โ”œโ”€โ”€ discovery.py    # Discovery (search, related, citing, details)
โ”‚       โ”œโ”€โ”€ strategy.py     # Strategy (generate_queries, expand)
โ”‚       โ”œโ”€โ”€ pico.py         # PICO parsing
โ”‚       โ”œโ”€โ”€ merge.py        # Result merging
โ”‚       โ”œโ”€โ”€ export.py       # Export tools
โ”‚       โ””โ”€โ”€ citation_tree.py # Citation network visualization (6 formats)
โ”œโ”€โ”€ entrez/                 # NCBI Entrez API wrapper
โ”œโ”€โ”€ exports/                # Export formats (RIS, BibTeX, CSV)
โ””โ”€โ”€ session.py              # Session management (internal mechanism)

Internal Mechanisms (Transparent to Agent)

Mechanism Description
Session Auto-create, auto-switch
Cache Auto-cache search results, avoid duplicate API calls
Rate Limit Auto-comply with NCBI API limits (0.34s/0.1s)
MeSH Lookup generate_search_queries() auto-queries NCBI MeSH database
ESpell Auto spelling correction (remifentanyl โ†’ remifentanil)
Query Analysis Each suggested query shows how PubMed actually interprets it

๐Ÿ“– Full architecture documentation: ARCHITECTURE.md

MeSH Auto-Expansion + Query Analysis

When calling generate_search_queries("remimazolam sedation"), internally it:

  1. ESpell Correction - Fix spelling errors
  2. MeSH Query - Entrez.esearch(db="mesh") to get standard vocabulary
  3. Synonym Extraction - Get synonyms from MeSH Entry Terms
  4. Query Analysis - Analyze how PubMed interprets each query
{
  "mesh_terms": [
    {
      "input": "remimazolam",
      "preferred": "remimazolam [Supplementary Concept]",
      "synonyms": ["CNS 7056", "ONO 2745"]
    }
  ],
  "all_synonyms": ["CNS 7056", "ONO 2745", ...],
  "suggested_queries": [
    {
      "id": "q1_title",
      "query": "(remimazolam sedation)[Title]",
      "purpose": "Exact title match - highest precision",
      "estimated_count": 8,
      "pubmed_translation": "\"remimazolam sedation\"[Title]"
    },
    {
      "id": "q3_and",
      "query": "(remimazolam AND sedation)",
      "purpose": "All keywords required",
      "estimated_count": 561,
      "pubmed_translation": "(\"remimazolam\"[Supplementary Concept] OR \"remimazolam\"[All Fields]) AND (\"sedate\"[All Fields] OR ...)"
    }
  ]
}

Value of Query Analysis: Agent thinks remimazolam AND sedation only searches these two words, but PubMed actually expands to Supplementary Concept + synonyms, results go from 8 to 561. This helps Agent understand the difference between intent and actual search.


๐Ÿ”’ HTTPS Deployment

Enable HTTPS secure communication for production environments.

Quick Start

# Step 1: Generate SSL certificates
./scripts/generate-ssl-certs.sh

# Step 2: Start HTTPS service (Docker)
./scripts/start-https-docker.sh up

# Verify deployment
curl -k https://localhost/

HTTPS Endpoints

Service URL Description
MCP SSE https://localhost/sse SSE connection (MCP)
Messages https://localhost/messages MCP POST
Health https://localhost/health Health check

Claude Desktop Configuration

{
  "mcpServers": {
    "pubmed-search": {
      "url": "https://localhost/sse"
    }
  }
}

๐Ÿ“– Full documentation:


๐Ÿ” Security

Security Features

Layer Feature Description
HTTPS TLS 1.2/1.3 encryption All traffic encrypted via Nginx
Rate Limiting 30 req/s Nginx level protection
Security Headers XSS/CSRF protection X-Frame-Options, X-Content-Type-Options
SSE Optimization 24h timeout Long-lived connections for real-time
No Database Stateless No SQL injection risk
No Secrets In-memory only No credentials stored

๐Ÿ“ฆ Installation

Basic Installation (Library Only)

pip install pubmed-search

With MCP Server Support

pip install "pubmed-search[mcp]"

From Source

git clone https://github.com/u9401066/pubmed-search-mcp.git
cd pubmed-search-mcp
pip install -e ".[all]"

As a Git Submodule

# Add as submodule to your project
git submodule add https://github.com/u9401066/pubmed-search-mcp.git src/pubmed_search

# Install dependencies
pip install biopython requests mcp

Then import in your code:

from src.pubmed_search import PubMedClient
# or add src to your Python path

๐Ÿ“š Usage

As a Python Library

from pubmed_search import PubMedClient

client = PubMedClient(email="your@email.com")

# Search for papers
results = client.search("anesthesia complications", limit=10)
for paper in results:
    print(f"{paper.pmid}: {paper.title}")

# Get related articles
related = client.find_related("12345678", limit=5)

# Get citing articles
citing = client.find_citing("12345678")

As an MCP Server (Local - stdio)

VS Code Configuration

Add to your .vscode/mcp.json:

{
  "servers": {
    "pubmed-search": {
      "type": "stdio",
      "command": "pubmed-search-mcp",
      "args": ["your@email.com"]
    }
  }
}

Or using Python module:

{
  "servers": {
    "pubmed-search": {
      "type": "stdio",
      "command": "python",
      "args": ["-m", "pubmed_search.mcp", "your@email.com"]
    }
  }
}

Running Standalone

# Using the console script
pubmed-search-mcp your@email.com

# Or using Python
python -m pubmed_search.mcp your@email.com

As a Remote MCP Server (HTTP/SSE)

For serving multiple machines, run the server in HTTP mode:

# Quick start
./start.sh

# Or with custom options
python run_server.py --transport sse --port 8765 --email your@email.com

# Using Docker
docker compose up -d

Remote Client Configuration

On other machines, configure .vscode/mcp.json:

{
  "servers": {
    "pubmed-search": {
      "type": "sse",
      "url": "http://YOUR_SERVER_IP:8765/sse"
    }
  }
}

See DEPLOYMENT.md for detailed deployment instructions.


๐Ÿ“ค Export Formats

Export your search results in formats compatible with major reference managers:

Format Compatible With Use Case
RIS EndNote, Zotero, Mendeley Universal import
BibTeX LaTeX, Overleaf, JabRef Academic writing
CSV Excel, Google Sheets Data analysis
MEDLINE PubMed native format Archiving
JSON Programmatic access Custom processing

Exported Fields

  • Core: PMID, Title, Authors, Journal, Year, Volume, Issue, Pages
  • Identifiers: DOI, PMC ID, ISSN
  • Content: Abstract (HTML tags cleaned)
  • Metadata: Language, Publication Type, Keywords
  • Access: DOI URL, PMC URL, Full-text availability

Special Character Handling

  • BibTeX exports use pylatexenc for proper LaTeX encoding
  • Nordic characters (รธ, รฆ, รฅ), umlauts (รผ, รถ, รค), and accents are correctly converted
  • Example: Sรธren Hansen โ†’ S{\o}ren Hansen

๐Ÿ“– API Documentation

PubMedClient

The main client class for interacting with PubMed.

from pubmed_search import PubMedClient

client = PubMedClient(
    email="your@email.com",  # Required by NCBI
    api_key=None,            # Optional: NCBI API key for higher rate limits
    tool="pubmed-search"     # Tool name for NCBI tracking
)

Low-level Entrez API

For more control, use the low-level Entrez interface:

from pubmed_search.entrez import LiteratureSearcher

searcher = LiteratureSearcher(email="your@email.com")

# Advanced search with filters
results = searcher.search_advanced(
    term="propofol sedation",
    filter_humans=True,
    filter_english=True,
    date_range=("2020", "2024"),
    max_results=50
)

๐Ÿ“„ License

Apache License 2.0 - see LICENSE

๐Ÿค Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Run tests: pytest
  5. Submit a pull request

๐Ÿ”— Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pubmed_search_mcp-0.1.14.tar.gz (73.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pubmed_search_mcp-0.1.14-py3-none-any.whl (80.9 kB view details)

Uploaded Python 3

File details

Details for the file pubmed_search_mcp-0.1.14.tar.gz.

File metadata

  • Download URL: pubmed_search_mcp-0.1.14.tar.gz
  • Upload date:
  • Size: 73.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pubmed_search_mcp-0.1.14.tar.gz
Algorithm Hash digest
SHA256 30b7354be5fafc36f7f4ea5c3421f1c923440290e77cc40b328be33d8364adaa
MD5 c57efe5d2c6540a0371e5e1c1d68240a
BLAKE2b-256 e293deca78d62c63f639d026a487b3009656ebed37000094d9eea4c35b6937a3

See more details on using hashes here.

Provenance

The following attestation bundles were made for pubmed_search_mcp-0.1.14.tar.gz:

Publisher: publish.yml on u9401066/pubmed-search-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pubmed_search_mcp-0.1.14-py3-none-any.whl.

File metadata

File hashes

Hashes for pubmed_search_mcp-0.1.14-py3-none-any.whl
Algorithm Hash digest
SHA256 01f387dbc6e9add0928ec66dbced295ffb87f3d12c904184e8416ff665e897f4
MD5 fa9aaf7f7032c06956a922e9b23d423c
BLAKE2b-256 56b3fb1bec0c231b8a8524882967c46a2176421e87a030730dbdf68eb1a87e9f

See more details on using hashes here.

Provenance

The following attestation bundles were made for pubmed_search_mcp-0.1.14-py3-none-any.whl:

Publisher: publish.yml on u9401066/pubmed-search-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page