Skip to main content

FastMCP server for academic paper search, metadata, PDFs, citations, and BibTeX

Project description

mcsci-hub

An MCP server that gives LLMs access to academic literature. Search papers, retrieve metadata, download PDFs, traverse citation graphs, and generate BibTeX -- all through a unified interface backed by four academic data sources.

Built with FastMCP.

search "CRISPR delivery" --> get_paper(doi) --> fetch_pdf(doi) --> [mcp-pdf] --> analyze

How It Works

mcsci-hub aggregates data from multiple academic APIs in parallel, merging results into a single coherent response:

Source What it provides Access
CrossRef Authoritative metadata -- title, authors, journal, volume, pages Free (polite pool)
OpenAlex Abstracts, open access URLs, semantic search Free (polite pool)
Semantic Scholar Citation counts, influential citations, fields of study, BibTeX Free (optional API key)
Sci-Hub Paywalled PDF access via mirror scraping Fallback only

The strategy: always try open access first. Sci-Hub is a last resort. When any source fails, the others still return what they can (graceful degradation).

Tools

Tool Description
search_papers(query, ...) Keyword/topic search via OpenAlex with CrossRef fallback. Filters by year range and open access status.
get_paper(doi) Full metadata from all sources, merged and cached. Returns title, authors, abstract, citation counts, fields of study, OA status.
fetch_pdf(doi) Downloads the PDF: tries OA URL first, then Sci-Hub. Saves locally and hints to use mcp-pdf for parsing.
get_citations(doi, max_results) Forward citations -- papers that cite this one. Includes Semantic Scholar's "influential" flag.
get_references(doi, max_results) Backward references -- papers this one cites. Traces intellectual lineage.
get_bibtex(doi) BibTeX entry from Semantic Scholar, or constructed from CrossRef metadata as fallback.

Resources

URI Description
scihub://mirrors Configured mirror domains and count
scihub://config Full server configuration (cache, timeouts, PDF directory, API key status)
doi://{prefix}/{suffix} Dynamic paper metadata lookup by DOI (e.g. doi://10.1038/nature12373)

Prompts

Prompt Description
literature_review(topic, depth) Guided workflow: search, detail top papers, traverse citations, synthesize. depth='deep' adds PDF analysis via mcp-pdf.
paper_deep_dive(doi) Single paper analysis: metadata, PDF, full-text parsing, citation context.
extract_findings(doi) Focused data extraction from a paper's PDF -- tables, results, conclusions.

Composability with mcp-pdf

fetch_pdf saves PDFs locally and returns the file path. Tool responses and prompts guide the calling LLM to use mcp-pdf for full-text extraction, creating the pipeline:

search --> fetch --> parse --> analyze

Install mcp-pdf alongside mcsci-hub to unlock the full workflow.

Installation

Requires Python 3.11+.

# Clone and install
git clone git@git.supported.systems:MCP/mcsci-hub.git
cd mcsci-hub
uv sync

# Copy and edit environment config
cp .env.example .env

Add to Claude Code

# Local development
claude mcp add mcsci-hub -- uv run --directory /path/to/mcsci-hub mcsci-hub

# From PyPI (once published)
claude mcp add mcsci-hub -- uvx mcsci-hub

Run standalone (stdio transport)

uv run mcsci-hub

Configuration

All settings are managed via environment variables (loaded from .env by pydantic-settings).

Variable Default Description
SCIHUB_MIRRORS sci-hub.se,sci-hub.st,sci-hub.ru Comma-separated mirror domains, tried in order
CROSSREF_MAILTO (empty) Email for CrossRef polite pool (higher rate limits)
OPENALEX_MAILTO (empty) Email for OpenAlex polite pool
S2_API_KEY (empty) Optional Semantic Scholar API key for higher rate limits
CACHE_MAX_SIZE 1000 Maximum entries in the TTL cache
CACHE_TTL 3600 Cache entry lifetime in seconds
HTTP_TIMEOUT 30 Request timeout in seconds
PDF_SAVE_DIR /tmp/mcsci-hub-pdfs Directory where downloaded PDFs are saved

See .env.example for a ready-to-use template.

Data Source Strategy

Each field in the aggregated response has a primary and fallback source:

Field Primary Fallback Notes
Title, Authors CrossRef OpenAlex, S2 CrossRef is the authoritative registry
Abstract OpenAlex S2 Decoded from inverted index format
Citation count S2 OpenAlex S2 distinguishes influential citations
Open Access URL OpenAlex S2 OpenAlex integrates Unpaywall data
Paywalled PDF Sci-Hub -- Mirror scraping, fallback only
BibTeX S2 Constructed from CrossRef S2 has citationStyles.bibtex
Fields of study S2 -- Better classification taxonomy

All sources are queried in parallel via asyncio.gather with return_exceptions=True -- a slow or failing source never blocks the others.

Project Structure

src/mcsci_hub/
  server.py                  # FastMCP app, lifespan, resources, prompts
  config.py                  # pydantic-settings env config
  cache.py                   # TTL-aware LRU cache
  models.py                  # Pydantic response models
  clients/
    crossref.py              # CrossRef API (polite pool, rate limited)
    openalex.py              # OpenAlex API (abstracts, OA URLs)
    semantic_scholar.py      # S2 API (citations, BibTeX)
    scihub.py                # Sci-Hub scraper (mirror rotation, retry)
  tools/
    search.py                # search_papers
    paper.py                 # get_paper + aggregate_paper()
    pdf.py                   # fetch_pdf
    citations.py             # get_citations, get_references
    bibtex.py                # get_bibtex

Development

# Install with dev dependencies
uv sync

# Run tests (55 tests, all mocked with respx + FastMCP in-process transport)
uv run pytest -v

# Run tests with coverage
uv run pytest -v --cov=mcsci_hub

# Lint
uv run ruff check src/ tests/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcsci_hub-2026.5.11.1.tar.gz (17.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcsci_hub-2026.5.11.1-py3-none-any.whl (24.8 kB view details)

Uploaded Python 3

File details

Details for the file mcsci_hub-2026.5.11.1.tar.gz.

File metadata

  • Download URL: mcsci_hub-2026.5.11.1.tar.gz
  • Upload date:
  • Size: 17.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"EndeavourOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mcsci_hub-2026.5.11.1.tar.gz
Algorithm Hash digest
SHA256 175092022c450ff88a64fc0fd7ef63d2cfca2c67bdb55c4140704a2e478ff618
MD5 52242da2395d8671d89dc82fdab7b556
BLAKE2b-256 da35539ae21b0bbc185eeda4a367430cf3feb6b0abac08b0cf87a6c55d7cd7ba

See more details on using hashes here.

File details

Details for the file mcsci_hub-2026.5.11.1-py3-none-any.whl.

File metadata

  • Download URL: mcsci_hub-2026.5.11.1-py3-none-any.whl
  • Upload date:
  • Size: 24.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"EndeavourOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mcsci_hub-2026.5.11.1-py3-none-any.whl
Algorithm Hash digest
SHA256 50ccd64e782499abf81947be2bab2994d11f9367925611fdac6844f7a3901eb9
MD5 1300a4970ee2528ecccc2383019c3766
BLAKE2b-256 fa658e313719780aba97027e2e0dd3432cb34fe5e565302ecbabf1249f4d98d8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page