Skip to main content

FastMCP server for academic paper search, metadata, PDFs, citations, and BibTeX

Project description

mcsci-hub

An MCP server that gives LLMs access to academic literature. Search papers, retrieve metadata, download PDFs, traverse citation graphs, and generate BibTeX -- all through a unified interface backed by four academic data sources.

Built with FastMCP.

search "CRISPR delivery" --> get_paper(doi) --> fetch_pdf(doi) --> [mcp-pdf] --> analyze

How It Works

mcsci-hub aggregates data from multiple academic APIs in parallel, merging results into a single coherent response:

Source What it provides Access
CrossRef Authoritative metadata -- title, authors, journal, volume, pages Free (polite pool)
OpenAlex Abstracts, open access URLs, semantic search Free (polite pool)
Semantic Scholar Citation counts, influential citations, fields of study, BibTeX Free (optional API key)
Sci-Hub Paywalled PDF access via mirror scraping Fallback only

The strategy: always try open access first. Sci-Hub is a last resort. When any source fails, the others still return what they can (graceful degradation).

Tools

Tool Description
search_papers(query, ...) Keyword/topic search via OpenAlex with CrossRef fallback. Filters by year range and open access status.
get_paper(doi) Full metadata from all sources, merged and cached. Returns title, authors, abstract, citation counts, fields of study, OA status.
fetch_pdf(doi) Downloads the PDF: tries OA URL first, then Sci-Hub. Saves locally and hints to use mcp-pdf for parsing.
get_citations(doi, max_results) Forward citations -- papers that cite this one. Includes Semantic Scholar's "influential" flag.
get_references(doi, max_results) Backward references -- papers this one cites. Traces intellectual lineage.
get_bibtex(doi) BibTeX entry from Semantic Scholar, or constructed from CrossRef metadata as fallback.

Resources

URI Description
scihub://mirrors Configured mirror domains and count
scihub://config Full server configuration (cache, timeouts, PDF directory, API key status)
doi://{prefix}/{suffix} Dynamic paper metadata lookup by DOI (e.g. doi://10.1038/nature12373)

Prompts

Prompt Description
literature_review(topic, depth) Guided workflow: search, detail top papers, traverse citations, synthesize. depth='deep' adds PDF analysis via mcp-pdf.
paper_deep_dive(doi) Single paper analysis: metadata, PDF, full-text parsing, citation context.
extract_findings(doi) Focused data extraction from a paper's PDF -- tables, results, conclusions.

Composability with mcp-pdf

fetch_pdf saves PDFs locally and returns the file path. Tool responses and prompts guide the calling LLM to use mcp-pdf for full-text extraction, creating the pipeline:

search --> fetch --> parse --> analyze

Install mcp-pdf alongside mcsci-hub to unlock the full workflow.

Installation

Requires Python 3.11+.

# Clone and install
git clone git@git.supported.systems:MCP/mcsci-hub.git
cd mcsci-hub
uv sync

# Copy and edit environment config
cp .env.example .env

Add to Claude Code

# Local development
claude mcp add mcsci-hub -- uv run --directory /path/to/mcsci-hub mcsci-hub

# From PyPI (once published)
claude mcp add mcsci-hub -- uvx mcsci-hub

Run standalone (stdio transport)

uv run mcsci-hub

Configuration

All settings are managed via environment variables (loaded from .env by pydantic-settings).

Variable Default Description
SCIHUB_MIRRORS sci-hub.se,sci-hub.st,sci-hub.ru Comma-separated mirror domains, tried in order
CROSSREF_MAILTO (empty) Email for CrossRef polite pool (higher rate limits)
OPENALEX_MAILTO (empty) Email for OpenAlex polite pool
S2_API_KEY (empty) Optional Semantic Scholar API key for higher rate limits
CACHE_MAX_SIZE 1000 Maximum entries in the TTL cache
CACHE_TTL 3600 Cache entry lifetime in seconds
HTTP_TIMEOUT 30 Request timeout in seconds
PDF_SAVE_DIR /tmp/mcsci-hub-pdfs Directory where downloaded PDFs are saved

See .env.example for a ready-to-use template.

Data Source Strategy

Each field in the aggregated response has a primary and fallback source:

Field Primary Fallback Notes
Title, Authors CrossRef OpenAlex, S2 CrossRef is the authoritative registry
Abstract OpenAlex S2 Decoded from inverted index format
Citation count S2 OpenAlex S2 distinguishes influential citations
Open Access URL OpenAlex S2 OpenAlex integrates Unpaywall data
Paywalled PDF Sci-Hub -- Mirror scraping, fallback only
BibTeX S2 Constructed from CrossRef S2 has citationStyles.bibtex
Fields of study S2 -- Better classification taxonomy

All sources are queried in parallel via asyncio.gather with return_exceptions=True -- a slow or failing source never blocks the others.

Project Structure

src/mcsci_hub/
  server.py                  # FastMCP app, lifespan, resources, prompts
  config.py                  # pydantic-settings env config
  cache.py                   # TTL-aware LRU cache
  models.py                  # Pydantic response models
  clients/
    crossref.py              # CrossRef API (polite pool, rate limited)
    openalex.py              # OpenAlex API (abstracts, OA URLs)
    semantic_scholar.py      # S2 API (citations, BibTeX)
    scihub.py                # Sci-Hub scraper (mirror rotation, retry)
  tools/
    search.py                # search_papers
    paper.py                 # get_paper + aggregate_paper()
    pdf.py                   # fetch_pdf
    citations.py             # get_citations, get_references
    bibtex.py                # get_bibtex

Development

# Install with dev dependencies
uv sync

# Run tests (55 tests, all mocked with respx + FastMCP in-process transport)
uv run pytest -v

# Run tests with coverage
uv run pytest -v --cov=mcsci_hub

# Lint
uv run ruff check src/ tests/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcsci_hub-2026.2.16.1.tar.gz (113.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcsci_hub-2026.2.16.1-py3-none-any.whl (23.1 kB view details)

Uploaded Python 3

File details

Details for the file mcsci_hub-2026.2.16.1.tar.gz.

File metadata

  • Download URL: mcsci_hub-2026.2.16.1.tar.gz
  • Upload date:
  • Size: 113.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"EndeavourOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mcsci_hub-2026.2.16.1.tar.gz
Algorithm Hash digest
SHA256 7760d329d4826038ff75305405303ca6877706df28a8bab45fc534e7f0508b94
MD5 7f6e996b5f9d421380c20c9fe13eb2ce
BLAKE2b-256 b0603257dc55fd98da52b2cb8258b49a17ca02ef4a40c59e1d4476eb1f857c72

See more details on using hashes here.

File details

Details for the file mcsci_hub-2026.2.16.1-py3-none-any.whl.

File metadata

  • Download URL: mcsci_hub-2026.2.16.1-py3-none-any.whl
  • Upload date:
  • Size: 23.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"EndeavourOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mcsci_hub-2026.2.16.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f42f2d15e03cb04360e5915a85c9382fb5f77f7340fd27f631f2cef5590205db
MD5 7b19b372e5134b51f9af4adf1ef1afd7
BLAKE2b-256 cb25de3aad1f6f7985ce9ef62b5165225985a64251487b6887950d3b95db4044

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page