Skip to main content

FastMCP server for academic paper search, metadata, PDFs, citations, and BibTeX

Project description

mcsci-hub

An MCP server that gives LLMs access to academic literature. Search papers, retrieve metadata, download PDFs, traverse citation graphs, and generate BibTeX -- all through a unified interface backed by four academic data sources.

Built with FastMCP.

search "CRISPR delivery" --> get_paper(doi) --> fetch_pdf(doi) --> [mcp-pdf] --> analyze

How It Works

mcsci-hub aggregates data from multiple academic APIs in parallel, merging results into a single coherent response:

Source What it provides Access
CrossRef Authoritative metadata -- title, authors, journal, volume, pages Free (polite pool)
OpenAlex Abstracts, open access URLs, semantic search Free (polite pool)
Semantic Scholar Citation counts, influential citations, fields of study, BibTeX Free (optional API key)
Sci-Hub Paywalled PDF access via mirror scraping Fallback only

The strategy: always try open access first. Sci-Hub is a last resort. When any source fails, the others still return what they can (graceful degradation).

Tools

Tool Description
search_papers(query, ...) Keyword/topic search via OpenAlex with CrossRef fallback. Filters by year range and open access status.
get_paper(doi) Full metadata from all sources, merged and cached. Returns title, authors, abstract, citation counts, fields of study, OA status.
fetch_pdf(doi) Downloads the PDF: tries OA URL first, then Sci-Hub. Saves locally and hints to use mcp-pdf for parsing.
get_citations(doi, max_results) Forward citations -- papers that cite this one. Includes Semantic Scholar's "influential" flag.
get_references(doi, max_results) Backward references -- papers this one cites. Traces intellectual lineage.
get_bibtex(doi) BibTeX entry from Semantic Scholar, or constructed from CrossRef metadata as fallback.

Resources

URI Description
scihub://mirrors Configured mirror domains and count
scihub://config Full server configuration (cache, timeouts, PDF directory, API key status)
doi://{prefix}/{suffix} Dynamic paper metadata lookup by DOI (e.g. doi://10.1038/nature12373)

Prompts

Prompt Description
literature_review(topic, depth) Guided workflow: search, detail top papers, traverse citations, synthesize. depth='deep' adds PDF analysis via mcp-pdf.
paper_deep_dive(doi) Single paper analysis: metadata, PDF, full-text parsing, citation context.
extract_findings(doi) Focused data extraction from a paper's PDF -- tables, results, conclusions.

Composability with mcp-pdf

fetch_pdf saves PDFs locally and returns the file path. Tool responses and prompts guide the calling LLM to use mcp-pdf for full-text extraction, creating the pipeline:

search --> fetch --> parse --> analyze

Install mcp-pdf alongside mcsci-hub to unlock the full workflow.

Installation

Requires Python 3.11+.

# Clone and install
git clone git@git.supported.systems:MCP/mcsci-hub.git
cd mcsci-hub
uv sync

# Copy and edit environment config
cp .env.example .env

Add to Claude Code

# Local development
claude mcp add mcsci-hub -- uv run --directory /path/to/mcsci-hub mcsci-hub

# From PyPI (once published)
claude mcp add mcsci-hub -- uvx mcsci-hub

Run standalone (stdio transport)

uv run mcsci-hub

Configuration

All settings are managed via environment variables (loaded from .env by pydantic-settings).

Variable Default Description
SCIHUB_MIRRORS sci-hub.se,sci-hub.st,sci-hub.ru Comma-separated mirror domains, tried in order
CROSSREF_MAILTO (empty) Email for CrossRef polite pool (higher rate limits)
OPENALEX_MAILTO (empty) Email for OpenAlex polite pool
S2_API_KEY (empty) Optional Semantic Scholar API key for higher rate limits
CACHE_MAX_SIZE 1000 Maximum entries in the TTL cache
CACHE_TTL 3600 Cache entry lifetime in seconds
HTTP_TIMEOUT 30 Request timeout in seconds
PDF_SAVE_DIR /tmp/mcsci-hub-pdfs Directory where downloaded PDFs are saved

See .env.example for a ready-to-use template.

Data Source Strategy

Each field in the aggregated response has a primary and fallback source:

Field Primary Fallback Notes
Title, Authors CrossRef OpenAlex, S2 CrossRef is the authoritative registry
Abstract OpenAlex S2 Decoded from inverted index format
Citation count S2 OpenAlex S2 distinguishes influential citations
Open Access URL OpenAlex S2 OpenAlex integrates Unpaywall data
Paywalled PDF Sci-Hub -- Mirror scraping, fallback only
BibTeX S2 Constructed from CrossRef S2 has citationStyles.bibtex
Fields of study S2 -- Better classification taxonomy

All sources are queried in parallel via asyncio.gather with return_exceptions=True -- a slow or failing source never blocks the others.

Project Structure

src/mcsci_hub/
  server.py                  # FastMCP app, lifespan, resources, prompts
  config.py                  # pydantic-settings env config
  cache.py                   # TTL-aware LRU cache
  models.py                  # Pydantic response models
  clients/
    crossref.py              # CrossRef API (polite pool, rate limited)
    openalex.py              # OpenAlex API (abstracts, OA URLs)
    semantic_scholar.py      # S2 API (citations, BibTeX)
    scihub.py                # Sci-Hub scraper (mirror rotation, retry)
  tools/
    search.py                # search_papers
    paper.py                 # get_paper + aggregate_paper()
    pdf.py                   # fetch_pdf
    citations.py             # get_citations, get_references
    bibtex.py                # get_bibtex

Development

# Install with dev dependencies
uv sync

# Run tests (55 tests, all mocked with respx + FastMCP in-process transport)
uv run pytest -v

# Run tests with coverage
uv run pytest -v --cov=mcsci_hub

# Lint
uv run ruff check src/ tests/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcsci_hub-2026.2.16.tar.gz (113.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcsci_hub-2026.2.16-py3-none-any.whl (23.1 kB view details)

Uploaded Python 3

File details

Details for the file mcsci_hub-2026.2.16.tar.gz.

File metadata

  • Download URL: mcsci_hub-2026.2.16.tar.gz
  • Upload date:
  • Size: 113.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"EndeavourOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mcsci_hub-2026.2.16.tar.gz
Algorithm Hash digest
SHA256 7c7d41f67b74c1205cb44f292766fba94961296ec62688545a6469c2a455ff50
MD5 800340a36d65269be94d4e0fb67d509d
BLAKE2b-256 5fba52345d1e5db4c9aee4f7f8aa52dacfcb8fefa047cb5feee60174cb26dc49

See more details on using hashes here.

File details

Details for the file mcsci_hub-2026.2.16-py3-none-any.whl.

File metadata

  • Download URL: mcsci_hub-2026.2.16-py3-none-any.whl
  • Upload date:
  • Size: 23.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"EndeavourOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mcsci_hub-2026.2.16-py3-none-any.whl
Algorithm Hash digest
SHA256 e062949b5bcda0a92d96ec7c0ca9c58d6944a0a42df48b9c3172fc0381e42e6d
MD5 2e60ab723b8bba4614e0b89890772597
BLAKE2b-256 23a16b4293bbca9aa773ee5aaeb6ef447e976ae669b6cd2592913a3bf5617d27

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page