Skip to main content

Model Context Protocol server for Semantic Scholar — 200M+ academic papers, 14 tools spanning paper search, citation graph traversal, author profiles, and recommendations.

Project description

Semantic Scholar MCP Server

CI codecov PyPI version DOI Docker GitHub Release License: MIT MCP Python 3.10+ Smithery

A comprehensive 14-tool MCP server for Semantic Scholar academic research workflows. Direct access to 200M+ papers from Semantic Scholar — paper search, citation graph traversal, author profiles, and recommendations — from any Model Context Protocol client (e.g., Claude Desktop, Claude Code, Cursor, Cline, Continue, and others).


Installation

Option 1: One-Line Install (Recommended)

# No cloning needed — runs directly from PyPI
uvx s2-mcp-server

Option 2: Claude Code

claude mcp add semantic-scholar -- uvx s2-mcp-server

Option 3: Claude Desktop (Windows)

Add to %APPDATA%\Claude\claude_desktop_config.json:

{
  "mcpServers": {
    "semantic-scholar": {
      "command": "uvx",
      "args": ["s2-mcp-server"],
      "env": {
        "SEMANTIC_SCHOLAR_API_KEY": "your-key-here"
      }
    }
  }
}

Option 4: Claude Desktop (macOS)

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "semantic-scholar": {
      "command": "uvx",
      "args": ["s2-mcp-server"],
      "env": {
        "SEMANTIC_SCHOLAR_API_KEY": "your-key-here"
      }
    }
  }
}

Option 5: pip / From Source

pip install s2-mcp-server
# or
git clone https://github.com/smaniches/semantic-scholar-mcp.git
cd semantic-scholar-mcp && pip install -e .

Option 6: Docker

docker pull ghcr.io/smaniches/semantic-scholar-mcp:latest
docker run -e SEMANTIC_SCHOLAR_API_KEY=your-key ghcr.io/smaniches/semantic-scholar-mcp

Note: Get a free API key at semanticscholar.org/product/api. Without a key, you get rate-limited public access (1 req/sec).


Architecture

flowchart LR
  Client["MCP client<br/>(Claude Desktop, Claude Code,<br/>Cursor, Cline, Continue, …)"]
  subgraph Server ["s2-mcp-server (this package)"]
    direction TB
    FastMCP["FastMCP runtime<br/>(stdio transport, lifespan)"]
    Tools["14 @mcp.tool functions<br/>(server.py)"]
    Models["Pydantic input models<br/>+ field sets (models.py)"]
    Validators["Paper-ID validator<br/>(validators.py)"]
    Cache["TTL cache<br/>(cache.py)"]
    Fmt["Markdown formatters<br/>(formatters.py)"]
    HTTP["httpx client<br/>+ rate limit + retry/backoff<br/>(client.py)"]
    Errors["Typed exceptions<br/>(errors.py)"]
    Log["Structured JSON logger<br/>(logging_config.py)"]
  end
  S2Graph["Semantic Scholar<br/>Graph API"]
  S2Recs["Semantic Scholar<br/>Recommendations API"]

  Client <-- "stdio (JSON-RPC)" --> FastMCP
  FastMCP --> Tools
  Tools --> Models
  Tools --> Validators
  Tools --> Cache
  Tools --> HTTP
  Tools --> Fmt
  HTTP --> Errors
  HTTP --> Log
  HTTP -- "GET / POST<br/>x-api-key" --> S2Graph
  HTTP -- "GET / POST<br/>x-api-key" --> S2Recs

Module responsibilities (src/semantic_scholar_mcp/):

Module Responsibility
server.py FastMCP instance, 14 @mcp.tool registrations, lifespan, main() entry. Re-exports the helper surface for back-compat.
client.py Shared httpx.AsyncClient singleton, per-tier rate limiter (1 req/s public, 10 req/s keyed), retry loop with exponential backoff + jitter on 429/503/timeout, HTTP→typed-exception mapping.
models.py Pydantic input models per tool, ResponseFormat enum, the four tiered field-set constants (PAPER_SEARCH_FIELDS, …_LITE, PAPER_BULK_SEARCH_FIELDS, PAPER_DETAIL_FIELDS, AUTHOR_FIELDS).
validators.py Pre-flight paper-ID validation. Rejects NUL bytes, ?, #, path traversal; accepts the seven canonical ID formats.
cache.py In-memory TTL cache (5 min, 200 entries, oldest-first eviction) for paper/author lookups within a session.
formatters.py Markdown renderers for paper and author dicts, tuned for chat-surface readability.
errors.py SemanticScholarError hierarchy: AuthenticationError, RateLimitError, NotFoundError, ValidationError, ServerError.
logging_config.py One-JSON-per-line StructuredFormatter on stderr; safe to ship through any log aggregator.

Design choices worth knowing

  • Single httpx.AsyncClient per process. Created lazily, closed in the FastMCP lifespan teardown. Amortizes connection setup; respects keep-alive limits.
  • Rate limit is enforced at the client, not the API. A semaphore + last-request timestamp ensures we never exceed the per-tier interval even when the MCP host issues tool calls in parallel.
  • Retry is bounded and jittered. Up to MAX_RETRIES = 3, base 1 s, capped at 30 s. Honors Retry-After when present.
  • Errors are typed. Status codes map onto a small exception hierarchy so callers can branch on AuthenticationError vs RateLimitError vs NotFoundError instead of parsing strings.
  • Input validation is pre-flight. Paper IDs are checked before any outbound request; bad IDs never hit the wire.
  • Version is single-source. __version__ is derived from importlib.metadata.version("s2-mcp-server"), so bumping pyproject.toml is sufficient; release-please bumps the manifest, server.json (×2 paths), CITATION.cff, and .zenodo.json in lockstep on every release.

Configuration

API Key Options

You can provide your API key in two ways:

  1. Environment Variable (recommended for persistent use):

    export SEMANTIC_SCHOLAR_API_KEY="your-api-key-here"
    
  2. Per-Request Parameter (overrides env var):

    {
      "api_key": "your-api-key-here"
    }
    

    Caution: per-request api_key values are part of the tool-call arguments and may be visible in MCP transcripts, client logs, and the LLM's tool-call history depending on the client. For production use, prefer the SEMANTIC_SCHOLAR_API_KEY environment variable. Removal of the per-request parameter is planned for a follow-up release; see .github/SECURITY.md for the tracked list.

Get a free API key at: https://www.semanticscholar.org/product/api

Claude Desktop Setup

Add to your Claude Desktop config file:

Windows: %APPDATA%\Claude\claude_desktop_config.json macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Linux: ~/.config/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "semantic-scholar": {
      "command": "python",
      "args": ["-m", "semantic_scholar_mcp"],
      "env": {
        "SEMANTIC_SCHOLAR_API_KEY": "your-api-key-here"
      }
    }
  }
}

Then restart Claude Desktop.


Supported ID Formats

The server accepts the following paper identifier formats:

Format Pattern Example
Semantic Scholar ID 40-character hex 649def34f8be52c8b66281af98ae884c09aef38b
DOI DOI:xxx DOI:10.1038/s41586-021-03819-2
ArXiv ARXIV:xxx ARXIV:2106.15928 or ARXIV:2106.15928v2
PubMed PMID:xxx PMID:32908142
Corpus ID CorpusId:xxx CorpusId:215416146
ACL ACL:xxx ACL:P19-1285
URL URL:xxx URL:https://arxiv.org/abs/2106.15928

Tools Reference

1. semantic_scholar_search_papers

Search for academic papers with advanced filters.

Parameters:

Parameter Type Required Description
query string Yes Search query (supports AND, OR, NOT operators and "phrase search")
year string No Year filter: "2024", "2020-2024", or "2020-"
fields_of_study string[] No Filter by fields: ["Computer Science", "Biology"]
publication_types string[] No Filter by type: ["Review", "JournalArticle"]
open_access_only boolean No Only return open access papers (default: false)
min_citation_count integer No Minimum citation count
limit integer No Max results 1-100 (default: 10)
offset integer No Pagination offset (default: 0)
response_format string No "markdown" or "json" (default: markdown)
api_key string No Override environment API key

Example:

Search for "transformer attention mechanism" papers from 2023 with at least 100 citations

JSON Example:

{
  "query": "transformer attention mechanism",
  "year": "2023",
  "min_citation_count": 100,
  "fields_of_study": ["Computer Science"],
  "limit": 20
}

2. semantic_scholar_get_paper

Get detailed information about a specific paper.

Parameters:

Parameter Type Required Description
paper_id string Yes Paper ID in any supported format
include_citations boolean No Include citing papers (default: false)
include_references boolean No Include referenced papers (default: false)
citations_limit integer No Max citations to return 1-100 (default: 10)
references_limit integer No Max references to return 1-100 (default: 10)
response_format string No "markdown" or "json" (default: markdown)
api_key string No Override environment API key

Example:

Get details for DOI:10.1038/s41586-021-03819-2 including its top 20 citations

JSON Example:

{
  "paper_id": "DOI:10.1038/s41586-021-03819-2",
  "include_citations": true,
  "citations_limit": 20
}

3. semantic_scholar_search_authors

Search for academic authors by name.

Parameters:

Parameter Type Required Description
query string Yes Author name to search
limit integer No Max results 1-100 (default: 10)
offset integer No Pagination offset (default: 0)
response_format string No "markdown" or "json" (default: markdown)
api_key string No Override environment API key

Example:

Find author "Yoshua Bengio"

JSON Example:

{
  "query": "Yoshua Bengio",
  "limit": 5
}

4. semantic_scholar_get_author

Get author profile with publications.

Parameters:

Parameter Type Required Description
author_id string Yes Semantic Scholar author ID
include_papers boolean No Include publications (default: true)
papers_limit integer No Max papers to return 1-100 (default: 20)
response_format string No "markdown" or "json" (default: markdown)
api_key string No Override environment API key

Example:

Get author profile for author ID 1741101 with their top 50 publications

JSON Example:

{
  "author_id": "1741101",
  "include_papers": true,
  "papers_limit": 50
}

5. semantic_scholar_recommendations

Get AI-powered paper recommendations based on a seed paper.

Parameters:

Parameter Type Required Description
paper_id string Yes Seed paper ID in any supported format
limit integer No Max recommendations 1-100 (default: 10)
response_format string No "markdown" or "json" (default: markdown)
api_key string No Override environment API key

Example:

Get recommendations based on paper 649def34f8be52c8b66281af98ae884c09aef38b

JSON Example:

{
  "paper_id": "ARXIV:1706.03762",
  "limit": 15
}

6. semantic_scholar_bulk_papers

Retrieve multiple papers in a single request (max 500).

Parameters:

Parameter Type Required Description
paper_ids string[] Yes List of paper IDs (max 500)
response_format string No "markdown" or "json" (default: json)
api_key string No Override environment API key

Example:

Retrieve these papers: DOI:10.1038/nature12373, ARXIV:2106.15928, PMID:32908142

JSON Example:

{
  "paper_ids": [
    "DOI:10.1038/nature12373",
    "ARXIV:2106.15928",
    "PMID:32908142"
  ]
}

7. semantic_scholar_bulk_search

Search papers with sorting and cursor-based pagination for large result sets. Unlike search_papers, supports a sort order and returns a token for paging through all results.

Parameters:

Parameter Type Required Description
query string Yes Search query
sort string No Sort order, e.g. "citationCount:desc", "publicationDate:asc"
token string No Continuation token from a previous bulk_search response
year string No Year filter: "2024", "2020-2024", "2020-"
fields_of_study string[] No Filter by fields: ["Computer Science"]
publication_types string[] No Filter by type: ["Review", "JournalArticle"]
min_citation_count integer No Minimum citation count
limit integer No Max results per page 1-1000 (default: 100)
response_format string No "markdown" or "json" (default: markdown)
api_key string No Override environment API key

JSON Example:

{
  "query": "graph neural networks",
  "sort": "citationCount:desc",
  "year": "2020-2024",
  "limit": 100
}

Returns: total result count, the page of papers, and a token for the next page (when more results exist).


8. semantic_scholar_export_citation

Export a citation for a paper in BibTeX format.

Parameters:

Parameter Type Required Description
paper_id string Yes Paper ID in any supported format
format string No Citation format (currently only "bibtex")
api_key string No Override environment API key

JSON Example:

{
  "paper_id": "DOI:10.1038/s41586-021-03819-2",
  "format": "bibtex"
}

Returns: the BibTeX string for the requested paper.


9. semantic_scholar_match_paper

Find the single best paper matching a title string. Returns a numeric matchScore alongside the matched paper.

Parameters:

Parameter Type Required Description
query string Yes Paper title to match (1-500 chars)
response_format string No "markdown" or "json" (default: markdown)
api_key string No Override environment API key

JSON Example:

{
  "query": "Attention Is All You Need"
}

Returns: the best-matching paper plus its matchScore, or "No matching paper found." if no match.


10. semantic_scholar_paper_authors

Get full author profiles for a paper's authors (richer than the abbreviated author list returned by get_paper).

Parameters:

Parameter Type Required Description
paper_id string Yes Paper ID in any supported format
limit integer No Max authors to return 1-1000 (default: 100)
response_format string No "markdown" or "json" (default: markdown)
api_key string No Override environment API key

JSON Example:

{
  "paper_id": "ARXIV:1706.03762",
  "limit": 25
}

Returns: the list of full author records for the paper.


11. semantic_scholar_author_batch

Retrieve multiple authors in a single request (max 1000).

Parameters:

Parameter Type Required Description
author_ids string[] Yes List of author IDs (1-1000)
response_format string No "markdown" or "json" (default: json)
api_key string No Override environment API key

JSON Example:

{
  "author_ids": ["1741101", "40348417", "144749327"]
}

Returns: counts of requested / retrieved, the retrieved author records, and a not_found list of IDs the API did not return.


12. semantic_scholar_multi_recommend

Get recommendations using multiple positive (and optional negative) example papers.

Parameters:

Parameter Type Required Description
positive_paper_ids string[] Yes Papers to find similar results for (1-100)
negative_paper_ids string[] No Papers to dissimilate from (0-100)
limit integer No Max recommendations 1-500 (default: 10)
response_format string No "markdown" or "json" (default: markdown)
api_key string No Override environment API key

JSON Example:

{
  "positive_paper_ids": ["ARXIV:1706.03762", "ARXIV:1810.04805"],
  "negative_paper_ids": ["DOI:10.1038/nature14539"],
  "limit": 20
}

Returns: the recommended papers plus an echo of the positive/negative seeds used.


13. semantic_scholar_snippet_search

Search within paper full text and return text snippets with surrounding context. Heavily rate-limited without an API key.

Parameters:

Parameter Type Required Description
query string Yes Search query for paper text (1-500 chars)
paper_ids string[] No Limit search to specific papers (max 100)
year string No Year filter: "2024", "2020-2024", "2020-"
fields_of_study string[] No Filter by fields: ["Computer Science"]
min_citation_count integer No Minimum citation count
limit integer No Max results 1-100 (default: 10)
response_format string No "markdown" or "json" (default: markdown)
api_key string No Override environment API key

JSON Example:

{
  "query": "scaling laws for language models",
  "year": "2022-2024",
  "limit": 20
}

Returns: matching snippets, each with the source paper title, section, and a short text excerpt.


14. semantic_scholar_status

Check server health and API connectivity status.

Parameters: None

Example:

Check Semantic Scholar API status

Response:

{
  "server": "semantic-scholar-mcp",
  "version": "1.2.2",
  "api_key_configured": true,
  "timestamp": "2026-04-06T12:00:00.000000+00:00",
  "api_reachable": true
}

Rate Limits

Tier Requests/Second How to Get
No API Key 1 req/sec Default
Free API Key 1 req/sec Sign up
Academic Partner 10-100 req/sec Apply via S2

The server automatically handles rate limiting with:

  • Request serialization to enforce minimum intervals
  • Exponential backoff retry for 429 (rate limit) and 503 (service unavailable) errors
  • Maximum 3 retries with jitter

Architecture

+-----------------+     +----------------------+     +-----------------+
|  Claude Desktop |---->|  semantic-scholar-mcp |---->| Semantic Scholar|
|   (MCP Client)  |<----|     (This Server)     |<----+      API        |
+-----------------+     +----------------------+     +-----------------+
        |                         |                          |
        | stdio (JSON-RPC)        | Your API Key             | HTTPS
        | Local process           | Local machine            | 200M+ papers

Where your API key goes. The MCP server runs locally on your machine and does not store your API key on disk. When the server makes authenticated requests, the key is sent only to api.semanticscholar.org over HTTPS as the x-api-key header that the Semantic Scholar API requires. No telemetry is sent to any third party. See the per-request api_key caution above for how transcript exposure can occur when the parameter is used per-request instead of via the environment variable.


Development

# Clone
git clone https://github.com/smaniches/semantic-scholar-mcp.git
cd semantic-scholar-mcp

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run tests with coverage
pytest --cov=src/semantic_scholar_mcp --cov-report=term-missing

# Type checking
mypy src/

Security

API keys are never persisted to disk by the server. Prefer the SEMANTIC_SCHOLAR_API_KEY environment variable over the per-request api_key tool parameter (see SECURITY.md for details on the transcript-exposure risk). All API communication uses HTTPS to api.semanticscholar.org. See SECURITY.md for vulnerability reporting and the v1.2.x known-limitations list.


Related MCP servers by the same author

  • alphafold-sovereign-mcp — Model Context Protocol server for AlphaFold DB and 13 other biomedical data sources, with a local SQLite knowledge graph (pip install --pre alphafold-sovereign-mcp).
  • uniprot-mcp — Model Context Protocol server for UniProt Swiss-Prot and TrEMBL (pip install uniprot-mcp-server).

License

MIT License - see LICENSE file.


Author

Santiago Maniches


Contributing

Contributions welcome! Please read our Contributing Guidelines.


Support


Built by TOPOLOGICA LLC
Advancing computational research through topological intelligence

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s2_mcp_server-1.3.1.tar.gz (24.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

s2_mcp_server-1.3.1-py3-none-any.whl (28.4 kB view details)

Uploaded Python 3

File details

Details for the file s2_mcp_server-1.3.1.tar.gz.

File metadata

  • Download URL: s2_mcp_server-1.3.1.tar.gz
  • Upload date:
  • Size: 24.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for s2_mcp_server-1.3.1.tar.gz
Algorithm Hash digest
SHA256 bd626c23c0c9346932c4abdbe7180d2b69cb266c48d205ee9ad7bb5d7863da0d
MD5 08dd4dd7578981ae5024b5de126c1021
BLAKE2b-256 968455d81a7d2815fc50d1ade27a29ffc77bc2628ee881a865e26e7183f97797

See more details on using hashes here.

Provenance

The following attestation bundles were made for s2_mcp_server-1.3.1.tar.gz:

Publisher: publish.yml on smaniches/semantic-scholar-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file s2_mcp_server-1.3.1-py3-none-any.whl.

File metadata

  • Download URL: s2_mcp_server-1.3.1-py3-none-any.whl
  • Upload date:
  • Size: 28.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for s2_mcp_server-1.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 482f6f6ea5c44399fb07be8908c3a0fef930b72eff53b04ecce11273ebd5a3d6
MD5 83959c57d834814665d946d01db38b52
BLAKE2b-256 1b7a795086dbe3b036a0810c727859a5bf16bc1351a20ed48af30e5938c27d1c

See more details on using hashes here.

Provenance

The following attestation bundles were made for s2_mcp_server-1.3.1-py3-none-any.whl:

Publisher: publish.yml on smaniches/semantic-scholar-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page