Model Context Protocol server for Semantic Scholar — 200M+ academic papers, 14 tools spanning paper search, citation graph traversal, author profiles, and recommendations.
Project description
Semantic Scholar MCP Server
A comprehensive 14-tool MCP server for Semantic Scholar academic research workflows. Direct access to 200M+ papers from Semantic Scholar — paper search, citation graph traversal, author profiles, and recommendations — from any Model Context Protocol client (e.g., Claude Desktop, Claude Code, Cursor, Cline, Continue, and others).
Installation
Option 1: One-Line Install (Recommended)
# No cloning needed — runs directly from PyPI
uvx s2-mcp-server
Option 2: Claude Code
claude mcp add semantic-scholar -- uvx s2-mcp-server
Option 3: Claude Desktop (Windows)
Add to %APPDATA%\Claude\claude_desktop_config.json:
{
"mcpServers": {
"semantic-scholar": {
"command": "uvx",
"args": ["s2-mcp-server"],
"env": {
"SEMANTIC_SCHOLAR_API_KEY": "your-key-here"
}
}
}
}
Option 4: Claude Desktop (macOS)
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"semantic-scholar": {
"command": "uvx",
"args": ["s2-mcp-server"],
"env": {
"SEMANTIC_SCHOLAR_API_KEY": "your-key-here"
}
}
}
}
Option 5: pip / From Source
pip install s2-mcp-server
# or
git clone https://github.com/smaniches/semantic-scholar-mcp.git
cd semantic-scholar-mcp && pip install -e .
Option 6: Docker
docker pull ghcr.io/smaniches/semantic-scholar-mcp:latest
docker run -e SEMANTIC_SCHOLAR_API_KEY=your-key ghcr.io/smaniches/semantic-scholar-mcp
Note: Get a free API key at semanticscholar.org/product/api. Without a key, you get rate-limited public access (1 req/sec).
Architecture
flowchart LR
Client["MCP client<br/>(Claude Desktop, Claude Code,<br/>Cursor, Cline, Continue, …)"]
subgraph Server ["s2-mcp-server (this package)"]
direction TB
FastMCP["FastMCP runtime<br/>(stdio transport, lifespan)"]
Tools["14 @mcp.tool functions<br/>(server.py)"]
Models["Pydantic input models<br/>+ field sets (models.py)"]
Validators["Paper-ID validator<br/>(validators.py)"]
Cache["TTL cache<br/>(cache.py)"]
Fmt["Markdown formatters<br/>(formatters.py)"]
HTTP["httpx client<br/>+ rate limit + retry/backoff<br/>(client.py)"]
Errors["Typed exceptions<br/>(errors.py)"]
Log["Structured JSON logger<br/>(logging_config.py)"]
end
S2Graph["Semantic Scholar<br/>Graph API"]
S2Recs["Semantic Scholar<br/>Recommendations API"]
Client <-- "stdio (JSON-RPC)" --> FastMCP
FastMCP --> Tools
Tools --> Models
Tools --> Validators
Tools --> Cache
Tools --> HTTP
Tools --> Fmt
HTTP --> Errors
HTTP --> Log
HTTP -- "GET / POST<br/>x-api-key" --> S2Graph
HTTP -- "GET / POST<br/>x-api-key" --> S2Recs
Module responsibilities (src/semantic_scholar_mcp/):
| Module | Responsibility |
|---|---|
server.py |
FastMCP instance, 14 @mcp.tool registrations, lifespan, main() entry. Re-exports the helper surface for back-compat. |
client.py |
Shared httpx.AsyncClient singleton, per-tier rate limiter (1 req/s public, 10 req/s keyed), retry loop with exponential backoff + jitter on 429/503/timeout, HTTP→typed-exception mapping. |
models.py |
Pydantic input models per tool, ResponseFormat enum, the four tiered field-set constants (PAPER_SEARCH_FIELDS, …_LITE, PAPER_BULK_SEARCH_FIELDS, PAPER_DETAIL_FIELDS, AUTHOR_FIELDS). |
validators.py |
Pre-flight paper-ID validation. Rejects NUL bytes, ?, #, path traversal; accepts the seven canonical ID formats. |
cache.py |
In-memory TTL cache (5 min, 200 entries, oldest-first eviction) for paper/author lookups within a session. |
formatters.py |
Markdown renderers for paper and author dicts, tuned for chat-surface readability. |
errors.py |
SemanticScholarError hierarchy: AuthenticationError, RateLimitError, NotFoundError, ValidationError, ServerError. |
logging_config.py |
One-JSON-per-line StructuredFormatter on stderr; safe to ship through any log aggregator. |
Design choices worth knowing
- Single
httpx.AsyncClientper process. Created lazily, closed in the FastMCP lifespan teardown. Amortizes connection setup; respects keep-alive limits. - Rate limit is enforced at the client, not the API. A semaphore + last-request timestamp ensures we never exceed the per-tier interval even when the MCP host issues tool calls in parallel.
- Retry is bounded and jittered. Up to
MAX_RETRIES = 3, base 1 s, capped at 30 s. HonorsRetry-Afterwhen present. - Errors are typed. Status codes map onto a small exception hierarchy so callers can branch on
AuthenticationErrorvsRateLimitErrorvsNotFoundErrorinstead of parsing strings. - Input validation is pre-flight. Paper IDs are checked before any outbound request; bad IDs never hit the wire.
- Version is single-source.
__version__is derived fromimportlib.metadata.version("s2-mcp-server"), so bumpingpyproject.tomlis sufficient; release-please bumps the manifest,server.json(×2 paths),CITATION.cff, and.zenodo.jsonin lockstep on every release.
Configuration
API Key Options
You can provide your API key in two ways:
-
Environment Variable (recommended for persistent use):
export SEMANTIC_SCHOLAR_API_KEY="your-api-key-here"
-
Per-Request Parameter (overrides env var):
{ "api_key": "your-api-key-here" }
Caution: per-request
api_keyvalues are part of the tool-call arguments and may be visible in MCP transcripts, client logs, and the LLM's tool-call history depending on the client. For production use, prefer theSEMANTIC_SCHOLAR_API_KEYenvironment variable. Removal of the per-request parameter is planned for a follow-up release; see .github/SECURITY.md for the tracked list.
Get a free API key at: https://www.semanticscholar.org/product/api
Claude Desktop Setup
Add to your Claude Desktop config file:
Windows: %APPDATA%\Claude\claude_desktop_config.json
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"semantic-scholar": {
"command": "python",
"args": ["-m", "semantic_scholar_mcp"],
"env": {
"SEMANTIC_SCHOLAR_API_KEY": "your-api-key-here"
}
}
}
}
Then restart Claude Desktop.
Supported ID Formats
The server accepts the following paper identifier formats:
| Format | Pattern | Example |
|---|---|---|
| Semantic Scholar ID | 40-character hex | 649def34f8be52c8b66281af98ae884c09aef38b |
| DOI | DOI:xxx |
DOI:10.1038/s41586-021-03819-2 |
| ArXiv | ARXIV:xxx |
ARXIV:2106.15928 or ARXIV:2106.15928v2 |
| PubMed | PMID:xxx |
PMID:32908142 |
| Corpus ID | CorpusId:xxx |
CorpusId:215416146 |
| ACL | ACL:xxx |
ACL:P19-1285 |
| URL | URL:xxx |
URL:https://arxiv.org/abs/2106.15928 |
Tools Reference
1. semantic_scholar_search_papers
Search for academic papers with advanced filters.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
query |
string | Yes | Search query (supports AND, OR, NOT operators and "phrase search") |
year |
string | No | Year filter: "2024", "2020-2024", or "2020-" |
fields_of_study |
string[] | No | Filter by fields: ["Computer Science", "Biology"] |
publication_types |
string[] | No | Filter by type: ["Review", "JournalArticle"] |
open_access_only |
boolean | No | Only return open access papers (default: false) |
min_citation_count |
integer | No | Minimum citation count |
limit |
integer | No | Max results 1-100 (default: 10) |
offset |
integer | No | Pagination offset (default: 0) |
response_format |
string | No | "markdown" or "json" (default: markdown) |
api_key |
string | No | Override environment API key |
Example:
Search for "transformer attention mechanism" papers from 2023 with at least 100 citations
JSON Example:
{
"query": "transformer attention mechanism",
"year": "2023",
"min_citation_count": 100,
"fields_of_study": ["Computer Science"],
"limit": 20
}
2. semantic_scholar_get_paper
Get detailed information about a specific paper.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
paper_id |
string | Yes | Paper ID in any supported format |
include_citations |
boolean | No | Include citing papers (default: false) |
include_references |
boolean | No | Include referenced papers (default: false) |
citations_limit |
integer | No | Max citations to return 1-100 (default: 10) |
references_limit |
integer | No | Max references to return 1-100 (default: 10) |
response_format |
string | No | "markdown" or "json" (default: markdown) |
api_key |
string | No | Override environment API key |
Example:
Get details for DOI:10.1038/s41586-021-03819-2 including its top 20 citations
JSON Example:
{
"paper_id": "DOI:10.1038/s41586-021-03819-2",
"include_citations": true,
"citations_limit": 20
}
3. semantic_scholar_search_authors
Search for academic authors by name.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
query |
string | Yes | Author name to search |
limit |
integer | No | Max results 1-100 (default: 10) |
offset |
integer | No | Pagination offset (default: 0) |
response_format |
string | No | "markdown" or "json" (default: markdown) |
api_key |
string | No | Override environment API key |
Example:
Find author "Yoshua Bengio"
JSON Example:
{
"query": "Yoshua Bengio",
"limit": 5
}
4. semantic_scholar_get_author
Get author profile with publications.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
author_id |
string | Yes | Semantic Scholar author ID |
include_papers |
boolean | No | Include publications (default: true) |
papers_limit |
integer | No | Max papers to return 1-100 (default: 20) |
response_format |
string | No | "markdown" or "json" (default: markdown) |
api_key |
string | No | Override environment API key |
Example:
Get author profile for author ID 1741101 with their top 50 publications
JSON Example:
{
"author_id": "1741101",
"include_papers": true,
"papers_limit": 50
}
5. semantic_scholar_recommendations
Get AI-powered paper recommendations based on a seed paper.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
paper_id |
string | Yes | Seed paper ID in any supported format |
limit |
integer | No | Max recommendations 1-100 (default: 10) |
response_format |
string | No | "markdown" or "json" (default: markdown) |
api_key |
string | No | Override environment API key |
Example:
Get recommendations based on paper 649def34f8be52c8b66281af98ae884c09aef38b
JSON Example:
{
"paper_id": "ARXIV:1706.03762",
"limit": 15
}
6. semantic_scholar_bulk_papers
Retrieve multiple papers in a single request (max 500).
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
paper_ids |
string[] | Yes | List of paper IDs (max 500) |
response_format |
string | No | "markdown" or "json" (default: json) |
api_key |
string | No | Override environment API key |
Example:
Retrieve these papers: DOI:10.1038/nature12373, ARXIV:2106.15928, PMID:32908142
JSON Example:
{
"paper_ids": [
"DOI:10.1038/nature12373",
"ARXIV:2106.15928",
"PMID:32908142"
]
}
7. semantic_scholar_bulk_search
Search papers with sorting and cursor-based pagination for large result sets.
Unlike search_papers, supports a sort order and returns a token for
paging through all results.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
query |
string | Yes | Search query |
sort |
string | No | Sort order, e.g. "citationCount:desc", "publicationDate:asc" |
token |
string | No | Continuation token from a previous bulk_search response |
year |
string | No | Year filter: "2024", "2020-2024", "2020-" |
fields_of_study |
string[] | No | Filter by fields: ["Computer Science"] |
publication_types |
string[] | No | Filter by type: ["Review", "JournalArticle"] |
min_citation_count |
integer | No | Minimum citation count |
limit |
integer | No | Max results per page 1-1000 (default: 100) |
response_format |
string | No | "markdown" or "json" (default: markdown) |
api_key |
string | No | Override environment API key |
JSON Example:
{
"query": "graph neural networks",
"sort": "citationCount:desc",
"year": "2020-2024",
"limit": 100
}
Returns: total result count, the page of papers, and a token for the
next page (when more results exist).
8. semantic_scholar_export_citation
Export a citation for a paper in BibTeX format.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
paper_id |
string | Yes | Paper ID in any supported format |
format |
string | No | Citation format (currently only "bibtex") |
api_key |
string | No | Override environment API key |
JSON Example:
{
"paper_id": "DOI:10.1038/s41586-021-03819-2",
"format": "bibtex"
}
Returns: the BibTeX string for the requested paper.
9. semantic_scholar_match_paper
Find the single best paper matching a title string. Returns a numeric
matchScore alongside the matched paper.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
query |
string | Yes | Paper title to match (1-500 chars) |
response_format |
string | No | "markdown" or "json" (default: markdown) |
api_key |
string | No | Override environment API key |
JSON Example:
{
"query": "Attention Is All You Need"
}
Returns: the best-matching paper plus its matchScore, or "No matching
paper found." if no match.
10. semantic_scholar_paper_authors
Get full author profiles for a paper's authors (richer than the abbreviated
author list returned by get_paper).
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
paper_id |
string | Yes | Paper ID in any supported format |
limit |
integer | No | Max authors to return 1-1000 (default: 100) |
response_format |
string | No | "markdown" or "json" (default: markdown) |
api_key |
string | No | Override environment API key |
JSON Example:
{
"paper_id": "ARXIV:1706.03762",
"limit": 25
}
Returns: the list of full author records for the paper.
11. semantic_scholar_author_batch
Retrieve multiple authors in a single request (max 1000).
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
author_ids |
string[] | Yes | List of author IDs (1-1000) |
response_format |
string | No | "markdown" or "json" (default: json) |
api_key |
string | No | Override environment API key |
JSON Example:
{
"author_ids": ["1741101", "40348417", "144749327"]
}
Returns: counts of requested / retrieved, the retrieved author
records, and a not_found list of IDs the API did not return.
12. semantic_scholar_multi_recommend
Get recommendations using multiple positive (and optional negative) example papers.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
positive_paper_ids |
string[] | Yes | Papers to find similar results for (1-100) |
negative_paper_ids |
string[] | No | Papers to dissimilate from (0-100) |
limit |
integer | No | Max recommendations 1-500 (default: 10) |
response_format |
string | No | "markdown" or "json" (default: markdown) |
api_key |
string | No | Override environment API key |
JSON Example:
{
"positive_paper_ids": ["ARXIV:1706.03762", "ARXIV:1810.04805"],
"negative_paper_ids": ["DOI:10.1038/nature14539"],
"limit": 20
}
Returns: the recommended papers plus an echo of the positive/negative seeds used.
13. semantic_scholar_snippet_search
Search within paper full text and return text snippets with surrounding context. Heavily rate-limited without an API key.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
query |
string | Yes | Search query for paper text (1-500 chars) |
paper_ids |
string[] | No | Limit search to specific papers (max 100) |
year |
string | No | Year filter: "2024", "2020-2024", "2020-" |
fields_of_study |
string[] | No | Filter by fields: ["Computer Science"] |
min_citation_count |
integer | No | Minimum citation count |
limit |
integer | No | Max results 1-100 (default: 10) |
response_format |
string | No | "markdown" or "json" (default: markdown) |
api_key |
string | No | Override environment API key |
JSON Example:
{
"query": "scaling laws for language models",
"year": "2022-2024",
"limit": 20
}
Returns: matching snippets, each with the source paper title, section, and a short text excerpt.
14. semantic_scholar_status
Check server health and API connectivity status.
Parameters: None
Example:
Check Semantic Scholar API status
Response:
{
"server": "semantic-scholar-mcp",
"version": "1.2.2",
"api_key_configured": true,
"timestamp": "2026-04-06T12:00:00.000000+00:00",
"api_reachable": true
}
Rate Limits
| Tier | Requests/Second | How to Get |
|---|---|---|
| No API Key | 1 req/sec | Default |
| Free API Key | 1 req/sec | Sign up |
| Academic Partner | 10-100 req/sec | Apply via S2 |
The server automatically handles rate limiting with:
- Request serialization to enforce minimum intervals
- Exponential backoff retry for 429 (rate limit) and 503 (service unavailable) errors
- Maximum 3 retries with jitter
Architecture
+-----------------+ +----------------------+ +-----------------+
| Claude Desktop |---->| semantic-scholar-mcp |---->| Semantic Scholar|
| (MCP Client) |<----| (This Server) |<----+ API |
+-----------------+ +----------------------+ +-----------------+
| | |
| stdio (JSON-RPC) | Your API Key | HTTPS
| Local process | Local machine | 200M+ papers
Where your API key goes. The MCP server runs locally on your machine and
does not store your API key on disk. When the server makes authenticated
requests, the key is sent only to api.semanticscholar.org over HTTPS as
the x-api-key header that the Semantic Scholar API requires. No telemetry
is sent to any third party. See the per-request api_key caution above for
how transcript exposure can occur when the parameter is used per-request
instead of via the environment variable.
Development
# Clone
git clone https://github.com/smaniches/semantic-scholar-mcp.git
cd semantic-scholar-mcp
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run tests with coverage
pytest --cov=src/semantic_scholar_mcp --cov-report=term-missing
# Type checking
mypy src/
Security
API keys are never persisted to disk by the server. Prefer the
SEMANTIC_SCHOLAR_API_KEY environment variable over the per-request api_key
tool parameter (see SECURITY.md for details on the
transcript-exposure risk). All API communication uses HTTPS to
api.semanticscholar.org. See SECURITY.md for
vulnerability reporting and the v1.2.x known-limitations list.
Related MCP servers by the same author
alphafold-sovereign-mcp— Model Context Protocol server for AlphaFold DB and 13 other biomedical data sources, with a local SQLite knowledge graph (pip install --pre alphafold-sovereign-mcp).uniprot-mcp— Model Context Protocol server for UniProt Swiss-Prot and TrEMBL (pip install uniprot-mcp-server).
License
MIT License - see LICENSE file.
Author
Santiago Maniches
- Founder & CEO, TOPOLOGICA LLC
- ORCID: 0009-0005-6480-1987
- LinkedIn: santiago-maniches
- Website: topologica.ai
Contributing
Contributions welcome! Please read our Contributing Guidelines.
Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Contact: santiago@topologica.ai
Built by TOPOLOGICA LLC
Advancing computational research through topological intelligence
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file s2_mcp_server-1.3.1.tar.gz.
File metadata
- Download URL: s2_mcp_server-1.3.1.tar.gz
- Upload date:
- Size: 24.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd626c23c0c9346932c4abdbe7180d2b69cb266c48d205ee9ad7bb5d7863da0d
|
|
| MD5 |
08dd4dd7578981ae5024b5de126c1021
|
|
| BLAKE2b-256 |
968455d81a7d2815fc50d1ade27a29ffc77bc2628ee881a865e26e7183f97797
|
Provenance
The following attestation bundles were made for s2_mcp_server-1.3.1.tar.gz:
Publisher:
publish.yml on smaniches/semantic-scholar-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
s2_mcp_server-1.3.1.tar.gz -
Subject digest:
bd626c23c0c9346932c4abdbe7180d2b69cb266c48d205ee9ad7bb5d7863da0d - Sigstore transparency entry: 1574678145
- Sigstore integration time:
-
Permalink:
smaniches/semantic-scholar-mcp@f665b59a278f167e7815a6bb068fd26181626132 -
Branch / Tag:
refs/tags/semantic-scholar-mcp-v1.3.1 - Owner: https://github.com/smaniches
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f665b59a278f167e7815a6bb068fd26181626132 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file s2_mcp_server-1.3.1-py3-none-any.whl.
File metadata
- Download URL: s2_mcp_server-1.3.1-py3-none-any.whl
- Upload date:
- Size: 28.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
482f6f6ea5c44399fb07be8908c3a0fef930b72eff53b04ecce11273ebd5a3d6
|
|
| MD5 |
83959c57d834814665d946d01db38b52
|
|
| BLAKE2b-256 |
1b7a795086dbe3b036a0810c727859a5bf16bc1351a20ed48af30e5938c27d1c
|
Provenance
The following attestation bundles were made for s2_mcp_server-1.3.1-py3-none-any.whl:
Publisher:
publish.yml on smaniches/semantic-scholar-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
s2_mcp_server-1.3.1-py3-none-any.whl -
Subject digest:
482f6f6ea5c44399fb07be8908c3a0fef930b72eff53b04ecce11273ebd5a3d6 - Sigstore transparency entry: 1574678199
- Sigstore integration time:
-
Permalink:
smaniches/semantic-scholar-mcp@f665b59a278f167e7815a6bb068fd26181626132 -
Branch / Tag:
refs/tags/semantic-scholar-mcp-v1.3.1 - Owner: https://github.com/smaniches
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f665b59a278f167e7815a6bb068fd26181626132 -
Trigger Event:
workflow_dispatch
-
Statement type: