MCP server for academic literature search across 9 APIs (OpenAlex, Semantic Scholar, PubMed, arXiv, medRxiv/bioRxiv, CrossRef, Google Scholar, ORCID, Unpaywall)
Project description
Academic Research MCP Server
A unified Model Context Protocol server that provides AI assistants with access to nine academic research APIs through 25 tools, plus local caching, systematic review management, and PRISMA-compliant workflow support. Designed for biomedical researchers who work across clinical medicine and computer science.
APIs & Data Sources
| API | Auth | Rate Limit | What it does |
|---|---|---|---|
| OpenAlex | None | 10/sec (polite pool) | 250M+ works, authors, institutions |
| Semantic Scholar | Optional | 100/5min or 1/sec with key | Paper search, citation graphs, recommendations, batch lookup |
| CrossRef | None | 50/sec (polite pool) | DOI registry fallback, author search |
| PubMed | Optional | 3/sec or 10/sec with key | NCBI E-utilities, full Boolean/MeSH query syntax |
| arXiv | None | 1 req/3 sec | CS/ML preprint search with arXiv query syntax |
| medRxiv/bioRxiv | None | Reasonable use | Preprint search, date-range browsing, publication status |
| Google Scholar | None | ~100 req then CAPTCHA | Keyword search, advanced search, author profiles |
| ORCID | None | Unlimited | Researcher profiles, publications, funding |
| Unpaywall | None | 100K/day | Legal open access PDF resolution |
No API keys required for basic use. Optional environment variables for higher throughput:
| Variable | Effect |
|---|---|
S2_API_KEY |
Semantic Scholar: 100/5min -> 1/sec sustained |
OPENALEX_EMAIL |
OpenAlex polite pool: 1/sec -> 10/sec |
CROSSREF_EMAIL |
CrossRef polite pool: faster responses (falls back to OPENALEX_EMAIL) |
NCBI_API_KEY |
PubMed: 3/sec -> 10/sec |
Installation
Option 1 — One-click (Claude Desktop)
Download academic-research-mcp.dxt and double-click it. Claude Desktop will prompt you for optional API keys and restart automatically.
Option 2 — uvx (Claude Code, Cursor, any MCP client)
No install step — uvx fetches and runs the package in an isolated environment on first use.
Add to your MCP config (.claude/mcp.json, cursor_mcp.json, etc.):
{
"mcpServers": {
"academic-research": {
"command": "uvx",
"args": ["academic-research-mcp"],
"env": {
"OPENALEX_EMAIL": "your-email@institution.edu",
"S2_API_KEY": "your-key-here",
"NCBI_API_KEY": "your-ncbi-key-here"
}
}
}
}
Requires uv (curl -LsSf https://astral.sh/uv/install.sh | sh).
Option 3 — pip install
pip install academic-research-mcp
academic-research-mcp # runs the server
Option 4 — From source
git clone https://github.com/alisoroushmd/academic-research-mcp.git
cd academic-research-mcp
pip install -e .
academic-research-mcp
All 25 Tools
Smart Tools (start here)
smart_search-- THE recommended search tool. Multi-source search with deduplication and early stopping. Auto-logs to active review when set.find_paper-- Universal paper resolver. Accepts any identifier (DOI, PMID, arXiv ID, URL, title).
Unified Search & Authors
search_papers-- Search any source viasourceparam (openalex, s2, crossref, arxiv, medrxiv, google_scholar, pubmed). For PubMed, supports full MeSH/Boolean syntax. Auto-logs to active review.search_authors-- Search for researchers across openalex, s2, orcid, google_scholar.get_author-- Detailed author profile by ID (auto-detects source from ID format).get_author_works-- Publications by author across any source.get_author_funding-- ORCID funding/grants history.
Citation & Recommendation (Semantic Scholar)
get_paper_network-- Citation network: forward (who cited this), backward (what this cites), or both.recommend_papers-- "Papers like this one" recommendations.batch_get_papers-- Batch lookup: up to 500 papers in one request.
Preprints (medRxiv/bioRxiv)
preprints-- Three modes: recent (by category), date_range, or publication status check.
Institutions
get_institution-- Institution details from OpenAlex.
PDF & Open Access
open_access-- Find legal PDFs for one or multiple DOIs (Unpaywall, PMC, preprints, repositories).
Citation Validation
validate_citations-- Verify DOIs and PMIDs actually exist before presenting to users (max 25).
Cache
cache_manage-- View cache stats or clear by category.
Systematic Review Tools
create_review-- Start a new systematic review with a name and research question.reviews-- List all reviews or get full details (search log, paper counts by status) for one.delete_review-- Permanently delete a review and all its data.set_active_review-- Activate auto-logging: all subsequent searches auto-deduplicate and log to this review.add_papers_to_review-- Manually add papers by DOI/PMID (for expert recommendations, reference lists).get_review_papers-- Paginated retrieval with optional status/search filters.update_paper_status-- Batch update screening status (new, screened_in, screened_out, included).snowball_search-- Citation harvesting: forward/backward from seed papers, deduplicated against the review library.export_review-- Export papers as DOI list or full dicts for Zotero import.prisma_counts-- PRISMA 2020 flow diagram counts (identification, screening, included).
Systematic Review Workflow
1. create_review("GIM risk SR", "gastric intestinal metaplasia risk stratification")
2. set_active_review(review_id) # enables auto-logging
3. smart_search / search_papers # results auto-deduplicate and log
4. snowball_search(review_id, seeds) # citation harvesting from key papers
5. add_papers_to_review(review_id, ["10.xxxx/..."]) # manual additions
6. get_review_papers(review_id) # browse candidates
7. update_paper_status(review_id, paper_ids, "screened_in")
8. prisma_counts(review_id) # PRISMA flow diagram numbers
9. export_review(review_id) # DOI list for Zotero import
All searches are logged with source, query, filters, raw count, and new-after-dedup count. Deduplication uses DOI (case-insensitive), PMID, and fuzzy title matching (>=85% similarity).
Abstracts
All search tools return abstracts where available. Semantic Scholar, arXiv, PubMed, and medRxiv return full abstracts directly. OpenAlex reconstructs abstracts from its inverted index format. CrossRef returns abstracts when publishers deposit them (coverage varies). Use abstract content to assess relevance before fetching full papers.
When to Use Which Source
| Source | Best for |
|---|---|
| smart_search | Default -- automatically picks the best sources |
| Semantic Scholar | AI/ML papers, citation graphs, influence metrics, batch lookups |
| OpenAlex | High-volume searches, comprehensive coverage, no rate limit worries |
| CrossRef | Fallback when other APIs throttle, DOI verification, citation counts |
| PubMed | Clinical/biomedical literature, MeSH terms, Boolean queries, field tags |
| arXiv | CS/ML preprints before journal publication |
| medRxiv/bioRxiv | Health sciences preprints, tracking publication status |
| Google Scholar | Catch-all, books, theses, non-indexed sources |
| ORCID | Researcher profiles, collaborator lookup, funding history |
| Unpaywall | Finding legal free PDFs, checking OA status before paywall |
Throughput Strategy
When you need to process papers faster than individual APIs allow:
- Use smart_search -- automatically picks OpenAlex first (10 req/sec), adds sources as needed
- S2 batch endpoint -- 500 papers in a single request via
batch_get_papers - CrossRef fallback -- 50 req/sec with email, 150M+ works
- PubMed -- 10 req/sec with NCBI API key, full MeSH vocabulary
- Cache hit -- Repeated lookups (landmark papers, your own work) are instant
- Google Scholar last -- Most aggressive rate limiting, use sparingly
Local Cache
The server includes a SQLite cache (WAL mode, singleton connection, thread-safe) at ~/.cache/academic-research-mcp/cache.db that:
- Caches results from all 9 API clients via
@cacheddecorator - Avoids redundant API calls for frequently accessed papers
- Default TTL: 24 hours for searches, 7 days for paper details, 3 days for authors
- Expired entries cleaned up automatically on server startup
- Override location with
ACADEMIC_CACHE_DIRenv var
The same database stores systematic review state (reviews, papers, searches) in separate tables.
Dependencies
- Python 3.10+
- scholarly -- Google Scholar access
- mcp -- Model Context Protocol SDK
- requests -- HTTP client
- httpx[socks] -- SOCKS proxy support for scholarly
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file academic_research_mcp-0.1.0.tar.gz.
File metadata
- Download URL: academic_research_mcp-0.1.0.tar.gz
- Upload date:
- Size: 59.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d588bfbc57811a27f7b358af88d25044dae387840b530f08bec9dce6bc00f7bf
|
|
| MD5 |
049c79255cd94d486cdd02dc7378564c
|
|
| BLAKE2b-256 |
921ed77904fff10cdceeced31db8af005a7aeed6a08d4588e9cb041af4412c4f
|
File details
Details for the file academic_research_mcp-0.1.0-py3-none-any.whl.
File metadata
- Download URL: academic_research_mcp-0.1.0-py3-none-any.whl
- Upload date:
- Size: 59.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b070dc35a7249aae0c3060e2eca48c6f530b2fe1e232fd9bc2eb84a9a3aaa00
|
|
| MD5 |
e5e6d10f997d062fa770d69f91260040
|
|
| BLAKE2b-256 |
6e9af7b8fc06240deaadce76700219187c443134fe3fa93a5bf7682b459895ab
|