A MCP toolkit for paper searching, manuscript processing, and academic research workflows.

These details have not been verified by PyPI

Project links

Project description

Scholar Toolkit MCP

Fork Notice: This is a fork of openags/paper-toolkit-mcp, originally created by P.S Zhang. This fork extends the project with manuscript processing, search caching, and citation export features. Both versions are licensed under MIT.

A comprehensive MCP toolkit for paper searching, manuscript processing, and academic research workflows. The project follows a free-first strategy: prioritize open and public data sources, support optional API keys when they improve stability or coverage, and keep source-specific connectors extensible for advanced users.

New Features (v0.2.0): Manuscript processing with citation placeholders, search caching, BibTeX/RIS export, and one-click Word document generation.

PyPI License Python

Manuscript Processing (New)

Workflow

Write your paper in Markdown with citation placeholders
Process with paper-toolkit manuscript command
Import refs.ris to Zotero (optional)
Submit the generated draft_final.docx

Supported Placeholders

[@doi:10.1038/s41591-020-0001-2]
[@pmid:32145678]
[@arxiv:2106.12345]
[@title:Attention Is All You Need]

Usage

# Basic usage (generates formatted markdown + BibTeX + RIS)
paper-toolkit manuscript draft.md

# With Word document generation (requires pandoc)
paper-toolkit manuscript draft.md --docx

# Specify citation style
paper-toolkit manuscript draft.md -s apa
paper-toolkit manuscript draft.md -s ieee
paper-toolkit manuscript draft.md -s gb7714

# Custom output directory
paper-toolkit manuscript draft.md -o ./output

# Disable specific outputs
paper-toolkit manuscript draft.md --no-bib --no-ris

Citation Styles

Style	Code	Description
GB/T 7714-2015	`gb7714`	Chinese national standard (numeric)
APA 7th	`apa`	American Psychological Association
IEEE	`ieee`	Institute of Electrical and Electronics Engineers
Vancouver	`vancouver`	International Committee of Medical Journal Editors
Harvard	`harvard`	Author-date format

Output Files

After processing, you get:

draft_formatted.md - Markdown with numbered citations [1], [2], ...
draft_final.docx - Word document (if --docx used and pandoc installed)
refs.bib - BibTeX file (can be imported to Zotero/JabRef)
refs.ris - RIS file (Zotero/EndNote/Mendeley compatible)
draft_references.txt - Plain text reference list

Search Caching (New)

How It Works

Search results are cached as JSON files in .paper_cache/
Cache location is relative to current working directory
Follows your project folder — copy the folder, cache moves with it
TTL (time-to-live) is 24 hours by default

Cache Location

your_project/
├── draft.md
├── refs.bib
└── .paper_cache/          ← Cache is here
    ├── abc123.json        ← Cached search results
    └── def456.json

Manage Cache

# List cached items
paper-toolkit cache list

# Clear all cache
paper-toolkit cache clear

Or via MCP tools: cache_list(), cache_clear()

CLI Usage

# Search papers
paper-toolkit search "machine learning" -s arxiv,semantic -n 10

# Download PDF
paper-toolkit download arxiv 2106.12345

# Read paper (extract text)
paper-toolkit read arxiv 2106.12345

# Get paper metadata
paper-toolkit search "attention is all you need" -s crossref -n 1

# Process manuscript
paper-toolkit manuscript draft.md -s gb7714 --docx

# Cache management
paper-toolkit cache list
paper-toolkit cache clear

# List available sources
paper-toolkit sources

Overview
New Features
Project Principles
Features
Source Strategy
Sci-Hub Notice
Installation
Manuscript Processing
Search Caching
CLI Usage
Contributing
Star History
License
TODO

New Features (v0.2.0)

Manuscript Processing

Write your paper in Markdown with citation placeholders, then generate a formatted Word document with references automatically:

# Introduction
Deep learning has made significant progress in medical imaging[@doi:10.1038/s41591-020-0001-2].
Transformer architecture revolutionized NLP[@title:Attention Is All You Need].

Process it:

paper-toolkit manuscript draft.md -s gb7714 --docx

Output:

draft_formatted.md - Text with numbered citations [1], [2], ...
refs.bib - BibTeX file (for Zotero/EndNote import)
refs.ris - RIS file (Zotero compatible)
draft_final.docx - Word document with formatted references

Supported placeholders: [@doi:...], [@pmid:...], [@arxiv:...], [@title:...]

Supported citation styles: GB/T 7714-2015, APA 7th, IEEE, Vancouver, Harvard

Search Caching

Search results are automatically cached in .paper_cache/ (relative to current working directory):

Follows your workspace: Cache is saved in the folder you're working in
Portable: Copy your project folder and cache moves with it
Easy management: Users can manually delete .paper_cache/ to clear

MCP Tools Added

process_manuscript - Process manuscript with citations
get_paper_metadata - Get paper metadata by identifier
export_references - Export references in BibTeX/RIS/text format
cache_list / cache_clear - Manage search cache

Overview

paper-toolkit-mcp is a Python-based tool for searching and downloading academic papers from various platforms. It provides tools for searching papers, downloading PDFs, and extracting text, making it ideal for researchers and AI-driven workflows. It can be used as an MCP server (for Claude Desktop and other MCP clients) or as a Claude Code skill with a CLI interface.

Project Principles

Free-First: Public and open sources are the default roadmap. Paid or restricted sources are not the core direction of this project.
Optional API Keys: API keys are supported only when they improve stability, rate limits, or metadata quality. The MCP should still be usable without them whenever possible.
LLM-Friendly Retrieval: Search results should be standardized, deduplicated, and as complete as possible for downstream LLM workflows.
Source Transparency: Different sources have different strengths. The MCP should make those tradeoffs explicit instead of pretending every source supports full-text retrieval.

Features

Two-Layer Architecture:
- Layer 1 (Unified Tooling): High-level search_papers for multi-source concurrent search & deduplication, and download_with_fallback relying on publisher open access links with sequential fallbacks.
- Layer 2 (Platform Connectors): Modular connectors for specific academic platforms (arXiv, PubMed, bioRxiv, Semantic Scholar, etc.) equipped with intelligent DOI extraction via regex text analysis or API fields.
Multi-Source Support: Search and download papers from arXiv, PubMed, bioRxiv, medRxiv, Google Scholar, IACR ePrint Archive, Semantic Scholar, Crossref, OpenAlex, PubMed Central (PMC), CORE, Europe PMC, dblp, OpenAIRE, CiteSeerX, DOAJ, BASE, Zenodo, HAL, SSRN, Unpaywall (DOI lookup), and optional Sci-Hub workflows.
Standardized Output: Papers are returned in a consistent dictionary format via the Paper class.
Free-First Design: Open and public sources are prioritized before any optional commercial or restricted integrations.
Optional API-Key Enhancement: Sources like Semantic Scholar can work better with a user-provided API key, but are not intended to force paid usage.
Discovery + Retrieval Workflow: Google Scholar and Crossref can be used for discovery and DOI backfilling, while open repositories and publisher links are used for lawful full-text resolution where available.
OA-First Fallback Chain: download_with_fallback now follows source-native download → OpenAIRE/CORE/Europe PMC/PMC discovery → Unpaywall DOI resolution → optional Sci-Hub.
MCP Integration: Compatible with MCP clients for LLM context enhancement.
Extensible Design: Easily add new academic platforms by extending the academic_platforms module.

Source Strategy

The long-term goal is not to depend on a single search engine, but to combine multiple free and public sources with clear roles:

Open metadata backbone: Crossref, OpenAlex, Semantic Scholar, dblp, CiteSeerX, SSRN, Unpaywall (DOI-centric OA metadata).
Discipline-specific sources: arXiv, PubMed, PubMed Central, Europe PMC, IACR.
Open-access full-text sources: arXiv, PMC, CORE, OpenAIRE, DOAJ, BASE, Zenodo, HAL, publisher open-access links.
Discovery and DOI recovery: Google Scholar can be useful for finding titles, versions, and DOI clues when other public metadata sources are incomplete.

Recommended free-first roadmap:

Keep current public sources stable.
Add OpenAlex as a broad free metadata source.
Add PubMed Central and Europe PMC for stronger biomedical full-text access.
Add CORE and OpenAIRE for repository-based open-access retrieval.
Use Google Scholar mainly as a discovery fallback, not as the primary canonical source.

Platform Capability Matrix

This matrix reflects verified live-integration results from functional and end-to-end regression tests in this repository. Columns show the highest capability level observed under normal conditions.

Platform	Search	Download	Read	Notes
arXiv	✅	✅	✅	Open API; reliable
PubMed	✅	❌	⚠️ info-only	Open API; reliable
bioRxiv	✅	✅	✅	Open API; reliable
medRxiv	✅	✅	✅	Open API; reliable
Google Scholar	⚠️	❌	❌	Bot-detection active; set `paper_toolkit_mcp_GOOGLE_SCHOLAR_PROXY_URL`
IACR	✅	✅	✅	Open API; reliable
Semantic Scholar	✅	✅ (OA)	✅ (OA)	Works without key (rate-limited); key improves limits; key rejection (403) retried automatically without key
Crossref	✅	❌	⚠️ info-only	Open API; reliable
OpenAlex	✅	❌	⚠️ info-only	Open API; reliable
PMC	✅	✅ (OA only)	✅ (OA only)	OA PDFs only; direct download may be blocked by some proxy environments
CORE	✅	✅ (record-dependent)	✅ (record-dependent)	Free key recommended; connector retries with backoff and falls back to key-less on 401/403
Europe PMC	✅	✅ (OA)	✅ (OA)	OA PDFs only; direct download may be blocked by some proxy environments
dblp	✅	❌	⚠️ info-only	Open API; reliable
OpenAIRE	✅	❌	❌	Open API; retries 3× with escalating request profiles on transient 403
CiteSeerX	⚠️	✅ (record-dependent)	⚠️	API endpoint intermittently unavailable / redirects to web archive
DOAJ	✅	⚠️ (URL-dependent)	⚠️ (URL-dependent)	PDF availability varies by article; free key raises rate limits
BASE	⚠️	✅ (record-dependent)	✅ (record-dependent)	OAI-PMH endpoint requires institutional IP registration; returns empty gracefully otherwise
Zenodo	✅	✅ (record-dependent)	✅ (record-dependent)	Open API; reliable
HAL	✅	✅ (record-dependent)	✅ (record-dependent)	Open API; reliable
SSRN	⚠️	⚠️ best-effort	⚠️ best-effort	403 bot-detection active; public PDF only
Unpaywall	✅ (DOI lookup)	❌	❌	Requires `paper_toolkit_mcp_UNPAYWALL_EMAIL`
Sci-Hub (optional)	⚠️ fallback-only	✅	❌	Optional; unstable mirrors; user responsibility
IEEE Xplore 🔑	🚧 skeleton	🚧 skeleton	🚧 skeleton	Requires `paper_toolkit_mcp_IEEE_API_KEY` to activate
ACM DL 🔑	🚧 skeleton	🚧 skeleton	🚧 skeleton	Requires `paper_toolkit_mcp_ACM_API_KEY` to activate

✅ = reliable in live tests. ⚠️ = works but subject to upstream instability or access restrictions. ❌ = not supported. 🔑 = key required. 🚧 = skeleton only.

Credential & API Key Requirements

All keys are optional unless noted. Configure them in .env (preferred) or as shell exports.

Environment Variable	Provider	Required?	How to obtain
`paper_toolkit_mcp_UNPAYWALL_EMAIL`	Unpaywall	Yes (Unpaywall disabled without it)	Any valid email; register at unpaywall.org
`paper_toolkit_mcp_CORE_API_KEY`	CORE	Recommended	Free at core.ac.uk/services/api
`paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY`	Semantic Scholar	Optional	Free at semanticscholar.org — improves rate limits
`paper_toolkit_mcp_GOOGLE_SCHOLAR_PROXY_URL`	Google Scholar	Optional	Your HTTP/HTTPS proxy URL — bypasses bot-detection
`paper_toolkit_mcp_DOAJ_API_KEY`	DOAJ	Optional	Free at doaj.org — raises hourly rate limit
`paper_toolkit_mcp_ZENODO_ACCESS_TOKEN`	Zenodo	Optional	Free at zenodo.org — required for private records
`paper_toolkit_mcp_IEEE_API_KEY`	IEEE Xplore	Required to activate	Free at developer.ieee.org
`paper_toolkit_mcp_ACM_API_KEY`	ACM DL	Required to activate	See libraries.acm.org/digital-library/acm-open

All variables follow the paper_toolkit_mcp_<NAME> prefix scheme. Legacy names without the prefix (e.g. CORE_API_KEY, UNPAYWALL_EMAIL) are still supported for backward compatibility.

Known Upstream Limitations

Some search failures are caused by external provider instability, not by bugs in this project:

Source	Symptom	Cause	Workaround
Google Scholar	Returns 0 results / empty HTML	Bot-detection (CAPTCHA)	Set `paper_toolkit_mcp_GOOGLE_SCHOLAR_PROXY_URL` to a proxy
Semantic Scholar	429 rate-limited responses	Anonymous access rate limit	Set `paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY`; if key is rejected (403) connector automatically retries without key
CORE	500 / timeout errors	Unauthenticated rate limiting	Set `paper_toolkit_mcp_CORE_API_KEY` (free); connector retries with exponential backoff and falls back to key-less on 401/403
OpenAIRE	Transient 403 responses	IP-based session rate limiting	Connector retries 3× per profile, escalating: plain session → XML Accept header → raw `requests.get` with Mozilla UA
CiteSeerX	404 via web archive redirect	PSU endpoint intermittently redirects to archive	No workaround; connector returns empty gracefully
BASE	Search returns 0 results	OAI-PMH endpoint requires institutional IP registration	Register at base-search.net for API access; connector returns empty gracefully otherwise
SSRN	HTTP 403	Bot-detection (Cloudflare)	No workaround; connector tries two endpoints and returns a clear message on failure
PMC / Europe PMC	PDF download ProxyError	Local proxy blocking direct HTTPS PDF download	Disable proxy or use `download_with_fallback` instead
Unpaywall	Skipped entirely	`UNPAYWALL_EMAIL` env var not set	Set `paper_toolkit_mcp_UNPAYWALL_EMAIL` in `.env`

Optional Paid Platform Connectors (Phase 3)

IEEE Xplore and ACM Digital Library connectors are included as opt-in skeletons. They are disabled by default — no API calls are made unless you explicitly configure the corresponding keys.

Platform	Env Var	Status
IEEE Xplore	`paper_toolkit_mcp_IEEE_API_KEY`	🚧 skeleton — search registered, download/read raise `NotImplementedError`
ACM Digital Library	`paper_toolkit_mcp_ACM_API_KEY`	🚧 skeleton — search registered, download/read raise `NotImplementedError`

How to enable:

export paper_toolkit_mcp_IEEE_API_KEY=<your_ieee_key>       # free key at https://developer.ieee.org/
export paper_toolkit_mcp_ACM_API_KEY=<your_acm_key>         # see https://libraries.acm.org/digital-library

Once a key is set, the corresponding source is automatically added to ALL_SOURCES and its MCP tools (search_ieee / search_acm, download_ieee / download_acm, read_ieee_paper / read_acm_paper) are registered at server startup.

Without a key the connectors log a startup warning only — the rest of the server is unaffected.

Free Source Expansion (Phase 4)

Three additional free-source connectors are now integrated into the MCP server:

zenodo: Official Zenodo REST API connector (search + record-dependent PDF/read support).
hal: HAL public API connector (search + record-dependent PDF/read support).
ssrn: Discovery-first connector with hardened parser and best-effort download/read when a direct public PDF link is available.
unpaywall: DOI-centric OA metadata source for standalone lookup (search_unpaywall) and fallback URL resolution.

SSRN integration remains compliance-first: it only attempts direct public PDF links exposed by SSRN pages. If login/restricted delivery is required, the connector returns a clear message instead of bypassing access controls.

Sci-Hub Notice

Sci-Hub support can remain available as an optional connector for users who explicitly choose to enable it, but it should not be treated as the default or recommended full-text path.

Availability is unstable and mirrors change frequently.
Legal and policy risks vary by jurisdiction.
README and tool descriptions should clearly state that users are responsible for enabling and using it.
Open-access and publisher-permitted sources should be tried first whenever possible.

Installation

Choose the method that best fits your workflow. All methods support the same optional API keys.

MCP Server Config file locations (for methods below)

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

Windows: %APPDATA%\Claude\claude_desktop_config.json

Linux: ~/.config/Claude/claude_desktop_config.json

Method 1 — Local Deployment (Clone & Run) — Recommended

This is the most reliable method — you have full control and can customize the installation.

# 1. Clone your forked repo
git clone https://github.com/YOUR_USERNAME/paper-toolkit-mcp.git
cd paper-toolkit-mcp

# 2. Install dependencies (using uv, recommended)
# Install uv if you don't have it: https://docs.astral.sh/uv/getting-started/installation/
uv venv && source .venv/bin/activate  # Windows: .venv\Scripts\activate
uv pip install -e ".[dev]"

# 3. Verify it works
uv run -m paper_toolkit_mcp.server
# or
paper-toolkit search "machine learning" -s arxiv,semantic

Claude Desktop / Trae IDE config (replace the path with your actual clone location):

{
  "mcpServers": {
    "paper-toolkit-mcp": {
      "command": "uv",
      "args": [
        "run",
        "--directory", "D:/Codes/paper-toolkit-mcp",
        "-m", "paper_toolkit_mcp.server"
      ],
      "env": {
        "paper_toolkit_mcp_UNPAYWALL_EMAIL": "your@email.com",
        "paper_toolkit_mcp_CORE_API_KEY": "",
        "paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY": ""
      }
    }
  }
}

For Trae IDE on Windows, edit the MCP settings file at the location shown in Trae's settings UI, or use the python -m method:

{
  "mcpServers": {
    "paper-toolkit-mcp": {
      "command": "python",
      "args": ["-m", "paper_toolkit_mcp.server"]
    }
  }
}

Make sure to run this from your project directory, or set the cwd appropriately.

Method 1 — Smithery (one-command, recommended for Claude Desktop)

npx -y @smithery/cli install @openags/paper-toolkit-mcp --client claude

Smithery automatically writes the correct config block for you. No manual JSON editing needed.

Method 2 — `uvx` (no install, always latest)

uvx runs the package directly from PyPI without a permanent install. Requires uv.

# Install uv (skip if already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

⚠️ macOS note: uvx generated wrapper scripts rely on realpath, which is not included in macOS by default. If you see a realpath: command not found error, either install GNU coreutils (brew install coreutils) or use Method 3 (uv run) instead — it does not have this limitation.

Claude Desktop config:

{
  "mcpServers": {
    "paper-toolkit-mcp": {
      "command": "uvx",
      "args": ["paper-toolkit-mcp"],
      "env": {
        "paper_toolkit_mcp_UNPAYWALL_EMAIL": "your@email.com",
        "paper_toolkit_mcp_CORE_API_KEY": "",
        "paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY": "",
        "paper_toolkit_mcp_ZENODO_ACCESS_TOKEN": "",
        "paper_toolkit_mcp_GOOGLE_SCHOLAR_PROXY_URL": "",
        "paper_toolkit_mcp_IEEE_API_KEY": "",
        "paper_toolkit_mcp_ACM_API_KEY": ""
      }
    }
  }
}

Method 3 — `uv` (persistent install)

uv tool install paper-toolkit-mcp

Claude Desktop config:

{
  "mcpServers": {
    "paper-toolkit-mcp": {
      "command": "uv",
      "args": ["tool", "run", "paper-toolkit-mcp"],
      "env": {
        "paper_toolkit_mcp_UNPAYWALL_EMAIL": "your@email.com",
        "paper_toolkit_mcp_CORE_API_KEY": "",
        "paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY": "",
        "paper_toolkit_mcp_ZENODO_ACCESS_TOKEN": "",
        "paper_toolkit_mcp_GOOGLE_SCHOLAR_PROXY_URL": "",
        "paper_toolkit_mcp_IEEE_API_KEY": "",
        "paper_toolkit_mcp_ACM_API_KEY": ""
      }
    }
  }
}

Method 4 — `pip` (standard Python install)

pip install paper-toolkit-mcp

Claude Desktop config:

{
  "mcpServers": {
    "paper-toolkit-mcp": {
      "command": "python",
      "args": ["-m", "paper_toolkit_mcp.server"],
      "env": {
        "paper_toolkit_mcp_UNPAYWALL_EMAIL": "your@email.com",
        "paper_toolkit_mcp_CORE_API_KEY": "",
        "paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY": "",
        "paper_toolkit_mcp_ZENODO_ACCESS_TOKEN": "",
        "paper_toolkit_mcp_GOOGLE_SCHOLAR_PROXY_URL": "",
        "paper_toolkit_mcp_IEEE_API_KEY": "",
        "paper_toolkit_mcp_ACM_API_KEY": ""
      }
    }
  }
}

If python is not on your PATH, replace it with the full path (e.g. /usr/bin/python3 or C:\Python311\python.exe). Run which python3 / where python to find it.

Method 5 — `npx` (via Smithery CLI, no local Python needed)

npx -y @smithery/cli run @openags/paper-toolkit-mcp

Claude Desktop config:

{
  "mcpServers": {
    "paper-toolkit-mcp": {
      "command": "npx",
      "args": ["-y", "@smithery/cli", "run", "@openags/paper-toolkit-mcp"],
      "env": {
        "paper_toolkit_mcp_UNPAYWALL_EMAIL": "your@email.com",
        "paper_toolkit_mcp_CORE_API_KEY": "",
        "paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY": ""
      }
    }
  }
}

Method 6 — Docker

docker build -t paper-toolkit-mcp .
docker run --rm -i \
  -e paper_toolkit_mcp_UNPAYWALL_EMAIL=your@email.com \
  -e paper_toolkit_mcp_CORE_API_KEY=your_core_key \
  paper-toolkit-mcp

Claude Desktop config:

{
  "mcpServers": {
    "paper-toolkit-mcp": {
      "command": "docker",
      "args": ["run", "--rm", "-i", "paper-toolkit-mcp"],
      "env": {
        "paper_toolkit_mcp_UNPAYWALL_EMAIL": "your@email.com",
        "paper_toolkit_mcp_CORE_API_KEY": "",
        "paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY": "",
        "paper_toolkit_mcp_ZENODO_ACCESS_TOKEN": "",
        "paper_toolkit_mcp_GOOGLE_SCHOLAR_PROXY_URL": "",
        "paper_toolkit_mcp_IEEE_API_KEY": "",
        "paper_toolkit_mcp_ACM_API_KEY": ""
      }
    }
  }
}

Method 7 — Clone & run from source (development / recommended for macOS local)

This is the most reliable method on macOS — no wrapper scripts, no realpath issues.

# 1. Install uv (skip if already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# 2. Clone repo
git clone https://github.com/openags/paper-toolkit-mcp.git
cd paper-toolkit-mcp

# 3. Verify it runs (uv auto-resolves dependencies, no manual install needed)
uv run -m paper_toolkit_mcp.server

Claude Desktop config (replace the directory path with your actual clone location):

{
  "mcpServers": {
    "paper-toolkit-mcp": {
      "command": "uv",
      "args": [
        "run",
        "--directory", "/path/to/paper-toolkit-mcp",
        "-m", "paper_toolkit_mcp.server"
      ],
      "env": {
        "paper_toolkit_mcp_UNPAYWALL_EMAIL": "your@email.com",
        "paper_toolkit_mcp_CORE_API_KEY": "",
        "paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY": "",
        "paper_toolkit_mcp_ZENODO_ACCESS_TOKEN": "",
        "paper_toolkit_mcp_GOOGLE_SCHOLAR_PROXY_URL": "",
        "paper_toolkit_mcp_IEEE_API_KEY": "",
        "paper_toolkit_mcp_ACM_API_KEY": ""
      }
    }
  }
}

For example, if you cloned to /Users/mac/Pengsong/paper-toolkit-mcp:

"args": ["run", "--directory", "/Users/mac/Pengsong/paper-toolkit-mcp", "-m", "paper_toolkit_mcp.server"]

uv run automatically installs dependencies into an isolated environment on first run — no pip install or venv needed.

For active development, optionally install an editable copy:

uv venv && source .venv/bin/activate   # Windows: .venv\Scripts\activate
uv pip install -e ".[dev]"

Environment Variables (`.env` file)

Instead of putting keys directly in the JSON config you can store them in a .env file in the project root (auto-loaded on startup):

cp .env.example .env   # if running from source
# or create ~/.paper-toolkit-mcp.env for global use

paper_toolkit_mcp_UNPAYWALL_EMAIL=your@email.com
paper_toolkit_mcp_CORE_API_KEY=
paper_toolkit_mcp_SEMANTIC_SCHOLAR_API_KEY=
paper_toolkit_mcp_ZENODO_ACCESS_TOKEN=
paper_toolkit_mcp_GOOGLE_SCHOLAR_PROXY_URL=
paper_toolkit_mcp_IEEE_API_KEY=
paper_toolkit_mcp_ACM_API_KEY=

To use a custom path: export paper_toolkit_mcp_ENV_FILE=/absolute/path/to/.env

Legacy variable names without the paper_toolkit_mcp_ prefix (e.g. CORE_API_KEY, UNPAYWALL_EMAIL) are still supported for backward compatibility.

Contributing

We welcome contributions! Here's how to get started:

Fork the Repository: Click "Fork" on GitHub.

Clone and Set Up:

git clone https://github.com/yourusername/paper-toolkit-mcp.git
cd paper-toolkit-mcp
uv venv && source .venv/bin/activate
uv pip install -e ".[dev]"

Make Changes:
- Add new platforms in academic_platforms/.
- Update tests in tests/.
Submit a Pull Request: Push changes and create a PR on GitHub.

Demo

TODO

Planned Academic Platforms

[√] arXiv
[√] PubMed
[√] bioRxiv
[√] medRxiv
[√] Google Scholar
[√] IACR ePrint Archive
[√] Semantic Scholar
[√] Crossref
[√] PubMed Central (PMC)
[√] CORE
[√] Europe PMC
[√] Sci-Hub warning and enablement docs

Development Tasks

[√] Fix Async search bugs and ensure reliable fast MCP events
[√] End-to-End full pipeline testing script (search, parse, download)
[√] Establish two-layer federated architecture (Layer 1 tool: search_papers)
[√] Ensure pervasive DOI extraction across metadata fields & abstract fallbacks
Citation graph & Paper relation context feature
[√] Expand full-stack OpenAlex provider

Priority Free and Open Sources

[√] PubMed Central (PMC)
[√] CORE
[√] OpenAlex
[√] Europe PMC
[√] OpenAIRE
[√] dblp
[√] CiteSeerX
[√] DOAJ
[√] BASE
[√] Zenodo
[√] HAL
[√] SSRN (discovery + best-effort full-text)
[√] Unpaywall (standalone DOI search source)

Optional and Non-Core Integrations

ResearchGate
JSTOR
ScienceDirect
Springer Link
[√] IEEE Xplore (optional skeleton — activate with IEEE_API_KEY)
[√] ACM Digital Library (optional skeleton — activate with ACM_API_KEY)
Web of Science
Scopus

Star History

License

This project is licensed under the MIT License. See the LICENSE file for details.

Happy researching with paper-toolkit-mcp! If you encounter issues, open a GitHub issue.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

Apr 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

paper_toolkit_mcp-0.2.0.tar.gz (453.0 kB view details)

Uploaded Apr 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

paper_toolkit_mcp-0.2.0-py3-none-any.whl (132.8 kB view details)

Uploaded Apr 26, 2026 Python 3

File details

Details for the file paper_toolkit_mcp-0.2.0.tar.gz.

File metadata

Download URL: paper_toolkit_mcp-0.2.0.tar.gz
Upload date: Apr 26, 2026
Size: 453.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for paper_toolkit_mcp-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`d3c9c8b241023b314c41ff8bdd711ad18ce938f4facb4f095dcb5d49049c1d6f`
MD5	`0f325c6c42b0885594c284e257eb0a89`
BLAKE2b-256	`55d0d1c4f7014f808cbd98e69b28f3db8f15af42f900fad5b17149168121fd1d`

See more details on using hashes here.

File details

Details for the file paper_toolkit_mcp-0.2.0-py3-none-any.whl.

File metadata

Download URL: paper_toolkit_mcp-0.2.0-py3-none-any.whl
Upload date: Apr 26, 2026
Size: 132.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for paper_toolkit_mcp-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8297b8a7b6c2d91330b405eb78f271493fa587dd665ad3448f02861474fc0494`
MD5	`37f5bc1772eb7fe134e6ba716e4309c5`
BLAKE2b-256	`0083ee91480db617e0699ee2f8f20ab564adc796ea746271d4cc0c546ba600e3`

See more details on using hashes here.

paper-toolkit-mcp 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Scholar Toolkit MCP

Manuscript Processing (New)

Workflow

Supported Placeholders

Usage

Citation Styles

Output Files

Search Caching (New)

How It Works

Cache Location

Manage Cache

CLI Usage

Table of Contents

New Features (v0.2.0)

Manuscript Processing

Search Caching

MCP Tools Added

Overview

Project Principles

Features

Source Strategy

Platform Capability Matrix

Credential & API Key Requirements

Known Upstream Limitations

Optional Paid Platform Connectors (Phase 3)

Free Source Expansion (Phase 4)

Sci-Hub Notice

Installation

Method 1 — Local Deployment (Clone & Run) — Recommended

Method 1 — Smithery (one-command, recommended for Claude Desktop)

Method 2 — uvx (no install, always latest)

Method 3 — uv (persistent install)

Method 4 — pip (standard Python install)

Method 5 — npx (via Smithery CLI, no local Python needed)

Method 6 — Docker

Method 7 — Clone & run from source (development / recommended for macOS local)

Environment Variables (.env file)

Contributing

Demo

TODO

Planned Academic Platforms

Development Tasks

Priority Free and Open Sources

Optional and Non-Core Integrations

Star History

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Method 2 — `uvx` (no install, always latest)

Method 3 — `uv` (persistent install)

Method 4 — `pip` (standard Python install)

Method 5 — `npx` (via Smithery CLI, no local Python needed)

Environment Variables (`.env` file)