Skip to main content

MCP server for academic paper search, curation, and multi-platform push

Project description

๐Ÿ“š Paper Distill MCP Server

License: AGPL-3.0 Python 3.10+ PyPI version CI

Academic paper search, intelligent curation, and multi-platform delivery โ€” built on the Model Context Protocol.

Compatible with all MCP clients: Claude Desktop, Claude Code, Cursor, Trae, Codex CLI, Gemini CLI, OpenClaw, VS Code, Zed, and more.

โš ๏ธ Early development stage. Many features are still being validated and may contain bugs or instabilities. Feedback and bug reports are warmly welcome!


โœจ Features

  • ๐Ÿ” 11-source parallel search โ€” OpenAlex, Semantic Scholar, PubMed, arXiv, Papers with Code, CrossRef, Europe PMC, bioRxiv, DBLP, CORE, Unpaywall
  • ๐Ÿค– Adaptive AI delivery โ€” the agent tracks your evolving research interests and automatically refines search keywords and recommendations over time
  • ๐Ÿ“Š 4-dimensional weighted ranking โ€” relevance ร— recency ร— impact ร— novelty, fully customizable weights
  • ๐Ÿ‘ฅ Dual-AI blind review โ€” two AI reviewers independently shortlist papers; a chief reviewer synthesizes a final push/overflow/discard decision (optional)
  • ๐Ÿงน Scraper delegation โ€” offload abstract extraction to a low-cost agent or API to cut token spend significantly
  • ๐ŸŒ Personal paper library site โ€” Astro + Vercel auto-deploy; site updates within 30 seconds of each push
  • ๐Ÿ“ฌ Multi-platform delivery โ€” Telegram / Discord / Feishu / WeCom
  • ๐Ÿ“ฆ Zotero integration โ€” save papers to Zotero with one command
  • ๐Ÿ“ Obsidian integration โ€” auto-generate paper note cards with Zotero backlinks; supports summary and template modes

๐Ÿš€ Quick Install

uvx paper-distill-mcp

That's it. Your AI client will discover all tools automatically. No API keys required for basic paper search.

No uv? โ†’ curl -LsSf https://astral.sh/uv/install.sh | sh or brew install uv

Other installation methods (pip / Homebrew / Docker / source)

pip:

pip install paper-distill-mcp

Homebrew:

brew tap Eclipse-Cj/tap
brew install paper-distill-mcp

Docker:

docker run -i --rm ghcr.io/eclipse-cj/paper-distill-mcp

From source (developers):

git clone https://github.com/Eclipse-Cj/paper-distill-mcp.git
cd paper-distill-mcp
python3 -m venv .venv && .venv/bin/pip install --upgrade pip && .venv/bin/pip install -e .

๐Ÿ”— Connecting to AI Clients

Claude Desktop

Add to claude_desktop_config.json (Settings โ†’ Developer โ†’ Edit Config):

{
  "mcpServers": {
    "paper-distill": {
      "command": "uvx",
      "args": ["paper-distill-mcp"]
    }
  }
}

Claude Code

claude mcp add paper-distill -- uvx paper-distill-mcp

Or add to .mcp.json:

{
  "mcpServers": {
    "paper-distill": {
      "command": "uvx",
      "args": ["paper-distill-mcp"]
    }
  }
}

Codex CLI (OpenAI)

Add to ~/.codex/config.toml:

[mcp_servers.paper-distill]
command = "uvx"
args = ["paper-distill-mcp"]

Gemini CLI (Google)

Add to ~/.gemini/settings.json:

{
  "mcpServers": {
    "paper-distill": {
      "command": "uvx",
      "args": ["paper-distill-mcp"]
    }
  }
}

OpenClaw

mcporter config add paper-distill --command uvx --scope home -- paper-distill-mcp
mcporter list  # verify

To remove: mcporter config remove paper-distill

OpenClaw โ€” install from source
git clone https://github.com/Eclipse-Cj/paper-distill-mcp.git ~/.openclaw/tools/paper-distill-mcp
cd ~/.openclaw/tools/paper-distill-mcp
uv venv .venv && uv pip install .
mcporter config add paper-distill \
  --command ~/.openclaw/tools/paper-distill-mcp/.venv/bin/python3 \
  --scope home \
  -- -m mcp_server.server
mcporter list

To remove: rm -rf ~/.openclaw/tools/paper-distill-mcp && mcporter config remove paper-distill

Other clients (Cursor, VS Code, Windsurf, Zed, Trae)

Same JSON config, different config file paths:

Client Config path
Claude Desktop claude_desktop_config.json
Trae Settings โ†’ MCP โ†’ Add
Cursor ~/.cursor/mcp.json
VS Code .vscode/mcp.json
Windsurf ~/.codeium/windsurf/mcp_config.json
Zed settings.json

HTTP transport (remote / hosted)

paper-distill-mcp --transport http --port 8765

๐ŸŽฏ Getting Started

After connecting your client, tell the agent "initialize paper-distill". It will call setup() and walk you through:

  1. Research topics โ€” describe your interests in plain language; the AI extracts keywords
  2. Delivery platform โ€” set up Telegram / Discord / Feishu / WeCom (optional)
  3. Paper library site โ€” build a personal paper library that updates automatically (optional)
  4. Scraper delegate โ€” point to a low-cost agent or API for abstract extraction (recommended)
  5. Preferences โ€” paper count, ranking weights, review mode, etc.
  6. First search โ€” pool_refresh() populates the paper pool

All settings can be updated at any time through conversation:

  • "Push 8 papers next time"
  • "Add a new topic: RAG retrieval"
  • "Enable dual-AI blind review"
  • "Increase recency weight"

โš™๏ธ Configuration Reference

All parameters are set via configure() or add_topic() โ€” no manual file editing needed.

Research Topics (add_topic / manage_topics)

Parameter Description Default
key Topic identifier (e.g. "llm-reasoning") โ€”
label Display name (e.g. "LLM Reasoning") โ€”
keywords Search keywords, 3โ€“5 recommended โ€”
weight Topic priority 0.0โ€“1.0 (higher = more papers) 1.0
blocked Temporarily disable without deleting false

Paper Count & Review (configure)

Parameter Options Default Description
paper_count_value any integer 6 Papers per push
paper_count_mode "at_most" / "at_least" / "exactly" "at_most" Count mode
picks_per_reviewer any integer 5 Shortlist size per reviewer
review_mode "single" / "dual" "single" Single AI or dual blind review
custom_focus free text "" Custom selection criteria

๐Ÿ’ก Dual blind review: two independent AI reviewers each shortlist papers; a chief reviewer makes the final push/overflow/discard call. Papers that don't make the cut are held for the next cycle rather than discarded. Enable with configure(review_mode="dual").

Ranking Weights (configure)

Controls paper scoring. The four weights should sum to approximately 1.0.

Parameter Measures Default
w_relevance Keyword and topic match 0.55
w_recency How recently the paper was published 0.20
w_impact Citation count (log-normalized) 0.15
w_novelty Whether this is the first appearance 0.10

Example: "Prioritize recent papers" โ†’ configure(w_recency=0.35, w_relevance=0.40)

Scraper / Abstract Extraction Delegate (configure)

Abstract extraction is the most token-intensive step. It runs on the main agent by default, but can be delegated to a cheaper model to cut costs significantly.

Parameter Value Description
summarizer "self" Main agent handles extraction (most expensive)
agent name (e.g. "scraper") Delegate to a low-cost sub-agent
API URL Call an external LLM API (DeepSeek, Ollama, etc.)

๐Ÿ”ง Strongly recommended: for 30+ papers, frontier model costs add up fast. A $0.14/M-token model handles extraction just as well. Set this with configure(summarizer="scraper").

Paper Pool & Scan Batches (configure)

Parameter Description Default
scan_batches Split the paper pool into N batches, reviewed over N+1 days 2 (3 days)

pool_refresh() searches all 11 APIs and fills the pool. The pool is then split into batches for daily AI review โ€” avoiding a single 60+ paper dump.

  • scan_batches=2 (default): review first half on day 1, second half on day 2, finalize on day 3
  • scan_batches=3: review one-third per day, finalize on day 4

When all batches are reviewed, the pool is exhausted and the next run triggers a fresh API search automatically.

Delivery Platforms (Environment Variables)

Platform Environment variables platform value
Telegram TELEGRAM_BOT_TOKEN + TELEGRAM_CHAT_ID "telegram"
Discord DISCORD_WEBHOOK_URL "discord"
Feishu FEISHU_WEBHOOK_URL "feishu"
WeCom WECOM_WEBHOOK_URL "wecom"

โš ๏ธ Important: set environment variables in the MCP client config env field, not as system environment variables. Otherwise send_push() cannot access the webhook URL and the AI may generate scripts that call webhooks directly, causing encoding issues.

Config example (WeCom + Claude Desktop):

{
  "mcpServers": {
    "paper-distill": {
      "command": "uvx",
      "args": ["paper-distill-mcp"],
      "env": {
        "WECOM_WEBHOOK_URL": "https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=YOUR_KEY"
      }
    }
  }
}

Restart the MCP client after editing the config.

Push message format (fixed):

1. Paper Title (Year)
   Journal Name
   - One-sentence summary
   - Why it was selected
   https://doi.org/...

Paper Library Site (configure)

Personal paper library website, auto-updated on every push. Built on Astro + Vercel (free tier).

Parameter Description
site_deploy_hook Vercel deploy hook URL (triggers site rebuild)
site_repo_path Local path to the paper-library repository

Setup steps (the AI agent will guide you):

  1. Create a repo from the paper-library-template
  2. Connect to Vercel and deploy
  3. Create a deploy hook in Vercel (Settings > Git > Deploy Hooks)
  4. Tell the agent the hook URL โ†’ saved via configure(site_deploy_hook=...)

After setup, every finalize_review() call pushes the digest JSON to the site repo and triggers a Vercel rebuild. The site updates in ~30 seconds.

Zotero Integration

Save papers to Zotero with one command. Requires a Zotero account and API key.

Getting credentials:

  1. API Key: go to zotero.org/settings/keys/new โ†’ check "Allow library access" + "Allow write access" โ†’ Save Key
  2. Library ID: go to zotero.org/settings/keys โ†’ your userID is shown at the top

Add to MCP client config:

{
  "mcpServers": {
    "paper-distill": {
      "command": "uvx",
      "args": ["paper-distill-mcp"],
      "env": {
        "ZOTERO_LIBRARY_ID": "your userID",
        "ZOTERO_API_KEY": "your API key"
      }
    }
  }
}

After setup, reply collect 1 3 after a push to save papers 1 and 3 to Zotero, automatically sorted into per-topic folders.

All Environment Variables

Variable Description Required
OPENALEX_EMAIL Increases OpenAlex API rate limit; also used for Unpaywall optional
CORE_API_KEY CORE API key (free registration) optional
DEEPSEEK_API_KEY Enhanced search via DeepSeek optional
ZOTERO_LIBRARY_ID + ZOTERO_API_KEY Save papers to Zotero optional
SITE_URL Paper library website URL optional
PAPER_DISTILL_DATA_DIR Data directory default: ~/.paper-distill/

๐Ÿ› ๏ธ Tools (19 total)

Setup & Configuration

Tool Description
setup() First call โ€” detects fresh install and returns guided initialization instructions
add_topic(key, label, keywords) Add a research topic with search keywords
configure(...) Update any setting: paper count, ranking weights, review mode, etc.

Search & Curation

Tool Description
search_papers(query) Parallel search across 11 sources
rank_papers(papers) 4-dimensional weighted scoring
filter_duplicates(papers) Deduplicate against previously pushed papers

Daily Pipeline (paper pool mode)

Tool Description
pool_refresh(topic?) Search all 11 APIs and build the paper pool
prepare_summarize(custom_focus?) Generate AI abstract extraction prompt
prepare_review(dual?) Generate review prompt โ€” AI makes push/overflow/discard decisions
finalize_review(selections) Process AI decisions, update pool, output push message
pool_status() Pool status: count, scan day, exhausted or not
collect(paper_indices) Save papers to Zotero + generate Obsidian notes

Session & Output

Tool Description
init_session Detect delivery platform and load research context
load_session_context Load historical research context
generate_digest(papers, date) Generate output files (JSONL, site, Obsidian)
send_push(date, papers, platform) Deliver to Telegram / Discord / Feishu / WeCom
collect_to_zotero(paper_ids) Save to Zotero via DOI
manage_topics(action, topic) List / disable / enable / reweight topics
ingest_research_context(text) Inherit research context across sessions

๐Ÿ—๏ธ Architecture

AI client (Claude Code / Codex CLI / Gemini CLI / Cursor / ...)
    โ†“ MCP (stdio or HTTP)
paper-distill-mcp
    โ”œโ”€โ”€ search/         โ€” 11-source academic search (with OA full-text enrichment)
    โ”œโ”€โ”€ curate/         โ€” scoring + deduplication
    โ”œโ”€โ”€ generate/       โ€” output (JSONL, Obsidian, site)
    โ”œโ”€โ”€ bot/            โ€” push formatting (4 platforms)
    โ””โ”€โ”€ integrations/   โ€” Zotero API

The server does not call any LLM internally. Search, ranking, and deduplication are pure data operations. Intelligence comes from your AI client.


๐Ÿ“– Paywalled Papers & Open Access

The system searches all papers by default (including subscription journals) and maximizes free full-text access through:

  1. CORE โ€” world's largest OA aggregator (200M+ papers), covering author self-archived versions from institutional repositories
  2. Unpaywall โ€” after merging results, automatically looks up legal free PDFs via DOI (preprints, green OA, author versions)

For papers with no free version, the system returns a DOI link. If you have institutional VPN access, clicking the DOI link while connected is usually enough โ€” publishers identify your institution by IP.

open_access_url priority: arXiv > CORE > Unpaywall > OpenAlex > Semantic Scholar > Papers with Code


โ“ FAQ

Review stage hangs / no response for 30+ minutes

Symptom: the review prompt generated by prepare_review() causes the AI client to hang or time out.

Cause: too many candidate papers in the pool (e.g. 80โ€“100), making the prompt exceed the client's context window or output token limit. VS Code Copilot and some IDE plugins have limited context capacity.

Solutions (pick one):

  1. Increase scan_batches (recommended) โ€” split the pool into more batches:
    configure(scan_batches=5)
    
  2. Reduce topics or keywords โ€” fewer topics โ†’ fewer search results โ†’ smaller pool.
  3. Switch to a higher-context client โ€” Claude Code (200k), Claude Desktop (200k), or Cursor handle long prompts better.

Install error: Requires-Python >=3.10

Python 3.10+ is required. macOS ships with Python 3.9 by default โ€” install a newer version with brew install python@3.13 or use uv.

Docker image fails to pull (mainland China)

ghcr.io is blocked in mainland China. Use pip with a Chinese mirror:

pip install paper-distill-mcp -i https://pypi.tuna.tsinghua.edu.cn/simple

๐Ÿง‘โ€๐Ÿ’ป Development

git clone https://github.com/Eclipse-Cj/paper-distill-mcp.git
cd paper-distill-mcp
python3 -m venv .venv && .venv/bin/pip install --upgrade pip && .venv/bin/pip install -e .
python tests/test_mcp_smoke.py   # 9 tests, no network required

๐Ÿ“„ License

This project is licensed under AGPL-3.0. See LICENSE for details.

Unauthorized commercial use is prohibited. For commercial licensing inquiries, contact the author.


๐Ÿ“ฌ Contact

Bug reports and feature requests are welcome. The project is in active early development โ€” thank you for your patience and support ๐Ÿ™

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

paper_distill_mcp-0.2.3.tar.gz (95.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

paper_distill_mcp-0.2.3-py3-none-any.whl (119.9 kB view details)

Uploaded Python 3

File details

Details for the file paper_distill_mcp-0.2.3.tar.gz.

File metadata

  • Download URL: paper_distill_mcp-0.2.3.tar.gz
  • Upload date:
  • Size: 95.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for paper_distill_mcp-0.2.3.tar.gz
Algorithm Hash digest
SHA256 0674d6ccd063a63130a494fecf5a650484daf82c94668e68cad9a5770fafb2a9
MD5 9a75bc895454393ae2aacf9cda755251
BLAKE2b-256 e00ebf12bed6962f0d5b153a275c5345d069de663634ec20445d606b685c1f6c

See more details on using hashes here.

Provenance

The following attestation bundles were made for paper_distill_mcp-0.2.3.tar.gz:

Publisher: publish.yml on Eclipse-Cj/paper-distill-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file paper_distill_mcp-0.2.3-py3-none-any.whl.

File metadata

File hashes

Hashes for paper_distill_mcp-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 22796096d039fa84bd5fa53bbe3de777ab7adba43b5969def4778b0f88e4e58f
MD5 5aee1bcf39440e3c9d2f8f6c77796938
BLAKE2b-256 a7e3ad173409b3a3a46752d3a67e5aade2860ccc7228c515ad7477d7469ac9e2

See more details on using hashes here.

Provenance

The following attestation bundles were made for paper_distill_mcp-0.2.3-py3-none-any.whl:

Publisher: publish.yml on Eclipse-Cj/paper-distill-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page