Skip to main content

MCP server exposing PaperQA2 for deep synthesis across scientific papers

Project description

paperqa-mcp-server

Give Claude the ability to read, search, and synthesize across your entire PDF library. Built on PaperQA2.

Point it at your Zotero storage folder (or any folder of PDFs) and ask Claude questions that require deep reading across multiple papers.

Quick start

1. Install uv

uv is a Python package manager. If you don't have it yet:

curl -LsSf https://astral.sh/uv/install.sh | sh

After installing, restart your terminal so uv is on your PATH.

Verify it works:

uv --version

2. Get an OpenAI API key

PaperQA2 uses OpenAI for embeddings and internal reasoning. Get a key at https://platform.openai.com/api-keys

3. Test that it runs

This downloads ~90 Python packages the first time — that's normal:

uvx paperqa-mcp-server --help 2>/dev/null; echo "OK if no Python errors above"

4. Add to Claude Desktop

  1. Open Claude Desktop
  2. Go to Settings → Developer → Edit Config
  3. This opens claude_desktop_config.json. Add a paperqa entry inside mcpServers (create mcpServers if it doesn't exist):

First, find your full path to uvx:

which uvx           # e.g. /Users/yourname/.local/bin/uvx

Then use that path in the config:

{
  "mcpServers": {
    "paperqa": {
      "command": "/FULL/PATH/TO/uvx",
      "args": ["paperqa-mcp-server"],
      "env": {
        "OPENAI_API_KEY": "sk-your-key-here"
      }
    }
  }
}

Replace the two placeholders:

  • /FULL/PATH/TO/uvx — paste the output of which uvx
  • sk-your-key-here — your OpenAI API key from step 2

If your PDFs are somewhere other than ~/Zotero/storage, add a PAPER_DIRECTORY entry to env:

"env": {
  "OPENAI_API_KEY": "sk-your-key-here",
  "PAPER_DIRECTORY": "/full/path/to/your/pdfs"
}
  1. Quit Claude Desktop completely (Cmd+Q, not just close the window) and reopen it
  2. You should see a hammer icon — click it and paper_qa should be listed

5. Pre-build the index

Before Claude can search your papers, the server needs to build a search index. This reads each PDF, splits it into chunks, and sends the chunks to OpenAI's embedding API. With hundreds of papers this takes a while and costs a few dollars in API calls.

If you have more than 10 unindexed papers, the server will refuse to answer queries and tell you to run this step first. A few new papers will be indexed automatically when you query.

OPENAI_API_KEY=sk-your-key-here uvx paperqa-mcp-server index

If this crashes with a rate limit error, just re-run the same command. It picks up where it left off — each run indexes more files. With a large library (500+ papers) you may need to run it a few times.

After that, the index is cached at ~/.pqa/indexes/. Only new or changed files get re-processed on subsequent runs.

Troubleshooting

"Server disconnected" in Claude Desktop

Claude Desktop has a short startup timeout. If uv needs to download packages on first launch, it will time out. Fix: run uvx paperqa-mcp-server once from the terminal first so packages are cached.

"Index incomplete" when querying

The server checks the index before each query. If too many papers are unindexed, it returns a diagnostic message instead of trying (and failing) to index them all on the fly. Fix: run the index command in step 5.

Hammer icon doesn't appear

Make sure you quit Claude Desktop completely (Cmd+Q) and reopened it. Check for JSON syntax errors in claude_desktop_config.json — a missing comma is the most common mistake.

Use a different LLM

By default, PaperQA2 uses gpt-4o-mini for its internal reasoning. This is separate from Claude — Claude calls the tool, PaperQA2 does its own LLM calls internally to gather and synthesize evidence.

To use a different model, add env vars to your Claude Desktop config:

"env": {
  "OPENAI_API_KEY": "sk-your-key-here",
  "PQA_LLM": "gpt-4o",
  "PQA_SUMMARY_LLM": "gpt-4o-mini"
}

All environment variables

Variable Default Purpose
PAPER_DIRECTORY ~/Zotero/storage Folder containing your PDFs
OPENAI_API_KEY Required for default embeddings
PQA_LLM gpt-4o-mini LLM for internal reasoning
PQA_SUMMARY_LLM gpt-4o-mini LLM for summarizing chunks
PQA_EMBEDDING text-embedding-3-small Embedding model
ANTHROPIC_API_KEY Only if using Claude as internal LLM

Works with zotero-mcp

This pairs well with zotero-mcp:

  • paperqa-mcp-server — deep reading and synthesis across full paper text
  • zotero-mcp — browse your library, search metadata, read annotations

Claude can cross-reference between them — for example, finding papers with PaperQA and then pulling up their Zotero metadata and annotations. PaperQA2's citations include Zotero storage keys (e.g. ABC123DE from storage/ABC123DE/paper.pdf) that Claude can use to look up items via zotero-mcp.

Index implementation notes

paperqa-mcp-server index uses the same _settings() function as the MCP server, so the index it builds is exactly the one the server will look for. The PaperQA2 index directory name is a hash of the settings (embedding model, chunk size, paper directory path, etc.). The settings include:

  • Multimodal OFF — skip image extraction from PDFs (avoids a crash on PDFs with CMYK images)
  • Doc details OFF — skip Crossref/Semantic Scholar metadata lookups (avoids rate limits; Claude can get metadata from Zotero directly via zotero-mcp)
  • Concurrency 1 — index one file at a time to stay under OpenAI's embedding rate limit

Why not pqa index? The pqa CLI constructs settings via pydantic's CliSettingsSource, which produces different defaults than constructing Settings() directly in Python (e.g. chunk_chars of 7000 vs 5000). Different settings = different index hash = server can't find the index. Always use paperqa-mcp-server index to build the index.

Install from GitHub (latest)

To use the latest version from the main branch instead of PyPI:

{
  "mcpServers": {
    "paperqa": {
      "command": "/FULL/PATH/TO/uvx",
      "args": ["--from", "git+https://github.com/menyoung/paperqa-mcp-server", "paperqa-mcp-server"],
      "env": {
        "OPENAI_API_KEY": "sk-your-key-here"
      }
    }
  }
}

To build the index from the latest main branch:

OPENAI_API_KEY=sk-your-key-here uvx --from git+https://github.com/menyoung/paperqa-mcp-server paperqa-mcp-server index

Development

If you want to contribute or modify the server locally:

git clone https://github.com/menyoung/paperqa-mcp-server.git
cd paperqa-mcp-server
uv sync
uv run paperqa-mcp-server        # run the server
uv run paperqa-mcp-server index  # build the index

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

paperqa_mcp_server-0.1.0.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

paperqa_mcp_server-0.1.0-py3-none-any.whl (7.8 kB view details)

Uploaded Python 3

File details

Details for the file paperqa_mcp_server-0.1.0.tar.gz.

File metadata

  • Download URL: paperqa_mcp_server-0.1.0.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.14 {"installer":{"name":"uv","version":"0.9.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for paperqa_mcp_server-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ea6bd7a5d840967ee56979c84dc95fa0436e3e9f7b0347cec8d3d3e9ea2c15c4
MD5 2be19a880effbbedf4e97f3bd355158b
BLAKE2b-256 016cc1bac640435c69c9c7b06ee1469670988d283155f6170312d829afcf399a

See more details on using hashes here.

File details

Details for the file paperqa_mcp_server-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: paperqa_mcp_server-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.14 {"installer":{"name":"uv","version":"0.9.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for paperqa_mcp_server-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 02a17d5fa512ef126a334c10b76a5c881db8480fd62c97eda3d159f85e5a5f33
MD5 591cc2de5caefbc9769c78930e8a5f89
BLAKE2b-256 c0a8f32dbfd0701054387e91b60c3ac22a856f13a9c6699c2be5e3bed8eb5bc0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page