An MCP server exposing arXiv research tools (search, abstracts, author lookup, trending) to LLM agents.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

jananiv07

These details have not been verified by PyPI

Project description

📚 arXiv Research MCP Server

Give any LLM agent a research librarian for arXiv.

Search 2.4M+ papers, pull full abstracts, track a researcher's latest work, and surface what a field is publishing right now — all over the Model Context Protocol.

🎬 Demo

▶️ Demo GIF coming soon — a 30-second walkthrough of an agent searching arXiv and reading an abstract through these tools.

✨ Why this server

Large language models are great at reasoning about papers but have no live access to the literature. This server closes that gap with four focused, read-only tools that an agent can call to discover, read, and monitor research on arXiv — with output shaped specifically for an LLM's context window.

🧠 Agent-first tool design — every tool carries a detailed docstring the host shows to the model, so it knows when and how to call each one.
📦 Structured, validated output — each tool returns a typed Pydantic model, surfaced as MCP structuredContent (not just a blob of text).
🎚️ Context-aware verbosity — concise mode (default) trims abstracts and caps author lists; detailed returns everything. You never blow the window by accident.
🛟 Honest by design — trending_topics refuses to fake popularity metrics arXiv doesn't expose, and says so in every response.
✅ Actually verified — ruff + pyright --strict + an in-process smoke test and a real end-to-end stdio MCP client test, all green against the live API.

🧰 The four tools

Tool	What it's for	Parameters (defaults)
🔍 `search_papers`	Keyword discovery across all of arXiv. Supports field prefixes (`ti:`, `au:`, `abs:`, `cat:`) and boolean `AND` / `OR` / `ANDNOT`.	`query`, `max_results=10`, `sort_by="relevance"`, `response_format="concise"`
📄 `get_abstract`	Full record for one paper by ID — untruncated abstract, every author, all categories, DOI / journal ref / comment, PDF + abstract URLs.	`arxiv_id`
👤 `find_by_author`	A researcher's most recent papers, newest first.	`author_name`, `max_results=10`, `response_format="concise"`
📈 `trending_topics`	Recent submissions in a category within a time window, plus the sub-topics that dominate them.	`category`, `days=7`, `max_results=10`, `response_format="concise"`

Shared conventions

response_format: "concise" (default) shortens the abstract to ~280 chars and caps the author list to 8 names — abstract_truncated and author_count always tell the agent what was elided. "detailed" returns full text and all authors.
sort_by (search only): "relevance", "newest", or "last_updated".
Safety caps (auto-applied, and reported back in a note field): max_results is clamped to 50, trending_topics scans at most 200 recent papers and honors a window of 1–90 days.
arxiv_id is forgiving — it accepts bare (2401.01234), versioned (2401.01234v2), legacy (math.GT/0309136), and full-URL forms.

A deliberate note on "trending"

The arXiv API exposes no citation, download, or view counts — so genuine popularity cannot be measured. trending_topics therefore defines "trending" as recency of submission within the window, and ranks the sub-categories those recent papers co-occur in. Every response restates this in its note field so the agent never overclaims. Honesty over vanity metrics.

🚀 Quick start

Install from PyPI:

pip install arxiv-research-mcp

…then point your MCP client at the arxiv-research-mcp command (see Connect it to an MCP host).

Or install from source

git clone https://github.com/JananiV07/arxiv-mcp-server.git
cd arxiv-mcp-server

python -m venv .venv
# Windows (PowerShell):
.venv\Scripts\Activate.ps1
# macOS / Linux:
source .venv/bin/activate

pip install -r requirements.txt
python src/server.py

Requires Python 3.10+. Runtime deps are just mcp[cli] and arxiv. The PyPI package is named arxiv-research-mcp (the name arxiv-mcp-server was already taken by an unrelated project).

Run it directly (it speaks MCP over stdio, so normally a host launches it):

python src/server.py

🔌 Connect it to an MCP host

Configure your client

Add an entry to your client's MCP config file (for example, Claude Desktop uses claude_desktop_config.json; other clients expose an equivalent).

If you installed from PyPI (pip install arxiv-research-mcp), just reference the installed command:

{
  "mcpServers": {
    "arxiv-research": {
      "command": "arxiv-research-mcp"
    }
  }
}

If you installed from source, point at the Python interpreter from your virtual environment:

{
  "mcpServers": {
    "arxiv-research": {
      "command": "/absolute/path/to/arxiv-mcp-server/.venv/bin/python",
      "args": ["/absolute/path/to/arxiv-mcp-server/src/server.py"]
    }
  }
}

On Windows (from source), use the .exe and forward slashes — e.g. C:/path/to/arxiv-mcp-server/.venv/Scripts/python.exe.

Restart the host, and the four tools appear under the arxiv-research server.

Try it with the MCP Inspector

npx @modelcontextprotocol/inspector python src/server.py

💬 What an agent can do with it

Once connected, natural-language requests map cleanly onto the tools:

You ask…	The agent calls…
"Find recent papers on diffusion models for video."	`search_papers("ti:diffusion AND cat:cs.CV", sort_by="newest")`
"Summarize 'Attention Is All You Need'."	`get_abstract("1706.03762")`
"What has Yoshua Bengio published lately?"	`find_by_author("Yoshua Bengio")`
"What's hot in machine learning this week?"	`trending_topics("cs.LG", days=7)`

Example output (`get_abstract`, abridged)

{
  "arxiv_id": "1706.03762v7",
  "title": "Attention Is All You Need",
  "authors": ["Ashish Vaswani", "Noam Shazeer", "..."],
  "author_count": 8,
  "published": "2017-06-12",
  "updated": "2023-08-02",
  "primary_category": "cs.CL",
  "categories": ["cs.CL", "cs.LG"],
  "abstract": "The dominant sequence transduction models ...",
  "abstract_truncated": false,
  "abstract_url": "http://arxiv.org/abs/1706.03762v7",
  "pdf_url": "https://arxiv.org/pdf/1706.03762v7"
}

🏗️ Architecture & design choices

arxiv-mcp-server/
├── src/
│   └── server.py          # FastMCP server: 4 tools + Pydantic models + helpers
├── scripts/
│   ├── smoke_test.py      # in-process tests (import the tool fns directly)
│   └── client_test.py     # end-to-end test over the real stdio MCP protocol
├── pyproject.toml         # packaging + ruff + pyright config
├── requirements.txt       # runtime deps
└── README.md

FastMCP registers each tool via @mcp.tool(); type hints + pydantic.Field descriptions become the JSON input schema the host advertises to the model.
Typed output models — Paper, SearchResults, AuthorResults, TopicCount, TrendingResults — give the host structured, machine-readable results.
Read-only annotations — all four tools set readOnlyHint=True / destructiveHint=False, so hosts can treat them as safe to call freely.
One shared arxiv.Client with a polite delay + retries, respecting arXiv's fair-use guidance; its chatty INFO logging is silenced so stdout stays a clean MCP channel.
Actionable errors — bad input or a failed request raises a ValueError whose message tells the agent how to fix the call (correct ID format, valid category code, query-prefix syntax, …).

🧪 Development & testing

pip install -e ".[dev]"          # ruff + pyright

ruff check .                     # lint
pyright                          # type check (strict on our own code)
python scripts/smoke_test.py     # in-process checks vs the live arXiv API
python scripts/client_test.py    # full stdio MCP protocol round-trip

Two complementary test layers:

smoke_test.py imports the tool functions directly — fast feedback on tool logic, the concise/detailed split, max_results/days clamping, missing-field handling, and error paths.
client_test.py is a true MCP client: it spawns src/server.py as a subprocess and exercises initialize → list_tools → call_tool over stdio — the same path any MCP host uses. This is what proves the server works as an MCP server: input schemas, structuredContent, tool annotations, and protocol-level error reporting (isError).

📋 Requirements

Python 3.10+
mcp[cli] — the MCP Python SDK (FastMCP)
arxiv — Python wrapper for the arXiv API
Network access to export.arxiv.org

🙏 Acknowledgements

Paper data from the arXiv API. Thank you to arXiv for the open API — please use it within their Terms of Use.
Built on the Model Context Protocol.

arXiv is a trademark of Cornell University. This project is an independent, unofficial integration and is not affiliated with or endorsed by arXiv.

📄 License

Released under the MIT License — see LICENSE.

_{Built for the agentic era — so your LLM can read the literature, not just guess about it.}

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

jananiv07

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.0.0

May 31, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arxiv_research_mcp-1.0.0.tar.gz (18.5 kB view details)

Uploaded May 31, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

arxiv_research_mcp-1.0.0-py3-none-any.whl (15.6 kB view details)

Uploaded May 31, 2026 Python 3

File details

Details for the file arxiv_research_mcp-1.0.0.tar.gz.

File metadata

Download URL: arxiv_research_mcp-1.0.0.tar.gz
Upload date: May 31, 2026
Size: 18.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for arxiv_research_mcp-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`9988e390baaf3247997fd756ec405e8f29b48094b2cf776cefacf89de39f7434`
MD5	`a7c8b46975eb703fe8f78d289b2899c1`
BLAKE2b-256	`856cbf6dd6e52aceda5f36b99f5a42c4386328ca530bd6627675120e3fd734e4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for arxiv_research_mcp-1.0.0.tar.gz:

Publisher: publish.yml on JananiV07/arxiv-mcp-server

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: arxiv_research_mcp-1.0.0.tar.gz
- Subject digest: 9988e390baaf3247997fd756ec405e8f29b48094b2cf776cefacf89de39f7434
- Sigstore transparency entry: 1682420695
- Sigstore integration time: May 31, 2026
Source repository:
- Permalink: JananiV07/arxiv-mcp-server@00144f7f63e9aecbb0db8593903ad93589b00b19
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/JananiV07
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@00144f7f63e9aecbb0db8593903ad93589b00b19
- Trigger Event: release

File details

Details for the file arxiv_research_mcp-1.0.0-py3-none-any.whl.

File metadata

Download URL: arxiv_research_mcp-1.0.0-py3-none-any.whl
Upload date: May 31, 2026
Size: 15.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for arxiv_research_mcp-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`08d5b2bf194a5c1e4265c39d83f238a5d0c18f8f9e8cbeb9eaa0b5bd433e7c61`
MD5	`6a457ebc66817acc4598cf6511874ebd`
BLAKE2b-256	`f0998271106ce986c6e50a5f280c941e218853ee8fc1105fd5e0e29f4b8d298d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for arxiv_research_mcp-1.0.0-py3-none-any.whl:

Publisher: publish.yml on JananiV07/arxiv-mcp-server

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: arxiv_research_mcp-1.0.0-py3-none-any.whl
- Subject digest: 08d5b2bf194a5c1e4265c39d83f238a5d0c18f8f9e8cbeb9eaa0b5bd433e7c61
- Sigstore transparency entry: 1682420746
- Sigstore integration time: May 31, 2026
Source repository:
- Permalink: JananiV07/arxiv-mcp-server@00144f7f63e9aecbb0db8593903ad93589b00b19
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/JananiV07
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@00144f7f63e9aecbb0db8593903ad93589b00b19
- Trigger Event: release

arxiv-research-mcp 1.0.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

📚 arXiv Research MCP Server

🎬 Demo

✨ Why this server

🧰 The four tools

A deliberate note on "trending"

🚀 Quick start

🔌 Connect it to an MCP host

Configure your client

Try it with the MCP Inspector

💬 What an agent can do with it

Example output (get_abstract, abridged)

🏗️ Architecture & design choices

🧪 Development & testing

📋 Requirements

🙏 Acknowledgements

📄 License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Example output (`get_abstract`, abridged)