Skip to main content

A flexible academic paper search and analysis service for top conferences with MCP protocol support

Project description

PyPI Version PyPI Downloads Python Version License

Top Paper MCP Server

🔍 Enable AI assistants to search and access academic papers from arXiv and top conferences through a simple MCP interface.

The Top Paper MCP Server provides a bridge between AI assistants and academic research repositories (arXiv, CVPR, NeurIPS, ICLR, ICML, etc.) through the Model Context Protocol (MCP). It allows AI models to search for papers and access their content in a programmatic way.

🤝 Contribute • 📝 Report Bug

✨ Core Features

arXiv Integration

  • 🔎 Paper Search: Query arXiv papers with filters for date ranges and categories
  • 📄 Paper Access: Download and read paper content
  • 📋 Paper Listing: View all downloaded papers
  • 🗃️ Local Storage: Papers are saved locally for faster access

Conference Support

  • 🔎 Conference Search: Search papers from top AI/ML/CV/NLP conferences:
    • CVF: CVPR, ICCV, WACV
    • ECVA: ECCV
    • OpenReview: ICLR, NeurIPS, ICML, AAAI, IJCAI, ACL, EMNLP, NAACL, COLM, CoRL, MLSYS, MICCAI, IWSLT, INTERSPEECH
    • ML Anthology: COLT, UAI
    • ACM: ACM Digital Library (SIGGRAPH, CHI, KDD, etc.)
  • 📄 Paper Download: Download papers from conference websites
  • 📝 Prompts: Research prompts for paper analysis

Supported Conferences

Conference Data Source Year Range
Computer Vision
CVPR CVF Open Access 2000-present
ICCV CVF Open Access 2000-present
WACV CVF Open Access 2000-present
ECCV ECVA 2000-present
Machine Learning / AI
ICLR OpenReview API 2000-present
NeurIPS OpenReview API 2000-present
ICML OpenReview API 2000-present
AAAI OpenReview API 2000-present
IJCAI OpenReview API 2000-present
COLM OpenReview API 2000-present
CoRL OpenReview API 2000-present
MLSYS OpenReview API 2020-present
NLP
ACL OpenReview API 2000-present
EMNLP OpenReview API 2000-present
NAACL OpenReview API 2000-present
Speech / Multimodal
INTERSPEECH OpenReview API 2000-present
IWSLT OpenReview API 2000-present
MICCAI OpenReview API 2000-present
Theory
COLT ML Anthology 2000-present
UAI ML Anthology 2000-present
Other
ACM ACM Digital Library Varies

🚀 Quick Start

Installing via Smithery

npx -y @smithery/cli install top-paper-mcp-server --client claude

Manual Installation

uv tool install top-paper-mcp-server

For PDF support (older papers):

uv tool install 'top-paper-mcp-server[pdf]'

Verify installation:

top-paper-mcp-server --help

MCP Configuration

{
    "mcpServers": {
        "top-paper": {
            "command": "uv",
            "args": [
                "tool",
                "run",
                "top-paper-mcp-server",
                "--storage-path", "/path/to/paper/storage"
            ]
        }
    }
}

For development:

{
    "mcpServers": {
        "top-paper": {
            "command": "uv",
            "args": [
                "--directory",
                "path/to/cloned/top-paper-mcp-server",
                "run",
                "top-paper-mcp-server",
                "--storage-path", "/path/to/paper/storage"
            ]
        }
    }
}

HTTP Transport

TRANSPORT=http HOST=127.0.0.1 PORT=8080 top-paper-mcp-server --storage-path /path/to/papers

Then configure your MCP client:

{
    "mcpServers": {
        "top-paper": {
            "type": "http",
            "url": "http://127.0.0.1:8080/mcp"
        }
    }
}

💡 Available Tools

arXiv Tools

# Search arXiv papers
result = await call_tool("search_papers", {
    "query": "transformer",
    "max_results": 10,
    "categories": ["cs.LG", "cs.AI"]
})

# Download a paper
result = await call_tool("download_paper", {
    "paper_id": "2401.12345"
})

# List downloaded papers
result = await call_tool("list_papers", {})

# Read paper content
result = await call_tool("read_paper", {
    "paper_id": "2401.12345"
})

Conference Tools

# Search single conference
result = await call_tool("conference_search", {
    "query": "object detection",
    "conference": "CVPR",
    "year": 2024,
    "max_results": 10
})

# Multi-conference concurrent search (NEW!)
result = await call_tool("conference_search", {
    "query": "transformer",
    "conference": "NeurIPS",
    "year": 2024,
    "search_all": True,
    "conferences": ["CVPR", "NeurIPS", "ICLR", "ICML"]
})

# Search by category with concurrent execution
result = await call_tool("conference_search", {
    "query": "attention",
    "conference": "NeurIPS",
    "year": 2024,
    "search_all": True,
    "categories": ["computer_vision", "nlp"]
})

# Unified search across ALL conferences
result = await call_tool("unified_search", {
    "query": "deep learning",
    "year": 2024,
    "max_results_per_conference": 5,
    "total_results": 20
})

# Download conference paper
result = await call_tool("conference_download", {
    "paper_id": "12345",
    "conference": "CVPR",
    "year": 2024
})

Multi-Threaded Concurrent Search Features

  • Concurrent Execution: Searches multiple conferences in parallel using asyncio
  • Priority-Based Ordering: Results sorted by conference priority (CVPR > NeurIPS > ICLR > ICML > ...)
  • Category Filtering: Filter by domain (computer_vision, machine_learning, nlp, ai, speech, medical, theory)
  • Results Aggregation: Merges and deduplicates results from multiple sources
  • Rate Limiting: Built-in semaphore limits concurrent requests (max 10) to prevent API throttling
  • Timeout Protection: Individual requests timeout after 30 seconds to prevent slow endpoints blocking
  • Automatic Retry: Failed requests automatically retry up to 2 times with exponential backoff
  • Error Resilience: Graceful handling of partial failures - successful results still returned

⚙️ Configuration

Setting Purpose Default
--storage-path Paper storage location ~/.top-paper-mcp-server/papers
MAX_RESULTS Maximum search results 50
REQUEST_TIMEOUT API timeout in seconds 60
TRANSPORT Transport type: stdio, http stdio
HOST Host to bind to in HTTP mode 127.0.0.1
PORT Port to listen on in HTTP mode 8000

🔒 Security

Paper content retrieved from external sources is untrusted input.

When an AI assistant downloads or reads a paper through this server, the paper's text is passed directly into the model's context. A maliciously crafted paper could embed adversarial instructions designed to hijack the AI's behavior.

Recommended Mitigations

  1. Use read-only MCP configurations when possible
  2. Review paper content before acting on AI summaries
  3. Be cautious in multi-tool setups
  4. Treat AI-generated summaries as data, not instructions

🧪 Testing

python -m pytest

📄 License

Released under the Apache License 2.0. See the LICENSE file for details.


🙏 Acknowledgments

This project is built upon the excellent foundation of arxiv-mcp-server created by Joseph Blazick.

We sincerely thank:

  • Joseph Blazick for creating the original arxiv-mcp-server and making it open source
  • The arXiv team for providing the open research repository
  • OpenReview for enabling open access to peer reviews
  • CVF/ECVA for providing open access to computer vision conference papers
  • All the conference organizers who make their papers publicly accessible

This project extends the original work to support additional top academic conferences while maintaining compatibility with the existing arXiv functionality.


Made with ❤️ for academic research

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

top_paper_mcp_server-0.6.0.tar.gz (239.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

top_paper_mcp_server-0.6.0-py3-none-any.whl (70.9 kB view details)

Uploaded Python 3

File details

Details for the file top_paper_mcp_server-0.6.0.tar.gz.

File metadata

  • Download URL: top_paper_mcp_server-0.6.0.tar.gz
  • Upload date:
  • Size: 239.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for top_paper_mcp_server-0.6.0.tar.gz
Algorithm Hash digest
SHA256 08f7f99953d3766982667157de7b1a1dc03b4fcae3d4a9581ce6473d7b08eb75
MD5 f10df01a8b2751a6848f5256d4a4edb4
BLAKE2b-256 d6c0742b57429570c02856cb87fc6ddd4e64c32fd0fb5cb2a114071d4e819245

See more details on using hashes here.

File details

Details for the file top_paper_mcp_server-0.6.0-py3-none-any.whl.

File metadata

File hashes

Hashes for top_paper_mcp_server-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c3e3e3d64bd3b29b5b2993761e62e64663093db270c2cb21afd1803e50694ed3
MD5 9ade925a33d677899d4208b222a32880
BLAKE2b-256 4678e51ceefcf3b3d628128edd7afc4eb1a6477c9c033bc09d41cd695df91750

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page