A Python package for arXiv paper access with CLI and MCP server support
Project description
deepxiv-sdk
A Python SDK for accessing arXiv papers with CLI and MCP server support.
🎮 Try the live demo: https://1stauthor.com/
📚 API Documentation: https://data.rag.ac.cn/api/docs
Features
- 🔍 Paper Search: Search for arXiv papers using hybrid search (BM25 + Vector)
- 📄 Paper Access: Retrieve paper metadata, sections, and full content
- 🏥 PMC Support: Access PubMed Central biomedical literature
- 💻 CLI: Command-line interface for quick access
- 🔌 MCP Server: Model Context Protocol server for Claude Desktop integration
- 🤖 Intelligent Agent: ReAct-based agent for intelligent paper analysis
- 🔌 Flexible LLM Support: Compatible with OpenAI, DeepSeek, OpenRouter, and other OpenAI-compatible APIs
Installation
# Basic install (Reader + CLI)
pip install deepxiv-sdk
# With MCP server support
pip install deepxiv-sdk[mcp]
# With Agent support
pip install deepxiv-sdk[agent]
# Full install (all features)
pip install deepxiv-sdk[all]
Quick Start
Step 1: Get Your Free API Token
Visit https://data.rag.ac.cn/register to get your free API token (1000 requests/day).
Step 2: Configure Your Token
# Interactive configuration (saves to ~/.env)
deepxiv config
# Or provide token directly
deepxiv config --token YOUR_TOKEN
# The CLI will automatically load token from ~/.env
CLI Usage
# Show help
deepxiv help
# Get paper in different formats
deepxiv paper 2409.05591 # Full markdown
deepxiv paper 2409.05591 --head # Metadata (JSON)
deepxiv paper 2409.05591 --brief # Brief info (title, TLDR, keywords)
deepxiv paper 2409.05591 --raw # Raw markdown
deepxiv paper 2409.05591 --preview # Preview (~10k chars)
deepxiv paper 2409.05591 --section intro # Specific section
# Search papers
deepxiv search "agent memory" --limit 5
deepxiv search "transformer" --mode bm25 --format json
deepxiv search "LLM" --categories cs.AI,cs.CL --min-citations 100
# Get PMC papers
deepxiv pmc PMC544940 # Full JSON
deepxiv pmc PMC544940 --head # Metadata only
deepxiv pmc PMC514704 # Another example
# Start MCP server
deepxiv serve
Python API
from deepxiv_sdk import Reader
# Initialize the reader
reader = Reader(token="your_api_token") # or Reader() for free papers
# Search for papers
results = reader.search("agent memory", size=10)
for paper in results['results']:
print(f"{paper['title']} - {paper['arxiv_id']}")
# Get paper metadata
head = reader.head("2409.05591")
print(f"Title: {head['title']}")
# Get brief info (quick summary)
brief = reader.brief("2409.05591")
print(f"Title: {brief['title']}")
print(f"TLDR: {brief.get('tldr', 'N/A')}")
print(f"Citations: {brief.get('citations', 0)}")
# Read a section (case-insensitive)
intro = reader.section("2409.05591", "Introduction")
print(intro)
# Get full paper
content = reader.raw("2409.05591")
# Access PMC papers
pmc_head = reader.pmc_head("PMC544940")
print(f"PMC Title: {pmc_head['title']}")
pmc_full = reader.pmc_json("PMC544940")
print(f"PMC Content: {len(str(pmc_full))} chars")
Agent Usage
import os
from deepxiv_sdk import Reader, Agent
reader = Reader(token="your_api_token")
agent = Agent(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4",
reader=reader,
print_process=True
)
answer = agent.query("What are the latest papers about agent memory?")
print(answer)
MCP Server Setup
For Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"deepxiv": {
"command": "deepxiv",
"args": ["serve"],
"env": {
"DEEPXIV_TOKEN": "your_token_here"
}
}
}
}
Available MCP Tools
| Tool | Description |
|---|---|
search_papers |
Search arXiv with hybrid search |
get_paper_brief |
Get brief info (title, TLDR, keywords, citations) |
get_paper_metadata |
Get paper metadata and section TLDRs |
get_paper_section |
Read a specific section |
get_full_paper |
Get complete paper content |
get_paper_preview |
Get preview (~10k chars) |
get_pmc_metadata |
Get PMC paper metadata |
get_pmc_full |
Get complete PMC paper in JSON |
API Token
- Get Your Free Token: https://data.rag.ac.cn/register
- Daily Limit: 1000 free requests per day
- Test Papers:
- arXiv:
2409.05591and2504.21776are available without authentication - PMC:
PMC544940andPMC514704are available without authentication
- arXiv:
Token Configuration (3 Ways)
1. Using config command (Recommended)
deepxiv config
# Saves to ~/.env and automatically loads on every command
2. Environment Variable
export DEEPXIV_TOKEN="your_token_here"
# Add to ~/.bashrc or ~/.zshrc for persistence
3. Command-line Option
deepxiv paper 2512.02556 --token "your_token_here"
# Useful for one-time usage or multiple tokens
The CLI automatically loads tokens from:
- Command-line
--tokenoption (highest priority) DEEPXIV_TOKENenvironment variable.envfile in current directory~/.envfile in home directory (lowest priority)
API Reference
Reader Methods
arXiv Methods
search(query, size=10, search_mode="hybrid", ...): Search for papershead(arxiv_id): Get paper metadata and structurebrief(arxiv_id): Get brief info (title, TLDR, keywords, citations)section(arxiv_id, section_name): Get a specific section (case-insensitive)raw(arxiv_id): Get full paper in markdownpreview(arxiv_id): Get paper preview (~10k chars)json(arxiv_id): Get complete structured JSONmarkdown(arxiv_id): Get HTML view URL
PMC Methods
pmc_head(pmc_id): Get PMC paper metadatapmc_json(pmc_id): Get complete PMC paper in JSON
Agent Methods
query(question, reset_papers=False): Query the agentget_loaded_papers(): Get loaded papers inforeset_papers(): Reset all loaded papersadd_paper(arxiv_id): Add a paper to context
Examples
See the examples directory:
example_reader.py: Basic Reader usageexample_agent.py: Agent usageexample_advanced.py: Advanced patternsquickstart.py: Quick start guide
License
MIT License - see LICENSE file for details.
Support
- 🐛 GitHub Issues: https://github.com/qhjqhj00/deepxiv_sdk/issues
- 📚 API Documentation: https://data.rag.ac.cn/api/docs
- 🎮 Demo: https://1stauthor.com/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file deepxiv_sdk-0.1.0.tar.gz.
File metadata
- Download URL: deepxiv_sdk-0.1.0.tar.gz
- Upload date:
- Size: 29.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b208573d112a40cb124a555c913e49013caca99ac87b508e5373cd53a27592cd
|
|
| MD5 |
9eea6fad4bc76e387cfd608646f32ba5
|
|
| BLAKE2b-256 |
a5852fefecb861db3287507b46a4d5130b8fde8ec156b23f2e5d31da1383a5c4
|
File details
Details for the file deepxiv_sdk-0.1.0-py3-none-any.whl.
File metadata
- Download URL: deepxiv_sdk-0.1.0-py3-none-any.whl
- Upload date:
- Size: 30.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2baa98a3bc7456b856bd2e6a767e28edd937ed699a656dc8b27ca1cad25579bb
|
|
| MD5 |
23baad938e05cb3b170220db75eeb528
|
|
| BLAKE2b-256 |
2f4294f52f31e21f5f950f9430e96ef3dde57d8769d751c9046efb9de1c07bfe
|