A Python package for arXiv paper access with CLI and MCP server support
Project description
deepxiv-sdk
A Python SDK for accessing arXiv papers with CLI and MCP server support.
🎮 Try the live demo: https://1stauthor.com/
📚 API Documentation: https://data.rag.ac.cn/api/docs
Features
- 🔍 Paper Search: Search for arXiv papers using hybrid search (BM25 + Vector)
- 📄 Paper Access: Retrieve paper metadata, sections, and full content
- 🏥 PMC Support: Access PubMed Central biomedical literature
- 💻 CLI: Command-line interface for quick access
- 🔌 MCP Server: Model Context Protocol server for Claude Desktop integration
- 🤖 Intelligent Agent: ReAct-based agent for intelligent paper analysis
- 🔌 Flexible LLM Support: Compatible with OpenAI, DeepSeek, OpenRouter, and other OpenAI-compatible APIs
Installation
# Basic install (Reader + CLI)
pip install deepxiv-sdk
# With MCP server support
pip install deepxiv-sdk[mcp]
# With Agent support (includes OpenAI SDK)
pip install deepxiv-sdk[agent]
# Full install (all features)
pip install deepxiv-sdk[all]
Note: Agent requires openai>=1.0.0 for LLM calls. Install with [agent] or [all] extras.
Quick Start
Step 1: Get Your Free API Token
Visit https://data.rag.ac.cn/register to get your free API token (10000 requests/day).
Step 2: Configure Your Token
# Interactive configuration (saves to ~/.env)
deepxiv config
# Or provide token directly
deepxiv config --token YOUR_TOKEN
# The CLI will automatically load token from ~/.env
CLI Usage
# Show help
deepxiv help
# Get paper in different formats
deepxiv paper 2409.05591 # Full markdown
deepxiv paper 2409.05591 --head # Metadata (JSON)
deepxiv paper 2409.05591 --brief # Brief info (title, TLDR, keywords)
deepxiv paper 2409.05591 --raw # Raw markdown
deepxiv paper 2409.05591 --preview # Preview (~10k chars)
deepxiv paper 2409.05591 --section intro # Specific section
# Search papers
deepxiv search "agent memory" --limit 5
deepxiv search "transformer" --mode bm25 --format json
deepxiv search "LLM" --categories cs.AI,cs.CL --min-citations 100
# Get PMC papers
deepxiv pmc PMC544940 # Full JSON
deepxiv pmc PMC544940 --head # Metadata only
deepxiv pmc PMC514704 # Another example
# Intelligent Agent (requires agent installation)
deepxiv agent config # Configure LLM API first
deepxiv agent query "What are the latest papers about agent memory?"
# show reasoning process
deepxiv agent query "Latest HLE scores" --max-turn 10 --verbose
# Start MCP server
deepxiv serve
Agent Configuration:
- Config is saved to
~/.deepxiv_agent_config.json - Supports environment variables:
DEEPXIV_AGENT_API_KEY,DEEPXIV_AGENT_BASE_URL,DEEPXIV_AGENT_MODEL - Compatible with OpenAI, DeepSeek, OpenRouter, and other OpenAI-compatible APIs
Python API
from deepxiv_sdk import Reader
# Initialize the reader
reader = Reader(token="your_api_token") # or Reader() for free papers
# Search for papers
results = reader.search("agent memory", size=10)
for paper in results['results']:
print(f"{paper['title']} - {paper['arxiv_id']}")
# Get paper metadata
head = reader.head("2409.05591")
print(f"Title: {head['title']}")
# Get brief info (quick summary)
brief = reader.brief("2409.05591")
print(f"Title: {brief['title']}")
print(f"TLDR: {brief.get('tldr', 'N/A')}")
print(f"Citations: {brief.get('citations', 0)}")
# Read a section (case-insensitive)
intro = reader.section("2409.05591", "Introduction")
print(intro)
# Get full paper
content = reader.raw("2409.05591")
# Access PMC papers
pmc_head = reader.pmc_head("PMC544940")
print(f"PMC Title: {pmc_head['title']}")
pmc_full = reader.pmc_json("PMC544940")
print(f"PMC Content: {len(str(pmc_full))} chars")
Agent Usage
The intelligent agent can search papers, read content, and answer questions using ReAct reasoning.
Python API
import os
from deepxiv_sdk import Reader, Agent
reader = Reader(token="your_api_token")
agent = Agent(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4",
reader=reader,
max_llm_calls=20, # Maximum reasoning turns
print_process=True # Show reasoning steps
)
answer = agent.query("What are the latest papers about agent memory?")
print(answer)
# For DeepSeek or other APIs
agent = Agent(
api_key=os.getenv("DEEPSEEK_API_KEY"),
base_url="https://api.deepseek.com",
model="deepseek-chat",
reader=reader
)
MCP Server Setup
For Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"deepxiv": {
"command": "deepxiv",
"args": ["serve"],
"env": {
"DEEPXIV_TOKEN": "your_token_here"
}
}
}
}
Available MCP Tools
| Tool | Description |
|---|---|
search_papers |
Search arXiv with hybrid search |
get_paper_brief |
Get brief info (title, TLDR, keywords, citations) |
get_paper_metadata |
Get paper metadata and section TLDRs |
get_paper_section |
Read a specific section |
get_full_paper |
Get complete paper content |
get_paper_preview |
Get preview (~10k chars) |
get_pmc_metadata |
Get PMC paper metadata |
get_pmc_full |
Get complete PMC paper in JSON |
API Token
- Get Your Free Token: https://data.rag.ac.cn/register
- Daily Limit: 1000 free requests per day
- Test Papers:
- arXiv:
2409.05591and2504.21776are available without authentication - PMC:
PMC544940andPMC514704are available without authentication
- arXiv:
Token Configuration (3 Ways)
1. Using config command (Recommended)
deepxiv config
# Saves to ~/.env and automatically loads on every command
2. Environment Variable
export DEEPXIV_TOKEN="your_token_here"
# Add to ~/.bashrc or ~/.zshrc for persistence
3. Command-line Option
deepxiv paper 2512.02556 --token "your_token_here"
# Useful for one-time usage or multiple tokens
The CLI automatically loads tokens from:
- Command-line
--tokenoption (highest priority) DEEPXIV_TOKENenvironment variable.envfile in current directory~/.envfile in home directory (lowest priority)
API Reference
Reader Methods
arXiv Methods
search(query, size=10, search_mode="hybrid", ...): Search for papershead(arxiv_id): Get paper metadata and structurebrief(arxiv_id): Get brief info (title, TLDR, keywords, citations)section(arxiv_id, section_name): Get a specific section (case-insensitive)raw(arxiv_id): Get full paper in markdownpreview(arxiv_id): Get paper preview (~10k chars)json(arxiv_id): Get complete structured JSONmarkdown(arxiv_id): Get HTML view URL
PMC Methods
pmc_head(pmc_id): Get PMC paper metadatapmc_json(pmc_id): Get complete PMC paper in JSON
Agent Methods
query(question, reset_papers=False): Query the agent with a questionget_loaded_papers(): Get information about loaded papersreset_papers(): Clear all loaded papers from contextadd_paper(arxiv_id): Manually add a paper to context
Agent Tools
The agent has access to the following tools:
| Tool | Description |
|---|---|
search_papers |
Search arXiv papers with filters |
load_paper |
Load paper metadata and structure |
read_section |
Read a specific section |
get_full_paper |
Get complete paper content |
get_paper_preview |
Get paper preview (~10k chars) |
quick_preview |
Batch preview multiple papers (brief info only) |
Examples
See the examples directory:
example_reader.py: Basic Reader usageexample_agent.py: Agent usageexample_advanced.py: Advanced patternsquickstart.py: Quick start guide
License
MIT License - see LICENSE file for details.
Support
- 🐛 GitHub Issues: https://github.com/qhjqhj00/deepxiv_sdk/issues
- 📚 API Documentation: https://data.rag.ac.cn/api/docs
- 🎮 Demo: https://1stauthor.com/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file deepxiv_sdk-0.1.1.tar.gz.
File metadata
- Download URL: deepxiv_sdk-0.1.1.tar.gz
- Upload date:
- Size: 32.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
73ded89e4e2c25eff10d44c5cca88a8ffda735a4bdd60eb82536a43c51a0f645
|
|
| MD5 |
39a106ada124ce74ebd6d193b5f852cf
|
|
| BLAKE2b-256 |
b28952e502123f96e9335726c3f3c717a7fdda140fa1818ed784ef7dc0928545
|
File details
Details for the file deepxiv_sdk-0.1.1-py3-none-any.whl.
File metadata
- Download URL: deepxiv_sdk-0.1.1-py3-none-any.whl
- Upload date:
- Size: 33.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5bb700c6822f304c77ae9c63158d0311346802fadd0541495f3fc9bdc819f0ec
|
|
| MD5 |
432f394f440bb9d3d6b70290962a8fd1
|
|
| BLAKE2b-256 |
7c47dd8a8a02359089f3e12ec2254e987e933c80613222040cbdd49ad9812677
|