A Python package for arXiv paper access with CLI and MCP server support

These details have not been verified by PyPI

Project links

Project description

deepxiv-sdk

Agent-first academic paper interface for CLI, MCP, and Python. deepxiv gives OpenClaw, Claude Code, Codex, and other coding agents a fast, structured way to search papers, inspect metadata, read only the right sections, and reason over open-access literature without wasting tokens.

📚 API Documentation: https://data.rag.ac.cn/api/docs
🎥 Demo Video:
📄 Technical Report:
📖 中文文档: README.zh.md

Why deepxiv for agents?

Feature	deepxiv	Standard arXiv API
Hybrid Search (BM25 + Vector)	✅	❌
AI-Generated Summaries (TLDR)	✅	❌
Section-by-Section Access	✅	❌
GitHub Link Extraction	✅	❌
MCP Protocol Support	✅	❌
Biomedical Papers (PMC)	✅	❌
Agent-Oriented CLI	✅	❌
Free Daily Requests	10,000	∞*

*arXiv API has no limit, but strict rate limiting

Core Features

🔍 Hybrid Search: BM25 + vector search for better retrieval quality
📄 Section-Based Access: load only the sections an agent actually needs
✨ Brief Views: title, TLDR, keywords, citations, PDF, and GitHub link when available
💻 Three Interfaces: CLI / MCP Server / Python SDK
🤖 Agent-Friendly by Default: works well inside OpenClaw, Claude Code, Codex, and similar agent loops
📚 PMC Support: access biomedical literature alongside arXiv
🔥 Trending + Social Impact: discover papers getting attention online

Agent Integration

deepxiv is designed to be the paper interface layer for coding and research agents.

Codex: install the CLI skill and let Codex call deepxiv search, deepxiv paper, and deepxiv pmc directly
Claude Code: load the same CLI skill or use the MCP server for tool-based access
OpenClaw: use the CLI as a stable shell interface, or wire the MCP server into your agent runtime
Other agents: use the CLI for predictable terminal workflows, the MCP server for tool calling, or the Python SDK for direct integration

The key design goal is simple: give agents a comprehensive and token-efficient academic paper interface instead of forcing them to scrape raw PDFs or overfetch entire papers.

🌐 Open Access Literature Support

Current Support

✅ arXiv - Computer Science, Physics, Math, and more
✅ PubMed Central (PMC) - Biomedical and life sciences

Coming Soon (Roadmap)

🔄 bioRxiv - Preprints in biology
🔄 medRxiv - Preprints in medicine
🔄 Other OA Sources - Additional open access repositories
🔄 Full OA Literature Coverage - Comprehensive open access ecosystem

Why OA Literature? By focusing on open access papers, deepxiv ensures that researchers and AI systems have unrestricted access to knowledge without subscription barriers.

Quick Start

1. Installation

# Basic install (Reader + CLI)
pip install deepxiv-sdk

# Full install (MCP + Agent)
pip install deepxiv-sdk[all]

2. First Use

On first use, deepxiv automatically registers a free token and saves it to ~/.env:

deepxiv search "agent memory" --limit 5

3. CLI Usage

The CLI is the fastest way to plug deepxiv into agent workflows.

# Search papers
deepxiv search "transformer" --limit 10

# Quick paper understanding
deepxiv paper 2409.05591 --brief

# Paper structure and targeted reading
deepxiv paper 2409.05591 --head
deepxiv paper 2409.05591 --section Introduction
deepxiv paper 2409.05591 --preview
deepxiv paper 2409.05591

# Social/trending signals
deepxiv paper 2409.05591 --popularity
deepxiv trending --days 14 --limit 10

# Biomedical papers
deepxiv pmc PMC544940 --head

4. Use with OpenClaw, Claude Code, and Codex

Codex skill

mkdir -p $CODEX_HOME/skills
ln -s "$(pwd)/skills/deepxiv-cli" $CODEX_HOME/skills/deepxiv-cli

The included skill teaches agents when to use:

deepxiv search for literature discovery
deepxiv paper --brief for quick filtering
deepxiv paper --section for focused reading
deepxiv pmc for biomedical papers
deepxiv agent for deeper multi-turn reasoning

Claude Code / OpenClaw / custom agents

If your framework supports reusable operating instructions, load skills/deepxiv-cli/SKILL.md directly. This gives agents a clean command selection guide instead of relying on ad hoc shell usage.

5. MCP Server

Use MCP when you want tool-based integration rather than shell execution.

Add to Claude Desktop MCP config file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

Windows: %APPDATA%\Claude\claude_desktop_config.json

Linux: ~/.config/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "deepxiv": {
      "command": "deepxiv",
      "args": ["serve"],
      "env": {
        "DEEPXIV_TOKEN": "your_token_here"
      }
    }
  }
}

Available MCP tools:

Tool	Description
`search_papers`	Search arXiv papers
`get_paper_brief`	Quick summary
`get_paper_metadata`	Full metadata
`get_paper_section`	Read specific section
`get_full_paper`	Complete paper
`get_paper_preview`	Paper preview
`get_pmc_metadata`	PMC paper metadata
`get_pmc_full`	Complete PMC paper

6. Python Usage

from deepxiv_sdk import Reader

reader = Reader()

# Search papers
results = reader.search("agent memory", size=5)
for paper in results.get("results", []):
    print(f"{paper['title']} ({paper['arxiv_id']})")

# Get paper info
brief = reader.brief("2409.05591")
print(f"Title: {brief['title']}")
print(f"TLDR: {brief.get('tldr', 'N/A')}")
print(f"GitHub: {brief.get('github_url', 'N/A')}")

# Read specific section
intro = reader.section("2409.05591", "Introduction")
print(intro[:500])

# Get trending papers (no token required)
trending = reader.trending(days=7, limit=5)
for paper in trending['papers']:
    print(f"#{paper['rank']}: {paper['arxiv_id']}")
    print(f"  Views: {paper['stats']['total_views']}")

# Get social impact metrics (requires token)
reader_with_token = Reader(token="your_token_here")
impact = reader_with_token.social_impact("2409.05591")
if impact:
    print(f"Views: {impact['total_views']}")
    print(f"Tweets: {impact['total_tweets']}")

Complete API Reference

Search and Query

reader.search(query, size=10, search_mode="hybrid", categories=None, min_citation=None)
reader.head(arxiv_id)              # Paper metadata and sections overview
reader.brief(arxiv_id)             # Quick summary (title, TLDR, keywords, citations, GitHub URL)
reader.section(arxiv_id, section)  # Read specific section
reader.raw(arxiv_id)               # Full paper
reader.preview(arxiv_id)           # Paper preview (~10k characters)
reader.json(arxiv_id)              # Complete structured JSON

PMC (Biomedical Papers)

reader.pmc_head(pmc_id)            # PMC paper metadata
reader.pmc_full(pmc_id)            # Complete PMC paper JSON

Agent (Optional)

from deepxiv_sdk import Agent

agent = Agent(api_key="your_openai_key", model="gpt-4")
answer = agent.query("What are the latest papers about agent memory?")
print(answer)

Token Management

deepxiv supports 4 ways to configure tokens:

1. Auto-registration (Recommended) - Automatically creates and saves on first use

deepxiv search "agent"

2. Using config command

deepxiv config --token YOUR_TOKEN

3. Environment variable

export DEEPXIV_TOKEN="your_token"

4. Command-line option

deepxiv paper 2409.05591 --token YOUR_TOKEN

Increase daily limit: Default is 10,000 requests/day. For higher limits, email your name, email, and phone to tommy@chien.io.

Free Test Papers

These papers can be accessed without a token:

arXiv: 2409.05591, 2504.21776 PMC: PMC544940, PMC514704

Agent Usage (Optional)

The built-in ReAct agent can automatically search papers, read content, and perform multi-turn reasoning:

from deepxiv_sdk import Agent

agent = Agent(
    api_key="your_deepseek_key",
    base_url="https://api.deepseek.com/v1",
    model="deepseek-chat"
)

answer = agent.query("Compare key ideas in transformers and attention mechanisms")
print(answer)

Or via CLI:

deepxiv agent config  # Configure LLM API
deepxiv agent query "What are the latest papers about agent memory?" --verbose

Error Handling

deepxiv provides specific exception types:

from deepxiv_sdk import (
    Reader,
    AuthenticationError,  # 401 - Invalid or expired token
    RateLimitError,       # 429 - Daily limit reached
    NotFoundError,        # 404 - Paper not found
    ServerError,          # 5xx - Server error
    APIError              # Other API errors
)

try:
    paper = reader.brief("2409.05591")
except AuthenticationError:
    print("Please update your token")
except RateLimitError:
    print("Daily limit reached")
except NotFoundError:
    print("Paper not found")
except APIError as e:
    print(f"API error: {e}")

Troubleshooting

Q: Do I need a token to use? A: No. Some papers are free to access. Search and some content require a token, but it's auto-created on first use.

Q: What's the maximum search results? A: 100 per request. Use offset parameter for pagination.

Q: How to handle timeouts? A: Reader automatically retries (max 3 times) with exponential backoff. You can customize:

reader = Reader(timeout=120, max_retries=5)

Q: Can I cache paper content? A: Yes. After getting content with reader, cache locally to database or file system.

Q: Which LLMs does the agent support? A: Any OpenAI-compatible API (OpenAI, DeepSeek, OpenRouter, local Ollama, etc.).

Examples

See examples/ directory:

quickstart.py - 5-minute quick start
example_reader.py - Basic Reader usage
example_agent.py - Agent usage
example_advanced.py - Advanced patterns
example_error_handling.py - Error handling examples

License

MIT License - see LICENSE file

Support

🐛 GitHub Issues: https://github.com/qhjqhj00/deepxiv_sdk/issues
📚 API Documentation: https://data.rag.ac.cn/api/docs
📧 Higher Limits: Email with your name, email, and phone to tommy@chien.io

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.1

May 22, 2026

0.3.0

May 22, 2026

0.2.5

Apr 20, 2026

0.2.4

Apr 2, 2026

This version

0.2.3

Apr 1, 2026

0.2.2

Mar 31, 2026

0.2.0

Mar 29, 2026

0.1.1

Feb 12, 2026

0.1.0

Feb 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepxiv_sdk-0.2.3.tar.gz (47.7 kB view details)

Uploaded Apr 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

deepxiv_sdk-0.2.3-py3-none-any.whl (42.4 kB view details)

Uploaded Apr 1, 2026 Python 3

File details

Details for the file deepxiv_sdk-0.2.3.tar.gz.

File metadata

Download URL: deepxiv_sdk-0.2.3.tar.gz
Upload date: Apr 1, 2026
Size: 47.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for deepxiv_sdk-0.2.3.tar.gz
Algorithm	Hash digest
SHA256	`e926668f2dab9c65366eea0ca9cb61939626cb23baf328382f2955defb52ad26`
MD5	`d0ac34608064f51fa59c260770d94782`
BLAKE2b-256	`82710a4fa9847cbacf5f879a32d512be1d8fcbaa19ef72d1a52b567215553cf2`

See more details on using hashes here.

File details

Details for the file deepxiv_sdk-0.2.3-py3-none-any.whl.

File metadata

Download URL: deepxiv_sdk-0.2.3-py3-none-any.whl
Upload date: Apr 1, 2026
Size: 42.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for deepxiv_sdk-0.2.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`924b38fc3942414d1863acf9008b5fec82889b7e291b74d8c571e4d13c87b24d`
MD5	`1b1b3dcc5959c7e405f44ef45cd4f994`
BLAKE2b-256	`a75c287b065413eab619a90ba031616ffbc825a7e396733658107bed3e057b8f`

See more details on using hashes here.

deepxiv-sdk 0.2.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

deepxiv-sdk

Why deepxiv for agents?

Core Features

Agent Integration

🌐 Open Access Literature Support

Current Support

Coming Soon (Roadmap)

Quick Start

1. Installation

2. First Use

3. CLI Usage

4. Use with OpenClaw, Claude Code, and Codex

Codex skill

Claude Code / OpenClaw / custom agents

5. MCP Server

6. Python Usage

Complete API Reference

Search and Query

PMC (Biomedical Papers)

Agent (Optional)

Token Management

Free Test Papers

Agent Usage (Optional)

Error Handling

Troubleshooting

Examples

License

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes