An intelligent arXiv literature crawler and analyzer for physics research
Project description
arXiv Pulse
Intelligent arXiv Literature Tracking System
๐ Language: ไธญๆๆๆกฃ
arXiv Pulse is a Python package for automated crawling, summarizing, and tracking of the latest research papers from arXiv in condensed matter physics, density functional theory (DFT), machine learning, force fields, and computational materials science. It provides a modern web interface for a professional literature management experience.
๐ธ Screenshots
โจ Key Features
- ๐ Web Interface: Modern FastAPI + Vue 3 + Element Plus interface with real-time SSE streaming
- ๐ One-Command Start: Simply run
pulse serveto start the service - ๐ Web Configuration: First-time setup wizard, all settings stored in database
- ๐ค AI Auto-Processing: Automatic translation, AI summarization, and figure extraction
- ๐ฌ AI Chat Assistant: Ask questions about papers with context-aware AI assistant
- ๐ Smart Search: Natural language queries with AI-powered keyword parsing
- ๐ Paper Collections: Create, edit, and delete collections to organize important papers
- ๐ Paper Basket: Select multiple papers for batch operations
- ๐ Secure by Default: Localhost-only binding, explicit confirmation for remote access
- ๐ Multilingual Support: UI in Chinese/English, translation to multiple languages
๐ What's New in 1.2.0
- Enhanced UI Components: Redesigned buttons, switches, selects, dialogs with refined shadows and transitions
- Paper Index Numbers: Visual index numbers on paper cards for easy reference
- Back-to-Top Button: Quick navigation with scroll-aware floating button
- Tooltips for Floating Buttons: Helpful labels on hover for all floating action buttons
- Recent Papers AI Search: Search within recent papers using natural language
- Sync Page Improvements: Better spacing, help icons with tooltips
- SQLite WAL Mode: Concurrent read/write operations for better performance
- Bug Fixes: Form submission, pagination visibility, index preservation during search
๐ Quick Start
Installation
pip install arxiv-pulse
Start Service
# Create data directory
mkdir my_papers && cd my_papers
# Start web service (background mode by default)
pulse serve .
# Or specify port
pulse serve . --port 3000
# Foreground mode (see logs in terminal)
pulse serve . -f
Then visit http://localhost:8000
Service Management
pulse status . # Check service status
pulse stop . # Stop service
pulse restart . # Restart service
pulse stop . --force # Force stop (SIGKILL)
Remote Access (SSH Tunnel)
By default, the service only accepts localhost connections for security. For remote access, use SSH tunnel:
# On server
pulse serve .
# On your computer
ssh -L 8000:localhost:8000 user@server
# Then visit http://localhost:8000
This provides encrypted connection without exposing your API keys.
First-Time Setup
- Visit http://localhost:8000
- Follow the setup wizard:
- Step 1: Configure AI API (OpenAI/DeepSeek key, model, endpoint)
- Step 2: Select research fields
- Step 3: Set sync parameters
- Step 4: Start initial sync
๐ Security
arXiv Pulse is designed with security in mind:
- Localhost-only by default: Service binds to 127.0.0.1, inaccessible from external networks
- No plaintext credentials: API keys stored in local SQLite database, never transmitted
- Explicit remote access: Opening to non-localhost requires a flag with security warning
For remote access, we recommend:
- SSH Tunnel (easiest):
ssh -L 8000:localhost:8000 user@server - VPN: WireGuard, OpenVPN, or Tailscale
- Reverse Proxy: Nginx/Caddy with HTTPS
# If you must open to network (not recommended)
pulse serve . --host 0.0.0.0 --allow-non-localhost-access-with-plaintext-transmission-risk
๐ Daily Usage
Pages
| Page | Description |
|---|---|
| Home | Statistics overview, search by natural language |
| Recent | Papers from last N days, filter by field |
| Sync | Sync status, field management, manual sync |
| Collections | Organize important papers into collections |
Features
- Search: Use natural language like "DFT calculations for battery materials"
- Filter: Click "Filter Fields" to select research areas
- AI Chat: Click the chat icon (bottom-right) to ask questions
- Paper Basket: Click basket icon on cards to collect papers for batch operations
- Settings: Click gear icon to modify API key, language, and sync options
๐ Project Structure
arxiv_pulse/
โโโ core/ # Core infrastructure (Config, Database, Lock)
โโโ models/ # SQLAlchemy ORM models
โโโ services/ # Business logic (AI, translation, papers)
โโโ crawler/ # ArXiv API crawler
โโโ ai/ # Paper summarizer, report generator
โโโ search/ # AI-powered search engine
โโโ cli/ # Command-line interface
โโโ web/ # FastAPI web application
โ โโโ app.py # FastAPI app
โ โโโ api/ # API endpoints
โ โโโ static/ # Vue 3 frontend (components, stores, i18n)
โโโ i18n/ # Backend translations
Data Directory/
โโโ data/arxiv_papers.db # SQLite database
โโโ web.log # Service log
For detailed architecture, see DEV.md.
๐ง API Endpoints
| Endpoint | Method | Description |
|---|---|---|
/api/config |
GET/PUT | Get/update configuration |
/api/config/status |
GET | Get initialization status |
/api/papers/search/stream |
GET (SSE) | AI-powered search |
/api/papers/recent/update |
POST (SSE) | Update recent papers |
/api/collections |
GET/POST | List/create collections |
/api/stats |
GET | Database statistics |
/api/chat/sessions/{id}/send |
POST (SSE) | Send message to AI |
๐งช Supported Research Fields
20+ research fields available:
| Category | Fields |
|---|---|
| Physics | Condensed Matter, Quantum Physics, High Energy, Nuclear, Astrophysics |
| Computation | DFT, First-Principles, MD, Force Fields, Computational Physics |
| AI/ML | Machine Learning, Artificial Intelligence |
| Chemistry | Quantum Chemistry, Chemical Physics |
| Math | Mathematical Physics, Numerical Analysis, Statistics |
| Others | Quantitative Biology, Electrical Engineering |
๐ Troubleshooting
Q: Port already in use?
pulse serve . --port 3000
Q: Service shows "not running" but port is occupied?
pulse stop . --force
# Or remove stale lock
rm .pulse.lock
Q: How to reinitialize?
rm data/arxiv_papers.db
pulse serve .
Q: AI not responding?
- Check API key in Settings
- Check console for errors (F12 โ Console)
- Try foreground mode to see logs:
pulse serve . -f
๐ License
GPL-3.0 - see LICENSE for details.
๐ Acknowledgments
This project was developed by OpenCode, an AI coding agent.
- Yang Li - For 500+ iterations of requirements discussions, design decisions, and testing feedback. This project would not exist without your patience and vision.
- GLM-5 - For providing the core intelligence that powers OpenCode. ~200 million tokens consumed in bringing this project to life.
- arXiv.org - For the open API
- DeepSeek - For AI models used in paper summarization
- Computational materials science community - For inspiration and use cases
arXiv Pulse - Making arXiv literature tracking simple and efficient!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file arxiv_pulse-1.2.1-py3-none-any.whl.
File metadata
- Download URL: arxiv_pulse-1.2.1-py3-none-any.whl
- Upload date:
- Size: 1.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
17d9dd994d2a5c526f54326f31bff2e698bdd43b2eb33516eec283f94dbba520
|
|
| MD5 |
4af3a6c893905818b12a31763d490718
|
|
| BLAKE2b-256 |
07085d7e5d581c34e6ec2738c49aa0a72b3ddfbf25254d93db58276d10c66602
|
Provenance
The following attestation bundles were made for arxiv_pulse-1.2.1-py3-none-any.whl:
Publisher:
publish.yaml on kYangLi/arXiv-Pulse
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
arxiv_pulse-1.2.1-py3-none-any.whl -
Subject digest:
17d9dd994d2a5c526f54326f31bff2e698bdd43b2eb33516eec283f94dbba520 - Sigstore transparency entry: 968444987
- Sigstore integration time:
-
Permalink:
kYangLi/arXiv-Pulse@bdafae5b9b1c1fe31e1126c6496b395c4e92df5e -
Branch / Tag:
refs/tags/v1.2.1 - Owner: https://github.com/kYangLi
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yaml@bdafae5b9b1c1fe31e1126c6496b395c4e92df5e -
Trigger Event:
release
-
Statement type: