Skip to main content

An intelligent arXiv literature crawler and analyzer for physics research

Project description

arXiv Pulse

Intelligent arXiv Literature Tracking System

Version Python License

๐ŸŒ Language: ไธญๆ–‡ๆ–‡ๆกฃ

arXiv Pulse is a Python package for automated crawling, summarizing, and tracking of the latest research papers from arXiv in condensed matter physics, density functional theory (DFT), machine learning, force fields, and computational materials science. It provides a modern web interface for a professional literature management experience.

๐Ÿ“ธ Screenshots

English Interface

โœจ Key Features

  • ๐ŸŒ Web Interface: Modern FastAPI + Vue 3 + Element Plus interface with real-time SSE streaming
  • ๐Ÿš€ One-Command Start: Simply run pulse serve to start the service
  • ๐Ÿ“ Web Configuration: First-time setup wizard, all settings stored in database
  • ๐Ÿค– AI Auto-Processing: Automatic translation, AI summarization, and figure extraction
  • ๐Ÿ’ฌ AI Chat Assistant: Ask questions about papers with context-aware AI assistant
  • ๐Ÿ” Smart Search: Natural language queries with AI-powered keyword parsing
  • ๐Ÿ“ Paper Collections: Create, edit, and delete collections to organize important papers
  • ๐Ÿ›’ Paper Basket: Select multiple papers for batch operations
  • ๐Ÿ”’ Secure by Default: Localhost-only binding, explicit confirmation for remote access
  • ๐ŸŒ Multilingual Support: UI in Chinese/English, translation to multiple languages

๐Ÿ†• What's New in 1.2.0

  • Enhanced UI Components: Redesigned buttons, switches, selects, dialogs with refined shadows and transitions
  • Paper Index Numbers: Visual index numbers on paper cards for easy reference
  • Back-to-Top Button: Quick navigation with scroll-aware floating button
  • Tooltips for Floating Buttons: Helpful labels on hover for all floating action buttons
  • Recent Papers AI Search: Search within recent papers using natural language
  • Sync Page Improvements: Better spacing, help icons with tooltips
  • SQLite WAL Mode: Concurrent read/write operations for better performance
  • Bug Fixes: Form submission, pagination visibility, index preservation during search

๐Ÿš€ Quick Start

Installation

pip install arxiv-pulse

Start Service

# Create data directory
mkdir my_papers && cd my_papers

# Start web service (background mode by default)
pulse serve .

# Or specify port
pulse serve . --port 3000

# Foreground mode (see logs in terminal)
pulse serve . -f

Then visit http://localhost:8000

Service Management

pulse status .          # Check service status
pulse stop .            # Stop service
pulse restart .         # Restart service
pulse stop . --force    # Force stop (SIGKILL)

Remote Access (SSH Tunnel)

By default, the service only accepts localhost connections for security. For remote access, use SSH tunnel:

# On server
pulse serve .

# On your computer
ssh -L 8000:localhost:8000 user@server

# Then visit http://localhost:8000

This provides encrypted connection without exposing your API keys.

First-Time Setup

  1. Visit http://localhost:8000
  2. Follow the setup wizard:
    • Step 1: Configure AI API (OpenAI/DeepSeek key, model, endpoint)
    • Step 2: Select research fields
    • Step 3: Set sync parameters
    • Step 4: Start initial sync

๐Ÿ”’ Security

arXiv Pulse is designed with security in mind:

  • Localhost-only by default: Service binds to 127.0.0.1, inaccessible from external networks
  • No plaintext credentials: API keys stored in local SQLite database, never transmitted
  • Explicit remote access: Opening to non-localhost requires a flag with security warning

For remote access, we recommend:

  1. SSH Tunnel (easiest): ssh -L 8000:localhost:8000 user@server
  2. VPN: WireGuard, OpenVPN, or Tailscale
  3. Reverse Proxy: Nginx/Caddy with HTTPS
# If you must open to network (not recommended)
pulse serve . --host 0.0.0.0 --allow-non-localhost-access-with-plaintext-transmission-risk

๐Ÿ“– Daily Usage

Pages

Page Description
Home Statistics overview, search by natural language
Recent Papers from last N days, filter by field
Sync Sync status, field management, manual sync
Collections Organize important papers into collections

Features

  • Search: Use natural language like "DFT calculations for battery materials"
  • Filter: Click "Filter Fields" to select research areas
  • AI Chat: Click the chat icon (bottom-right) to ask questions
  • Paper Basket: Click basket icon on cards to collect papers for batch operations
  • Settings: Click gear icon to modify API key, language, and sync options

๐Ÿ“ Project Structure

arxiv_pulse/
โ”œโ”€โ”€ core/                   # Core infrastructure (Config, Database, Lock)
โ”œโ”€โ”€ models/                 # SQLAlchemy ORM models
โ”œโ”€โ”€ services/               # Business logic (AI, translation, papers)
โ”œโ”€โ”€ crawler/                # ArXiv API crawler
โ”œโ”€โ”€ ai/                     # Paper summarizer, report generator
โ”œโ”€โ”€ search/                 # AI-powered search engine
โ”œโ”€โ”€ cli/                    # Command-line interface
โ”œโ”€โ”€ web/                    # FastAPI web application
โ”‚   โ”œโ”€โ”€ app.py             # FastAPI app
โ”‚   โ”œโ”€โ”€ api/               # API endpoints
โ”‚   โ””โ”€โ”€ static/            # Vue 3 frontend (components, stores, i18n)
โ””โ”€โ”€ i18n/                   # Backend translations

Data Directory/
โ”œโ”€โ”€ data/arxiv_papers.db    # SQLite database
โ””โ”€โ”€ web.log                 # Service log

For detailed architecture, see DEV.md.

๐Ÿ”ง API Endpoints

Endpoint Method Description
/api/config GET/PUT Get/update configuration
/api/config/status GET Get initialization status
/api/papers/search/stream GET (SSE) AI-powered search
/api/papers/recent/update POST (SSE) Update recent papers
/api/collections GET/POST List/create collections
/api/stats GET Database statistics
/api/chat/sessions/{id}/send POST (SSE) Send message to AI

๐Ÿงช Supported Research Fields

20+ research fields available:

Category Fields
Physics Condensed Matter, Quantum Physics, High Energy, Nuclear, Astrophysics
Computation DFT, First-Principles, MD, Force Fields, Computational Physics
AI/ML Machine Learning, Artificial Intelligence
Chemistry Quantum Chemistry, Chemical Physics
Math Mathematical Physics, Numerical Analysis, Statistics
Others Quantitative Biology, Electrical Engineering

๐Ÿ› Troubleshooting

Q: Port already in use?

pulse serve . --port 3000

Q: Service shows "not running" but port is occupied?

pulse stop . --force
# Or remove stale lock
rm .pulse.lock

Q: How to reinitialize?

rm data/arxiv_papers.db
pulse serve .

Q: AI not responding?

  • Check API key in Settings
  • Check console for errors (F12 โ†’ Console)
  • Try foreground mode to see logs: pulse serve . -f

๐Ÿ“„ License

GPL-3.0 - see LICENSE for details.

๐Ÿ™ Acknowledgments

This project was developed by OpenCode, an AI coding agent.

  • Yang Li - For 500+ iterations of requirements discussions, design decisions, and testing feedback. This project would not exist without your patience and vision.
  • GLM-5 - For providing the core intelligence that powers OpenCode. ~200 million tokens consumed in bringing this project to life.
  • arXiv.org - For the open API
  • Computational materials science community - For inspiration and use cases

arXiv Pulse - Making arXiv literature tracking simple and efficient!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arxiv_pulse-1.2.3-py3-none-any.whl (1.2 MB view details)

Uploaded Python 3

File details

Details for the file arxiv_pulse-1.2.3-py3-none-any.whl.

File metadata

  • Download URL: arxiv_pulse-1.2.3-py3-none-any.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for arxiv_pulse-1.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 9b8fa533a745c8dece60beb35fd2fb0f5527df1e33a5cf204bbfc9a558d2090d
MD5 9a4fed4e203b286d5a65c87ac33b52ca
BLAKE2b-256 16fbf0eb2eba269d819a8858ec68557420924f2c8b76271a5216041dd04b0ec3

See more details on using hashes here.

Provenance

The following attestation bundles were made for arxiv_pulse-1.2.3-py3-none-any.whl:

Publisher: publish.yaml on kYangLi/arXiv-Pulse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page