Skip to main content

A configurable daily ArXiv digest agent — filter and summarise papers in any research area

Project description

arxiv-paper-digest

A daily agent that monitors ArXiv for XAI and interpretability papers, filters by semantic similarity, summarises with a local LLM, and saves a Markdown digest. No external API keys required.

How it works

1. ArXiv RSS 
2. filter_unseen (SQLite)
3. semantic filter (sentence-transformers)
4. summarise (Ollama) 
5. outputs/digests/YYYY-MM-DD.md

Quickstart

# Requires Ollama running with the model pulled
ollama pull llama3.2:3b

pip install arxiv-xai-digest
arxiv-digest              # full run
arxiv-digest --dry-run    # skip LLM, test the rest of the pipeline

Or from source:

git clone https://github.com/ilonae/research-agent
cd research-agent
pip install -e ".[dev]"

Configuration

Copy .env.example to .env. All variables are optional — defaults shown:

Variable Default
AGENT_OLLAMA_MODEL llama3.2:3b
AGENT_OLLAMA_URL http://localhost:11434
AGENT_MAX_PER_FEED 20
AGENT_ARXIV_CATEGORIES ["cs.LG","cs.AI","cs.CV"]
AGENT_SIMILARITY_THRESHOLD 0.35
AGENT_EMBEDDING_MODEL all-MiniLM-L6-v2
AGENT_ANCHORS 8 XAI topic sentences

Docker

docker compose up   # starts Ollama sidecar + agent

Scheduled runs

.github/workflows/daily-digest.yml runs at 07:00 UTC and commits the digest back to the repo. Trigger manually from Actions : Run workflow to test.

Querying the memory store

sqlite3 outputs/seen_papers.db \
  "SELECT title, first_seen FROM seen_papers
   WHERE first_seen >= date('now', '-7 days')
   ORDER BY first_seen DESC;"

Development

pip install -e ".[dev]"
pytest && ruff check . && mypy agent/ tools/ config/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arxiv_paper_digest-0.1.0.tar.gz (10.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arxiv_paper_digest-0.1.0-py3-none-any.whl (14.2 kB view details)

Uploaded Python 3

File details

Details for the file arxiv_paper_digest-0.1.0.tar.gz.

File metadata

  • Download URL: arxiv_paper_digest-0.1.0.tar.gz
  • Upload date:
  • Size: 10.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for arxiv_paper_digest-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8ce7942f6eabee15ce2a73a1555bc3c79e9880a7bd88a0af3144e8cc66f0b93c
MD5 c3a75ae0a7de8031db7379b37b03e10d
BLAKE2b-256 6604aa3068a146dc35302f7bd464d7badf38efed2478c676f1909e0f9d204056

See more details on using hashes here.

File details

Details for the file arxiv_paper_digest-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for arxiv_paper_digest-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7239951c49e8d8ff218f0e6360d32b4ad6c24282033394b847fe28fff95e8cce
MD5 e6b106567ed434a3e6f6f8d400f5235d
BLAKE2b-256 fec31f61e5f5a91d433dce363b434f08b7d016c0e76d6f9161701adc05b19501

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page