Skip to main content

Command-line Papers Downloader. Citation extraction and PDF naming automation.

Project description

arXiv-dl

Command-line research paper downloader for papers hosted on arXiv, NeurIPS, CVF Open Access (CVPR, ICCV, WACV), and ECVA (ECCV).

Disclaimer: This is an opinionated command-line tool for downloading papers. It prioritizes ease of use for researchers and is not an official arXiv project.

What does it do?

  • Downloads papers from arXiv, NeurIPS, CVPR, ICCV, WACV, and ECCV with a simple CLI.
  • Speeds up downloads with aria2 when available.
  • Retrieves paper metadata:
    • Title, abstract, and year
    • Authors
    • Comments and conference acceptance info
    • Repository URLs when available
    • BibTeX citation
  • Maintains a list of local papers and their metadata in a JSON file.
  • Lets you configure the download destination with an environment variable or command-line option.
  • Saves downloaded papers with standardized filenames.

Why?

  • Save time downloading and organizing papers.
  • Use multiple parallel connections for faster downloads.
  • Keep a local paper list for lookup, notes, and citations.

Installation

For regular command-line use, install with pipx:

  • Prerequisite: Python 3.9 or later
pipx install arxiv-dl

If pipx is not installed:

# Debian/Ubuntu
sudo apt install pipx
pipx ensurepath

# macOS
brew install pipx
pipx ensurepath

[!NOTE] pipx installs command-line tools in isolated environments and exposes their commands on your PATH. This avoids conflicts with operating-system-managed Python installations, including Debian/Ubuntu environments that block global pip install through PEP 668.

To upgrade:

pipx upgrade arxiv-dl

If you prefer pip, install inside a virtual environment:

python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -U arxiv-dl

Optionally, install aria2c for multi-connection downloads.

  • macOS: brew install aria2
  • Linux: sudo snap install aria2c

Usage

After installation, use paper in your shell to download papers. The legacy commands arxiv-dl and getpaper are equivalent to paper.

paper [OPTIONS] TARGET(s)

Shell examples

# Download a single target
$ paper 1512.03385

# Download multiple targets
$ paper 2103.15538 2304.04415 https://arxiv.org/abs/1512.03385

Supported Targets

Click to expand

✅ Supported, 🚧 Not Yet Supported, ❌ Not Supported

  • ArXiv
    • ✅ ArXiv ID: 1512.03385 or arXiv:1512.03385
    • ✅ Legacy ArXiv ID: alg-geom/9708001 or cs/0002001, etc.
    • ✅ ArXiv Abstract Page URL: https://arxiv.org/abs/1512.03385
    • ✅ ArXiv PDF Page URL: https://arxiv.org/pdf/1512.03385.pdf
    • ✅ ArXiv HTML Page URL: https://arxiv.org/html/2506.15442
  • CVF Open Access (CVPR, ICCV, WACV)
    • ✅ CVF Abstract Page URL: https://openaccess.thecvf.com/content/**/html/**/*.html
    • ✅ CVF PDF Page URL: https://openaccess.thecvf.com/content/**/papers/**/*.pdf
  • ECVA (ECCV)
    • ✅ ECVA Abstract Page URL: https://www.ecva.net/html/**/*.php
    • ❌ ECVA PDF Page URL: https://www.ecva.net/papers/**/*.pdf
  • NeurIPS / NIPS
    • ✅ NeurIPS Abstract Page URL: https://proceedings.neurips.cc/paper_files/paper/**/hash/**/*.html
    • ✅ NeurIPS PDF Page URL: https://proceedings.neurips.cc/paper_files/paper/**/file/**/*.pdf
    • ✅ NIPS mirror Abstract Page URL: https://papers.nips.cc/paper_files/paper/**/hash/**/*.html
    • ✅ NIPS mirror PDF Page URL: https://papers.nips.cc/paper_files/paper/**/file/**/*.pdf
  • OpenReview
    • 🚧 TODO

Common Options

  • -v, --verbose: Print full details.
  • -d, --download-dir: Set the download directory for this run. This overrides both the default path and ARXIV_DOWNLOAD_FOLDER.
  • -n, --n-threads: Set the number of parallel download connections used by aria2.

[!TIP] Run paper -h to see all options.

Python API

from arxiv_dl import download_paper

download_paper(target="1512.03385", download_dir=".", set_verbose_level="silent")

Configuration

Default Download Destination

  • By default, papers are downloaded to $HOME/Downloads/ArXiv_Papers.

Custom Download Destination

Set ARXIV_DOWNLOAD_FOLDER to choose a persistent download destination. Add this to your .bashrc or .zshrc:

export ARXIV_DOWNLOAD_FOLDER="YOUR/PATH/TO/ANY/FOLDER"
  • Download destination priority:
    1. Command-line option -d (highest priority)
    2. Environment variable ARXIV_DOWNLOAD_FOLDER
    3. Default download destination (lowest priority)

Custom Command Alias

  • You can define aliases to rename the command or add default options:
    alias dp="paper"
    alias dpv="paper -v -d '~/Documents/Papers'"
    

Contributing

Development, testing, build, and publishing notes are in DEVELOPMENT.md.

License

This project is licensed under the MIT License.
© Mark H. Huang. All rights reserved.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arxiv_dl-1.3.1.tar.gz (967.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arxiv_dl-1.3.1-py3-none-any.whl (22.4 kB view details)

Uploaded Python 3

File details

Details for the file arxiv_dl-1.3.1.tar.gz.

File metadata

  • Download URL: arxiv_dl-1.3.1.tar.gz
  • Upload date:
  • Size: 967.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for arxiv_dl-1.3.1.tar.gz
Algorithm Hash digest
SHA256 0de5f77993eb38fc1541848f5c1e46b5a7058f81b3266f0d213cac5c865eb791
MD5 ba0f92425b3702abae6ed35f833dae56
BLAKE2b-256 7effadc6841fe22d482bcbc4366957ed8932e255c35c36b0087561256922f088

See more details on using hashes here.

Provenance

The following attestation bundles were made for arxiv_dl-1.3.1.tar.gz:

Publisher: publish.yml on MarkHershey/arxiv-dl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arxiv_dl-1.3.1-py3-none-any.whl.

File metadata

  • Download URL: arxiv_dl-1.3.1-py3-none-any.whl
  • Upload date:
  • Size: 22.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for arxiv_dl-1.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a07dfc6948c65e255c194dd4b28a399f371c1893059c9fd4a3eeaf6f41b0cd5d
MD5 3bf54b75dd4d3ff7b5ae674516a015d8
BLAKE2b-256 b67919ed2d33462841685b04172b5e3f82ba8ff1124b43c8db7eab6022dbc02d

See more details on using hashes here.

Provenance

The following attestation bundles were made for arxiv_dl-1.3.1-py3-none-any.whl:

Publisher: publish.yml on MarkHershey/arxiv-dl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page