Skip to main content

Command-line Papers Downloader. Citation extraction and PDF naming automation.

Project description

arXiv-dl

Command-line research paper downloader for papers hosted on arXiv, NeurIPS, CVF Open Access (CVPR, ICCV, WACV), and ECVA (ECCV).

Disclaimer: This is an opinionated command-line tool for downloading papers. It prioritizes ease of use for researchers and is not an official arXiv project.

What does it do?

  • Downloads papers from arXiv, NeurIPS, CVPR, ICCV, WACV, and ECCV with a simple CLI.
  • Speeds up downloads with aria2 when available.
  • Retrieves paper metadata:
    • Title, abstract, and year
    • Authors
    • Comments and conference acceptance info
    • Repository URLs when available
    • BibTeX citation
  • Maintains a list of local papers and their metadata in a JSON file.
  • Lets you configure the download destination with an environment variable or command-line option.
  • Saves downloaded papers with standardized filenames.

Why?

  • Save time downloading and organizing papers.
  • Use multiple parallel connections for faster downloads.
  • Keep a local paper list for lookup, notes, and citations.

Installation

Install with pip:

  • Prerequisite: Python 3.9 or later
python3 -m pip install -U arxiv-dl

[!NOTE] After installation, make sure the Python script installation directory is on your PATH. If the paper command is not found, see this PATH setup note or the Python Packaging guide for installing stand-alone command-line tools.

Optionally, install aria2c for multi-connection downloads.

  • macOS: brew install aria2
  • Linux: sudo snap install aria2c

Usage

After installation, use paper in your shell to download papers. The legacy commands arxiv-dl and getpaper are equivalent to paper.

paper [OPTIONS] TARGET(s)

Shell examples

# Download a single target
$ paper 1512.03385

# Download multiple targets
$ paper 2103.15538 2304.04415 https://arxiv.org/abs/1512.03385

Supported Targets

Click to expand

✅ Supported, 🚧 Not Yet Supported, ❌ Not Supported

  • ArXiv
    • ✅ ArXiv ID: 1512.03385 or arXiv:1512.03385
    • ✅ Legacy ArXiv ID: alg-geom/9708001 or cs/0002001, etc.
    • ✅ ArXiv Abstract Page URL: https://arxiv.org/abs/1512.03385
    • ✅ ArXiv PDF Page URL: https://arxiv.org/pdf/1512.03385.pdf
    • ✅ ArXiv HTML Page URL: https://arxiv.org/html/2506.15442
  • CVF Open Access (CVPR, ICCV, WACV)
    • ✅ CVF Abstract Page URL: https://openaccess.thecvf.com/content/**/html/**/*.html
    • ✅ CVF PDF Page URL: https://openaccess.thecvf.com/content/**/papers/**/*.pdf
  • ECVA (ECCV)
    • ✅ ECVA Abstract Page URL: https://www.ecva.net/html/**/*.php
    • ❌ ECVA PDF Page URL: https://www.ecva.net/papers/**/*.pdf
  • NeurIPS / NIPS
    • ✅ NeurIPS Abstract Page URL: https://proceedings.neurips.cc/paper_files/paper/**/hash/**/*.html
    • ✅ NeurIPS PDF Page URL: https://proceedings.neurips.cc/paper_files/paper/**/file/**/*.pdf
    • ✅ NIPS mirror Abstract Page URL: https://papers.nips.cc/paper_files/paper/**/hash/**/*.html
    • ✅ NIPS mirror PDF Page URL: https://papers.nips.cc/paper_files/paper/**/file/**/*.pdf
  • OpenReview
    • 🚧 TODO

Common Options

  • -v, --verbose: Print full details.
  • -d, --download-dir: Set the download directory for this run. This overrides both the default path and ARXIV_DOWNLOAD_FOLDER.
  • -n, --n-threads: Set the number of parallel download connections used by aria2.

[!TIP] Run paper -h to see all options.

Python API

from arxiv_dl import download_paper

download_paper(target="1512.03385", download_dir=".", set_verbose_level="silent")

Configuration

Default Download Destination

  • By default, papers are downloaded to $HOME/Downloads/ArXiv_Papers.

Custom Download Destination

Set ARXIV_DOWNLOAD_FOLDER to choose a persistent download destination. Add this to your .bashrc or .zshrc:

export ARXIV_DOWNLOAD_FOLDER="YOUR/PATH/TO/ANY/FOLDER"
  • Download destination priority:
    1. Command-line option -d (highest priority)
    2. Environment variable ARXIV_DOWNLOAD_FOLDER
    3. Default download destination (lowest priority)

Custom Command Alias

  • You can define aliases to rename the command or add default options:
    alias dp="paper"
    alias dpv="paper -v -d '~/Documents/Papers'"
    

Contributing

Development, testing, build, and publishing notes are in DEVELOPMENT.md.

License

This project is licensed under the MIT License.
© Mark H. Huang. All rights reserved.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arxiv_dl-1.3.0.tar.gz (967.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arxiv_dl-1.3.0-py3-none-any.whl (22.3 kB view details)

Uploaded Python 3

File details

Details for the file arxiv_dl-1.3.0.tar.gz.

File metadata

  • Download URL: arxiv_dl-1.3.0.tar.gz
  • Upload date:
  • Size: 967.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for arxiv_dl-1.3.0.tar.gz
Algorithm Hash digest
SHA256 9f42858d256633d9fd32497cb9cf50b67b837243af3f671528839ea9b5a31458
MD5 60c08b53a6436175c90cbd6e7c90410f
BLAKE2b-256 371dc2b86e5aa925e78371bf2eb6a9e63a21aabbfab62c595eebcbf19f702ef6

See more details on using hashes here.

Provenance

The following attestation bundles were made for arxiv_dl-1.3.0.tar.gz:

Publisher: publish.yml on MarkHershey/arxiv-dl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file arxiv_dl-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: arxiv_dl-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 22.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for arxiv_dl-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f6eaa05b16aa2f048cd2c02a770615aca786865eb10674efff0bed5dd01d8955
MD5 0b087548953f4ba1e5121ed4d2e908e0
BLAKE2b-256 b7e788d5b6c8b2045184fb58fc5f83621a44f21add11d39a46c9102f5add6a9c

See more details on using hashes here.

Provenance

The following attestation bundles were made for arxiv_dl-1.3.0-py3-none-any.whl:

Publisher: publish.yml on MarkHershey/arxiv-dl

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page