Command-line Papers Downloader. Citation extraction and PDF naming automation.
Project description
arXiv-dl
Command-line research paper downloader for papers hosted on arXiv, NeurIPS, CVF Open Access (CVPR, ICCV, WACV), and ECVA (ECCV).
Disclaimer: This is an opinionated command-line tool for downloading papers. It prioritizes ease of use for researchers and is not an official arXiv project.
What does it do?
- Downloads papers from arXiv, NeurIPS, CVPR, ICCV, WACV, and ECCV with a simple CLI.
- Speeds up downloads with aria2 when available.
- Retrieves paper metadata:
- Title, abstract, and year
- Authors
- Comments and conference acceptance info
- Repository URLs when available
BibTeXcitation
- Maintains a list of local papers and their metadata in a JSON file.
- Lets you configure the download destination with an environment variable or command-line option.
- Saves downloaded papers with standardized filenames.
Why?
- Save time downloading and organizing papers.
- Use multiple parallel connections for faster downloads.
- Keep a local paper list for lookup, notes, and citations.
Installation
For regular command-line use, install with pipx:
- Prerequisite: Python 3.9 or later
pipx install arxiv-dl
If pipx is not installed:
# Debian/Ubuntu
sudo apt install pipx
pipx ensurepath
# macOS
brew install pipx
pipx ensurepath
[!NOTE]
pipxinstalls command-line tools in isolated environments and exposes their commands on yourPATH. This avoids conflicts with operating-system-managed Python installations, including Debian/Ubuntu environments that block globalpip installthrough PEP 668.
To upgrade:
pipx upgrade arxiv-dl
If you prefer pip, install inside a virtual environment:
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -U arxiv-dl
Optionally, install aria2c for multi-connection downloads.
- macOS:
brew install aria2 - Linux:
sudo snap install aria2c
Usage
After installation, use paper in your shell to download papers.
The legacy commands arxiv-dl and getpaper are equivalent to paper.
paper [OPTIONS] TARGET(s)
Shell examples
# Download a single target
$ paper 1512.03385
# Download multiple targets
$ paper 2103.15538 2304.04415 https://arxiv.org/abs/1512.03385
Supported Targets
Click to expand
✅ Supported, 🚧 Not Yet Supported, ❌ Not Supported
- ArXiv
- ✅ ArXiv ID:
1512.03385orarXiv:1512.03385 - ✅ Legacy ArXiv ID:
alg-geom/9708001orcs/0002001, etc. - ✅ ArXiv Abstract Page URL:
https://arxiv.org/abs/1512.03385 - ✅ ArXiv PDF Page URL:
https://arxiv.org/pdf/1512.03385.pdf - ✅ ArXiv HTML Page URL:
https://arxiv.org/html/2506.15442
- ✅ ArXiv ID:
- CVF Open Access (CVPR, ICCV, WACV)
- ✅ CVF Abstract Page URL:
https://openaccess.thecvf.com/content/**/html/**/*.html - ✅ CVF PDF Page URL:
https://openaccess.thecvf.com/content/**/papers/**/*.pdf
- ✅ CVF Abstract Page URL:
- ECVA (ECCV)
- ✅ ECVA Abstract Page URL:
https://www.ecva.net/html/**/*.php - ❌ ECVA PDF Page URL:
https://www.ecva.net/papers/**/*.pdf
- ✅ ECVA Abstract Page URL:
- NeurIPS / NIPS
- ✅ NeurIPS Abstract Page URL:
https://proceedings.neurips.cc/paper_files/paper/**/hash/**/*.html - ✅ NeurIPS PDF Page URL:
https://proceedings.neurips.cc/paper_files/paper/**/file/**/*.pdf - ✅ NIPS mirror Abstract Page URL:
https://papers.nips.cc/paper_files/paper/**/hash/**/*.html - ✅ NIPS mirror PDF Page URL:
https://papers.nips.cc/paper_files/paper/**/file/**/*.pdf
- ✅ NeurIPS Abstract Page URL:
- OpenReview
- 🚧 TODO
Common Options
-v,--verbose: Print full details.-d,--download-dir: Set the download directory for this run. This overrides both the default path andARXIV_DOWNLOAD_FOLDER.-n,--n-threads: Set the number of parallel download connections used byaria2.
[!TIP] Run
paper -hto see all options.
Python API
from arxiv_dl import download_paper
download_paper(target="1512.03385", download_dir=".", set_verbose_level="silent")
Configuration
Default Download Destination
- By default, papers are downloaded to
$HOME/Downloads/ArXiv_Papers.
Custom Download Destination
Set ARXIV_DOWNLOAD_FOLDER to choose a persistent download destination. Add this to your .bashrc or .zshrc:
export ARXIV_DOWNLOAD_FOLDER="YOUR/PATH/TO/ANY/FOLDER"
- Download destination priority:
- Command-line option
-d(highest priority) - Environment variable
ARXIV_DOWNLOAD_FOLDER - Default download destination (lowest priority)
- Command-line option
Custom Command Alias
- You can define aliases to rename the command or add default options:
alias dp="paper" alias dpv="paper -v -d '~/Documents/Papers'"
Contributing
Development, testing, build, and publishing notes are in DEVELOPMENT.md.
License
This project is licensed under the MIT License.
© Mark H. Huang. All rights reserved.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file arxiv_dl-1.3.1.tar.gz.
File metadata
- Download URL: arxiv_dl-1.3.1.tar.gz
- Upload date:
- Size: 967.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0de5f77993eb38fc1541848f5c1e46b5a7058f81b3266f0d213cac5c865eb791
|
|
| MD5 |
ba0f92425b3702abae6ed35f833dae56
|
|
| BLAKE2b-256 |
7effadc6841fe22d482bcbc4366957ed8932e255c35c36b0087561256922f088
|
Provenance
The following attestation bundles were made for arxiv_dl-1.3.1.tar.gz:
Publisher:
publish.yml on MarkHershey/arxiv-dl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
arxiv_dl-1.3.1.tar.gz -
Subject digest:
0de5f77993eb38fc1541848f5c1e46b5a7058f81b3266f0d213cac5c865eb791 - Sigstore transparency entry: 1503206487
- Sigstore integration time:
-
Permalink:
MarkHershey/arxiv-dl@54efacad19f2ce1ad4cd7c247240cb0a2bb7ac49 -
Branch / Tag:
refs/tags/v1.3.1 - Owner: https://github.com/MarkHershey
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@54efacad19f2ce1ad4cd7c247240cb0a2bb7ac49 -
Trigger Event:
push
-
Statement type:
File details
Details for the file arxiv_dl-1.3.1-py3-none-any.whl.
File metadata
- Download URL: arxiv_dl-1.3.1-py3-none-any.whl
- Upload date:
- Size: 22.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a07dfc6948c65e255c194dd4b28a399f371c1893059c9fd4a3eeaf6f41b0cd5d
|
|
| MD5 |
3bf54b75dd4d3ff7b5ae674516a015d8
|
|
| BLAKE2b-256 |
b67919ed2d33462841685b04172b5e3f82ba8ff1124b43c8db7eab6022dbc02d
|
Provenance
The following attestation bundles were made for arxiv_dl-1.3.1-py3-none-any.whl:
Publisher:
publish.yml on MarkHershey/arxiv-dl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
arxiv_dl-1.3.1-py3-none-any.whl -
Subject digest:
a07dfc6948c65e255c194dd4b28a399f371c1893059c9fd4a3eeaf6f41b0cd5d - Sigstore transparency entry: 1503207020
- Sigstore integration time:
-
Permalink:
MarkHershey/arxiv-dl@54efacad19f2ce1ad4cd7c247240cb0a2bb7ac49 -
Branch / Tag:
refs/tags/v1.3.1 - Owner: https://github.com/MarkHershey
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@54efacad19f2ce1ad4cd7c247240cb0a2bb7ac49 -
Trigger Event:
push
-
Statement type: