Command-line arXiv Papers Downloader. Citation extraction and PDF naming automation.
Project description
arXiv-dl
Command-line ArXiv & CVF Open Access Paper Downloader. [PyPI] [Source]
Disclaimer: This is a highly-opinionated CLI tool for downloading papers. It priorities ease of use for researchers. Obviously, this is not an official project.
What does it do?
- Support downloading papers from ArXiv, CVPR, ICCV, WACV via simple CLI.
- Support downloading speedup by using aria2c.
- Retrieve the paper's metadata such as:
- Title, Abstract, Year
- Authors
- Comments (Conference acceptance info)
- Repository URLs
BibTeX
Citation
- Automatically maintain a list of local papers and their metadata in a JSON file.
- Configure the desired download destination via an environment variable or a command-line argument.
- All downloaded papers will have standardized filename for easy browsing.
Why?
- Save time and effort to download and organize papers on your machine.
- Speedup downloading process by using multiple parallel connections.
- Local paper list would be handy for quick local lookup, making notes, and doing citations.
How to install it?
This is a command-line tool, use pip
to install the package globally.
- Pre-requisite:
Python 3.x
python3 -m pip install --upgrade arxiv-dl
(Optional) Install aria2c for download speedup.
- MacOS:
brew install aria2
- Linux:
sudo snap install aria2c
How to use it?
After installation, three equivalent commands arxiv-dl
, getpaper
, paper
should be available in your terminal.
$ paper [-h] [-v] [-p] [-d DOWNLOAD_DIR] [-n N_THREADS] urls [urls ...]
Options:
-v
,--verbose
(optional): Print paper metadata.-p
,--pdf_only
(optional): Download PDF only without creating Markdown notes-d
,--download_dir
(optional): Specify one-time download directory. This option will override the default download directory or the one specified in the environment variableARXIV_DOWNLOAD_FOLDER
.-n
,--n_threads
(optional): Specify the number of parallel connections to be used byaria2
.
Usage Examples:
# Use ArXiv Paper ID
$ paper 1512.03385 2103.15538
# Use ArXiv Abstract Page URL
$ paper https://arxiv.org/abs/2103.15538
# Use ArXiv PDF Page URL
$ paper https://arxiv.org/pdf/1512.03385.pdf
# Use CVF Open Access URL
$ paper "https://openaccess.thecvf.com/content/CVPR2021/html/Lin_Real-Time_High-Resolution_Background_Matting_CVPR_2021_paper.html"
Configurations
Set Custom Download Destination (Optional)
- Default Download Destination:
~/Downloads/ArXiv_Papers
- To set custom download destination, use the environment variable
ARXIV_DOWNLOAD_FOLDER
. Include the following line in your.bashrc
or.zshrc
file:export ARXIV_DOWNLOAD_FOLDER=~/Documents/Papers
- Precedence:
- Command-line option
-d
- Environment variable
ARXIV_DOWNLOAD_FOLDER
- Default download destination
- Command-line option
Set Custom Command Alias (Optional)
- You can always set your own preferred alias for the default
getpaper
command. - Include the following line(s) in your
.bashrc
or.zshrc
file to set your preferred alias:alias dp="getpaper" alias dpv="getpaper -v -d '~/Documents/Papers'"
Development
Set up development environment
python3 -m venv venv && \
source venv/bin/activate && \
pip install -e ".[dev]"
Run Tests
pytest
Build the package
make
Clean cache & build artifacts
make clean
TODOs
- Add support for ara2c.
- Add support for papers on CVF Open Access.
- Add support for papers on OpenReview.
License
MIT License - Copyright (c) 2021-2022 Mark Huang
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
arxiv-dl-1.1.3.tar.gz
(26.1 kB
view hashes)
Built Distribution
arxiv_dl-1.1.3-py3-none-any.whl
(20.1 kB
view hashes)