Skip to main content

A command line application for downloading scientific papers

Project description

Overview

papers-dl is a command line application for downloading scientific papers.

Usage

# parse DOI identifiers from a file:
papers-dl parse -m doi --path pages/my-paper.html

# parse ISBN identifiers from a file, output matches as CSV:
papers-dl parse -m isbn --path pages/my-paper.html -f csv

# fetch paper with given identifier from any known provider:
papers-dl fetch "10.1016/j.cub.2019.11.030"

# fetch paper from any known Sci-Hub URL with verbose logging on, and store in "papers" directory:
papers-dl -v fetch -p "scihub" -o "papers" "10.1107/s0907444905036693"

# fetch paper from specific Sci-Hub URL:
papers-dl fetch -p "sci-hub.ee" "10.1107/s0907444905036693"

# fetch paper from SciDB (Anna's Archive):
papers-dl fetch -p "scidb" "10.1107/s0907444905036693"

About

papers-dl attempts to be a comprehensive tool for gathering research papers from popular open libraries. There are other solutions for this (see "Other tools" below), but papers-dl is trying to fill its own niche:

  • comprehensive: other tools usually work with a single library, while papers-dl is trying to support a collection of popular libraries.
  • performant: papers-dl tries to improve search and retrieval times by making use of concurrency where possible.

That said, papers-dl may not be the best choice for your specific use case right now. For example, if you require features supported by a specific library, one of the more mature and specialized tools listed below may be a better option.

papers-dl was initially created to serve as an extractor for ArchiveBox, a powerful solution for self-hosted web archiving.

This project started as a fork of scihub.py.

Other tools

Roadmap

papers-dl's CLI is not yet stable.

Short-term roadmap:

parsing

  • add support for parsing more identifier types, like PMID, ISSN, and arXiv identifiers

fetching

  • add arXiv as a provider
  • add support for downloading formats other than PDFs, like HTML or epub

searching

  • add a CLI command for searching libraries for papers and metadata

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

papers_dl-0.0.21.tar.gz (12.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

papers_dl-0.0.21-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file papers_dl-0.0.21.tar.gz.

File metadata

  • Download URL: papers_dl-0.0.21.tar.gz
  • Upload date:
  • Size: 12.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for papers_dl-0.0.21.tar.gz
Algorithm Hash digest
SHA256 ed6e36b72b413309c3a99ba3d82e69c23602bf4c289a4d90aa8ecaaf6b7f06cd
MD5 e3206aa301250fbf2fc23488fb60088c
BLAKE2b-256 ab9579f1b01f427fe9a4b670a4496e2c1ca1d1c1480c84258b4eb4edfd5a08ad

See more details on using hashes here.

File details

Details for the file papers_dl-0.0.21-py3-none-any.whl.

File metadata

  • Download URL: papers_dl-0.0.21-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for papers_dl-0.0.21-py3-none-any.whl
Algorithm Hash digest
SHA256 e21b5c2f43b940dc7619aed3010f834198e40576236cd466200cfa0533bc300b
MD5 3849c695cb96b37a9bbc5153cc409ea1
BLAKE2b-256 20b0b79a38bdca312f8b47dfa5f9372457e4afaba89e1d74d88b25c50a520dd4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page