Skip to main content

getpaper - papers download made easy!

Project description

getpaper

Paper downloader

getting started

Install the library with:

pip install getpaper

Usage

Downloading papers

After the installation you can either import the library into your python code or you can use the console scripts, for example:

download download download_pubmed --pubmed 22266545 --folder papers --name pmid

Downloads the paper with pubmed id into the folder 'papers' and uses the pubmed id as name

download download download_doi --doi 10.1519/JSC.0b013e318225bbae --folder papers

Downloads the paper with DOI into the folder papers, as --name is not specified doi is used as name

Parsing the papers

You can parse the downloaded papers with the unstructure library. For example if the papers are in the folder test, you can run:

getpaper/parse.py parse_folder --folder /home/antonkulaga/sources/getpaper/test

You can also parse papers on a per file basis, for example:

getpaper/parse.py parse_paper --paper /home/antonkulaga/sources/getpaper/test/22266545.pdf

Additional requirements

Detectron2 is required for using models from the layoutparser model zoo but is not automatically installed with this package. For MacOS and Linux, build from source with:

pip install 'git+https://github.com/facebookresearch/detectron2.git@e2ce8dc#egg=detectron2'

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

getpaper-0.0.4.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

getpaper-0.0.4-py2.py3-none-any.whl (5.7 kB view details)

Uploaded Python 2Python 3

File details

Details for the file getpaper-0.0.4.tar.gz.

File metadata

  • Download URL: getpaper-0.0.4.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.11

File hashes

Hashes for getpaper-0.0.4.tar.gz
Algorithm Hash digest
SHA256 60dfd588f30e1f2439de7f1739f7ec63f007e56841f803803eeaf86541bdd170
MD5 7bfa323a8f8bc607450deb5f79092171
BLAKE2b-256 0989cf58a36c3a4a2c230489a70ac6cdf73a2ab16f4c1477e2692555c18517b6

See more details on using hashes here.

File details

Details for the file getpaper-0.0.4-py2.py3-none-any.whl.

File metadata

  • Download URL: getpaper-0.0.4-py2.py3-none-any.whl
  • Upload date:
  • Size: 5.7 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.11

File hashes

Hashes for getpaper-0.0.4-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 11872435a9c4185d5c996a997177581b074dbcfd984860fabce6c04530f5e0cc
MD5 ca343aa86388f2144fd7d077e05462a1
BLAKE2b-256 9171bedfbec63c9c60180e17dd2ffb6ede7e2ad5d6406a5d03c13fe8b78e770b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page