Skip to main content

The Literature mining and processing utility

Project description

# Combined Data Mining Utility

This script provides a combined utility for data mining tasks related to PubMed articles. It offers various functionalities to facilitate tasks such as searching PubMed, retrieving abstracts, downloading full texts, processing PubMed IDs, crawling URLs, removing duplicates, converting PDFs to text files, and additional utilities.

## Getting Started

### Prerequisites

Ensure you have Python installed on your system. The script is compatible with both Python 2 and 3.

### Installation

  1. pip install scholarsync

To address the error message,
WARNING: The script scholarsync is installed in ‘/home/username/.local/bin’ which is not on PATH.

open a terminal and type the following command, then press Enter:

echo 'export PATH="$PATH:/home/username/.local/bin"' >> ~/.bashrc && source ~/.bashrc

Replace username with your actual username.

## Usage

Upon running the script, you will be prompted with a menu to select the desired functionality. The available options include:

  • PubMed search/query

  • Get abstracts from PubMed IDs

  • Attempt full text download from PubMed

  • Process PubMed IDs to get DOI for 3rd party search

  • URL transformation of PubMed IDs for 3rd party search

  • Crawling and downloading from URLs

  • Removing duplicates

  • Converting PDFs to text files

  • Additional utilities

Follow the on-screen instructions to navigate through the menu and execute the desired tasks.

## License

By using this script, you agree to the terms of the LICENSE included in the repository.

## Contributing

Contributions are welcome! Feel free to submit pull requests or open issues for any improvements or bug fixes.

## Acknowledgments

  • This script was developed to simplify various data mining tasks related to PubMed articles.

  • Special thanks to the developers and contributors of the libraries used in this script.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scholarsync-0.0.3.0.tar.gz (30.0 kB view hashes)

Uploaded Source

Built Distribution

scholarsync-0.0.3.0-py3-none-any.whl (31.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page