Skip to main content

Reliably scrape and clean Google Scholar citations. Automatically uploads to Zotero. Bibtex local file exports are also supported.

Project description

pyserpZotero

Installation: pip install pyserpzotero

Usage: psz

Resource URL
Docs https://pyserpzotero.readthedocs.io
GitHub Repo https://github.com/hack-r/pyserpZotero
PyPI Package https://pypi.org/project/pyserpZotero/
SerpAPI https://serpAPI.com
Zotero https://zotero.org

How to configure it?

ArXiv, BioRxiv, and MedRxiv do not require configuration, although they can be disabled in the config file. You'll need to provide an API key for serpAPI and Zotero, as well as a Zotero library Id. You can either provide these directly as arguments to the functions, via the interactive mode, or manage them more securely via a YAML configuration file, as in the Example Usage below.

Key Features

  • GUI and CLI Interfaces: Offers both a user-friendly graphical interface and a command-line interface for flexibility.
  • Automated Search: Search Google Scholar and preprint servers like arXiv, medRxiv, and bioRxiv.
  • Citation Management: Automatically fetches citations and uploads them to your Zotero library.
  • PDF Attachment: Attempts to find and download free PDFs of the articles and attach them to the Zotero entries.
  • Duplicate Avoidance: Checks your Zotero library to avoid adding duplicate entries.

Installation

pip install pyserpZotero

Ensure you have the following dependencies installed:

  • Python 3.7 or higher
  • Required Python packages will be installed automatically via pip.

Usage

Graphical User Interface (GUI)

Starting with version 1.2, pyserpZotero includes a GUI application for an improved user experience.

Launching the GUI

After installation, you can start the GUI by running:

pyserpZotero_gui

Features of the GUI

  • Easy Configuration: Input your SerpAPI and Zotero credentials directly in the GUI.
  • Multiple Search Terms: Enter multiple search terms separated by semicolons (;).
  • Progress Monitoring: View real-time progress of your searches and downloads.
  • Log Viewer: See detailed logs of the operations being performed.
  • PDF Viewer: Open and view downloaded PDFs directly from the application.

Command-Line Interface (CLI)

For users who prefer the command line, pyserpZotero still offers a robust CLI.

Starting the CLI

Simply run:

psz

Interactive Mode

The CLI will prompt you for:

  • SerpAPI Key
  • Zotero Library ID
  • Zotero API Key
  • Download preferences
  • Search terms

Example Usage

Enter your SerpAPI API key:
Enter your Zotero library ID:
Enter your Zotero API key:
Enter download destination path (leave empty for current directory):
Do you want to download your citation library to avoid duplicating entries? [Y/n]:
Do you want to download PDFs? [Y/n]:
Enter the oldest year to search from (leave empty if none):
Enter the max number of searches you would like to do (leave empty for default value of 50):
Enter up to 20 search phrases separated by semi-colon(;): Cancer Research; Humanoid Robot; DNA mutation

Configuration

You can provide your API keys and preferences either during the interactive prompts or by editing the config.yaml file created in your current directory.

config.yaml Example

SERP_API_KEY: your_serpapi_key
ZOT_ID: your_zotero_library_id
ZOT_KEY: your_zotero_api_key
DOWNLOAD_DEST: ./downloads
ENABLE_LIB_DOWNLOAD: true
ENABLE_PDF_DOWNLOAD: true
NO_SERP: false
NO_ARXIV: false
NO_BIOARXIV: false
NO_MEDARXIV: false

Advanced Features

Skipping Specific Platforms

You can configure the application to skip searching specific platforms by setting the following options in your config.yaml:

  • NO_SERP: true - Skip searching using SerpAPI (Google Scholar).
  • NO_ARXIV: true - Skip searching on arXiv.
  • NO_BIOARXIV: true - Skip searching on bioRxiv.
  • NO_MEDARXIV: true - Skip searching on medRxiv.

Multiple Queries

You can add multiple queries separated by semicolons (;). The application will process each query sequentially.

PDF Viewer

The GUI includes a built-in PDF viewer. After downloading PDFs, you can open and view them directly from the application.

Dependencies

Make sure to have the following packages installed:

  • ttkbootstrap: For enhanced GUI styling.
    pip install ttkbootstrap
    
  • PyMuPDF: For PDF viewing functionality.
    pip install PyMuPDF
    

Why use SerpAPI?

SerpAPI provides stable access to Google Scholar without IP throttling, ensuring that your searches are reliable and uninterrupted.

How do I obtain API keys?

  • SerpAPI Key: Sign up at SerpAPI to get your API key.
  • Zotero API Key: Log in to your Zotero account and navigate to API Settings to create a new key.

Is there a free tier for SerpAPI?

Yes, SerpAPI offers a free tier which currently allows for 100 searches per month.

Why do you sometimes align assignment operators across lines like that?

It's an R programming practice for readability based on major style guides (not PEP).

Contributing

Contributions and forks are welcome! Please see the GitHub repository for contribution guidelines.

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyserpzotero-1.2.0.tar.gz (64.8 MB view details)

Uploaded Source

Built Distribution

pyserpzotero-1.2.0-py3-none-any.whl (64.2 MB view details)

Uploaded Python 3

File details

Details for the file pyserpzotero-1.2.0.tar.gz.

File metadata

  • Download URL: pyserpzotero-1.2.0.tar.gz
  • Upload date:
  • Size: 64.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.9

File hashes

Hashes for pyserpzotero-1.2.0.tar.gz
Algorithm Hash digest
SHA256 3099b9e93d275b2a28e514d6519079eab67f44455eeb8778ae22acf5e07298f0
MD5 a2fa02dd57f24f69b22f7b67b92906b6
BLAKE2b-256 620aa7e3983d7284d43b76566a00462babfb299cbdc7a48ca1b6eabdd1b24ebc

See more details on using hashes here.

File details

Details for the file pyserpzotero-1.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pyserpzotero-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c7045792d84721301fc4c79c30a1a09fd6d4dbc8fd9f4dbe94e4e73d8276c06c
MD5 1aede1055b7313bf9c917afc6af78f90
BLAKE2b-256 7310f4f37a7cff54b70facc1373cc741fd8b16a02178b2972833f64ff2c449d6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page