Reliably scrape and clean Google Scholar citations. Automatically uploads to Zotero. Bibtex local file exports are also supported.
Project description
pyserpZotero
Installation:
pip install pyserpzotero
Usage:
psz
Resource | URL |
---|---|
Docs | https://pyserpzotero.readthedocs.io |
GitHub Repo | https://github.com/hack-r/pyserpZotero |
PyPI Package | https://pypi.org/project/pyserpZotero/ |
SerpAPI | https://serpAPI.com |
Zotero | https://zotero.org |
How to configure it?
ArXiv, BioRxiv, and MedRxiv do not require configuration, although they can be disabled in the config file. You'll need to provide an API key for serpAPI and Zotero, as well as a Zotero library Id. You can either provide these directly as arguments to the functions, via the interactive mode, or manage them more securely via a YAML configuration file, as in the Example Usage below.
Key Features
- GUI and CLI Interfaces: Offers both a user-friendly graphical interface and a command-line interface for flexibility.
- Automated Search: Search Google Scholar and preprint servers like arXiv, medRxiv, and bioRxiv.
- Citation Management: Automatically fetches citations and uploads them to your Zotero library.
- PDF Attachment: Attempts to find and download free PDFs of the articles and attach them to the Zotero entries.
- Duplicate Avoidance: Checks your Zotero library to avoid adding duplicate entries.
Installation
pip install pyserpZotero
Ensure you have the following dependencies installed:
- Python 3.7 or higher
- Required Python packages will be installed automatically via
pip
.
Usage
Graphical User Interface (GUI)
Starting with version 1.2, pyserpZotero
includes a GUI application for an improved user experience.
Launching the GUI
After installation, you can start the GUI by running:
pyserpZotero_gui
Features of the GUI
- Easy Configuration: Input your SerpAPI and Zotero credentials directly in the GUI.
- Multiple Search Terms: Enter multiple search terms separated by semicolons (
;
). - Progress Monitoring: View real-time progress of your searches and downloads.
- Log Viewer: See detailed logs of the operations being performed.
- PDF Viewer: Open and view downloaded PDFs directly from the application.
Command-Line Interface (CLI)
For users who prefer the command line, pyserpZotero
still offers a robust CLI.
Starting the CLI
Simply run:
psz
Interactive Mode
The CLI will prompt you for:
- SerpAPI Key
- Zotero Library ID
- Zotero API Key
- Download preferences
- Search terms
Example Usage
Enter your SerpAPI API key:
Enter your Zotero library ID:
Enter your Zotero API key:
Enter download destination path (leave empty for current directory):
Do you want to download your citation library to avoid duplicating entries? [Y/n]:
Do you want to download PDFs? [Y/n]:
Enter the oldest year to search from (leave empty if none):
Enter the max number of searches you would like to do (leave empty for default value of 50):
Enter up to 20 search phrases separated by semi-colon(;): Cancer Research; Humanoid Robot; DNA mutation
Configuration
You can provide your API keys and preferences either during the interactive prompts or by editing the config.yaml
file created in your current directory.
config.yaml
Example
SERP_API_KEY: your_serpapi_key
ZOT_ID: your_zotero_library_id
ZOT_KEY: your_zotero_api_key
DOWNLOAD_DEST: ./downloads
ENABLE_LIB_DOWNLOAD: true
ENABLE_PDF_DOWNLOAD: true
NO_SERP: false
NO_ARXIV: false
NO_BIOARXIV: false
NO_MEDARXIV: false
Advanced Features
Skipping Specific Platforms
You can configure the application to skip searching specific platforms by setting the following options in your config.yaml
:
NO_SERP: true
- Skip searching using SerpAPI (Google Scholar).NO_ARXIV: true
- Skip searching on arXiv.NO_BIOARXIV: true
- Skip searching on bioRxiv.NO_MEDARXIV: true
- Skip searching on medRxiv.
Multiple Queries
You can add multiple queries separated by semicolons (;
). The application will process each query sequentially.
PDF Viewer
The GUI includes a built-in PDF viewer. After downloading PDFs, you can open and view them directly from the application.
Dependencies
Make sure to have the following packages installed:
- ttkbootstrap: For enhanced GUI styling.
pip install ttkbootstrap
- PyMuPDF: For PDF viewing functionality.
pip install PyMuPDF
Why use SerpAPI?
SerpAPI provides stable access to Google Scholar without IP throttling, ensuring that your searches are reliable and uninterrupted.
How do I obtain API keys?
- SerpAPI Key: Sign up at SerpAPI to get your API key.
- Zotero API Key: Log in to your Zotero account and navigate to API Settings to create a new key.
Is there a free tier for SerpAPI?
Yes, SerpAPI offers a free tier which currently allows for 100 searches per month.
Why do you sometimes align assignment operators across lines like that?
It's an R programming practice for readability based on major style guides (not PEP).
Contributing
Contributions and forks are welcome! Please see the GitHub repository for contribution guidelines.
License
This project is licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pyserpzotero-1.2.0.tar.gz
.
File metadata
- Download URL: pyserpzotero-1.2.0.tar.gz
- Upload date:
- Size: 64.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3099b9e93d275b2a28e514d6519079eab67f44455eeb8778ae22acf5e07298f0 |
|
MD5 | a2fa02dd57f24f69b22f7b67b92906b6 |
|
BLAKE2b-256 | 620aa7e3983d7284d43b76566a00462babfb299cbdc7a48ca1b6eabdd1b24ebc |
File details
Details for the file pyserpzotero-1.2.0-py3-none-any.whl
.
File metadata
- Download URL: pyserpzotero-1.2.0-py3-none-any.whl
- Upload date:
- Size: 64.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c7045792d84721301fc4c79c30a1a09fd6d4dbc8fd9f4dbe94e4e73d8276c06c |
|
MD5 | 1aede1055b7313bf9c917afc6af78f90 |
|
BLAKE2b-256 | 7310f4f37a7cff54b70facc1373cc741fd8b16a02178b2972833f64ff2c449d6 |