Skip to main content

A simple CLI tool to download PubMed articles without the complexity of EDirect

Project description

ppget

A simple CLI tool to easily download PubMed articles

日本語版README | English

ppget is a command-line tool for searching and downloading literature data from PubMed. Unlike EDirect, which requires complex setup, you can start using it immediately.

✨ Features

  • 🚀 No installation required - Run instantly with uvx
  • 📝 CSV/JSON support - Easy to use in spreadsheets or programs
  • 🔍 Flexible search - Full support for PubMed search syntax (AND, OR, MeSH, etc.)
  • 📊 Automatic metadata - Automatically records search queries and timestamps
  • 🎯 Simple API - Clear and intuitive options

🚀 Quick Start

Run without installation (Recommended)

If you have uv installed, you can run it instantly without installation:

# Basic usage
uvx ppget "machine learning AND medicine"

# Specify number of results
uvx ppget "COVID-19 vaccine" -l 50

# Save as JSON
uvx ppget "cancer immunotherapy" -f json

Install and use

For frequent use, you can install it:

# Install with pip
pip install ppget

# Install with uv
uv tool install ppget

# Run
ppget "your search query"

📖 Usage

Basic usage

# Simple search (CSV format by default, up to 100 results)
ppget "diabetes treatment"

# Example output:
# Searching PubMed...
# Query: 'diabetes treatment'
# Max results: 100
# ✓ Found 100 articles
# ✓ Saved 100 articles to pubmed_20251018_143022.csv
# ✓ Metadata saved to pubmed_20251018_143022.meta.txt

Options

ppget [query] [options]

Required:
  query                 Search query

Options:
  -l, --limit          Maximum number of results (default: 100)
  -o, --output         Output file or directory
  -f, --format         Output format: csv or json (default: csv)
  -e, --email          Email address (for API rate limit relaxation)
  -q, --quiet          Suppress progress messages (errors only)
  -v, --version        Show version and exit
  -h, --help           Show help message

Advanced usage

1. Change number of results

# Retrieve up to 200 results
ppget "machine learning healthcare" -l 200

2. Specify output format

# Save as JSON
ppget "spine surgery" -f json

# Default is CSV (can be opened in Excel)
ppget "orthopedics" -f csv

3. Specify filename

# Specify file path directly
ppget "cancer research" -o results/cancer_papers.csv

# Specify directory (filename is auto-generated)
ppget "neuroscience" -o ./data/

# Extension determines format
ppget "cardiology" -o heart_disease.json

4. Specify email address (API rate limit relaxation)

NCBI's API has relaxed limits when you provide an email address:

ppget "genomics" -e your.email@example.com -l 500

5. Use PubMed search syntax

# AND search
ppget "machine learning AND radiology"

# OR search
ppget "COVID-19 OR SARS-CoV-2"

# MeSH term search
ppget "Diabetes Mellitus[MeSH] AND Drug Therapy[MeSH]"

# Filter by year
ppget "cancer immunotherapy AND 2024[PDAT]"

# Search by author
ppget "Smith J[Author]"

# Complex search
ppget "(machine learning OR deep learning) AND (radiology OR imaging) AND 2023:2024[PDAT]"

📁 Output Format

CSV format (default)

Easy to open in spreadsheets. A metadata file (.meta.txt) is also generated.

pubmed_20251018_143022.csv          # Article data
pubmed_20251018_143022.meta.txt     # Search metadata

CSV columns:

  • pubmed_id - PubMed ID
  • title - Title
  • abstract - Abstract
  • journal - Journal name
  • publication_date - Publication date
  • doi - DOI
  • authors - Author list (semicolon-separated)
  • keywords - Keywords (semicolon-separated)
  • conclusions - Conclusions
  • methods - Methods
  • results - Results
  • copyrights - Copyright information

JSON format

Easy to process programmatically.

[
  {
    "pubmed_id": "12345678",
    "title": "...",
    "abstract": "...",
    ...
  }
]

Metadata file (.meta.txt):

Query: machine learning
Search Date: 2025-10-18 14:30:22
Retrieved Results: 100
Data File: pubmed_20251018_143022.json

🆚 Comparison with EDirect

Feature ppget EDirect
Installation Not required (uvx instant run) Complex setup required
Ease of use Single command Multiple command combinations
Output format CSV/JSON XML/Text
Metadata Automatic Manual management
Learning curve Low High

EDirect example (complex)

# Search with EDirect (multiple steps required)
esearch -db pubmed -query "machine learning" | \
efetch -format abstract | \
xtract -pattern PubmedArticle -element MedlineCitation/PMID,ArticleTitle

ppget example (simple)

# With ppget, just one command
ppget "machine learning"

💡 Use Cases

Collecting research papers

# Collect latest papers on a specific topic
ppget "CRISPR gene editing" -l 100 -o crispr_papers.csv

# Run multiple searches at once
ppget "diabetes treatment 2024[PDAT]" -o diabetes_2024.csv
ppget "cancer immunotherapy 2024[PDAT]" -o cancer_2024.csv

For data analysis

# Retrieve in JSON format and analyze with Python
ppget "artificial intelligence healthcare" -f json -l 500 -o ai_health.json

# Example Python code to read
import json
with open('ai_health.json') as f:
    data = json.load(f)
    # Analysis...

Literature review

# Retrieve in CSV and manage in Excel
ppget "systematic review AND meta-analysis" -l 200 -o reviews.csv

# → Open in Excel and review titles and abstracts

🤝 Contributing

Bug reports and feature requests are welcome at Issues.

📄 License

MIT License - See LICENSE for details.

🙏 Acknowledgments

This tool uses pymed-paperscraper.


Start searching PubMed easily and quickly!

uvx ppget "your research topic"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ppget-0.1.0.tar.gz (17.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ppget-0.1.0-py3-none-any.whl (7.9 kB view details)

Uploaded Python 3

File details

Details for the file ppget-0.1.0.tar.gz.

File metadata

  • Download URL: ppget-0.1.0.tar.gz
  • Upload date:
  • Size: 17.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for ppget-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e0c7af918029273007bcbf0faf98afb9bbdcc88eee491346ba97e5125de55b53
MD5 a3738de53ef380adc9c77b4ef54c2f37
BLAKE2b-256 1bcb910424017d51c61886f5da4a5ea9a207a30a39dc7dee1ebf38505852cb9d

See more details on using hashes here.

File details

Details for the file ppget-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ppget-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for ppget-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 437bc923c155dcb8029a984ca4e7d17d060fcd76e76db4012c7a974d719bdf1b
MD5 5a34e178989606d795fabb56b889eb5b
BLAKE2b-256 5d4382907e43518c6dcad98ec1f0668502f2b9f99a4d973a905a00462024c54f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page