Simple interface to query or scrape IDs from PubMed.
Project description
pubmed-id
Simple interface to query or scrape IDs from PubMed (The US National Library of Medicine).
This tool was originally developed to obtain temporal data for the well-known PubMed graph dataset.
Usage
Command line interface
A CLI is included that allows querying the PubMed via their API or by web scraping.
usage: pubmed-id [-h] [-o OUTPUT_FILE] [-m {api,citedin,refs,scrape}]
[-w WORKERS] [-c SIZE] [--email ADDRESS] [--tool NAME]
[--quiet] [--log-level {critical,error,warning,info,debug}]
ID [ID ...]
positional arguments:
ID IDs to query (separated by whitespaces).
optional arguments:
-h, --help show this help message and exit
-o OUTPUT_FILE, --output-file OUTPUT_FILE
File to write results to (default: 'PubMedAPI.json').
-m {api,citedin,refs,scrape}, --method {api,citedin,refs,scrape}
Method to obtain data with (default: 'api').
-w WORKERS, --max-workers WORKERS
Number of processes to use (optional).
-c SIZE, --chunksize SIZE
Number of objects sent to each worker (optional).
--email ADDRESS Your e-mail address (required to query API only).
--tool NAME Tool name (optional, used to query API only).
--quiet Does not print results (limited to a single item only
by default).
--log-level {critical,error,warning,info,debug}
Logging level (optional).
Importing as a class
Quick example on how to obtain data from the API:
>>> from pubmed_id import PubMedAPI
>>> api = PubMedAPI(email="myemail@domain.com", tool="MyToolName")
For more information on the API, please check the official documentation.
Obtain data from API
By default, the returned data is a dictionary with the PMCID, the PMID, and the DOI of a paper:
>>> api(6798965)
{
"pmcid": "PMC1163140",
"pmid": "6798965",
"doi": "10.1042/bj1970405"
}
Either an integer (PMID), a string (PMID or PMCID), or a list is accepted as input when calling the class directly.
Note: NCBI recommends that users post no more than three URL requests per second and limit large jobs to either weekends or between 9:00 PM and 5:00 AM Eastern time during weekdays. See more: Usage Guidelines.
Scrape data from website
Scraping the PMID or PMICD instead returns more data (strings shortened for brevity):
>>> api(6798965, method="scrape")
{
"6798965": {
"date": "1981 Aug 1",
"title": "Characterization of N-glycosylated...",
"abstract": "The N epsilon-glycosylation of...",
"author_names": "A Le Pape;J P Muh;A J Bailey",
"author_ids": "6798965;6798965;6798965",
"doi": "PMC1163140",
"pmid": "6798965"
}
}
Note: some papers are unavailable from the API, but still return data when scraped, e.g., PMID 15356126.
Get paper references
Returns list of references from a paper:
>>> api(6798965, method="refs")
{
"6798965": [
"7430347",
"..."
]
}
Get citations for a paper
Returns list of citations to a paper:
>>> api(6798965, method="citedin")
{
"15356126": [
"32868408",
"..."
]
}
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pubmed_id-1.1.tar.gz.
File metadata
- Download URL: pubmed_id-1.1.tar.gz
- Upload date:
- Size: 8.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0824ebc5cd045bdb3993caa49dd12ab0244ce2a9e4b0e67b65895b1397146d46
|
|
| MD5 |
dc830e1e25df1101018af145170d9dad
|
|
| BLAKE2b-256 |
ba21cd8e55a8f7c0ea2ec3a895f25ddcb2facc18ca50e9c6e7942e9cf627a377
|
File details
Details for the file pubmed_id-1.1-py3-none-any.whl.
File metadata
- Download URL: pubmed_id-1.1-py3-none-any.whl
- Upload date:
- Size: 7.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd40283b33ae73ae4ef54794410b03486d1117334711f210b6e03da17c923b84
|
|
| MD5 |
110a066d14778579ee3d68239b79eedb
|
|
| BLAKE2b-256 |
ebb1589ed3be5124c6d84e4ba829a0aa2947e692b6e0db357dc78a8ba3b029cf
|