Skip to main content

Search for author affiliations from Pubmed for a list of Pubmed IDs and DOIs

Project description

Command line tool the takes one or a list of Pubmed IDs or DOIs, searches Pubmed for corresponding author affiliations and outputs the information to file

Command line options

-h, --help Show help text
-i PUBMEDID, --pubmedid PUBMEDID
 Search for author affiliations for single Pubmed ID
-d doi, --doi doi
 Search for author affiliations for a single DOI
-f file, --infile file
 File with a list of Pubmed IDs and DOIs (they can be mixed). One entry per line.
-x format, --format format
 Output format. Choices=[‘json’ (default),’text’]. ‘text’ option produces tab separated table, denormalised in the sense that the pubmed ID/DOI is repeated on multiple rows if there are multiple authors with related affiliations.

Example runs

Pubmed input and JSON output:

python pubmedAuthorAffiliation.py -i 27863242

Output:

{'articleTitle': 'Decoding Mammalian Ribosome-mRNA States by Translational GTPase Complexes.', 'journalTitle': 'Cell', 'authorList': [{'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Shao', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'S'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Murray', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'J'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Brown', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'A'}, {'firstName': 'n/a', 'institute': 'University of California', 'lastName': 'Taunton', 'affiliation': 'Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA.', 'country': 'USA', 'initials': 'J'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Ramakrishnan', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK. Electronic address: ramak@mrc-lmb.cam.ac.uk.', 'country': 'UK', 'initials': 'V'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Hegde', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK. Electronic address: rhegde@mrc-lmb.cam.ac.uk.', 'country': 'UK', 'initials': 'RS'}], 'pubmedId': '27863242', 'error': False}
{'articleTitle': 'Decoding Mammalian Ribosome-mRNA States by Translational GTPase Complexes.', 'journalTitle': 'Cell', 'authorList': [{'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Shao', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'S'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Murray', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'J'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Brown', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'A'}, {'firstName': 'n/a', 'institute': 'University of California', 'lastName': 'Taunton', 'affiliation': 'Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA.', 'country': 'USA', 'initials': 'J'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Ramakrishnan', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK. Electronic address: ramak@mrc-lmb.cam.ac.uk.', 'country': 'UK', 'initials': 'V'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Hegde', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK. Electronic address: rhegde@mrc-lmb.cam.ac.uk.', 'country': 'UK', 'initials': 'RS'}], 'pubmedId': '27863242', 'error': False}

DOI input and text output:

python pubmedAuthorAffiliation.py -d 10.1016/j.molcel.2016.11.013 -x text

Output:

27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     L       Tafur   European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     Y       Sadian  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     NA      Hoffmann        European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     AJ      Jakobi  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany; European Molecular Biology Laboratory (EMBL), Hamburg Unit, Notkestrasse 85, 22607 Hamburg, Germany.     Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     R       Wetzel  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     WJH     Hagen   European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     C       Sachse  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     CW      Müller  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany. Electronic address: cmueller@embl.de.    Germany European Molecular Biology Laboratory (EMBL)

Input file with mixed DOI and Pubmed. Text output written to a file:

python pubmedAuthorAffiliation.py -f emdb-2010.txt -x text > /tmp/out.txt

In this case unrecognized lines are ignored, e.g.:

WARNING:root:processList: id not recognized: id

Code testing

This will go through lists of selected Pubmed and DOI known to work:

python test_pubmedAuthorAffiliation.py

Project details


Release history Release notifications

This version

0.3

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for pubmed-author-affiliation, version 0.3
Filename, size File type Python version Upload date Hashes
Filename, size pubmed_author_affiliation-0.3-py2-none-any.whl (19.4 kB) File type Wheel Python version py2 Upload date Hashes View hashes
Filename, size pubmed-author-affiliation-0.3.tar.gz (15.5 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page