Skip to main content

Search for author affiliations from Pubmed for a list of Pubmed IDs and DOIs

Project description

Command line tool the takes one or a list of Pubmed IDs or DOIs, searches Pubmed for corresponding author affiliations and outputs the information to file

Command line options

-h, --help Show help text
-i PUBMEDID, --pubmedid PUBMEDID
 Search for author affiliations for single Pubmed ID
-d doi, --doi doi
 Search for author affiliations for a single DOI
-f file, --infile file
 File with a list of Pubmed IDs and DOIs (they can be mixed). One entry per line.
-x format, --format format
 Output format. Choices=[‘json’ (default),’text’]. ‘text’ option produces tab separated table, denormalised in the sense that the pubmed ID/DOI is repeated on multiple rows if there are multiple authors with related affiliations.

Example runs

Pubmed input and JSON output:

python pubmedAuthorAffiliation.py -i 27863242

Output:

{'articleTitle': 'Decoding Mammalian Ribosome-mRNA States by Translational GTPase Complexes.', 'journalTitle': 'Cell', 'authorList': [{'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Shao', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'S'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Murray', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'J'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Brown', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'A'}, {'firstName': 'n/a', 'institute': 'University of California', 'lastName': 'Taunton', 'affiliation': 'Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA.', 'country': 'USA', 'initials': 'J'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Ramakrishnan', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK. Electronic address: ramak@mrc-lmb.cam.ac.uk.', 'country': 'UK', 'initials': 'V'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Hegde', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK. Electronic address: rhegde@mrc-lmb.cam.ac.uk.', 'country': 'UK', 'initials': 'RS'}], 'pubmedId': '27863242', 'error': False}
{'articleTitle': 'Decoding Mammalian Ribosome-mRNA States by Translational GTPase Complexes.', 'journalTitle': 'Cell', 'authorList': [{'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Shao', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'S'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Murray', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'J'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Brown', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK.', 'country': 'UK', 'initials': 'A'}, {'firstName': 'n/a', 'institute': 'University of California', 'lastName': 'Taunton', 'affiliation': 'Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA.', 'country': 'USA', 'initials': 'J'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Ramakrishnan', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK. Electronic address: ramak@mrc-lmb.cam.ac.uk.', 'country': 'UK', 'initials': 'V'}, {'firstName': 'n/a', 'institute': 'MRC-LMB', 'lastName': 'Hegde', 'affiliation': 'MRC-LMB, Francis Crick Avenue, Cambridge CB2 0QH, UK. Electronic address: rhegde@mrc-lmb.cam.ac.uk.', 'country': 'UK', 'initials': 'RS'}], 'pubmedId': '27863242', 'error': False}

DOI input and text output:

python pubmedAuthorAffiliation.py -d 10.1016/j.molcel.2016.11.013 -x text

Output:

27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     L       Tafur   European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     Y       Sadian  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     NA      Hoffmann        European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     AJ      Jakobi  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany; European Molecular Biology Laboratory (EMBL), Hamburg Unit, Notkestrasse 85, 22607 Hamburg, Germany.     Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     R       Wetzel  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     WJH     Hagen   European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     C       Sachse  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany.  Germany European Molecular Biology Laboratory (EMBL)
27867008    Molecular cell  Molecular Structures of Transcribing RNA Polymerase I.  n/a     CW      Müller  European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany. Electronic address: cmueller@embl.de.    Germany European Molecular Biology Laboratory (EMBL)

Input file with mixed DOI and Pubmed. Text output written to a file:

python pubmedAuthorAffiliation.py -f emdb-2010.txt -x text > /tmp/out.txt

In this case unrecognized lines are ignored, e.g.:

WARNING:root:processList: id not recognized: id

Code testing

This will go through lists of selected Pubmed and DOI known to work:

python test_pubmedAuthorAffiliation.py

Project details


Release history Release notifications | RSS feed

This version

0.3

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pubmed-author-affiliation-0.3.tar.gz (15.5 kB view hashes)

Uploaded source

Built Distribution

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page