Skip to main content

A simple package for converting PubMed output (ie, the output from PubMed with all PubMed fields) into a flat file or pandas.DataFrame object. Simple functions are provided to merge the results of separate searches and automatically create variables indicating which search(es) a record was found in. Users can then add their own notes on the references by importing the flat file into a spreadsheet program of their choice and adding their notes in new columns.

Project description

Summary

A simple package for converting PubMed output with all PubMed fields into a flat file or pandas.DataFrame object. Simple functions are provided to merge the results of separate searches and automatically create tags (variables that are 'Y' or '') to indicate which search(es) a reference is found in.

Users can then add notes on the references by importing the flat file into a spreadsheet program and adding their notes in new columns. A user can then export this spreadsheet back to a flat file, convert it to a data frame, merge it with the results of additional searches, and convert this merged data frame back to a text file for further user notes, etc...

Obviously this is not a particularly robust 'database system' to use, and in particular users should take care that when updating flat files they do not omit any columns they manually created in the previous flat file.

Trademark Notice

The PubMed wordmark is a registered trademark of the U.S. Department of Health and Human Services. This software is not endorsed by or affiliated with the trademark holders.

Example

import flatmed as fm

# When rerunning, uncomment the `txt_to_df` call and change the next lines
# to a `merge_to_df()` call.
#master_df = fm.txt_to_df('ex_refs_mynotes.txt')
inpdir = 'downloaded/'

# Start with a single set of results and convert to a data frame
master_df = fm.pubmed_to_df(inpdir + 'TMLE.txt')
master_df['lta_TMLE'] = 'Y'

# Merge to this data frame four other searches using the default merge 
#   method (outer join) and create four more indicator variables.
master_df = fm.merge_to_df(master_df, inpdir + 'MSM.txt', 'lta_MSM')
master_df = fm.merge_to_df(master_df, inpdir + 'TVCONF.txt', 'lta_TVCONF')
master_df = fm.merge_to_df(master_df, inpdir + 'AIPW.txt', 'lta_AIPW')
master_df = fm.merge_to_df(master_df, inpdir + 'MSM_excl.txt',
                           'lta_MSM_excl')

# The `lta_MSM_excl` are a subset of the MSM results that we have already
#  determined can be excluded.
master_df = master_df[~((master_df.lta_MSM_excl == 'Y'))]

# After the above filter, the variable is always empty, so can be dropped.
master_df = master_df.drop(columns = ['lta_MSM_excl'])

# publication year
master_df['l_pubyear'] = master_df.DP.str.slice(start=0, stop=4)

# keep only a subset of the PubMed fields. Can add more if desired.
# We use the convention that any variables automatically created as a tag
# starts with 'lta_' and other variables we create (either externally or in
# this file) start with 'l_'.
my_cols = [ x for x in master_df.columns if x.startswith('lta_')
                                            or x.startswith('l_') ]
fm.df_to_txt(master_df, 'ex_refs.txt',
             ['PMID', 'TI', 'AU', 'SO', 'l_pubyear'] + my_cols + ['AB'])

Release Notes

Version 0.0.1

  • Initial release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flatmed-0.0.1.tar.gz (44.3 kB view details)

Uploaded Source

Built Distribution

flatmed-0.0.1-py3-none-any.whl (33.3 kB view details)

Uploaded Python 3

File details

Details for the file flatmed-0.0.1.tar.gz.

File metadata

  • Download URL: flatmed-0.0.1.tar.gz
  • Upload date:
  • Size: 44.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.2

File hashes

Hashes for flatmed-0.0.1.tar.gz
Algorithm Hash digest
SHA256 8147140aaa9da3a3318c72082713480bd5fa7cc8430fe8a2c269cfe1e0ffe02a
MD5 5f4214fe40e2da8ba76b0b94602dee79
BLAKE2b-256 18ba629732fc9834927eeb362911ba6be8c5288841dc1d20be7aecabcf267477

See more details on using hashes here.

File details

Details for the file flatmed-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: flatmed-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 33.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.2

File hashes

Hashes for flatmed-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9c50a58a7d62d01bf11ef1be5ff107d8579e64c0fc3085199e795bfe2c4d5bf8
MD5 fe3ac7cf6c59c039f057ee65cca74774
BLAKE2b-256 3bf4d871bc490449ac0d9935d14e20d8d872d61d16050014a5afda42bb3b7d7b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page