Skip to main content

An open source deep learning library for Electronic Health Record (EHR) data

Project description

lemonpie

An open source deep learning library for Electronic Health Record (EHR) data.

In this initial release of the library ..

  • it implements 2 deep learning models (an LSTM and a CNN) based on popular papers
  • that can be trained on synthetic EHR data, created using the open source Synthea Patient Generator
  • to predict conditions that are on the CDC's list of top chronic diseases that contribute most to healthcare costs
    • and is easily configurable to train on and predict any conditions in the dataset

The end goal is to

  • keep adding more model implementations
  • keep adding different publicly available datasets
  • and have a leaderboard to track which models and configurations work best on these datasets

Install

With conda

  • conda install -c corazonlabs -c fastai -c conda-forge lemonpie

With pip

  • pip install lemonpie

Or ..

  1. Git clone the repo
    • https://github.com/corazonlabs/lemonpie.git
  2. Create a new conda env using the environment.yml file
    • cd lemonpie
    • conda env create --name lemonpie --file environment.yml

How to use

  1. Read through and then run the following Quick Start guides to get a general idea.
    • if using the cloned repo, run these noteboooks
      • 99_quick_walkthru.ipynb
      • 99_running_exps.ipynb
    • if using installed lib, just open a jupyter notebook and copy, paste & run cell-by-cell from these guides
  2. Setup Synthea
  3. Run experiments
    • Refer to Detailed Docs for customizations

Roadmap

  • A leader-board to track which models and configurations work best on different publicly available datasets.

  • Callbacks, Mixed Precision, etc

    • Either upgrade the library to use fastai v2.
    • Or as a minimum, build functionality for fastai-style callbacks & PyTorch AMP.
  • More models

    • Pick some of the best EHR models out there and implement them.
    • Ideas are welcome.
  • More datasets

    • eICU and MIMIC3 possibly.
    • Ideas are welcome.
  • NLP on clinical notes

    • Synthea does not have clinical notes, so this can only be done with other datasets.
  • Predicting different conditions

    • Again different datasets will allow this - e.g. hospitalization data (length of stay, in-patient mortality), ER data, etc.
  • Integraion with Experiment management tools like W&B, Comet, etc,.

Known Issues & Limitations

  1. num_workers > 0 not working yet, under investigation
    • Workaround - depending upon your GPU capacity, you can load the entire dataset on GPU pre-training with a single switch
      • If running manually set lazy_load_gpu=False when creating the data object with EHRData(.... )
      • If running through an Experiment config file, set it in the experiment.yaml file
  2. Test coverage
    • Need to write more tests for more comprehensive coverage

References

This library is created using the awesome nbdev

Synthea Synthetic Patient Population Simulator

Jason Walonoski, Mark Kramer, Joseph Nichols, Andre Quina, Chris Moesel, Dylan Hall, Carlton Duffett, Kudakwashe Dube, Thomas Gallagher, Scott McLachlan, Synthea:An approach, method, and software mechanism for generating synthetic patients and the synthetic electronic health care record, Journal of the American Medical Informatics Association, Volume 25, Issue 3, March 2018, Pages 230–238, https://doi.org/10.1093/jamia/ocx079

LSTM Model based on this paper - Scalable and accurate deep learning for electronic health records

Rajkomar, A., Oren, E., Chen, K. et al. Scalable and accurate deep learning with electronic health records. npj Digital Med 1, 18 (2018). https://doi.org/10.1038/s41746-018-0029-1

CNN Model based on this paper - Deepr: A Convolutional Net for Medical Records

Nguyen, P., Tran, T., Wickramasinghe, N., & Venkatesh, S. (2017). $\mathtt {Deepr}$:A Convolutional Net for Medical Records. IEEE Journal of Biomedical and Health Informatics, 21, 22-30.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lemonpie-0.1.2.tar.gz (33.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lemonpie-0.1.2-py3-none-any.whl (33.2 kB view details)

Uploaded Python 3

File details

Details for the file lemonpie-0.1.2.tar.gz.

File metadata

  • Download URL: lemonpie-0.1.2.tar.gz
  • Upload date:
  • Size: 33.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.5.0.1 requests/2.23.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.4

File hashes

Hashes for lemonpie-0.1.2.tar.gz
Algorithm Hash digest
SHA256 da5d271a22b739881fcd4ff84a7f9a1947b821828f822df060fd8b24bbbe172b
MD5 c0d270726d342fde1322bd660f6d59c0
BLAKE2b-256 dbdaec6b48c9545eb7b1091d2eb667aaad7d8643e2b8d36918fa8149997751c2

See more details on using hashes here.

File details

Details for the file lemonpie-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: lemonpie-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 33.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.5.0.1 requests/2.23.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.4

File hashes

Hashes for lemonpie-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1418c8c23d5160b032c7efca4c101f82d737a72b7437d6ccf41f19979b1a4e9d
MD5 9bf652c928863af691b157aa45dada5a
BLAKE2b-256 0deed0d2c8c6b3314e6679ec41b22c2d1e7f305b6e01e5ef531590d709cdeda3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page