Framework for Electronic Medical Records. A python package for building models using EHR data.
Project description
FEMR
Framework for Electronic Medical Records
FEMR is a python package for building models using EHR data.
FEMR offers the following four main types of functionality. In order, they are the ability to:
- Convert EHR and claims data into a common schema, where each patient is associated with a timeline of events extracted from the EHR
- Apply labeling functions on that schema in order to derive labels for each patient
- Apply featurization schemes to obtain feature matrices for each patient
- Perform other common tasks necessary for research with EHR data
Installation
There are two variants of the FEMR package, a CPU only version and a CUDA enabled version.
How to install FEMR without CUDA
pip install femr
How to install FEMR with CUDA support
Note that CUDA-enabled FEMR requires jax in order to function.
pip install --upgrade "jax[cuda11_pip]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
pip install "femr_cuda[models]"
Development
The following guides are for developers who want to contribute to FEMR.
Building from source
In some scenarios (such as contributing to FEMR), you might want to compile the package from source.
In order to do so, follow the following instructions.
conda create -n FEMR_ENV python=3.10 bazel=6 -c conda-forge -y
conda activate FEMR_ENV
export BAZEL_USE_CPP_ONLY_TOOLCHAIN=1
git clone https://github.com/som-shahlab/femr.git
cd femr
pip install -e .
Special note for NERO users
As Nero does not have internet access, you must run the following before running the code above.
export DISTDIR=/local-scratch/nigam/distdir
(Optional) Installing CUDA on Nero / Carina
As a side note for Nero/Carina users, do not use your home directory to save the femr repo and installation files due to limited storage. We recommend using the shared project folder, e.g., on nero, use '/local-scratch/nigam/project/...'
If you are using Nero, you will need to install CUDA manually until the CUDA version on Nero is updated. To do so, follow these steps:
-
Download version 11.8 of CUDA onto your local machine from here
-
Copy your CUDA download from your local machine onto Nero, into whatever folder you'd like. We'll refer to the path to this folder as
<PATH_TO_CUDA_INSTALLER>
from now on.- Note: Nero doesn't work with
scp
. You can use an alternative likepscp
, which functions basically identically toscp
. You can installpscp
on a Mac by usingbrew install putty
.
- Note: Nero doesn't work with
-
ssh
into Nero usingssh <username>@nero-nigam.compute.stanford.edu
-
On Nero, run the CUDA installer as a bash command as follows:
bash <PATH_TO_CUDA_INSTALLER> --installpath=<INSTALL_PATH>
, where<PATH_TO_CUDA_INSTALLER>
is the path to the file you downloaded/transferred in Step #2, and<INSTALL_PATH>
is where you'd like to save your CUDA installation files. We recommend using~
or something similar. -
The CUDA installer will pop-up a window during installation. Uncheck all of the boxes it presents except for the box labeled "cuda toolkit".
-
After the installation completes, the installer will print out two paths to your console. Take note of these paths, and copy them into your
.bashrc
file by running the following commands. -
Install cuDNN v8.7.0 (November 28th, 2022) for CUDA. Go to this link and download the file
Download cuDNN v8.7.0 (November 28th, 2022), for CUDA 11.x
->Local Installer for Linux x86_64 (Tar)
on your local computer and transfer it over to your local folder in nero. Then follow the instruction here section 1.3. Note that you need to copy over cudnn files to your local cuda. For example,
cp cudnn-*-archive/include/cudnn*.h <path_to_your_cuda>/include
cp -P cudnn-*-archive/lib/libcudnn* <path_to_your_cuda>/lib64
chmod a+r <path_to_your_cuda>/include/cudnn*.h <path_to_your_cuda>/lib64/libcudnn*
- Add the following to your .bashrc file. You may need to restart your terminal for the changes to be reflected.
export PATH="<INSTALL_PATH>/bin:$PATH"
export LD_LIBRARY_PATH="<INSTALL_PATH>/lib64:$LD_LIBRARY_PATH"
To write in a .bashrc file, use
nano ~/.bashrc
- Run
rm /tmp/cuda-installer.log
to remove the installer log (if you don't do this, it will cause a segmentation fault for other users when they try to install CUDA).
Precommit checks
Before committing, please run the following commands to ensure that your code is formatted correctly and passes all tests.
Installation
conda install pre-commit pytest -y
pre-commit install
Running
Test Functions
pytest tests
Formatting Checks
pre-commit run --all-files
Miscellaneous
GZIP decompression commands
export OMOP_SOURCE=/share/pi/nigam...
gunzip $OMOP_SOURCE/**/*.csv.gz
Zstandard compression commands
export OMOP_SOURCE=/share/pi/nigam...
zstd -1 --rm $OMOP_SOURCE/**/*.csv
Generating extract
# Set up environment variables
# Path to a folder containing your raw STARR-OMOP download, generated via `tools.stanford.download_bigquery.py`
export OMOP_SOURCE=/path/to/omop/folder...
# Path to any arbitrary folder where you want to store your FEMR extract
export EXTRACT_DESTINATION=/path/to/femr/extract/folder...
# Path to any arbitrary folder where you want to store your FEMR extract logs
export EXTRACT_LOGS=/path/to/femr/extract/logs...
# Do some data preprocessing with Stanford-specific helper scripts
# Extract data from flowsheets
python tools/stanford/flowsheet_cleaner.py --num_threads 5 $OMOP_SOURCE "${EXTRACT_DESTINATION}_flowsheets"
# Normalize visits
python tools/omop/normalize_visit_detail.py --num_threads 5 "${EXTRACT_DESTINATION}_flowsheets" "${EXTRACT_DESTINATION}_flowsheets_detail"
# Run actual FEMR extraction
etl_stanford_omop "${EXTRACT_DESTINATION}_flowsheets_detail" $EXTRACT_DESTINATION $EXTRACT_LOGS --num_threads 10
Example usage (Note: This should take ~10 minutes on a 1% extract of STARR-OMOP)
export OMOP_SOURCE=/local-scratch/nigam/projects/ethanid/som-rit-phi-starr-prod.starr_omop_cdm5_deid_1pcent_2022_11_09
export EXTRACT_DESTINATION=/local-scratch/nigam/projects/mwornow/femr_starr_omop_cdm5_deid_1pcent_2022_11_09
export EXTRACT_LOGS=/local-scratch/nigam/projects/mwornow/femr_starr_omop_cdm5_deid_1pcent_2022_11_09_logs
python tools/stanford/flowsheet_cleaner.py --num_threads 5 $OMOP_SOURCE "${EXTRACT_DESTINATION}_flowsheets"
python tools/omop/normalize_visit_detail.py --num_threads 5 "${EXTRACT_DESTINATION}_flowsheets" "${EXTRACT_DESTINATION}_flowsheets_detail"
etl_stanford_omop "${EXTRACT_DESTINATION}_flowsheets_detail" $EXTRACT_DESTINATION $EXTRACT_LOGS --num_threads 10
(Optional) Installing PyTorch
If you are on Nero, you need to install PyTorch using:
conda install numpy -y
pip install torch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu111
If you are on Carina, you need to install PyTorch using:
conda install numpy pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia -y
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for femr-0.1.8-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c72c4954deae924e9a0a4079c9bad252d58976f78c96f6d4ed394a0b3cbda525 |
|
MD5 | 89e18d52f54bbb24181c60eb0135abbc |
|
BLAKE2b-256 | f8993e42d44d3bc41b69745c27bd1460b51033565440b744bcac4a8c7ff01b4a |
Hashes for femr-0.1.8-pp39-pypy39_pp73-macosx_10_14_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cf4df19c85378164616c2307f428d2705456388f3a4488f6c704d60f71432640 |
|
MD5 | c90253049b59ea0dd14e528b92e90356 |
|
BLAKE2b-256 | 3d54c68f2f47b5be88adbaa4fb9e33b0daa0e49bd2c48159ac703d808ceaf828 |
Hashes for femr-0.1.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e320f7039a17e7a78978037e38c3cd2141541565930bc657bf73492a90d5d62f |
|
MD5 | 73512ab7905ba838f6661be06019daeb |
|
BLAKE2b-256 | a0f7c3edf9fa2cc0789ea08127fb317998b63b67bc3754fbd1fc2eac24859b5a |
Hashes for femr-0.1.8-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c474b7d9fe56308895836929d6598d39cef5cb516e2667b7464704b5a1c3a919 |
|
MD5 | b4e50787be42e2886db85b6f74fd6df2 |
|
BLAKE2b-256 | 70abc5304b522f1d37ae31f43d0a9cad31c59703826d966c0b40735e2363b1a1 |
Hashes for femr-0.1.8-cp311-cp311-macosx_10_14_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4a461d1d2e5e2b0361f70ec20aa4173e9d731908dc97114f125702fdfcccfc68 |
|
MD5 | 343cd682268ab3d0751d845cd09e3d9d |
|
BLAKE2b-256 | d9aba6f44e1e2ebb4d1455742cfe69eb436bac291c0814cbdc0380f14e7c0823 |
Hashes for femr-0.1.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bbc9f920fc7a18ac22bb352d3860d3b09a4646e6b09f678605507a2372980a19 |
|
MD5 | 5b3b3b2a3c35e5358d572ac878d6fdc7 |
|
BLAKE2b-256 | 2897dc89e14f5514ac805a137edb4db04b0bba0d7796ee4f98dbea18c4d606c8 |
Hashes for femr-0.1.8-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c6fa314e34057fb986b0929bd0b3dd87de4ce77ef4c5a1e825a0e44969a8a31c |
|
MD5 | a9023e9a252f6a7fed08d68cfbaf041c |
|
BLAKE2b-256 | 49e8e25c8e13d7499afbcf05da7b98d58843db27ceb564fa4702ddf696cc65c0 |
Hashes for femr-0.1.8-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b75b15d9c8e6d51f766bd03236cdb83c2e5ca4368c5f6266d16adb29bc39587 |
|
MD5 | 0f42ea914459ee5a10301a002a1114e8 |
|
BLAKE2b-256 | 1e15ee8f01f526673a739d3e898dbdb6befa8b9b85ee3d230e7cdd2908c5d424 |
Hashes for femr-0.1.8-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 920bb59dc1380066cae166f83ae39fb042119d1dc98bbda7c6a4a044f3be2ec9 |
|
MD5 | 81279adbbfac348bc1bc8cea7dea68f9 |
|
BLAKE2b-256 | a38a40e4dc1449d4be8207c9ca90be233168574adee090bcb3027dc62e296ae2 |
Hashes for femr-0.1.8-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 78f2fe73609aa2bb4e8e29124028ecbc97118027e7d5343c36d8334fe09f1023 |
|
MD5 | c213844120ea67554f978e0343058714 |
|
BLAKE2b-256 | 2f12798128d20180dacd20d3aeb94c97f0a84ac3581e1f2de4dcf1297af71371 |
Hashes for femr-0.1.8-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f2e0e1d9766226cd2cff9a0476005b0408ba6f31f093ec8bea3d2b97dfb58ddf |
|
MD5 | 536c446db4c00e87a90e68ceaf7bacd9 |
|
BLAKE2b-256 | 61bae69d4ffcc89be087542b35c439878104d8f6027f733b250736c032a35785 |