Framework for Electronic Medical Records. A python package for building models using EHR data.

Project description

FEMR

Framework for Electronic Medical Records

FEMR is a Python package for manipulating longitudinal EHR data for machine learning, with a focus on supporting the creation of foundation models and verifying their presumed benefits in healthcare. Such a framework is needed given the current state of large language models in healthcare and the need for better evaluation frameworks.

The currently supported foundation models is MOTOR.

FEMR works with data that has been converted to the MEDS schema, a simple schema that supports a wide variety of EHR / claims datasets. Please see the MEDS documentation, and in particular its provided ETLs for help converting your data to MEDS.

FEMR helps users:

Use ontologies to better understand / featurize medical codes
Algorithmically label subject records based on structured data
Generate tabular features from subject timelines for use with traditional gradient boosted tree models
Train and finetune MOTOR-derived models for binary classification and prediction tasks.

We recommend users start with our tutorial folder

Installation

pip install femr

# If you are using deep learning, you also need to install xformers
#
# Note that xformers has some known issues with MacOS.
# If you are using MacOS you might also need to install llvm. See https://stackoverflow.com/questions/60005176/how-to-deal-with-clang-error-unsupported-option-fopenmp-on-travis
pip install xformers

Getting Started

The first step of using FEMR is to convert your subject data into MEDS, the standard input format expected by FEMR codebase.

Note: FEMR currently only supports MEDS v3, so you will need to install MEDS v3 versions of packages. Aka pip install meds-etl==0.3.11

The best way to do this is with the ETLs provided by MEDS.

OMOP Data

If you have OMOP CDM formated data, follow these instructions:

Download your OMOP dataset to [PATH_TO_SOURCE_OMOP].
Convert OMOP => MEDS using the following:

# Convert OMOP => MEDS data format
meds_etl_omop [PATH_TO_SOURCE_OMOP] [PATH_TO_OUTPUT_MEDS]

Use HuggingFace's Datasets library to load our dataset in Python

dataset = datasets.Dataset.from_parquet(PATH_TO_OUTPUT_MEDS + 'data/*')

# Print dataset stats
print(dataset)
>>> Dataset({
>>>   features: ['subject_id', 'events'],
>>>   num_rows: 6732
>>> })

# Print number of events in first subject in dataset
print(len(dataset[0]['events']))
>>> 2287

Stanford STARR-OMOP Data

If you are using the STARR-OMOP dataset from Stanford (which uses the OMOP CDM), we add an initial Stanford-specific preprocessing step. Otherwise this should be identical to the OMOP Data section. Follow these instructions:

Download your STARR-OMOP dataset to [PATH_TO_SOURCE_OMOP].
Convert STARR-OMOP => MEDS using the following:

# Convert OMOP => MEDS data format
meds_etl_omop [PATH_TO_SOURCE_OMOP] [PATH_TO_OUTPUT_MEDS]_raw

# Apply Stanford fixes
femr_stanford_omop_fixer [PATH_TO_OUTPUT_MEDS]_raw [PATH_TO_OUTPUT_MEDS]

Use HuggingFace's Datasets library to load our dataset in Python

dataset = datasets.Dataset.from_parquet(PATH_TO_OUTPUT_MEDS + 'data/*')

# Print dataset stats
print(dataset)
>>> Dataset({
>>>   features: ['subject_id', 'events'],
>>>   num_rows: 6732
>>> })

# Print number of events in first subject in dataset
print(len(dataset[0]['events']))
>>> 2287

Development

The following guides are for developers who want to contribute to FEMR.

Precommit checks

Before committing, please run the following commands to ensure that your code is formatted correctly and passes all tests.

Installation

conda install pre-commit pytest -y
pre-commit install

Running

Test Functions

pytest tests

Formatting Checks

pre-commit run --all-files

Project details

Release history Release notifications | RSS feed

This version

0.2.4

Jul 8, 2025

0.2.3

Mar 21, 2024

0.2.2

Mar 21, 2024

0.2.1

Dec 10, 2023

0.2.0

Dec 5, 2023

0.1.16

Nov 3, 2023

0.1.15

Nov 3, 2023

0.1.14

Nov 3, 2023

0.1.13

Nov 3, 2023

0.1.12

Nov 3, 2023

0.1.11

Nov 3, 2023

0.1.10

Nov 2, 2023

0.1.9

Jul 8, 2023

0.1.8

May 2, 2023

0.1.7

Apr 28, 2023

0.1.6

Apr 28, 2023

0.1.5

Apr 26, 2023

0.1.3

Apr 19, 2023

0.1.0

Apr 19, 2023

0.0.314

Mar 11, 2024

0.0.198

Jun 12, 2023

0.0.21

May 8, 2024

0.0.20

Jun 13, 2023

0.0.19

Jun 3, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

femr-0.2.4.tar.gz (1.7 MB view details)

Uploaded Jul 8, 2025 Source

Built Distribution

femr-0.2.4-py3-none-any.whl (60.6 kB view details)

Uploaded Jul 8, 2025 Python 3

File details

Details for the file femr-0.2.4.tar.gz.

File metadata

Download URL: femr-0.2.4.tar.gz
Upload date: Jul 8, 2025
Size: 1.7 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for femr-0.2.4.tar.gz
Algorithm	Hash digest
SHA256	`3e8e4588bdf68b6d1a9224c6e90eb23b00e7e312a766a66fd180e13486f1b34e`
MD5	`522b3349c8288564e9ddb77ee3036b9c`
BLAKE2b-256	`3f9c0a2d954ff87cfae82592d8dd5a063e0817560a23907db68758e37e4d111d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for femr-0.2.4.tar.gz:

Publisher: build.yaml on som-shahlab/femr

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: femr-0.2.4.tar.gz
- Subject digest: 3e8e4588bdf68b6d1a9224c6e90eb23b00e7e312a766a66fd180e13486f1b34e
- Sigstore transparency entry: 267757751
- Sigstore integration time: Jul 8, 2025
Source repository:
- Permalink: som-shahlab/femr@51f678dc8ea0a042f9ed13948bf9c6c7053ef62e
- Branch / Tag: refs/tags/0.2.4
- Owner: https://github.com/som-shahlab
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: build.yaml@51f678dc8ea0a042f9ed13948bf9c6c7053ef62e
- Trigger Event: push

File details

Details for the file femr-0.2.4-py3-none-any.whl.

File metadata

Download URL: femr-0.2.4-py3-none-any.whl
Upload date: Jul 8, 2025
Size: 60.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for femr-0.2.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`525d56c06122167b1d67ccfdb6607b76e01093ea1e8842d6047003a48f81247d`
MD5	`b78ea9ddefdfaf85817fade25a6b6550`
BLAKE2b-256	`8a31ee09ff958784046800cd82885e42c1238263fab369ddfc11576210621dfc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for femr-0.2.4-py3-none-any.whl:

Publisher: build.yaml on som-shahlab/femr

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: femr-0.2.4-py3-none-any.whl
- Subject digest: 525d56c06122167b1d67ccfdb6607b76e01093ea1e8842d6047003a48f81247d
- Sigstore transparency entry: 267757755
- Sigstore integration time: Jul 8, 2025
Source repository:
- Permalink: som-shahlab/femr@51f678dc8ea0a042f9ed13948bf9c6c7053ef62e
- Branch / Tag: refs/tags/0.2.4
- Owner: https://github.com/som-shahlab
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: build.yaml@51f678dc8ea0a042f9ed13948bf9c6c7053ef62e
- Trigger Event: push

femr 0.2.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

FEMR

Framework for Electronic Medical Records

Installation

Getting Started

OMOP Data

Stanford STARR-OMOP Data

Development

Precommit checks

Installation

Running

Test Functions

Formatting Checks

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance