Skip to main content

Utilities for using Models with health data.

Project description

Electronic Medical Record Machine Learning Utilities for using Models with health data.

https://badge.fury.io/py/ehrml.svg

Configuration:

These utilities are very dependent on a particular configuration format. In python, it is a list of dicts, where each dict represents configuration for a particular field. The keys in this dict are as follows:

field

Description

index

The index to write the value (or in the case of any one-hot field, to start writing values) in the numpy array.

missing_flag_index

The index to write a one to if the data is missing (pre-imputation) in the numpy array. Do not set if no such missing data flag is desired for this field.

rwb_src

The value used to represent the value for all observation lists. Also, the field name associated with the “flat” data source, without any time suffix.

api_parent

The key in the layered data which contains relevant data for this field.

api_time_src

Which field in the layered data contains a reference to datetime for this observation.

api_src

Regarding the layered data, either the direct access field for each item in the list under api_parent, or the desired value of api_by.

api_by

If a field in layered data is not direct access, this is the field under api_parent which contains the name matching api_src. Do not set for direct access values.

api_from

If a field in layered data is not direct access, this is the field under api_parent which contains the value. Do not set for direct access values.

transformation

The name of a transformation or encoding to be executed on this field.

one_hot_vals

An array of values corresponding to a one hot encoding for this field. Different for each encoding, unused for numerical transformations.

mean

For numeric transformations, the precomputed mean.

std

For numeric transformations, the precomputed standard deviation.

min

For numeric transformations, replace any value lower than this value with this value. Also used in some transformations.

max

For numeric transformations, replace any value higher than this value with this value. Also used in some transformations.

Documentation

This tool uses a few different input and output structures in order to facilitate computation and analysis. Descriptions of these formats, along with descriptions of the methods and their inputs are in the python docstrings for these methods.

Note

At this point, the utilities here may be very specific to a particular kind of EHR and model.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ehrml-1.0.3.tar.gz (20.3 kB view hashes)

Uploaded Source

Built Distribution

ehrml-1.0.3-py3-none-any.whl (20.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page