A collection of tools to assist with pregnancy research.

These details have not been verified by PyPI

Project links

Project description

pypreg

pypreg is a collection of packages that classifies pregnancy outcome types from health data, identifies severe maternal morbidity, adverse pregnancy outcomes, and calculates obstetric comorbidity scores.

Features

This package makes available several methods to support pregnancy research:

Pregnancy Outcome Classification:
- Classify pregnancies by their outcome (Live birth, Stillbirth, etc.)
Adverse Pregnancy Outcomes:
- Identify the presence of indicators for Cesarean section, Preeclampsia, Gestational Diabetes, Gestational Hypertension, and Fetal Growth Restriction
Severe Maternal Morbidity:
- Identify the presence of SMM and non-Transfusion SMM
- Get individual flags for each of 21 indicators that make up the SMM definition
Obstetric Comorbidity Index:
- Get a numeric obstetric comorbidity index consistent with methods published by Bateman or Leonard

General Usage Data Format

The methods in this package rely on diagnostic, procedure, and/or diagnostic related group (DRG) codes. These codes should be organized rowwise with the relevant patient identifiers in the context of a pandas dataframe.

Pregnancy Outcome Classification

This module is an implementation of the obstetric classification algorithm given by Moll(2020).

Data Requirements

A Pandas dataframe with the following elements is required to begin using this package:

Column	Description	Notes
Patient ID	Unique patient identifier
Encounter ID	Encounter identifier
Admit date	Date of encounter admission	Date or datetime
Code type	Describes if the given code is a procedure, diagnostic, or DRG	Accepts 'dx', 'diagnosis', 'diagnostic', 'px', 'procedure', 'drg', 'diagnostic related group'
Code version	Describes the coding system of the given code	Accepts '9', 'ICD9', '10', 'ICD10', 'ICD10-CM', 'ICD10-PCS', 'CPT', 'CPT4', 'HCPCS', 'DRG', 'MS-DRG'
Code	The procedure, diagnostic, or DRG code

The dataframe may contain multiple instances of a single encounter with a variety of codes. The data should be presented rowwise where the codes are differentiated by the Code type and Code version columns.

Code selection

Crosswalk

The codes given by Moll(2020) are primarily in the ICD10 coding system and are crosswalked to ICD9 for broader use. Crosswalked codes are retrieved via General Equivalence Mappings. ICD10 codes are mapped back to ICD9, and ICD9 codes are mapped forward to ICD10. The resulting codes are reviewed and kept if they are related to an obstetric outcome, and, if the original ICD10 code broadly refers to labor and delivery, then the mapped codes may also.

Change to Moll codes

ICD10 procedure codes 10D17ZZ and 10D18ZZ refer to extraction of retained products of conception. These codes describe a dilation and curettage procedure which is more frequently, though not exclusively, performed on abortive outcomes. These codes lead to misclassification of abortive outcomes as unknown deliveries and were removed from this implementation.

Expanded codes

Expanded codes are added based upon a review of ICD9 and ICD10 codes for other obstetric outcomes not captured. This primarily captures codes where episode of care indicates delivered in ICD9.

Algorithm

The specifics of the algorithm are contained within Moll(2020). In short, each code provided is classified by regular expression matching. Each individual patient is then processed independently to prioritize certain outcomes. The outcomes are then checked to ensure that the spacing between different pregnancy outcome types is sensible. Outcome types that do not fit the spacing are dropped from consideration. A pregnancy start window is estimated based on the outcome type. The start of this window is then adjusted if needed so that it is not overlapping a previous pregnancy.

Usage

Four processes are exposed and available to be imported:

process_outcomes

the entry into the classifcation algorithm. All parameters are required except for the expanded flag which indicates if you would like to use the Moll codes or an expanded list of codes identified by this author
the dataframe need not have the columns in a standardized order, instead the required column names must be passed as strings

from pypreg import process_outcomes

process_outcomes(df: pd.DataFrame,
                 patient_col: str,
                 encounter_col: str,
                 admit_date_col: str,
                 version_col: str,
                 type_col: str,
                 code_col: str,
                 expanded: bool = False)

Output

This process will produce a pandas dataframe with the following data (column names that are reflexive of provided data will use the originally provided column name):

Column	Description	Notes
Patient ID	Unique patient identifier	Will be the same as originally provided
Preg_num	Pregnancy identifier per patient	Synonomous with gravida, but only relevant to the provided data (uncaptured pregnancies cannot be counted)
Encounter ID	Encounter identifier belonging to the outcome encounter	Will be the same as originally provided
Admit	Date of admission for the outcome encounter	Formatted as date, not datetime
Event_date	Date of admission for the outcome encounter	Formatted as date, not datetime
Outcome	String classification of the pregnancy outcome	live_birth, stillbirth, delivery, trophoblastic, ectopic, therapeutic_abortion, spontaneous_abortion
Start_window	Date that delineates the beginning of the pregnancy start window
End_window	Date that delineates the end of the pregnancy start window

OUTCOME_LIST

exports the list of pregnancy outcomes in hierarchical order

from pypreg import OUTCOME_LIST

print(OUTCOME_LIST)
['LIVE_BIRTH', 'STILLBIRTH', 'DELIVERY', 'TROPHOBLASTIC', 'ECTOPIC', 'THERAPEUTIC_ABORTION', 'SPONTANEOUS_ABORTION']

OUTCOMES

exports the full list of codes, the code type, the code version, and the association outcome classification
codes are given as a regular expression

from pypreg import OUTCOMES

print(OUTCOMES.head())
  code_type version     schema                      code  outcome
0        DX   ICD10       MOLL                 ^O0[08].*  ECTOPIC
1        DX    ICD9  CROSSWALK                    ^633.*  ECTOPIC
2        PX   ICD10       MOLL  ^10(D2[78]|T2[03478])ZZ$  ECTOPIC
3        PX    ICD9  CROSSWALK                     ^743$  ECTOPIC
4        PX    ICD9  CROSSWALK                 ^66[06]2$  ECTOPIC

map_version_split

splits OUTCOMES into 4 subcategories
- ICD9 Diagnostic codes
- ICD10 Diagnositc codes
- Procedure codes
- Diagnostic Related Group (DRG) codes
default behavior is to not include the expanded code list

from pypreg import map_version_split

map_version_split(expanded: bool = False)

Adverse Pregnancy Outcomes

This package is an implementation to identify adverse pregnancy outcomes from longitudinal data. This implementation covers cesarean section, fetal growth restriction, gestational diabetes, gestational hypertension, and preeclampsia. Pre-term birth is not included in this package as it relies on gestational age which is not covered in this work.

Code selection

Cesarean section

Codes were sourced from a variety of authors. Additional codes were added by this author after review that indicated a cesarean section had been performed. These include anesthesia for c-section as they should appear alongside the cesarean section code anyway, and disruption of cesarean wound as this indicates a cesarean section had been recently performed.

Usage

Data requirement

A Pandas dataframe with the following elements is required to begin using this package:

Column	Description	Notes
Patient ID	Unique patient identifier
Pregnancy ID	Pregnancy identifier
Code type	Describes if the given code is a procedure, diagnostic, or DRG	Accepts 'dx', 'diagnosis', 'px', 'procedure', 'drg', 'diagnostic related group', 'diagnostic grouping'
Code version	Describes the coding system of the given code	Accepts '9', 'ICD9', '10', 'ICD10', 'ICD10-CM', 'ICD10-PCS', 'CPT', 'CPT4', 'HCPCS', 'DRG', 'DIAGNOSTIC RELATED GROUP', 'DIAGNOSTIC GROUPING', 'MS-DRG'
Code	The procedure, diagnostic, or DRG code

The dataframe may contain multiple instances of a single encounter with a variety of codes. The data should be presented rowwise.

Process outcomes

from pypreg import apo

apo_df = apo(data_df,
             patient_id='patient_id',
             preg_id='preg_id',
             code_type='code_type',
             version='code_version',
             code='code')

Output

This process will produce a pandas dataframe with the following data:

Column	Description	Notes
Patient ID	Unique patient identifier
Pregnancy ID	Pregnancy identifier
cesarean	Indicates if a cesarean section was recorded for the pregnancy	Boolean
fetal growth restriction	Indicates if a fetal growth restriction was recorded for the pregnancy	Boolean
gest diabetes mellitus	Indicates if gestational diabetes was recorded for the pregnancy	Boolean
gest hypertension	Indicates if gestational hypertension was recorded for the pregnancy	Boolean
preeclampsia	Indicates if preeclampsia was recorded for the pregnancy	Boolean

Severe Maternal Morbidity

This package is an implementation of Severe Maternal Mordbidity(SMM) classification defined by the Centers for Disease Control and Prevention (CDC).

Code selection

Codes are as provided by the CDC. Where all child codes are indicated, the top-level codes are also included. All codes have been translated into regular expressions.

Usage

Data requirement

A Pandas dataframe with the following elements is required to begin using this package:

Column	Description	Notes
Encounter ID	Encounter identifier of the pregnancy outcome	SMM definition requires using the delivery encounter
Code type	Describes if the given code is a procedure, diagnostic, or DRG	Accepts 'dx', 'diagnosis', 'px', 'procedure'
Code version	Describes the coding system of the given code	Accepts '9', 'ICD9', '10', 'ICD10', 'ICD10-CM', 'ICD10-PCS'
Code	The procedure or diagnostic code

The dataframe may contain multiple instances of a single encounter with a variety of codes. The data should be presented rowwise where the codes are differentiated by the Code type and Code version columns.

Process SMM

from pypreg import smm

smm_df = smm(data_df,
             enc_id='encounter_id',
             code_type='code_type',
             version='code_version',
             code='code',
             indicators=True)

Output

This process will produce a pandas dataframe with the following data:

Column	Description	Notes
Encounter ID	Encounter identifier of the pregnancy outcome	SMM definition requires using the delivery encounter
smm	Indicates the delivery encounter qualifies as SMM	Boolean
transfusion	Indicates the delivery encounter recorded a transfusion procedure	Boolean
Others	If `indicators = TRUE` every condition class that makes up SMM will be included	Boolean

Obstetric Comorbidity Index

This module is an implementation of both the Bateman(2013) and Leonard(2020) obstetric comorbidity indices.

Codes

Codes are given by Bateman(2013) and Leonard(2020). This author has adapted them as regular expressions to include the top-level code where appropriate, as some secondary databases include these codes.

Scoring

Scores are assigned by comorbidity. Comorbidity classifications are matched to codes using regular expressions. Weights assigned to each comorbidity vary depending on the scoring method chosen. Refer to Bateman(2013) and Leonard(2020) for information on the weights assigned.

Ranges

Bateman
- 0-45
Leonard
- non-transfusion SMM
  - 0-281
- transfusion SMM
  - 0-478

Usage

Data requirement

A Pandas dataframe with the following elements is required to begin using this package:

Column	Description	Notes
Patient ID	Unique patient identifier
Pregnancy ID	Pregnancy identifier
Code version	Describes the coding system of the given code	Accepts '9', 'ICD9', '10', 'ICD10', 'ICD10-CM'
Code	The diagnostic code
Method	String choice to select the method	Accepts 'bateman', 'leonard'
Age column	Optional parameter to provide the column containing patient age	Bateman needs the patient age at the beginning of the pregnancy, Leonard needs the age at the outcome

The dataframe may contain multiple instances of a single pregnancy with a variety of codes. The data should be presented rowwise.

Calculate index

Two different scoring methods can be selected:

Bateman

from pypreg import calc_index

calc_index(df,
           patient_col='patient_id',
           pregnancy_col='preg_id',
           code_col='code',
           version_col='code_version',
           method='bateman',
           age_col='age'):

Bateman asks for patient age at the Last Menstrual Period (LMP). If you've given an age column, the method assumes that the patient can age during the course of pregnancy and thus the minimum age is selected.

Leonard

from pypreg import calc_index

calc_index(df,
           patient_col='patient_id',
           pregnancy_col='preg_id',
           code_col='code',
           version_col='code_version',
           method='leonard',
           age_col='age'):

Leonard asks for patient age at the outcome. If you've given an age column, the method assumes that the patient can age during the course of pregnancy and thus the maximum age is selected.

Output

This process will produce a pandas dataframe with the following data:

Column	Description	Notes
Patient ID	Unique patient identifier
Pregnancy ID	Pregnancy identifier
bateman_score	When the 'bateman' method is selected	Scores range from 0-5
leonard_smm_score	When the 'leonard' method is selected	Scores range from 0-478
leonard_nontransfucion_smm_score	When the 'leonard' method is selected	Scores range from 0-281

References

Centers for Disease Control and Prevention. How does CDC identify severe maternal morbidity? https://www.cdc.gov/reproductivehealth/maternalinfanthealth/smm/severe-morbidity-ICD.htm. Accessed 2023.
- Updated link: https://www.cdc.gov/maternal-infant-health/php/severe-maternal-morbidity/icd.html
Moll K FK Wong H-L. Task order HHSF22301001T: Pregnancy outcomes validation final report. U.S. Food and Drug Administration; 2020. Available at: https://www.bestinitiative.org/wp-content/uploads/2020/08/Validating_Pregnancy_Outcomes_Linked_Database_Report_2020-1.pdf
Agency for Healthcare Research and Quality U.S. Department of Health and Human Services. AHRQ Quality Indicators™ (AHRQ QI™)ICD-9-CM and ICD-10-CM/PCSSpecification Enhanced Version 5.0: Inpatient Quality Indicators #21 (IQI #21)Cesarean Delivery Rate, Uncomplicated 2015: https://qualityindicators.ahrq.gov/Downloads/Modules/IQI/V50-ICD10/TechSpecs/IQI%2021%20Cesarean%20Delivery%20Rate,%20Uncomplicated.pdf
Castillo WC, Boggess K, Stürmer T, Brookhart MA, Benjamin DK, Funk MJ. Trends in Glyburide Compared With Insulin Use for Gestational Diabetes Treatment in the United States, 2000-2011 (Supplemental Materials) Obstetrics & Gynecology , Vol. 123, No. 6 Ovid Technologies (Wolters Kluwer Health) p. 1177-1184
Bateman BT, Mhyre JM, Hernandez-Diaz S, Huybrechts KF, Fischer MA, Creanga AA, Callaghan WM, Gagne JJ. Development of a comorbidity index for use in obstetric patients. Obstet Gynecol. 2013 Nov;122(5):957-965. doi: 10.1097/AOG.0b013e3182a603bb. PMID: 24104771; PMCID: PMC3829199
Leonard SA, Kennedy CJ, Carmichael SL, Lyell DJ, Main EK. An Expanded Obstetric Comorbidity Scoring System for Predicting Severe Maternal Morbidity. Obstet Gynecol. 2020 Sep;136(3):440-449. doi: 10.1097/AOG.0000000000004022. PMID: 32769656; PMCID: PMC7523732

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jun 3, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pypreg-0.1.0.tar.gz (70.3 kB view details)

Uploaded Jun 3, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pypreg-0.1.0-py3-none-any.whl (72.3 kB view details)

Uploaded Jun 3, 2024 Python 3

File details

Details for the file pypreg-0.1.0.tar.gz.

File metadata

Download URL: pypreg-0.1.0.tar.gz
Upload date: Jun 3, 2024
Size: 70.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.0 CPython/3.10.14

File hashes

Hashes for pypreg-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`8e632ebecbf253fa0b81a3490250d49abf88f0aa1aeabe8227e1ff651d28ff2e`
MD5	`4e8d18198bdee651eb0362c0be5ebd4f`
BLAKE2b-256	`c1162a9c0a8385b2bade225e40898a6739aa8947f0dd26ce7b316ef7911ca924`

See more details on using hashes here.

File details

Details for the file pypreg-0.1.0-py3-none-any.whl.

File metadata

Download URL: pypreg-0.1.0-py3-none-any.whl
Upload date: Jun 3, 2024
Size: 72.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.0 CPython/3.10.14

File hashes

Hashes for pypreg-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e9c54cbe52b309a2537eedecc4005fd17da6a6598f92bd2def616a7c2dab67e5`
MD5	`99d1864a9aa383bd126e0cf6fa2b9107`
BLAKE2b-256	`ab849c1834cb25a60b60a3301a0ad83db82ee140ab1a3035192a88279485e916`

See more details on using hashes here.

pypreg 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pypreg

Features

General Usage Data Format

Pregnancy Outcome Classification

Data Requirements

Code selection

Crosswalk

Change to Moll codes

Expanded codes

Algorithm

Usage

Output

Adverse Pregnancy Outcomes

Code selection

Cesarean section

Usage

Data requirement

Process outcomes

Output

Severe Maternal Morbidity

Code selection

Usage

Data requirement

Process SMM

Output

Obstetric Comorbidity Index

Codes

Scoring

Ranges

Usage

Data requirement

Calculate index

Output

References

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes