a rule-based clinical concept extraction tool to capture microorganisms and estimate infection status on semi-structured microbiology culture reports.
Project description
RBMCE (Rule-Based Microbiology Concept Extractor):
This code was developed to provide an open-source python package to extract clinical concepts from free-text semi-structured microbiology reports. The two primary outputs for this package are (1) an binary estimation of patient bacterial infection status and (2) a list of all clinically relevant microorganisms found in the report. These outputs were validated on two independent datasets and achieved f-1 scores over 0.95 on both outputs when compared to expert review. Full details on background, algorithm, and validation results can be seen at our paper here: (currently being written, will update once submitted to archive).
🏠 Homepage
✨ package
Requirements
* python >=3.6.8
* pandas >=0.25.0
Install
pip install rbmce
Usage
Recommended datastructure:
the rbcme.run() function expects a pandas dataframe with the following elements (associated column names can be specified as kwargs):
- parsed_note:
- microbiology report txt in either a raw or (**perferable) chopped up into components (eg gram stain/growth report/ab susceptability)
- culture_id:
- a primary key tied to a given sample/specimen + microbiological exam order.
- Often a microbiology order can be tied to numerous components (eg gram stain/growth report/ ab susceptability). additionally these can be appended to same report or added as a new report tied to same sample + order. all of these tied to a sample+order should share same culture_id
- visit_id:
- primary key for patient's visit/encounter
- can be 1-many:1 to culture_id or 1:1 (in which case can specify as culture_id)
- in some datasets a patient may have multiple cultures performed in a visit/encounter.
Inline:
import rbmce
import pandas as pd
d={'parsed_note': 'No Salmonella, Shigella, Campylobacter, Aeromonas or Plesiomonas isolated.', 'culture_id': 1, 'visit_id': 1}
df=pd.DataFrame(data=d, index=[1])
rbmce.run(df)
Command Line:
see rbcme_run_example.py for example of an executable python file to import, format, process w/ rbmce, and save outputs (annotated dataframe, markdown_summary file)
Run tests
Inline
from rbmce import debug
test_str='No Salmonella, Shigella, Campylobacter, Aeromonas or Plesiomonas isolated.'
debug.rbmce_str_in(test_str)
Command Line:
python -m rbmce.debug 'No Salmonella, Shigella, Campylobacter, Aeromonas or Plesiomonas isolated.'
Author
👤 Garrett Eickelberg
🤝 Contributing
Contributions, issues and feature requests are welcome!
Feel free to check issues page. You can also take a look at the contributing guide
Show your support
Give a ⭐️ if this project helped you!
Credits
📝 License
This project is MIT licensed.
This README was created with the markdown-readme-generator
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.