Skip to main content

a rule-based clinical concept extraction tool to capture microorganisms and estimate infection status on semi-structured microbiology culture reports.

Project description

Version Documentation Maintenance License:MIT

RBMCE (Rule-Based Microbiology Concept Extractor):

This code was developed to provide an open-source python package to extract clinical concepts from free-text semi-structured microbiology reports. The two primary outputs for this package are (1) an binary estimation of patient bacterial infection status and (2) a list of all clinically relevant microorganisms found in the report. These outputs were validated on two independent datasets and achieved f-1 scores over 0.95 on both outputs when compared to expert review. Full details on background, algorithm, and validation results can be seen at our paper here: (currently being written, will update once submitted to archive).

🏠 Homepage



* python >=3.6.8
* pandas >=0.25.0


pip install rbmce


Recommended datastructure:

the function expects a pandas dataframe with the following elements (associated column names can be specified as kwargs):

  • parsed_note:
    • microbiology report txt in either a raw or (**perferable) chopped up into components (eg gram stain/growth report/ab susceptability)
  • culture_id:
    • a primary key tied to a given sample/specimen + microbiological exam order.
    • Often a microbiology order can be tied to numerous components (eg gram stain/growth report/ ab susceptability). additionally these can be appended to same report or added as a new report tied to same sample + order. all of these tied to a sample+order should share same culture_id
  • visit_id:
    • primary key for patient's visit/encounter
    • can be 1-many:1 to culture_id or 1:1 (in which case can specify as culture_id)
    • in some datasets a patient may have multiple cultures performed in a visit/encounter.


import rbmce
import pandas as pd
d={'parsed_note': 'No Salmonella, Shigella, Campylobacter, Aeromonas or Plesiomonas isolated.', 'culture_id': 1, 'visit_id': 1}
df=pd.DataFrame(data=d, index=[1])

Command Line:

see for example of an executable python file to import, format, process w/ rbmce, and save outputs (annotated dataframe, markdown_summary file)

Run tests


from rbmce import debug
test_str='No Salmonella, Shigella, Campylobacter, Aeromonas or Plesiomonas isolated.'

Command Line:

python -m rbmce.debug 'No Salmonella, Shigella, Campylobacter, Aeromonas or Plesiomonas isolated.'


👤 Garrett Eickelberg

🤝 Contributing

Contributions, issues and feature requests are welcome!
Feel free to check issues page. You can also take a look at the contributing guide

Show your support

Give a ⭐️ if this project helped you!


Markdown Readme Generator

📝 License

This project is MIT licensed.

This README was created with the markdown-readme-generator

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rbmce-0.0.2.tar.gz (40.7 kB view hashes)

Uploaded Source

Built Distribution

rbmce-0.0.2-py3-none-any.whl (44.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page