Skip to main content

FHIR to pandas.dataframe for AI and ML

Project description

:fire: fhiry - FHIR to pandas dataframe for data analytics, AI and ML

Virtual flattened view of FHIR Bundle / ndjson / FHIR server / BigQuery!

Libraries.io SourceRank PyPI download total GitHub tag (latest by date)

:fire: FHIRy is a python package to facilitate health data analytics and machine learning by converting a folder of FHIR bundles/ndjson from bulk data export into a pandas data frame for analysis. You can import the dataframe into ML packages such as Tensorflow and PyTorch. FHIRy also supports FHIR server search and FHIR tables on BigQuery.

UPDATE

Recently added support for LLM based natural language queries of FHIR bundles/ndjson using llama-index. Please install the llm extras as follows. Please be cognizant of the privacy issues with publically hosted LLMs. Any feedback will be highly appreciated. See usage!

pip install fhiry[llm]

See usage.

Test this with the synthea sample or the downloaded ndjson from the SMART Bulk data server. Use the 'Discussions' tab above for feature requests.

:sparkles: Checkout this template for Multimodal machine learning in healthcare!

Installation

Stable

pip install fhiry

Latest dev version

pip install git+https://github.com/dermatologist/fhiry.git

Usage

1. Import FHIR bundles (JSON) from folder to pandas dataframe

import fhiry.parallel as fp
df = fp.process('/path/to/fhir/resources')
print(df.info())

Example source data set: Synthea

Jupyter notebook example: notebooks/synthea.ipynb

2. Import NDJSON from folder to pandas dataframe

import fhiry.parallel as fp
df = fp.ndjson('/path/to/fhir/ndjson/files')
print(df.info())

Example source data set: SMART Bulk Data Server Export

Jupyter notebook example: notebooks/ndjson.ipynb

3. Import FHIR Search results to pandas dataframe

Fetch and import resources from FHIR Search API results to pandas dataframe.

Documentation: fhir-search.md

Example: Import all conditions with a certain code from FHIR Server

Fetch and import all condition resources with Snomed (Codesystem http://snomed.info/sct) Code 39065001 in the FHIR element Condition.code (resource type specific FHIR search parameter code) to a pandas dataframe:

from fhiry.fhirsearch import Fhirsearch

fs = Fhirsearch(fhir_base_url = "http://fhir-server:8080/fhir")

my_fhir_search_parameters = {
    "code": "http://snomed.info/sct|39065001",
}

df = fs.search(resource_type = "Condition", search_parameters = my_fhir_search_parameters)

print(df.info())

4. Import Google BigQuery FHIR dataset

from fhiry.bqsearch import BQsearch
bqs = BQsearch()

df = bqs.search("SELECT * FROM `bigquery-public-data.fhir_synthea.patient` LIMIT 20") # can be a path to .sql file

Filters

Pass a config json to any of the constructors:

  • config_json can be a path to a json file.
df = fp.process('/path/to/fhir/resources', config_json='{ "REMOVE": ["resource.text.div"], "RENAME": { "resource.id": "id" }  }')

fs = Fhirsearch(fhir_base_url = "http://fhir-server:8080/fhir", config_json = '{ "REMOVE": ["resource.text.div"], "RENAME": { "resource.id": "id" }  }')

bqs = BQsearch('{ "REMOVE": ["resource.text.div"], "RENAME": { "resource.id": "id" }  }')

Columns

  • see df.columns
patientId
fullUrl
resource.resourceType
resource.id
resource.name
resource.telecom
resource.gender
...
...
...

Documentation

Give us a star ⭐️

If you find this project useful, give us a star. It helps others discover the project.

Contributors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

fhiry-4.0.0-py2.py3-none-any.whl (13.4 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file fhiry-4.0.0-py2.py3-none-any.whl.

File metadata

  • Download URL: fhiry-4.0.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 13.4 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for fhiry-4.0.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 eba1b09aa329c509914055899d256100af91c4e37da49ccecd744faad1e35ba4
MD5 2e295aad36e5b9dc26546ba8a786a3a3
BLAKE2b-256 cdccecaadc9f9c6c09b9e637b626b3808a18c48a3e21acabacb90d796b5b8431

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page