FHIR to pandas.dataframe for AI and ML
Project description
🔥 fhiry — FHIR to Pandas DataFrame for Data Analytics, AI, and ML
FHIRy is a Python package that simplifies health data analytics and machine learning by converting FHIR bundles or NDJSON files from bulk data export into pandas DataFrames. These DataFrames can be used directly with ML libraries such as TensorFlow and PyTorch. FHIRy also supports FHIR server search and FHIR tables on BigQuery.
✨ Features
- Flatten FHIR Bundles/NDJSON to DataFrames for analytics and ML
- Import from FHIR Server via FHIR Search API
- Query FHIR Data on Google BigQuery
- LLM-based Natural Language Queries (see examples/llm_example.py)
- Flexible Filtering and Column Selection
🔧 Quick Start
Installation
Stable release:
pip install fhiry
Latest development version:
pip install git+https://github.com/dermatologist/fhiry.git
LLM support:
pip install fhiry[llm]
Usage
1. Import FHIR Bundles (JSON) from Folder
import fhiry.parallel as fp
df = fp.process('/path/to/fhir/resources')
print(df.info())
Example dataset: Synthea
Notebook: notebooks/synthea.ipynb
2. Import NDJSON from Folder
import fhiry.parallel as fp
df = fp.ndjson('/path/to/fhir/ndjson/files')
print(df.info())
Example dataset: SMART Bulk Data Server
Notebook: notebooks/ndjson.ipynb
3. Import FHIR Search Results
Fetch resources from a FHIR server using the FHIR Search API:
from fhiry.fhirsearch import Fhirsearch
fs = Fhirsearch(fhir_base_url="http://fhir-server:8080/fhir")
params = {"code": "http://snomed.info/sct|39065001"}
df = fs.search(resource_type="Condition", search_parameters=params)
print(df.info())
See fhir-search.md for details.
4. Import from Google BigQuery FHIR Dataset
from fhiry.bqsearch import BQsearch
bqs = BQsearch()
df = bqs.search("SELECT * FROM `bigquery-public-data.fhir_synthea.patient` LIMIT 20")
🚀 5. LLM-based Natural Language Queries
FHIRy supports natural language queries over FHIR bundles/NDJSON using llama-index:
pip install fhiry[llm]
See usage: examples/llm_example.py
🚀 6. Convert FHIR Bundles/Resources to Text for LLMs
Convert a FHIR Bundle or resource to a textual representation for LLMs:
from fhiry import FlattenFhir
import json
bundle = json.load(open('bundle.json'))
flatten_fhir = FlattenFhir(bundle)
print(flatten_fhir.flattened)
Filters and Column Selection
You can pass a config JSON to any constructor to remove or rename columns:
df = fp.process('/path/to/fhir/resources', config_json='{ "REMOVE": ["resource.text.div"], "RENAME": { "resource.id": "id" } }')
fs = Fhirsearch(fhir_base_url="http://fhir-server:8080/fhir", config_json='{ "REMOVE": ["resource.text.div"], "RENAME": { "resource.id": "id" } }')
bqs = BQsearch('{ "REMOVE": ["resource.text.div"], "RENAME": { "resource.id": "id" } }')
See df.columns for available columns.
Example columns:
patientId
fullUrl
resource.resourceType
resource.id
resource.name
resource.telecom
resource.gender
...
Command Line Interface (CLI)
See CLI examples:
fhiry --help
Documentation
Full documentation: https://dermatologist.github.io/fhiry/
Contributing
We welcome contributions! See CONTRIBUTING.md.
Give Us a Star ⭐️
If you find this project useful, please give us a star to help others discover it.
Contributors
- Bell Eapen
- Markus Mandalka
- PRs welcome! See CONTRIBUTING.md
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fhiry-5.2.2.tar.gz.
File metadata
- Download URL: fhiry-5.2.2.tar.gz
- Upload date:
- Size: 715.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f583412afeeb26714a2d2fc3e1bba80e5438804c3d12f257fb08c96538c6e047
|
|
| MD5 |
6b124314af630e62a2e4eb46df0647f3
|
|
| BLAKE2b-256 |
792d4106076de73b886aac071e7b41238628915454c126f1a2bb0b4dcc3267f9
|
File details
Details for the file fhiry-5.2.2-py3-none-any.whl.
File metadata
- Download URL: fhiry-5.2.2-py3-none-any.whl
- Upload date:
- Size: 19.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6324df48673314d617dc86686b633ce677ced3cecd65bc4cf1b14fda828ad998
|
|
| MD5 |
696b5309a2263205d34ae2a6401462e6
|
|
| BLAKE2b-256 |
ff720bdb7a1f6e77df183ba631f63ee9e2765590bd26ac8196f606b21f44512c
|