Sequence patient electronic healthcare record generation with large language models (LLMs) as the neural database.
Project description
PromptEHR
Wang, Zifeng and Sun, Jimeng. (2022). PromptEHR: Conditional Electronic Healthcare Records Generation with Prompt Learning. EMNLP'22.
News
- [2023/01/08]
PromptEHR
is now integrated intoPyTrial
with a complete documentation and example, please check! New version with bugs fixed is also released!
Usage
Get pretrained PromptEHR model (learned on MIMIC-III sequence EHRs) in three lines:
from promptehr import PromptEHR
model = PromptEHR()
model.from_pretrained()
A jupyter example is available at https://github.com/RyanWangZf/PromptEHR/blob/main/example/demo_promptehr.ipynb.
How to install
Install the correct PyTorch
version by referring to https://pytorch.org/get-started/locally/.
Then try to install PromptEHR
by
pip install git+https://github.com/RyanWangZf/PromptEHR.git
or
pip install promptehr
Load demo synthetic EHRs (generated by PromptEHR)
from promptehr import load_synthetic_data
data = load_synthetic_data()
Use PromptEHR for generation
from promptehr import SequencePatient
from promptehr import load_synthetic_data
from promptehr import PromptEHR
# init model
model = PromptEHR()
model.from_pretrained()
# load input data
demo = load_synthetic_data(n_sample=1000) # we have 10,000 samples in total
# build the standard input data for train or test PromptEHR models
seqdata = SequencePatient(data={'v':demo['visit'], 'y':demo['y'], 'x':demo['feature'],},
metadata={
'visit':{'mode':'dense'},
'label':{'mode':'tensor'},
'voc':demo['voc'],
'max_visit':20,
}
)
# you can try to fit on this data by
# model.fit(seqdata)
# start generate
# n: the target total number of samples to generate
# n_per_sample: based on each sample, how many fake samples will be generated
# the output will have the same format of `SequencePatient`
fake_data = model.predict(seqdata, n=1000, n_per_sample=10)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
PromptEHR-0.0.5.tar.gz
(188.2 kB
view hashes)
Built Distribution
PromptEHR-0.0.5-py3-none-any.whl
(260.6 kB
view hashes)
Close
Hashes for PromptEHR-0.0.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 858513481da3d093d14f2dc4c830698a83970743074af34f11d6c8c8aa899b34 |
|
MD5 | 238ecf5aaac190bd43673f1d0408ed71 |
|
BLAKE2b-256 | b43c767e0f90d071eab0932c04337d21d68ece7eb00889b425a72cd7d9abeeb2 |