Pretrained BERT models for encoding clinical trial documents to compact embeddings.
Project description
Trial2Vec
Wang, Zifeng and Sun, Jimeng. (2022). Trial2Vec: Zero-Shot Clinical Trial Document Similarity Search using Self-Supervision. Findings of EMNLP'22.
Usage
Get pretrained Trial2Vec model in three lines:
from trial2vec import Trial2Vec
model = Trial2Vec()
model.from_pretrained()
How to install
Install the correct PyTorch
version by referring to https://pytorch.org/get-started/locally/.
Then install Trial2Vec
by
pip install git+https://github.com/RyanWangZf/Trial2Vec.git
or
pip install trial2vec
Search similar trials
Use Trial2Vec
to search similar clinical trials:
# load demo data
from trial2vec import load_demo_data
data = load_demo_data()
# contains trial documents
test_data = {'x': data['x']}
# make prediction
pred = model.predict(test_data)
Encode trials
Use Trial2Vec
to encode clinical trial documents:
test_data = {'x': df} # contains trial documents
emb = model.encode(test_data) # make inference
# or just find the pre-encoded trial documents
emb = [model[nct_id] for test_data['x']['nct_id']]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Trial2Vec-0.0.2.tar.gz
(22.8 kB
view hashes)
Built Distribution
Trial2Vec-0.0.2-py3-none-any.whl
(26.0 kB
view hashes)
Close
Hashes for Trial2Vec-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b05c658e93fbba480b1a7d52f7536298315fae2e4bb0a61c004262461959e6df |
|
MD5 | dacf60df0189efa92c2aa313700fb96f |
|
BLAKE2b-256 | 4f7c4df652f48f1571188a73e4e66285ddae2a9e4b6799869258ccd4f17f1a2a |