This project contains standardized tools to use LLMs in research studies for improving patient care.
Project description
project_ryland
Description
This project develops standardized tools to use LLMs in research studies for improving patient care. The two main features are:
- Enable a user-firendly access point for the use of GPT4DFCI, especially for those without a computing background
- (In development) Offer a set of tools to process OncDRS data and prepare it for use in LLM research
History
This project was conceived in fall 2025 when Justin Vinh noticed that no modular, user-friendly package existed at the Dana-Farber Cancer Institute in Boston, MA, to allow users to take advantage of the newly offered GPT4DFCI. GPT4DFCI is the HIPAA-compliant large language model (LLM) interface offered to researchers, and the associated API can be powerful if utilized. So he developed this project in collaboration with Thomas Sounack and the support of the Lindvall Lab to fill this gap.
RYLAND stands for "Research sYstem for LLM-based Analytics of Novel Data." Ryland is the protagonist of Justin's favorite book Project Hail Mary by Andy Weir.
Project Organization
project_ryland/
├── .github/
│ └── workflows/
│ └── publish.yml
├── .gitignore
├── CHANGELOG.md
├── LICENSE
├── project_ryland/
│ ├── __init__.py
│ └── llm_utils/
│ ├── __init__.py
│ ├── llm_config.py
│ └── llm_generation_utils.py
├── pyproject.toml
└── README.md
Features of the LLM Utililties Package
- Enables a user-friendly use of the GPT4DFCI API
- Enables the use of a prompt library to keep track of prompts and associated metadata
Instructions for Use
1. Installing the API
- Ensure that you are on the DFCI network or running the VPN client.
- Follow the instructions on the Azure website to install the Azure CLI tool. This will be necessary to enable the API for GPT4DFCI.
- Once installed, run this command in Terminal (MacOS) or Command Prompt (Windows):
az login --allow-no-subscriptions
- Running the prior command will open a window for you to login into your account. Log in.
2. Using Project Ryland
Note: A copy-paste version of the script is available at the end. Variable definitions can also be found at the end after the example script.
Note: You must be using the VPN Client or be on the DFIC netowrk to use GPT4DFCI.
-
If this is your first time using Project Ryland, you must install it into your environment. In Terminal or Command Prompt run the following
-
Import llm_generation_utils from Project Ryland
from project_ryland.llm_utils import llm_generation_utils as llm
- In your Jupyter notebook or python script, define your
endpointandentra_scope. The endpoint is user-specific, while the entra_scope is the same for all users (current default for DFCI shown below). These values should have been provided when you were granted GPT4DFCI API access. - Specify the LLM model that you will be using to run your prompts.
- Model names can be found in the llm_config.py file.
ENDPOINT = "https://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
ENTRA_SCOPE = "https://cognitiveservices.azure.com/.default"
model_name="gpt-5"
- Run the LLM_wrapper function to initialize the API.
- Note that this only has to be done once per run. You can call the API multiple times in one run
LLM_wrapper = llm.LLM_wrapper(
model_name,
endpoint=ENDPOINT,
entra_scope=ENTRA_SCOPE,
)
- Declare the path to your input CSV file.
- Declare the path to your LLM Prompt Library if you will be utilizing that feature. A template prompt gallery is available for download from the GitHub. Add the library to the same directory as your main script. Use of the gallery is highly recommended to track prompts texts, prompt structures, and associated metadata.
input_file = 'pathology_llm_tests.csv'
gallery_path = "llm_prompt_gallery"
- Use the generation to obtain your LLM output.
dfp_new = LLM_wrapper.process_text_data(
# Essential to specify
input_file_path=input_file,
text_column="SECTION_TEXT",
format_class=ps.AssessNanoPathology,
use_prompt_gallery=False,
# Specify if using the prompt gallery, else put None
prompt_gallery_path=gallery_path,
prompt_to_get="gwas_symptoms_prompt_v1",
user_prompt_vars=gwas_prompt_variables_v1,
# Specify if NOT using the prompt gallery, else put None
prompt_text = "Give me a hello",
# Optional to specify
output_dir="output_tests",
flatten=True,
sample_mode=False,
resume=True,
keep_checkpoints=False,
save_every=10,
- Alternatively, use the quickstart to get off the ground quickly!
from project_ryland.templates.quickstart import create_quickstart
create_quickstart(dest="~/quickstart")
or use the command line tool
bash project-ryland-init quickstart
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file project_ryland-2.1.6.tar.gz.
File metadata
- Download URL: project_ryland-2.1.6.tar.gz
- Upload date:
- Size: 112.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3ea9fae3a2acfe1010d61b85841b64bf3b9346858a0e97ca4cd0c67c3cff6ff
|
|
| MD5 |
a88868678af3187bd61fe9382ca61001
|
|
| BLAKE2b-256 |
0b5e348e60d393e3044e07c4c45488e21a2c62e7cd3211438b98e02687ac8090
|
Provenance
The following attestation bundles were made for project_ryland-2.1.6.tar.gz:
Publisher:
publish.yml on justin-vinh/project_ryland
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
project_ryland-2.1.6.tar.gz -
Subject digest:
e3ea9fae3a2acfe1010d61b85841b64bf3b9346858a0e97ca4cd0c67c3cff6ff - Sigstore transparency entry: 872199085
- Sigstore integration time:
-
Permalink:
justin-vinh/project_ryland@779ca9b879a3fc2f88d1f549046056e4150b9347 -
Branch / Tag:
refs/tags/v2.1.6 - Owner: https://github.com/justin-vinh
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@779ca9b879a3fc2f88d1f549046056e4150b9347 -
Trigger Event:
push
-
Statement type:
File details
Details for the file project_ryland-2.1.6-py3-none-any.whl.
File metadata
- Download URL: project_ryland-2.1.6-py3-none-any.whl
- Upload date:
- Size: 116.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5ae07f560a5eb8c221898b0b6af7ea065438e443b5a31f10f643f49d092552ed
|
|
| MD5 |
7858617d236e1b26d91867e71ff8b75b
|
|
| BLAKE2b-256 |
8ff0a957bb163969bf3f07a0b6fba56f4b2da03e09e0e8592109804dda91c1b7
|
Provenance
The following attestation bundles were made for project_ryland-2.1.6-py3-none-any.whl:
Publisher:
publish.yml on justin-vinh/project_ryland
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
project_ryland-2.1.6-py3-none-any.whl -
Subject digest:
5ae07f560a5eb8c221898b0b6af7ea065438e443b5a31f10f643f49d092552ed - Sigstore transparency entry: 872199089
- Sigstore integration time:
-
Permalink:
justin-vinh/project_ryland@779ca9b879a3fc2f88d1f549046056e4150b9347 -
Branch / Tag:
refs/tags/v2.1.6 - Owner: https://github.com/justin-vinh
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@779ca9b879a3fc2f88d1f549046056e4150b9347 -
Trigger Event:
push
-
Statement type: