Skip to main content

This project contains standardized tools to use LLMs in research studies for improving patient care.

Project description

project_ryland

Description

This project develops standardized tools to use LLMs in research studies for improving patient care. The two main features are:

  1. Enable a user-firendly access point for the use of GPT4DFCI, especially for those without a computing background
  2. (In development) Offer a set of tools to process OncDRS data and prepare it for use in LLM research

History

This project was conceived in fall 2025 when Justin Vinh noticed that no modular, user-friendly package existed at the Dana-Farber Cancer Institute in Boston, MA, to allow users to take advantage of the newly offered GPT4DFCI. GPT4DFCI is the HIPAA-compliant large language model (LLM) interface offered to researchers, and the associated API can be powerful if utilized. So he developed this project in collaboration with Thomas Sounack and the support of the Lindvall Lab to fill this gap.

RYLAND stands for "Research sYstem for LLM-based Analytics of Novel Data." Ryland is the protagonist of Justin's favorite book Project Hail Mary by Andy Weir.

Project Organization

project_ryland/
├── .github/
│   └── workflows/
│       └── publish.yml
├── .gitignore
├── CHANGELOG.md
├── LICENSE
├── project_ryland/
│   ├── __init__.py
│   └── llm_utils/
│       ├── __init__.py
│       ├── llm_config.py
│       └── llm_generation_utils.py
├── pyproject.toml
└── README.md

Features of the LLM Utililties Package

  1. Enables a user-friendly use of the GPT4DFCI API
  2. Enables the use of a prompt library to keep track of prompts and associated metadata

Instructions for Use

1. Installing the API

  1. Ensure that you are on the DFCI network or running the VPN client.
  2. Follow the instructions on the Azure website to install the Azure CLI tool. This will be necessary to enable the API for GPT4DFCI.
  3. Once installed, run this command in Terminal (MacOS) or Command Prompt (Windows):
az login --allow-no-subscriptions
  1. Running the prior command will open a window for you to login into your account. Log in.

2. Using Project Ryland

Note: A copy-paste version of the script is available at the end. Variable definitions can also be found at the end after the example script.

Note: You must be using the VPN Client or be on the DFIC netowrk to use GPT4DFCI.

  1. If this is your first time using Project Ryland, you must install it into your environment. In Terminal or Command Prompt run the following

  2. Import llm_generation_utils from Project Ryland

from project_ryland.llm_utils import llm_generation_utils as llm
  1. In your Jupyter notebook or python script, define your endpoint and entra_scope. The endpoint is user-specific, while the entra_scope is the same for all users (current default for DFCI shown below). These values should have been provided when you were granted GPT4DFCI API access.
  2. Specify the LLM model that you will be using to run your prompts.
ENDPOINT = "https://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
ENTRA_SCOPE = "https://cognitiveservices.azure.com/.default"
model_name="gpt-5"
  1. Run the LLM_wrapper function to initialize the API.
    • Note that this only has to be done once per run. You can call the API multiple times in one run
LLM_wrapper = llm.LLM_wrapper(
    model_name,
    endpoint=ENDPOINT,
    entra_scope=ENTRA_SCOPE,
)
  1. Declare the path to your input CSV file.
  2. Declare the path to your LLM Prompt Library if you will be utilizing that feature. A template prompt gallery is available for download from the GitHub. Add the library to the same directory as your main script. Use of the gallery is highly recommended to track prompts texts, prompt structures, and associated metadata.

input_file = 'pathology_llm_tests.csv'
gallery_path = "llm_prompt_gallery"
  1. Use the generation to obtain your LLM output.
dfp_new = LLM_wrapper.process_text_data(
    # Essential to specify
    input_file_path=input_file,
    text_column="SECTION_TEXT",
    format_class=ps.AssessNanoPathology,
    use_prompt_gallery=False,

    # Specify if using the prompt gallery, else put None
    prompt_gallery_path=gallery_path,
    prompt_to_get="gwas_symptoms_prompt_v1",
    user_prompt_vars=gwas_prompt_variables_v1,

    # Specify if NOT using the prompt gallery, else put None
    prompt_text = "Give me a hello",

    # Optional to specify
    output_dir="output_tests",
    flatten=True,
    sample_mode=False,
    resume=True,
    keep_checkpoints=False,
    save_every=10,
  1. Alternatively, use the quickstart to get off the ground quickly!
from project_ryland.templates.quickstart import create_quickstart
create_quickstart(dest="~/quickstart")

or use the command line tool

bash project-ryland-init quickstart

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

project_ryland-2.1.6.tar.gz (112.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

project_ryland-2.1.6-py3-none-any.whl (116.3 kB view details)

Uploaded Python 3

File details

Details for the file project_ryland-2.1.6.tar.gz.

File metadata

  • Download URL: project_ryland-2.1.6.tar.gz
  • Upload date:
  • Size: 112.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for project_ryland-2.1.6.tar.gz
Algorithm Hash digest
SHA256 e3ea9fae3a2acfe1010d61b85841b64bf3b9346858a0e97ca4cd0c67c3cff6ff
MD5 a88868678af3187bd61fe9382ca61001
BLAKE2b-256 0b5e348e60d393e3044e07c4c45488e21a2c62e7cd3211438b98e02687ac8090

See more details on using hashes here.

Provenance

The following attestation bundles were made for project_ryland-2.1.6.tar.gz:

Publisher: publish.yml on justin-vinh/project_ryland

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file project_ryland-2.1.6-py3-none-any.whl.

File metadata

  • Download URL: project_ryland-2.1.6-py3-none-any.whl
  • Upload date:
  • Size: 116.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for project_ryland-2.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 5ae07f560a5eb8c221898b0b6af7ea065438e443b5a31f10f643f49d092552ed
MD5 7858617d236e1b26d91867e71ff8b75b
BLAKE2b-256 8ff0a957bb163969bf3f07a0b6fba56f4b2da03e09e0e8592109804dda91c1b7

See more details on using hashes here.

Provenance

The following attestation bundles were made for project_ryland-2.1.6-py3-none-any.whl:

Publisher: publish.yml on justin-vinh/project_ryland

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page