Skip to main content

Retrieval augmented validation for TF inference in single cell biology

Project description

RAGulateRAGulate logo

Retrieval-Augmented Generation for Post-hoc Literature-Grounded Regulatory Assessment

GitHub issues PyPI - Project Conda Docs

Introduction

RAGulate is a Retrieval-Augmented Generation (RAG) pipeline that integrates domain-specific large language models (LLM) with curated literature to identify, score, and assess inferred regulatory (transcription factor-target gene) interactions in their biological context.

For further information and example tutorials, please check our documentation:

If you have any questions or concerns, feel free to open an issue.

Requirements

RAGulate is implemented in the LlamaIndex framework. Running RAGulate on CUDA is highly recommended if available.

Before installing and running RAGulate, ensure you have the following libraries installed:

  • PyTorch (version 2.0 or higher)
    Install with the exact command from the PyTorch “Get Started” page for your OS, Python version and (optionally) CUDA toolkit.
  • NumPy (version 1.23 or higher)

You can install these dependencies using pip:

pip install torch numpy

Installation

Option 1 (Coming soon):
You can install RAGulate via pip for a lightweight installation:

pip install ragulate-bio

Option 2 (Coming soon):
Alternatively, if you want the latest, unreleased version, you can install it directly from the source on GitHub:

pip install git+https://github.com/YDaiLab/RAGulate.git

Import

import ragulate_bio as ragulate  # recommended alias

Note: The PyPI distribution is named ragulate-bio to avoid a name conflict with an unrelated project called ragulate. Always import ragulate_bio in Python (you may alias it to ragulate for convenience)

Option 3 (Coming soon):
For users who prefer Conda or Mamba for environment management, you can install RAGulate along with extra dependencies:

Conda:

conda install -c zandigohar RAGulate

Mamba:

mamba create -n RAGulate -c zandigohar RAGulate

FAQ

Q1: Do I need a GPU to run RAGulate?
No, a GPU is not required. However, using a CUDA-enabled GPU is strongly recommended for faster runs, especially with large queries.

Q2: How do I know if I can use a GPU with RAGulate?
There are two quick checks:

  1. System check
    In your terminal, run nvidia-smi. If you see your GPU listed (model, memory, driver version), your machine has an NVIDIA GPU with the driver installed.

  2. Python check
    In a Python shell, run:

    import torch
    print(torch.cuda.is_available())  # True means PyTorch can see your GPU
    print(torch.cuda.device_count())  # How many GPUs are usable
    

Q3: Can I use RAGulate with R-based tools?
RAGulate is written in Python and works directly with Numpy objects.

Q4: What if I also have another package called ragulate installed? RAGulate will warn you if it detects a conflicting installation. We recommend using a clean virtual environment to avoid import clashes.

Q5: How do I cite RAGulate?
See the Citation section below for the latest reference and preprint link.

Q6: How can I reproduce the paper’s results?
See our Reproducibility Guide for step-by-step instructions. Then run RAGulate.

Citation

⚠️ Preprint coming soon (bioRxiv, 2026)
This repository is under active development. Please cite as:
Zandigohar M, Rehman J, Dai Y. RAGulate: RAGulate: Retrieval-Augmented Generation for Post-hoc Literature-Grounded Regulatory Assessment. 2026.

Development & Contact

RAGulate was developed and is actively maintained by Mehrdad Zandigohar as part of his PhD research at the University of Illinois Chicago (UIC), in the lab of Dr. Yang Dai.

📬 For private questions, please email: mzandi2@uic.edu

🤝 For collaboration inquiries, please contact PI: Dr. Yang Dai (yangdai@uic.edu)

Contributions, feature suggestions, and feedback are always welcome!

License

The code in RAGulate is licensed under the MIT License, which permits academic and commercial use, modification, and distribution.

Please note that any third-party dependencies bundled with RAGulate may have their own respective licenses.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragulate_bio-0.1.2.tar.gz (422.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ragulate_bio-0.1.2-py3-none-any.whl (75.9 kB view details)

Uploaded Python 3

File details

Details for the file ragulate_bio-0.1.2.tar.gz.

File metadata

  • Download URL: ragulate_bio-0.1.2.tar.gz
  • Upload date:
  • Size: 422.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ragulate_bio-0.1.2.tar.gz
Algorithm Hash digest
SHA256 ca6eb9d026cecc073a39dc0631dc617fe85d44cf629a3499d0e21f599cdd7902
MD5 1e0f647237470033bc20877ba6e4523a
BLAKE2b-256 65661b98ffd5c749b8a92b884de85232f18dcd76acb1c895b52a1c0f74d8901a

See more details on using hashes here.

Provenance

The following attestation bundles were made for ragulate_bio-0.1.2.tar.gz:

Publisher: release-pypi.yml on YDaiLab/RAGulate

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ragulate_bio-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: ragulate_bio-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 75.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ragulate_bio-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c698fa2f3fbaf6f06ef9994d3e742f6a2fc9786f08ea624e5f9e347a821e0825
MD5 4a0bac5e4cd32dc5af14333d4359ca97
BLAKE2b-256 9edee15115e62564d0de1f964a66462560e000d43c0c3a731b5d06352693a094

See more details on using hashes here.

Provenance

The following attestation bundles were made for ragulate_bio-0.1.2-py3-none-any.whl:

Publisher: release-pypi.yml on YDaiLab/RAGulate

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page