Skip to main content

A command-line tool for BRAD enrichment analysis

Project description

Gene Enrichment with BRAD

BRAD Enrichment is a command-line tool to generate a report for gene enrichment analysis based on a gene set and targeted literature database. The tool uses a BRAD Agent to identify the contextual significance of enrichment terms and databases based upon custom a custom literature database. This repository contains command line tools brad-builddb and brad-enrichment for building literature databases and generating reports.

image

Quickstart

Run the following commands to install this tool:

pip install brad-enrichment
brad-builddb --help
brad-enrichment --help

See below for detailed list on installing this code.

Installation

To install BRAD Enrichment, activate your conda environment from the BRAD repository and install the following two packages:

conda activate BRAD
pip install BRAD-Agent
pip install brad-enrichment

Usage

Gene Enrichment Analysis

To perform gene enrichment analysis, use the following command:

brad-enrichment <gene_string> [OPTIONS]

Arguments

  • <gene_string>: A string containing gene names separated by spaces or commas.

Options

  • --databases, -d: List of databases to use for enrichment (default: KEGG, GO, PanglaoDB).
  • --threshold-p-value, -p: P-value threshold for enrichment results (default: 0.05).
  • --minimum-enrichment-terms, -min: Minimum number of enrichment terms to report (default: 3).
  • --maximum-enrichment-terms, -max: Maximum number of enrichment terms to report (default: 10).
  • --literature-database, -l: Path to the literature database (default: ../databases/enrichment_database).
  • --output, -o: Output file name for results (default: enrichment_results.xlsx).
  • --query, -q: Custom query for enrichment analysis (default: standard query).
  • --verbose, -v: Enable verbose mode for debugging and logging.

Example Usage

brad-enrichment "TP53, MYC, EGFR" -d KEGG_2021_Human -p 0.01 -o my_results.xlsx

Building an Enrichment Literature Database

To build an enrichment literature database, use the following command:

brad-builddb [OPTIONS]

Options

  • --documents-directory, -d: Path to the directory containing document files (default: documents).
  • --database-directory, -D: Path to the directory where the database should be stored (default: databases).
  • --database-name, -n: Name of the database to be created (default: enrichment_database).
  • --text-size, -s: Size of text chunks for processing (default: 700).
  • --text-overlap, -o: Number of overlapping characters between chunks (default: 100).
  • --verbose, -v: Enable verbose output.

Example Usage

brad-builddb -d /path/to/documents -D /path/to/databases -n my_database -s 500 -o 50 -v

Development Setup

If you wish to contribute or modify the tool, follow these steps:

git clone https://github.com/Jpickard1/BRAD-Enrichment.git
cd BRAD-Enrichment
pip install -e .

Citation

If you use this tool in your research, please cite or paper Language Model Powered Digital Biology with BRAD as:

@article{pickard2024language,
  title={Language Model Powered Digital Biology with BRAD},
  author={Pickard, Joshua and Prakash, Ram and Choi, Marc Andrew and Oliven, Natalie and
          Stansbury, Cooper and Cwycyshyn, Jillian
          and Gorodetsky, Alex and Velasquez, Alvaro and Rajapakse, Indika},
  journal={arXiv preprint arXiv:2409.02864},
  url={https://arxiv.org/abs/2409.02864},
  year={2024}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

brad_enrichment-0.1.1.tar.gz (16.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

brad_enrichment-0.1.1-py3-none-any.whl (17.4 kB view details)

Uploaded Python 3

File details

Details for the file brad_enrichment-0.1.1.tar.gz.

File metadata

  • Download URL: brad_enrichment-0.1.1.tar.gz
  • Upload date:
  • Size: 16.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for brad_enrichment-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d242eef51f3c8d94c7a6264eb5723ecc544a8ffb207c88a6c07bfc656ae2e66c
MD5 5da2b3eb17fd32f7dab32b64463d5a50
BLAKE2b-256 f026227c6f1d20bfc06c4b149c504e74c7a912da4d6442b227a35cd1dcb63ede

See more details on using hashes here.

File details

Details for the file brad_enrichment-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for brad_enrichment-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 14956b94da547d33f2890a863e86a4addedb8f96bb87384d6fa546160db1d09d
MD5 d8a6916655738b01e0bbd7dea1cc02ef
BLAKE2b-256 a688a5b33fbaa168a858d5ee13339e96cc0e6e49aba1d33ec2c937f054d32ae1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page