Skip to main content

Create summararies of groups of genes.

Project description

👓 GeneGist

Researchers often face challenges in deciphering the complex interactions and functions of systems of genes. GeneGist addresses this problem by providing detailed summaries and insights into gene behaviors, interactions, and their roles in biological processes.

This complexity arises from the vast array of gene interactions, regulatory mechanisms, and the multifaceted roles genes play in biological processes. GeneGist generates in-depth summaries and insights into gene behaviors and interactions, as well as their roles in biological pathways and systems.

GeneGist first scrapes and analyzes academic articles. GeneGist leverages the most advanced Large Language Models (LLMs) available to analyze this information. Using this distilled knowledge it produces biological process summaries.

GeneGist can also create Gene Reference Into Function (GeneRIFs) directly from scientific literature. GeneRIFs are concise sentence-like annotations, typically written by a human, that describe the function of a gene. GeneGist can construct GeneRIFs using generative AI technology based on LLMs.

License

Apache License

Installation

To install GeneGist, ensure you have Python 3.10 or higher. It can be installed via pip:

pip install genegist

Development

Installing Poetry

Poetry is required to handle dependencies and package management. To install Poetry, run:

curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python -

Setting Up genegist

  1. Clone the repository:

    git clone [repository URL]
    cd genegist
    
  2. Install the dependencies using Poetry:

    poetry install
    

Usage

To use genegist, run the following command:

poetry run genegist [options]

Options

  • -g GENE, --gene GENE: Look up GeneRIFs for a given gene.

  • -s GENESET, --geneset GENESET: Look up GeneRIFs for a given gene set.

  • -f GENESET_FILE, --geneset-file GENESET_FILE: Look up GeneRIFs for a file containing a list of genes.

  • -p PROCESS, --process PROCESS: Find a biological process for the inputted gene set.

  • -d CREATE_DRY_RUN, --create-dry-run CREATE_DRY_RUN: Don't actually run the biological process finder, but save the gene summaries to a file.

  • -a, --abstracts: Also look up abstracts.

  • -r LOAD_DRY_RUN, --load-dry-run LOAD_DRY_RUN: Load the gene summaries from a file instead of running the LLM on them explicitly.

  • --llm {gpt-3.5-turbo-1106,gpt-4-1106-preview}: Specify the LLM to use.

  • -m ARTICLE, --article ARTICLE: Get the summary for a given PMID.

  • -t, --tasks: Run a given custom task. Currently only E3 ligase analysis is supported.

  • -y, --synthetic-generifs: Create synthetic generifs and save them to a tab-delimited file.

  • -i, --build-index: Build an embedding index for all the generifs.

Development

Running Tests

To run tests, use:

poetry run pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genegist-0.1.5.tar.gz (14.8 kB view details)

Uploaded Source

Built Distribution

genegist-0.1.5-py3-none-any.whl (15.6 kB view details)

Uploaded Python 3

File details

Details for the file genegist-0.1.5.tar.gz.

File metadata

  • Download URL: genegist-0.1.5.tar.gz
  • Upload date:
  • Size: 14.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.12.1 Darwin/23.2.0

File hashes

Hashes for genegist-0.1.5.tar.gz
Algorithm Hash digest
SHA256 d6e75b4d2add7e12e02637aaf01dc50f0d290c301e603517c8ae94b359c1f814
MD5 b5bfc90d0a0f60d4b4fc73bfb3d95411
BLAKE2b-256 ca57c77a00f6cf2c3375a81da273ba87a05a1c6da900a4cef5e42721845b8f42

See more details on using hashes here.

File details

Details for the file genegist-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: genegist-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 15.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.12.1 Darwin/23.2.0

File hashes

Hashes for genegist-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 8f119c2febcb72997af04bc02fae940f9d681fb974576cf9b833cb4f27797e34
MD5 fcf3a648ea989dad3e3908b2908f43a5
BLAKE2b-256 6ecbaac9a2f630845f194ce30c88b9bb50e2ec92e611f3106388f60e54d5474a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page