Skip to main content

GPTBioInsightor utilizes the powerful capabilities of large language models to help people quickly gain knowledge and insight, enhancing their work efficiency.

Project description

GPTBioInsightor

GPTBioInsightor is a tool designed for single-cell data analysis, particularly beneficial for newcomers to a biological field or those in interdisciplinary areas who may lack sufficient biological background knowledge. GPTBioInsightor utilizes the powerful capabilities of large language models to help people quickly gain knowledge and insight, enhancing their work efficiency.

Documention

Please checkout the documentations at:

Supported LLM provider

  • openai
  • anthropic
  • openrouter
  • groq
  • aliyun
  • zhipuai
  • siliconflow
  • deepseek
  • perplexity

Get started

Installation

Install GPTBioInsightor from PyPi:

pip install gptbioinsightor

Usage

import gptbioinsightor as gbi 

### Set API KEY of LLM 
import os

os.environ['API_KEY'] = "sk-***"
## or API KEY for anthropic
os.environ['ANTHROPIC_API_KEY'] = "sk-***"


# set background of your data
background = "Human Healthy Donor PBMCs" 

# make sure you have perform DEG analysis for adata,like: sc.tl.rank_genes_groups(adata, "leiden", method="wilcoxon")
# Here, use claude-3-5-sonnet-20241022 of anthropic, but you also can use other supported LLM provider.
res = gbi.get_celltype(adata, background=background, 
                       out="gbi.claude.celltype.md", 
                       topnumber=15,provider="anthropic", 
                       n_jobs=4,model="claude-3-5-sonnet-20241022")
# {'0': 'CD4+ T Helper Cells',
#  '1': 'B Cells',
#  '2': 'Monocytes/Macrophages',
#  '3': 'Natural Killer (NK) cells',
#  '4': 'Cytotoxic T Cells (CD8+)',
#  '5': 'Monocytes/Macrophages',
#  '6': 'Dendritic Cells',
#  '7': 'Platelets'}

It will output a markdown file, like:

# CellType Analysis
## cluster geneset 0

### Gene List
\```
LDHB, LTB, RGCC, IL32, NOSIP, CD3D, CD3E, TMEM123, VIM, TMEM66, FYB, JUNB, CCR7, CD27, MYL12A
\```

### Celltype Prediction
#### Optimal Celltype: T Cells (likely CD4+ T cells)
**Key Markers**:
- Cell-specific: CD3D, CD3E, CCR7, CD27, FYB
- Context-specific: LTB, JUNB, VIM

**Evidence and Reasoning**
- **PRIMARY EVIDENCE**: The presence of CD3D and CD3E, which are essential components of the T cell receptor complex, strongly indicates a T cell population. These markers are highly specific to T cells.
- **SECONDARY EVIDENCE**: CCR7 is a chemokine receptor that is typically expressed on naïve and central memory T cells, suggesting that these cells may be in a non-activated or memory state.
- **ADDITIONAL EVIDENCE**: CD27 and FYB are also known to be expressed in T cells, particularly in activated T cells. LTB (lymphotoxin beta) and JUNB (a transcription factor) are involved in T cell activation and function.

**Validation**: Other gold standard markers for T cells (not in Geneset 0) include CD4, CD8, and TCRα/β. For CD4+ T cells, additional markers like CD45RA and CD45RO can be used to distinguish between naïve and memory T cells.

#### Alternative Considerations
- **Alternative celltype1: NK cells**
    - **WHY Alternative? Key MARKERS, Evidence and Reasoning**: NK cells can express some of the markers found in this geneset, such as CCR7 and CD27, but the presence of CD3D and CD3E, which are not expressed in NK cells, makes this less likely.
    - **OTHER Gold Standard MARKERS(NOT IN Geneset 0) TO VALIDATE THE Alternative celltype1**: NK cells would typically express NKp46, KIRs, and NKG2D, which are not present in this geneset.

- **Alternative celltype2: B cells**
    - **WHY Alternative? Key MARKERS, Evidence and Reasoning**: B cells do not typically express CD3D and CD3E, which are strong T cell markers. However, some B cell markers like CD27 and CCR7 are present, which might lead to confusion. The absence of B cell-specific markers like CD19 and CD20 makes this alternative less likely.
    - **OTHER Gold Standard MARKERS(NOT IN Geneset 0) TO VALIDATE THE Alternative celltype2**: B cells would typically express CD19, CD20, and surface IgM, which are not present in this geneset.

### Novel Insights
- **NOTEWORTHY PATTERNS**: The co-expression of CCR7 and CD27 suggests a population of T cells that are either naïve or central memory T cells.
- **CELL STATE**: The expression of JUNB and LTB, along with other activation-related genes, suggests that these T cells may be in an activated or recently activated state.
- **POTENTIAL NEW FINDINGS**: The presence of VIM (vimentin), a marker often associated with mesenchymal cells, in T cells is intriguing and could indicate a unique subset of T cells or a state of T cells that have undergone some form of stress or activation leading to the upregulation of vimentin.

Contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gptbioinsightor-0.7.6.tar.gz (6.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gptbioinsightor-0.7.6-py3-none-any.whl (40.8 kB view details)

Uploaded Python 3

File details

Details for the file gptbioinsightor-0.7.6.tar.gz.

File metadata

  • Download URL: gptbioinsightor-0.7.6.tar.gz
  • Upload date:
  • Size: 6.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for gptbioinsightor-0.7.6.tar.gz
Algorithm Hash digest
SHA256 9a3e5bced30883c50514515c2d9867d7efb9635dc1fc99d796d07e3edf787b26
MD5 a3b770aaeebd0ef232647cf097ec3386
BLAKE2b-256 973834e1f35327d7d4540d95fab8c35371ed42692e06ba299c164bff1f22feab

See more details on using hashes here.

Provenance

The following attestation bundles were made for gptbioinsightor-0.7.6.tar.gz:

Publisher: publish.yml on huang-sh/GPTBioInsightor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gptbioinsightor-0.7.6-py3-none-any.whl.

File metadata

File hashes

Hashes for gptbioinsightor-0.7.6-py3-none-any.whl
Algorithm Hash digest
SHA256 c17d6c55a73006c929a560cadf279c28f099645c900ade90d0ad85b175630b55
MD5 a9ba14171b5aa41775a25c67e872729c
BLAKE2b-256 498f5b5b4a90b0fbb453902b1a22c04f0348fc88e3cc1e5b95bcea0d51912812

See more details on using hashes here.

Provenance

The following attestation bundles were made for gptbioinsightor-0.7.6-py3-none-any.whl:

Publisher: publish.yml on huang-sh/GPTBioInsightor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page