Efficient querying of genomic databases.

These details have not been verified by PyPI

Project links

Homepage

Project description

gget

Code Coverage

gget is a free, open-source command-line tool and Python package that enables efficient querying of genomic databases. gget consists of a collection of separate but interoperable modules, each designed to facilitate one type of database querying in a single line of code. gget was developed by Laura Luebbert in the Pachter Lab.

gget is part of the scverse® project and is fiscally sponsored by NumFOCUS. If you like gget and want to support our mission, please consider making a tax-deductible donation.

alt text

If you use gget in a publication, please cite*:

Luebbert, L., & Pachter, L. (2023). Efficient querying of genomic reference databases with gget. Bioinformatics. https://doi.org/10.1093/bioinformatics/btac836

Read the article here: https://doi.org/10.1093/bioinformatics/btac836

Installation

uv pip install gget

pip install --upgrade gget

Install from source:

git clone https://github.com/pachterlab/gget.git
cd gget
uv pip install .

For use in Jupyter Lab / Google Colab:

# Python
import gget

🔗 Manual

🪄 Quick start guide

Command line:

# Fetch all Homo sapiens reference and annotation FTPs from the latest Ensembl release
$ gget ref homo_sapiens

# Get Ensembl IDs of human genes with "ace2" or "angiotensin converting enzyme 2" in their name/description
$ gget search -s homo_sapiens 'ace2' 'angiotensin converting enzyme 2'

# Look up gene ENSG00000130234 (ACE2) and its transcript ENST00000252519
$ gget info ENSG00000130234 ENST00000252519

# Fetch the amino acid sequence of the canonical transcript of gene ENSG00000130234
$ gget seq --translate ENSG00000130234

# Quickly find the genomic location of (the start of) that amino acid sequence
$ gget blat MSSSSWLLLSLVAVTAAQSTIEEQAKTFLDKFNHEAEDLFYQSSLAS

# BLAST (the start of) that amino acid sequence
$ gget blast MSSSSWLLLSLVAVTAAQSTIEEQAKTFLDKFNHEAEDLFYQSSLAS

# Align multiple nucleotide or amino acid sequences against each other (also accepts path to FASTA file)  
$ gget muscle MSSSSWLLLSLVAVTAAQSTIEEQAKTFLDKFNHEAEDLFYQSSLAS MSSSSWLLLSLVEVTAAQSTIEQQAKTFLDKFHEAEDLFYQSLLAS

# Align one or more amino acid sequences against a reference (containing one or more sequences) (local BLAST) (also accepts paths to FASTA files)  
$ gget diamond MSSSSWLLLSLVAVTAAQSTIEEQAKTFLDKFNHEAEDLFYQSSLAS -ref MSSSSWLLLSLVEVTAAQSTIEQQAKTFLDKFHEAEDLFYQSLLAS  

# Use Enrichr for an ontology analysis of a list of genes
$ gget enrichr -db ontology ACE2 AGT AGTR1 ACE AGTRAP AGTR2 ACE3P

# Get the human tissue expression of gene ACE2
$ gget archs4 -w tissue ACE2

# Get the protein structure (in PDB format) of ACE2 as stored in the Protein Data Bank (PDB ID returned by gget info)
$ gget pdb 1R42 -o 1R42.pdb

# Download virus genome datasets from NCBI Virus (e.g., Zika virus sequences)
$ gget virus "Zika virus" --host "Homo sapiens" --nuc_completeness complete

# Find Eukaryotic Linear Motifs (ELMs) in a protein sequence
$ gget setup elm # setup only needs to be run once
$ gget elm -o results MSSSSWLLLSLVAVTAAQSTIEEQAKTFLDKFNHEAEDLFYQSSLAS

# Fetch a scRNAseq count matrix (AnnData format) based on specified gene(s), tissue(s), and cell type(s) (default species: human)
$ gget setup cellxgene # setup only needs to be run once
$ gget cellxgene --gene ACE2 SLC5A1 --tissue lung --cell_type 'mucus secreting cell' -o example_adata.h5ad

# Predict the protein structure of GFP from its amino acid sequence
$ gget setup alphafold # setup only needs to be run once
$ gget alphafold MSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK

Python (Jupyter Lab / Google Colab):

import gget
gget.ref("homo_sapiens")
gget.search(["ace2", "angiotensin converting enzyme 2"], "homo_sapiens")
gget.info(["ENSG00000130234", "ENST00000252519"])
gget.seq("ENSG00000130234", translate=True)
gget.blat("MSSSSWLLLSLVAVTAAQSTIEEQAKTFLDKFNHEAEDLFYQSSLAS")
gget.blast("MSSSSWLLLSLVAVTAAQSTIEEQAKTFLDKFNHEAEDLFYQSSLAS")
gget.muscle(["MSSSSWLLLSLVAVTAAQSTIEEQAKTFLDKFNHEAEDLFYQSSLAS", "MSSSSWLLLSLVEVTAAQSTIEQQAKTFLDKFHEAEDLFYQSLLAS"])
gget.diamond("MSSSSWLLLSLVAVTAAQSTIEEQAKTFLDKFNHEAEDLFYQSSLAS", reference="MSSSSWLLLSLVEVTAAQSTIEQQAKTFLDKFHEAEDLFYQSLLAS")
gget.enrichr(["ACE2", "AGT", "AGTR1", "ACE", "AGTRAP", "AGTR2", "ACE3P"], database="ontology", plot=True)
gget.archs4("ACE2", which="tissue")
gget.pdb("1R42", save=True)
gget.virus("Zika virus", host="Homo sapiens", nuc_completeness="complete")

gget.setup("elm") # setup only needs to be run once
ortho_df, regex_df = gget.elm("MSSSSWLLLSLVAVTAAQSTIEEQAKTFLDKFNHEAEDLFYQSSLAS")

gget.setup("cellxgene") # setup only needs to be run once
gget.cellxgene(gene = ["ACE2", "SLC5A1"], tissue = "lung", cell_type = "mucus secreting cell")

gget.setup("alphafold") # setup only needs to be run once
gget.alphafold("MSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK")

Call gget from R using reticulate:

system("pip install gget")
install.packages("reticulate")
library(reticulate)
gget <- import("gget")

gget$ref("homo_sapiens")
gget$search(list("ace2", "angiotensin converting enzyme 2"), "homo_sapiens")
gget$info(list("ENSG00000130234", "ENST00000252519"))
gget$seq("ENSG00000130234", translate=TRUE)
gget$blat("MSSSSWLLLSLVAVTAAQSTIEEQAKTFLDKFNHEAEDLFYQSSLAS")
gget$blast("MSSSSWLLLSLVAVTAAQSTIEEQAKTFLDKFNHEAEDLFYQSSLAS")
gget$muscle(list("MSSSSWLLLSLVAVTAAQSTIEEQAKTFLDKFNHEAEDLFYQSSLAS", "MSSSSWLLLSLVEVTAAQSTIEQQAKTFLDKFHEAEDLFYQSLLAS"), out="out.afa")
gget$diamond("MSSSSWLLLSLVAVTAAQSTIEEQAKTFLDKFNHEAEDLFYQSSLAS", reference="MSSSSWLLLSLVEVTAAQSTIEQQAKTFLDKFHEAEDLFYQSLLAS")
gget$enrichr(list("ACE2", "AGT", "AGTR1", "ACE", "AGTRAP", "AGTR2", "ACE3P"), database="ontology")
gget$archs4("ACE2", which="tissue")
gget$pdb("1R42", save=TRUE)
gget$virus("Zika virus", host="Homo sapiens", nuc_completeness="complete")

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.30.7

Jun 22, 2026

This version

0.30.6

Jun 11, 2026

0.30.5

May 24, 2026

0.30.3

Feb 27, 2026

0.30.2

Feb 8, 2026

0.30.0

Jan 20, 2026

0.29.3

Sep 11, 2025

0.29.2

Jul 3, 2025

0.29.1

Apr 21, 2025

0.29.0

Sep 26, 2024

0.28.6

Jun 3, 2024

0.28.5 yanked

May 30, 2024

Reason this release was yanked:

Bug in gget.setup("alphafold") + inversion mutations in gget mutate did not take the complement

0.28.4

Feb 1, 2024

0.28.3

Jan 22, 2024

0.28.2

Nov 16, 2023

0.28.0

Nov 12, 2023

0.27.9

Aug 7, 2023

0.27.8

Jul 12, 2023

0.27.7

May 16, 2023

0.27.6 yanked

May 2, 2023

Reason this release was yanked:

Requirement clashes

0.27.5

Apr 6, 2023

0.27.4

Mar 19, 2023

0.27.3

Mar 11, 2023

0.27.2

Jan 1, 2023

0.27.1

Dec 30, 2022

0.27.0

Dec 10, 2022

0.3.13

Nov 11, 2022

0.3.12

Nov 10, 2022

0.3.11

Sep 7, 2022

0.3.10

Sep 2, 2022

0.3.9

Aug 25, 2022

0.3.8

Aug 12, 2022

0.3.7

Aug 9, 2022

0.3.5

Aug 6, 2022

0.3.4 yanked

Aug 6, 2022

Reason this release was yanked:

Bug in gget alphafold reading .fa files

0.3.3 yanked

Aug 5, 2022

Reason this release was yanked:

Bug in gget alphafold reading .fa files

0.3.1 yanked

Aug 5, 2022

Reason this release was yanked:

Bug in gget alphafold relax flag

0.3.0 yanked

Aug 4, 2022

Reason this release was yanked:

Bug in gget alphafold relax flag

0.2.7

Jul 29, 2022

0.2.6

Jul 8, 2022

0.2.5

Jun 30, 2022

0.2.4

Jun 29, 2022

0.2.3

Jun 27, 2022

0.2.2

Jun 24, 2022

0.2.1

Jun 9, 2022

0.2.0

Jun 8, 2022

0.1.2

Jun 3, 2022

0.1.1

May 28, 2022

0.1.0

May 25, 2022

0.0.24

May 17, 2022

0.0.23 yanked

May 17, 2022

Reason this release was yanked:

Bug in terminal functionality

0.0.22

May 10, 2022

0.0.17

Mar 2, 2022

0.0.16

Mar 2, 2022

0.0.6

Feb 26, 2022

0.0.5

Feb 25, 2022

0.0.4

Feb 22, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gget-0.30.6.tar.gz (79.3 MB view details)

Uploaded Jun 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gget-0.30.6-py3-none-any.whl (79.6 MB view details)

Uploaded Jun 11, 2026 Python 3

File details

Details for the file gget-0.30.6.tar.gz.

File metadata

Download URL: gget-0.30.6.tar.gz
Upload date: Jun 11, 2026
Size: 79.3 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for gget-0.30.6.tar.gz
Algorithm	Hash digest
SHA256	`94d44fa131cd54ca88ee95a934e3ce2ca3db14fbcb7cdb1293906949656d239a`
MD5	`199c4dc92380e6ef7b47c8da6be07ff4`
BLAKE2b-256	`de4792871c1103dd88a4189efa90deb8290dd6180cbfb74d01c423930d3553ad`

See more details on using hashes here.

File details

Details for the file gget-0.30.6-py3-none-any.whl.

File metadata

Download URL: gget-0.30.6-py3-none-any.whl
Upload date: Jun 11, 2026
Size: 79.6 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for gget-0.30.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`067542a294c0cf61d6aa71fe945561748b3f45520110d1c7f81ca7dcdab2425c`
MD5	`5401993235bb4430d9f2e1c5b40f99a8`
BLAKE2b-256	`9faaec6461f579a79248f4ca588ddf1faff3960a734c5b3475fd29f73c2af0cb`

See more details on using hashes here.

gget 0.30.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

gget

Installation

🔗 Manual

🪄 Quick start guide

More tutorials

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes