Skip to main content

Command-line tool for retrieving pangenomes using the PanGBank API.

Project description

PanGBank-cli

PanGBank-cli is a command-line interface to search, retrieve, and download pangenomes from PanGBank via the PanGBank REST API. It acts as a convenient wrapper around the API, making PanGBank data easily accessible directly from the terminal.

PanGBank is a large-scale resource that hosts collections of microbial pangenomes constructed from diverse genome sources using PPanGGOLiN.

With PanGBank-cli you can:

  • Search pangenomes by taxon, genome, or collection
  • Retrieve detailed metrics for selected pangenomes
  • Download pangenome files for downstream analyses
  • Map an input genome to its corresponding pangenome in PanGBank and fetch it automatically

For interactive exploration, you can also browse PanGBank collections through the web application: PanGBank Web: https://pangbank.genoscope.cns.fr/

Installation

Option 1: Install using conda

# Create a new conda environment with Python
conda create -n pangbank-cli python=3.12 mash=2.3

# Activate the environment
conda activate pangbank-cli

# Clone the repository
git clone https://github.com/labgem/PanGBank-cli.git
cd PanGBank-cli

# Install PanGBank-cli
pip install .

Option 2: Install with pip

# Clone the repository
git clone https://github.com/labgem/PanGBank-cli.git
cd PanGBank-cli

# create and activate a virtual environment:
python -m venv venv

# Activate the virtual environment
# On Linux/macOS:
source venv/bin/activate

# Install PanGBank-cli
pip install .

[!WARNING] Installing PanGBank-cli with this method will only set up the Python dependencies. The external tool Mash (required for the match-pangenome command) is not included and must be installed separately to enable full functionality.

Usage

Once installed, you can access the CLI by running:

pangbank --help

This will display the list of available commands and options.

List available collections

pangbank list-collections

Displays all pangenome collections available in PanGBank, along with their description and the number of pangenomes they contain.

Search for pangenomes

pangbank search-pangenomes --taxon "g__Escherichia"

Searches PanGBank for pangenomes matching the given taxon. Results are saved as a TSV file named 'pangenomes_information.tsv' by default containing summary metrics for the matching pangenomes.

Download pangenomes

pangbank search-pangenomes --taxon "g__Chlamydia" \
    --collection GTDB_refseq \
    --outdir Chlamydia_pangenomes/ \
    --download

Searches for Chlamydia pangenomes in the GTDB_refseq collection, then downloads the corresponding pangenome files into Chlamydia_pangenomes/.

Match a genome to an existing pangenome

pangbank match-pangenome --input-genome <genome.fasta> --collection GTDB_all

Matches the given input genome (FASTA format) to the most similar pangenome in the selected collection using Mash and a precomputed sketch of the collection to identify the closest pangenome. The command outputs detailed information about the best matching pangenome.

[!NOTE]

  • Add the --download flag to download the corresponding pangenome file.
  • The downloaded file can then be used with PPanGGOLiN’s projection command to annotate the input genome. See the PPanGGOLiN documentation for details.

Citation

PanGBank pangenomes are constructed with PPanGGOLiN and its companion tools. If you use PanGBank or PanGBank-cli in your research, please cite the following references:

PPanGGOLiN: Depicting microbial diversity via a partitioned pangenome graph Gautreau G et al. (2020) PLOS Computational Biology 16(3): e1007732. doi: 10.1371/journal.pcbi.1007732

panRGP: a pangenome-based method to predict genomic islands and explore their diversity Bazin et al. (2020) Bioinformatics, Volume 36, Issue Supplement_2, Pages i651–i658 doi: 10.1093/bioinformatics/btaa792

panModule: detecting conserved modules in the variable regions of a pangenome graph Bazin et al. (2021) bioRxiv doi: 10.1101/2021.12.06.471380

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pangbank_cli-0.1.1.tar.gz (33.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pangbank_cli-0.1.1-py3-none-any.whl (31.9 kB view details)

Uploaded Python 3

File details

Details for the file pangbank_cli-0.1.1.tar.gz.

File metadata

  • Download URL: pangbank_cli-0.1.1.tar.gz
  • Upload date:
  • Size: 33.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pangbank_cli-0.1.1.tar.gz
Algorithm Hash digest
SHA256 7f646f2db99727990cdd8a795667afcc751254ee0cba9d0e15143b25dd72ca1f
MD5 0c2310d6096b22d8f408a2480bd1cd5e
BLAKE2b-256 34be7bb5f560434868088c5fc92230238819f758d57d513b880cc4cdc780b3cb

See more details on using hashes here.

Provenance

The following attestation bundles were made for pangbank_cli-0.1.1.tar.gz:

Publisher: python-publish.yml on labgem/PanGBank-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pangbank_cli-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pangbank_cli-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 31.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pangbank_cli-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 be8cc761df1d582357098b66c9aa544cfbca958c79b1b45e842f11cec599480c
MD5 6fd1597b831a1f199587edb97d7f8df7
BLAKE2b-256 3ba379e90fb6a9472b795605e95f652fd42591ee7efa9b3aff84bf5eb5a3a81a

See more details on using hashes here.

Provenance

The following attestation bundles were made for pangbank_cli-0.1.1-py3-none-any.whl:

Publisher: python-publish.yml on labgem/PanGBank-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page