Skip to main content

A bioinformatics tool for analyzing Bifidobacteria in sequencing data.

Project description

Bifidotyper

Bifidotyper is a fast, lightweight bioinformatics tool designed to take you from raw FastQ files to a complete and reproducible analysis of Bifidobacterial strains in your samples. It makes use of Sylph for rapid, k-mer-based read alignments. It also uses Salmon to detect the presence of genes necessary for the metabolism of human milk oligosaccharides (HMOs) based on alignments to Bifidobacterium longum genes annotated by Henrick et al.

Bifidotyper was developed as part of a PhD rotation in the Olm Lab and the IQ Biology program.

Bifidotyper Graphical Abstract


Installation

pip install bifidotyper

[!NOTE] Bifidotyper can be installed with pip but it depends on Sylph and Salmon, which don't have pip distributions. For ease of use, binaries are included for Sylph and automatically downloaded for Salmon if they aren't found in your PATH. If you have problems with these, you can install both manually with Conda (conda install -c bioconda sylph salmon).


Usage

Command Line Interface

For single-end reads:

bifidotyper -se <single-end FASTQ files> [-t <threads>] [-g <genome-dir> | -s <genome-sketch>] [-r <rpm_threshold>]

Or paired-end reads:

bifidotyper -pe <paired-end FASTQ files> [--r1-suffix <R1 suffix>] [--r2-suffix <R2 suffix>] [-t <threads>] [-g <genome-dir> | -s <genome-sketch>] [-r <rpm_threshold>]

Options

  • -se, --single-end: Single-end FASTQ files.
  • -pe, --paired-end: Paired-end FASTQ files (R1 and R2 files, supports wildcards).
  • -t, --threads: Number of threads to use for parallel processing (default: 1).
  • --r1-suffix: Suffix for R1 files (optional, only for paired-end mode. Default: "_R1").
  • --r2-suffix: Suffix for R2 files (optional, only for paired-end mode. Default: "_R2").
  • -g, --genome-dir: Directory containing reference genomes (optional, defaults to provided genomes).
  • -s, --genome-sketch: Path to a pre-sketched Sylph genome database (optional, defaults to provided database). Use sylph sketch *.fna to generate your own database.
  • -r, --rpm_threshold: Minimum RPM threshold for HMO genes to be considered present (default: 10).

Examples

# Run bifidotyper in paired-end mode with 4 threads and _R1/_R2 suffixes
bifidotyper -pe data/*.fastq.gz --r1-suffix _R1 --r2-suffix _R2 -t 4
# Run bifidotyper in single-end mode with 8 threads, a custom genome directory, and a custom RPM threshold
bifidotyper -se data/*.fastq.gz -t 8 -g my_genomes/ -r 25

Output

The tool generates several output files and directories:

  • plots/: All plots generated by the program. Also includes tables for convenience.
  • sylph_genome_sketches/: The database of Sylph genome indices.
  • sylph_fastq_sketches/: K-mer indices of input samples processed with Sylph.
  • sylph_genome_queries/: Results of running Sylph queries against the genomes.
  • hmo_quantification/: HMO gene alignments with Salmon.

Provided Reference Files


License

This project is licensed under the MIT License.

Contributing

Contributions are welcome! Please open an issue or submit a pull request on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bifidotyper-0.1.4.tar.gz (21.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bifidotyper-0.1.4-py3-none-any.whl (21.7 MB view details)

Uploaded Python 3

File details

Details for the file bifidotyper-0.1.4.tar.gz.

File metadata

  • Download URL: bifidotyper-0.1.4.tar.gz
  • Upload date:
  • Size: 21.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.4

File hashes

Hashes for bifidotyper-0.1.4.tar.gz
Algorithm Hash digest
SHA256 03de2c9e22a5a530bdd32c1940d3d460b97893e8d84e5e2193b71be62f5bd6fd
MD5 c650461b82dd377a0bb001ff3e382b89
BLAKE2b-256 abd0b46b8a268f0d77e4864473399bece588cb0632f58837ecb461d2392c8aa8

See more details on using hashes here.

File details

Details for the file bifidotyper-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: bifidotyper-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 21.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.4

File hashes

Hashes for bifidotyper-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 90ec04a0c5bc917280f479001e838474df274d3701487ea656ce2e0cc852d3bf
MD5 24394a48e4c9862c90ea43f10bde8fd1
BLAKE2b-256 daa2040a8427a2997fc3d55afb8ae0cecdd3d658b1ce526e5ad35b6118c6cf69

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page