Skip to main content

A bioinformatics tool for analyzing Bifidobacteria in sequencing data.

Project description

Bifidotyper

Bifidotyper is a fast, lightweight bioinformatics tool designed to take you from raw FastQ files to a complete and reproducible analysis of Bifidobacterial strains in your samples. It makes use of Sylph for rapid, k-mer-based read alignments. It also uses Salmon to detect the presence of genes necessary for the metabolism of human milk oligosaccharides (HMOs) based on alignments to the Bifidobacterium longum^1 genome using gene annotations from Henrick et al.[^2]

[^2]: Henrick, B. M. et al. (2021). Bifidobacteria-mediated immune system imprinting early in life. Cell, 184(15). https://doi.org/10.1016/j.cell.2021.05.030

Bifidotyper was developed as part of a PhD rotation in the Olm Lab and the IQ Biology program.

Installation

Bifidotyper can be installed with pip but it depends on Sylph and Salmon, which don't have pip distributions. You can install both with Conda (conda install -c bioconda sylph salmon), but for ease of use, binaries are included for Sylph and automatically downloaded for Salmon if they aren't found in your PATH.

Clone the repository and install the package:

git clone https://github.com/Bennibraun/bifidotyper.git
cd bifidotyper
pip install -e .

A proper distribution via PyPI or Anaconda is planned.

Usage

Command Line Interface

For single-end reads:

bifidotyper -se <single-end FASTQ files> [-t <threads>] [-g <genome-dir> | -s <genome-sketch>]

Or paired-end reads:

bifidotyper -pe <paired-end FASTQ files> [--r1-suffix <R1 suffix>] [--r2-suffix <R2 suffix>] [-t <threads>] [-g <genome-dir> | -s <genome-sketch>]

Options

  • -se, --single-end: Single-end FASTQ files.
  • -pe, --paired-end: Paired-end FASTQ files (R1 and R2 files, supports wildcards).
  • -t, --threads: Number of threads to use for parallel processing (default: 1).
  • --r1-suffix: Suffix for R1 files (optional, only for paired-end mode. Default: "_R1").
  • --r2-suffix: Suffix for R2 files (optional, only for paired-end mode. Default: "_R2").
  • -g, --genome-dir: Directory containing reference genomes (optional, defaults to provided genomes).
  • -s, --genome-sketch: Path to a pre-sketched Sylph genome database (optional, defaults to provided database). Use sylph sketch *.fna to generate your own database.

Examples

# Run bifidotyper in paired-end mode with 4 threads and _R1/_R2 suffixes
bifidotyper -pe data/*.fastq.gz --r1-suffix _R1 --r2-suffix _R2 -t 4
# Run bifidotyper in single-end mode with 8 threads and a custom genome directory
bifidotyper -se data/*.fastq.gz -t 8 -g my_genomes/

Output

The tool generates several output files and directories:

  • plots/: All plots generated by the program.
  • sylph_genome_sketches/: The database of Sylph genome indices.
  • sylph_fastq_sketches/: K-mer indices of input samples processed with Sylph.
  • sylph_genome_queries/: Results of running Sylph queries against the genomes.
  • hmo_quantification/: HMO gene alignments with Salmon.

Provided Reference Files


License

This project is licensed under the MIT License.

Contributing

Contributions are welcome! Please open an issue or submit a pull request on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bifidotyper-0.1.1.tar.gz (21.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bifidotyper-0.1.1-py3-none-any.whl (21.2 MB view details)

Uploaded Python 3

File details

Details for the file bifidotyper-0.1.1.tar.gz.

File metadata

  • Download URL: bifidotyper-0.1.1.tar.gz
  • Upload date:
  • Size: 21.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.3

File hashes

Hashes for bifidotyper-0.1.1.tar.gz
Algorithm Hash digest
SHA256 66c7d7eeb4f9f45baf4b17eacc997bda9e2005d30b68a419aeadc8b8508b73a0
MD5 ee29f2ce4966303054fe72cb5dc9c910
BLAKE2b-256 d0059328d607862289f49ad05eb6f42ce1da8a97ceab7a901cbbfa3d40b7aa19

See more details on using hashes here.

File details

Details for the file bifidotyper-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: bifidotyper-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 21.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.3

File hashes

Hashes for bifidotyper-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c6ec949c38165c18cb72152b36bd1c0ecdb771fbbc52ec072ea8241553f23610
MD5 8e5014b8f0854222ad1a437867abf208
BLAKE2b-256 fc9a6fb06e81111f6faa79e04e4599201225eee8482ff6b5c169ee819dd52693

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page