A bioinformatics tool for analyzing Bifidobacteria in sequencing data.
Project description
Bifidotyper
Bifidotyper is a fast, lightweight bioinformatics tool designed to take you from raw FastQ files to a complete and reproducible analysis of Bifidobacterial strains in your samples. It makes use of Sylph for rapid, k-mer-based read alignments. It also uses Salmon to detect the presence of genes necessary for the metabolism of human milk oligosaccharides (HMOs) based on alignments to Bifidobacterium longum genes annotated by Henrick et al.
Bifidotyper was developed as part of a PhD rotation in the Olm Lab and the IQ Biology program.
Installation
pip install bifidotyper
[!NOTE] Bifidotyper can be installed with
pipbut it depends on Sylph and Salmon, which don't havepipdistributions. For ease of use, binaries are included for Sylph and automatically downloaded for Salmon if they aren't found in yourPATH. If you have problems with these, you can install both manually with Conda (conda install -c bioconda sylph salmon).
Usage
Command Line Interface
For single-end reads:
bifidotyper -se <single-end FASTQ files> [-t <threads>] [-g <genome-dir> | -s <genome-sketch>] [-r <rpm_threshold>]
Or paired-end reads:
bifidotyper -pe <paired-end FASTQ files> [--r1-suffix <R1 suffix>] [--r2-suffix <R2 suffix>] [-t <threads>] [-g <genome-dir> | -s <genome-sketch>] [-r <rpm_threshold>]
Options
-se, --single-end: Single-end FASTQ files.-pe, --paired-end: Paired-end FASTQ files (R1 and R2 files, supports wildcards).-t, --threads: Number of threads to use for parallel processing (default: 1).--r1-suffix: Suffix for R1 files (optional, only for paired-end mode. Default: "_R1").--r2-suffix: Suffix for R2 files (optional, only for paired-end mode. Default: "_R2").-g, --genome-dir: Directory containing reference genomes (optional, defaults to provided genomes).-s, --genome-sketch: Path to a pre-sketched Sylph genome database (optional, defaults to provided database). Usesylph sketch *.fnato generate your own database.-r, --rpm_threshold: Minimum RPM threshold for HMO genes to be considered present (default: 10).
Examples
# Run bifidotyper in paired-end mode with 4 threads and _R1/_R2 suffixes
bifidotyper -pe data/*.fastq.gz --r1-suffix _R1 --r2-suffix _R2 -t 4
# Run bifidotyper in single-end mode with 8 threads, a custom genome directory, and a custom RPM threshold
bifidotyper -se data/*.fastq.gz -t 8 -g my_genomes/ -r 25
Output
The tool generates several output files and directories:
plots/: All plots generated by the program. Also includes tables for convenience.sylph_genome_sketches/: The database of Sylph genome indices.sylph_fastq_sketches/: K-mer indices of input samples processed with Sylph.sylph_genome_queries/: Results of running Sylph queries against the genomes.hmo_quantification/: HMO gene alignments with Salmon.
Provided Reference Files
- HMO functional annotations were retrieved from Henrick et al. 2021. The table is provided in
src/data/reference/humann2_HMO_annotation.csv - All B. longum annotations are from the NCBI record for CP001095.1
- A pre-processed Sylph genome database is provided for ease of use. Any genome matching the family Bifidobacterium in NCBI and GTDB was included. The genome database was dereplicated with dRep with
--S_ani 0.95before indexing. All genome accessions are listed insrc/data/reference/genomes.csv
License
This project is licensed under the MIT License.
Contributing
Contributions are welcome! Please open an issue or submit a pull request on GitHub.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bifidotyper-0.1.4.tar.gz.
File metadata
- Download URL: bifidotyper-0.1.4.tar.gz
- Upload date:
- Size: 21.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
03de2c9e22a5a530bdd32c1940d3d460b97893e8d84e5e2193b71be62f5bd6fd
|
|
| MD5 |
c650461b82dd377a0bb001ff3e382b89
|
|
| BLAKE2b-256 |
abd0b46b8a268f0d77e4864473399bece588cb0632f58837ecb461d2392c8aa8
|
File details
Details for the file bifidotyper-0.1.4-py3-none-any.whl.
File metadata
- Download URL: bifidotyper-0.1.4-py3-none-any.whl
- Upload date:
- Size: 21.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
90ec04a0c5bc917280f479001e838474df274d3701487ea656ce2e0cc852d3bf
|
|
| MD5 |
24394a48e4c9862c90ea43f10bde8fd1
|
|
| BLAKE2b-256 |
daa2040a8427a2997fc3d55afb8ae0cecdd3d658b1ce526e5ad35b6118c6cf69
|