A multi-sample and multi-databases taxonomic analysis using Kraken
Project description
MULTITAX — Multi-database Taxonomic Classification pipeline
- Overview:
Runs taxonomic analysis on a set of samples using sequana_taxonomy (Kraken2 under the hood), optionally followed by BLAST on unclassified reads.
- Input:
A set of FastQ files (paired or single-end).
- Output:
HTML report for each sample and a summary HTML report for all samples.
- Status:
Production
- Citation:
Cokelaer et al, (2017), ‘Sequana’: a Set of Snakemake NGS pipelines, Journal of Open Source Software, 2(16), 352, doi:10.21105/joss.00352
Installation
pip install sequana-multitax
To upgrade an existing installation:
pip install sequana-multitax --upgrade
Quick Start
Step 1 — prepare the working directory:
sequana_multitax \
--input-directory /path/to/reads \
--databases /path/to/krakendb
This creates a multitax/ working directory containing config.yaml and a multitax.sh launch script.
Step 2 — review the configuration (optional but recommended):
cd multitax cat config.yaml # adjust parameters as needed
Step 3 — run the pipeline:
sh multitax.sh
Taxonomic database
You will need one or more Kraken2 databases. You can download a toy database for testing:
sequana_taxonomy --download toydb
The pipeline also requires a taxonomy file stored in ~/.config/sequana/taxonomy.dat. Download it once with:
sequana_multitax --update-taxonomy
Call this command again from time to time when unknown taxon IDs appear in the HTML reports.
Multiple databases can be passed to run iterative classification:
sequana_multitax \
--input-directory /path/to/reads \
--databases /path/to/virusdb /path/to/bacteriadb
Apptainer / Singularity
Every tool runs inside a pre-built container. Point --apptainer-prefix to a shared directory so images are downloaded once and reused across projects:
sequana_multitax \
--input-directory /path/to/reads \
--databases /path/to/krakendb \
--apptainer-prefix ~/.sequana/apptainers
Pass extra bind mounts with --apptainer-args if your data lives outside $HOME:
--apptainer-args "-B /data:/data"
When running snakemake manually, include the apptainer options:
snakemake -s multitax.rules --configfile config.yaml --cores 4 \
--use-apptainer \
--apptainer-prefix ~/.sequana/apptainers \
--apptainer-args "-B /home:/home"
HPC / SLURM cluster
On a cluster with SLURM, pass --profile slurm:
sequana_multitax \
--input-directory /path/to/reads \
--databases /path/to/krakendb \
--profile slurm \
--slurm-queue fast \
--jobs 40 \
--apptainer-prefix /shared/containers
BLAST on unclassified reads
Reads that remain unclassified after Kraken can optionally be BLASTed against a local database:
sequana_multitax \
--input-directory /path/to/reads \
--databases /path/to/krakendb \
--store-unclassified \
--do-blast-unclassified
This requires a local BLAST+ installation and a downloaded nt database.
Pipeline overview
Kraken2 — classify reads against one or more databases sequentially.
Krona — interactive pie charts per sample.
[Optional] BLAST — align unclassified reads against a nucleotide DB.
MultiQC — aggregated summary report across all samples.
Each sample produces an HTML report with a static pie chart (species distribution; grey = unclassified) that links to an interactive Krona chart.
When multiple databases are provided they are applied sequentially. The order matters: reads classified by the first database are removed before the second database is run.
Configuration file
After running sequana_multitax, a config.yaml is created in the working directory. Key sections:
sequana_taxonomy — databases, confidence threshold, store_unclassified
blast — enable/disable BLAST on unclassified reads
multiqc — aggregated report settings
Full reference: config.yaml
Requirements
kraken2
sequana_taxonomy
krona
Changelog
Version |
Description |
|---|---|
0.15.0 |
|
0.14.1 |
|
0.14.0 |
|
0.13.0 |
|
0.12.2 |
|
0.12.1 |
|
0.12.0 |
|
0.11.1 |
|
0.11.0 |
|
0.10.2 |
|
0.10.1 |
|
0.10.0 |
|
0.9.2 |
|
0.9.1 |
|
0.9.0 |
|
0.8.7 |
|
0.8.6 |
|
0.8.5 |
|
0.8.4 |
|
0.8.3 |
|
0.8.2 |
|
0.8.1 |
Fix requirements. |
0.8.0 |
First release. |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sequana_multitax-0.15.0.tar.gz.
File metadata
- Download URL: sequana_multitax-0.15.0.tar.gz
- Upload date:
- Size: 150.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
92c231572dab59b9dd1573f834e2daa7cc17fa840a9ef8b2d5bda106fb8c2c22
|
|
| MD5 |
de3536cc84cbe97f93d5364a5d296d5a
|
|
| BLAKE2b-256 |
6c00fb37af5c634afe8d90ccc7330ad9ab10a1360a8850e799f088f36d835c63
|
File details
Details for the file sequana_multitax-0.15.0-py3-none-any.whl.
File metadata
- Download URL: sequana_multitax-0.15.0-py3-none-any.whl
- Upload date:
- Size: 150.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7f0841aa8cec947b9032d0a1ca2c5f6163c9a9eba0e1a726b6096095787df80a
|
|
| MD5 |
1b731c5885a70bdedacdf55fca3aba81
|
|
| BLAKE2b-256 |
ae5916273a2e936396154a53505ab5482ee99baf32a4b5f2f975853f4cae9eec
|