A collection of scripts designed to process Kraken2 reports and convert them into CSV format.
Reason this release was yanked:
Outdated
Project description
KrakenParser: Convert Kraken2 Reports to CSV
Overview
KrakenParser is a collection of scripts designed to process Kraken2 reports and convert them into CSV format. This pipeline extracts taxonomic abundance data at six levels:
- Phylum
- Class
- Order
- Family
- Genus
- Species
You can run the entire pipeline with a single command, or use the scripts individually depending on your needs.
Output example
counts_phylum.csv parsed from 7 kraken2 reports of metagenomic samples using KrakenParser:
Sample_id,Euryarchaeota,Euglenozoa,Parabasalia,Apicomplexa,Basidiomycota,Ascomycota,Acidobacteriota,Bdellovibrionota,Chlorobiota,Ignavibacteriota,Planctomycetota,Spirochaetota,Thermotogota,Fusobacteriota,Cyanobacteriota,Mycoplasmatota,Actinomycetota,Pseudomonadota,Bacteroidota,Deferribacterota,Campylobacterota,Thermodesulfobacteriota,Bacillota,Negarnaviricota,Nucleocytoviricota,Uroviricota,Peploviricota
X1,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
X2,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,4,0,0,0,0
X3,11,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1,0,0,0,4,0,0,0,0
X4,1313,0,0,0,0,4,0,0,0,0,0,1,2,2,1,3,3,17,33,4,5,4,112,0,0,0,0
X5,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,0,0
X6,0,0,0,0,0,0,0,0,0,0,1,1,0,1,1,0,0,3,3,0,3,2,13,0,0,0,1
X7,20,1,1,5,1,9,1,6,1,7,1,13,1,3,9,4,10,139,519,0,8,2,81,1,3,1,0
This counts_phylum.csv is easy to visualize as Relative Abundance Barplot!
Quick Start (Full Pipeline)
To run the full pipeline, use the following command:
KrakenParser data/kreports
This will:
- Convert Kraken2 reports to MPA format
- Combine MPA files into a single file
- Extract taxonomic levels into separate text files
- Process extracted text files
- Convert them into CSV format
Input Requirements
- The Kraken2 reports must be inside a subdirectory (e.g.,
data/kreports). - The script automatically creates output directories and processes the data.
Installation
pip install krakenparser
Using Individual Modules
You can also run each step manually if needed.
Step 1: Convert Kraken2 Reports to MPA Format
KrakenParser --kreport2mpa -i data/kreports -o data/mpa
This script converts Kraken2 .kreport files into MPA format using KrakenTools.
Step 2: Combine MPA Files
KrakenParser --combine_mpa -i data/mpa/* -o data/COMBINED.txt
This merges multiple MPA files into a single combined file.
Step 3: Extract Taxonomic Levels
KrakenParser --deconstruct -i data/COMBINED.txt -o data/counts
This step extracts only species-level data (excluding human reads).
Step 4: Process Extracted Taxonomic Data
KrakenParser --process -i data/COMBINED.txt -o data/counts/txt/counts_phylum.txt
KrakenParser --process -i data/COMBINED.txt -o data/counts/txt/counts_class.txt
KrakenParser --process -i data/COMBINED.txt -o data/counts/txt/counts_order.txt
KrakenParser --process -i data/COMBINED.txt -o data/counts/txt/counts_family.txt
KrakenParser --process -i data/COMBINED.txt -o data/counts/txt/counts_genus.txt
KrakenParser --process -i data/COMBINED.txt -o data/counts/txt/counts_species.txt
This script cleans up taxonomic names (removes prefixes, replaces underscores with spaces).
Step 5: Convert TXT to CSV
KrakenParser --txt2csv -i data/counts/txt/counts_phylum.txt -o data/counts/csv/counts_phylum.csv
KrakenParser --txt2csv -i data/counts/txt/counts_class.txt -o data/counts/csv/counts_class.csv
KrakenParser --txt2csv -i data/counts/txt/counts_order.txt -o data/counts/csv/counts_order.csv
KrakenParser --txt2csv -i data/counts/txt/counts_family.txt -o data/counts/csv/counts_family.csv
KrakenParser --txt2csv -i data/counts/txt/counts_genus.txt -o data/counts/csv/counts_genus.csv
KrakenParser --txt2csv -i data/counts/txt/counts_species.txt -o data/counts/csv/counts_species.csv
This converts the processed text files into structured CSV format.
Arguments Breakdown
KrakenParser (Main Pipeline)
- Automates the entire workflow.
- Takes one argument: the path to Kraken2 reports (
data/kreports). - Runs all the scripts in sequence.
--kreport2mpa (Step 1)
- Converts Kraken2 reports to MPA format.
- Uses
KrakenTools/kreport2mpa.py.
--combine_mpa (Step 2)
- Combines multiple MPA files into one.
- Uses
KrakenTools/combine_mpa.py.
--deconstruct (Step 3)
- Extracts phylum, class, order, family, genus, species into separate text files.
- Removes human-related reads.
--process (Step 4)
- Cleans and formats extracted taxonomic data.
- Removes prefixes (
s__,g__, etc.), replaces underscores with spaces.
--txt2csv (Step 5)
- Converts cleaned text files to CSV.
- Transposes data so that sample names become rows.
Example Output Structure
After running the full pipeline, the output directory will look like this:
data/
├─ kreports/ # Input Kraken2 reports
├─ mpa/ # Converted MPA files
├─ COMBINED.txt # Merged MPA file
└─ counts/
├─ txt/ # Extracted taxonomic levels in TXT
│ ├─ counts_species.txt
│ ├─ counts_genus.txt
│ ├─ counts_family.txt
│ ├─ ...
└─ csv/ # Final CSV output
├─ counts_species.csv
├─ counts_genus.csv
├─ counts_family.csv
├─ ...
Conclusion
KrakenParser provides a simple and automated way to convert Kraken2 reports into usable CSV files for downstream analysis. You can run the full pipeline with a single command or use individual scripts as needed.
For any issues or feature requests, feel free to open an issue on GitHub!
🚀 Happy analyzing!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file krakenparser-0.1.31.tar.gz.
File metadata
- Download URL: krakenparser-0.1.31.tar.gz
- Upload date:
- Size: 8.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
54b746a6c3bf71dacbeed94b124a19ee9c2c3a45f811723fd61c2620ddc7f149
|
|
| MD5 |
85c0c16625b9d50d9e5ff4dc0652dfd3
|
|
| BLAKE2b-256 |
3a58981c8c4cab146e8ef9e6abe337af5a271d613e9b10a0c44e5081b4a5ec3f
|
File details
Details for the file KrakenParser-0.1.31-py3-none-any.whl.
File metadata
- Download URL: KrakenParser-0.1.31-py3-none-any.whl
- Upload date:
- Size: 15.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0dbe290b0f348103489bed6e8ab3bfd4b28cd43d7bea9c7e65424535f26e9c9e
|
|
| MD5 |
d60638d6b074ab036922276b5a3cb643
|
|
| BLAKE2b-256 |
01f8d79f83258740295e684c3622f61d683b59be32680304ecf6d77c782b04a5
|