A computational workflow designed to recover plastid genomes from metagenomes.
Project description
Author: Yuhao Tong (The University of Melbourne)
A snakemake workflow for MMA metagenomics for recovering chloroplast genomes.
Project implementation commence date: August 8th, 2023 This workflow will try to cover all essential steps in order to get down the genomics-based metagenomic analysis in recovering algal plastidial genomes. Before starting the new jobs:
Set-up the binny workflow within ChloroScan working directory.
Set-up the CAT-taxonomy identification database, for details please see this: https://tbb.bio.uu.nl/bastiaan/CAT_prepare/
Make sure the FragGeneScanRs is added to your path.
All steps above have been covered by running autoInit.sh.
Modules of the workflow:
bio-corgi: Contig classification to filter out plastid contigs.
binny (Customized for ChloroScan): cluster contigs into bins/Metagenome-Assembled genomes (MAGs).
CAT/BAT: the taxonomy assignment of each contig to help identifying the taxon of bins (this workflow uses conda version of CAT/BAT, meanwhile the nr database has been updated to 2023/11/20).
summary.py: The tabular storing of binning and taxonomic identification info that can augment further interpretation of the data.
visualization.py: using the spreadsheet output from summary.py, visualize contig clustering info via scatterplot, taxonomy via pie chart and contig depth violin plot.
refinement.py: remove contigs within bins that are “not taxonomically identified as eukaryotic” AND contains no markers predicted from database used by binny.
CDS extraction: FragGeneScanRs (installed via cargo) predicts ORF from contigs and gffread will turn gff files into fasta format to enable downstream analysis.
Dataset microbial community visualization: via Krona, that produces a pie chart to visualize microbial taxonomic groups in metagenome.
Currently ChloroScan is only available via from-resource installation.
Once the workflow gets finished, the recovered MAGs will be passed to the snakemake workflow Orthoflow (https://rbturnbull.github.io/orthoflow/main/installation.html), to conduct the phylogenetic analysis and see if there any new insights brought by these MAGs.
To install ChloroScan from resource:
git clone https://github.com/Andyargueasae/chloroscan.git
cd chloroscan
poetry install
poetry shell
Future Update:
Comprehensive algae lineage-specific marker gene database (in development).
For detailed information for installation, fine-tuning workflow and instruction of configuration of ChloroScan, please don’t hesitate to visit the wiki website for details: https://andyargueasae.github.io/chloroscan/. And for further discussion of your bugs and difficulties in running ChloroScan, visit issues.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chloroscan-0.1.0.tar.gz.
File metadata
- Download URL: chloroscan-0.1.0.tar.gz
- Upload date:
- Size: 25.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.11.9 Linux/5.4.0-186-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
82a1bc33679a7c8279b84dfb8a5e71bbc984b671640981804a4575cef12792e0
|
|
| MD5 |
550b22d2205f54e9b0d3af393f39fcb1
|
|
| BLAKE2b-256 |
683e7f5be290c0f9ba7641dde65682728d9b58687f71559bf12bebc75ccdca97
|
File details
Details for the file chloroscan-0.1.0-py3-none-any.whl.
File metadata
- Download URL: chloroscan-0.1.0-py3-none-any.whl
- Upload date:
- Size: 30.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.11.9 Linux/5.4.0-186-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d1fb2cb3e9843a2bb9ebd82aa3c75396816d2e36ba13975ad23cc9d15a4ae6d
|
|
| MD5 |
e12f2bd40fd2d228a050cd982291c109
|
|
| BLAKE2b-256 |
7acc2254ef0541fadd33bf1c5d177df9fc7aa733ce882e336bed1edb4e60a569
|