A computational workflow designed to recover plastid genomes from metagenomes.
Project description
This workflow is designed to recover chloroplast genomes from metagenomic datasets.
Installation
To install the workflow, use pip3. The background environment will require Python <4.0, >=3.9 to set up the virtual environment.
pip3 install chloroscan==0.1.5
Detailed workflow instructions can be found at: https://andyargueasae.github.io/chloroscan/index.html. The website also contains Chinese version of the documentation with identical contents.
Machine/OS Requirements
ChloroScan is only tested on Linux (x86_64), running on IOS system is not recommended. ChloroScan can be installed on servers with hpc clusters and it is recommended to use a GPU to accelerate its running.
Note: Through testing, current version of chloroscan cannot support NVIDIA H-100 GPU, due to cuda version incompatibilities. We will work on updating it to allow better performances.
Configuration databases
Before running ChloroScan, some packages and datasets need to be installed to run CAT taxonomy prediction properly. ChloroScan incorporates a marker gene database while running binning, you don’t need to do anything, it will be loaded since you build conda environments. To download our curated Uniref90-algae plastid protein database, use the link: https://doi.org/10.26188/27990278.
To avoid authentication issues, we recommend using the pyfigshare command-line tool to download. The information of this tool can be found at: https://pypi.org/project/pyfigshare/. * Python > 3.0 is required to download pyfigshare.
Before downloading the files, set up your own figshare account and add an api token to the file ~/.figshare/token. Then run: .. code-block:: bash
figshare download -o CAT_db.tar.gz 27990278
Note: The tar.gz format of CAT database’s size is 47GB, and nearly 85GB after unzipped, please ensure you have enough disk storage. Meanwhile, the space to setup the conda environment also requires 15 GB of disk.
Sample data to try
To try ChloroScan, I recommend downloading our synthetic metagenome data via the command:
figshare download -o simulated_metagenomes.tar.gz 28748540
There are also some real metagenome datasets (modified to keep them lightweight) available at: https://figshare.unimelb.edu.au/articles/dataset/ChloroScan_test_data/30218614.
To download:
figshare download -o real_test_samples.tar.gz 30218614
Credit
ChloroScan is developed by:
Yuhao Tong 童禹皓 (University of Melbourne)
With Yuhao Tong the primary developer, if you want to contact us, please email to:
yuhtong@student.unimelb.edu.au
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chloroscan-0.1.6.tar.gz.
File metadata
- Download URL: chloroscan-0.1.6.tar.gz
- Upload date:
- Size: 30.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.2 CPython/3.12.3 Linux/6.14.0-1017-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
91e54424949f23198c710c821e8e2fff58b30f96eb24f18bff828bd9910249f5
|
|
| MD5 |
04a8827c8b0ec27e2365e74b983f511a
|
|
| BLAKE2b-256 |
639e6bfff084f8fcb9f6bfcc8314c59a39b5d9a958ea6c007b68e6e4a1208f18
|
File details
Details for the file chloroscan-0.1.6-py3-none-any.whl.
File metadata
- Download URL: chloroscan-0.1.6-py3-none-any.whl
- Upload date:
- Size: 45.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.2 CPython/3.12.3 Linux/6.14.0-1017-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9335deaede0d1ec04f728fe34c5e13701eb643c569cc549e9d291d1d582b8e81
|
|
| MD5 |
99b578313261184045ec663a5724feb7
|
|
| BLAKE2b-256 |
0f3cb1e159b7ec78d473c6197d1366f1cc6b28f4382dda0be8d56078bd3456d8
|