Skip to main content

A computational workflow designed to recover plastid genomes from metagenomes.

Project description

testing badge docs badge

docs/source/_static/images/new_ChloroScan_workflow.drawio.png

This workflow is designed to recover chloroplast genomes from metagenomic datasets.

Installation

To install the workflow, use pip3. The background environment will require Python <4.0, >=3.9 to set up the virtual environment.

pip3 install chloroscan==0.1.5

Detailed workflow instructions can be found at: https://andyargueasae.github.io/chloroscan/index.html. The website also contains Chinese version of the documentation with identical contents.

Machine/OS Requirements

ChloroScan is only tested on Linux (x86_64), running on IOS system is not recommended. ChloroScan can be installed on servers with hpc clusters and it is recommended to use a GPU to accelerate its running.

Note: Through testing, current version of chloroscan cannot support NVIDIA H-100 GPU, due to cuda version incompatibilities. We will work on updating it to allow better performances.

Configuration databases

Before running ChloroScan, some packages and datasets need to be installed to run CAT taxonomy prediction properly. ChloroScan incorporates a marker gene database while running binning, you don’t need to do anything, it will be loaded since you build conda environments. To download our curated Uniref90-algae plastid protein database, use the link: https://doi.org/10.26188/27990278.

To avoid authentication issues, we recommend using the pyfigshare command-line tool to download. The information of this tool can be found at: https://pypi.org/project/pyfigshare/. * Python > 3.0 is required to download pyfigshare.

Before downloading the files, set up your own figshare account and add an api token to the file ~/.figshare/token. Then run: .. code-block:: bash

figshare download -o CAT_db.tar.gz 27990278

Note: The tar.gz format of CAT database’s size is 47GB, and nearly 85GB after unzipped, please ensure you have enough disk storage. Meanwhile, the space to setup the conda environment also requires 15 GB of disk.

Sample data to try

To try ChloroScan, I recommend downloading our synthetic metagenome data via the command:

figshare download -o simulated_metagenomes.tar.gz 28748540

There are also some real metagenome datasets (modified to keep them lightweight) available at: https://figshare.unimelb.edu.au/articles/dataset/ChloroScan_test_data/30218614.

To download:

figshare download -o real_test_samples.tar.gz 30218614

Credit

ChloroScan is developed by:

With Yuhao Tong the primary developer, if you want to contact us, please email to:

yuhtong@student.unimelb.edu.au

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chloroscan-0.1.6.tar.gz (30.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chloroscan-0.1.6-py3-none-any.whl (45.8 kB view details)

Uploaded Python 3

File details

Details for the file chloroscan-0.1.6.tar.gz.

File metadata

  • Download URL: chloroscan-0.1.6.tar.gz
  • Upload date:
  • Size: 30.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.3 Linux/6.14.0-1017-azure

File hashes

Hashes for chloroscan-0.1.6.tar.gz
Algorithm Hash digest
SHA256 91e54424949f23198c710c821e8e2fff58b30f96eb24f18bff828bd9910249f5
MD5 04a8827c8b0ec27e2365e74b983f511a
BLAKE2b-256 639e6bfff084f8fcb9f6bfcc8314c59a39b5d9a958ea6c007b68e6e4a1208f18

See more details on using hashes here.

File details

Details for the file chloroscan-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: chloroscan-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 45.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.3 Linux/6.14.0-1017-azure

File hashes

Hashes for chloroscan-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 9335deaede0d1ec04f728fe34c5e13701eb643c569cc549e9d291d1d582b8e81
MD5 99b578313261184045ec663a5724feb7
BLAKE2b-256 0f3cb1e159b7ec78d473c6197d1366f1cc6b28f4382dda0be8d56078bd3456d8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page