Skip to main content

The Microbial Co-occurrence Network Explorer

Project description

MiCoNE - Microbial Co-occurrence Network Explorer

Build Status Documentation Status codecov CodeFactor Updates Code style: black

MiCoNE is a Python package for the exploration of the effects of various possible tools used during the 16S data processing workflow on the inferred co-occurrence networks. It is also developed as a flexible and modular pipeline for 16S data analysis, offering parallelized, fast and reproducible runs executed for different combinations of tools for each step of the data processing workflow. It incorporates various popular, publicly available tools as well as custom Python modules and scripts to facilitate inference of co-occurrence networks from 16S data.

The MiCoNE framework is introduced in:

Kishore, D., Birzu, G., Hu, Z., DeLisi, C., Korolev, K., & Segrè, D. (2023). Inferring microbial co-occurrence networks from amplicon data: A systematic evaluation. mSystems. doi:10.1128/msystems.00961-22.

Data related to the publication can be found on Zenodo: https://doi.org/10.5281/zenodo.7051556.

Features

  • Plug and play architecture: allows easy additions and removal of new tools
  • Flexible and portable: allows running the pipeline on local machine, compute cluster or the cloud with minimal configuration change through the usage of nextflow
  • Parallelization: automatic parallelization both within and across samples (needs to be enabled in the nextflow.config file)
  • Ease of use: available as a minimal Python library (without the pipeline) or as a full conda package

Installation

Installing the conda package:

mamba env create -n micone -f https://raw.githubusercontent.com/segrelab/MiCoNE/master/env.yml

NOTE:

  1. MiCoNE requires the mamba package manager, otherwise micone init will not work.
  2. Direct installation via anaconda cloud will be available soon.

Installing the minimal Python library:

pip install micone

NOTE: The Python library does not provide the functionality to execute pipelines

Workflow

pipeline

It supports the conversion of raw 16S sequence data into co-occurrence networks. Each process in the pipeline supports alternate tools for performing the same task, users can use the configuration file to change these values.

Usage

The MiCoNE pipelines comes with an easy-to-use CLI. To get a list of subcommands you can type:

micone --help

Supported subcommands:

  1. install - Initializes the package and environments (creates conda environments for various pipeline processes)
  2. init - Initialize the nextflow templates for the micone workflow
  3. clean - Cleans files from a pipeline run (cleans temporary data, log files and other extraneous files)
  4. validate-results - Check the results of the pipeline execution

Installing the environments

In order to run the pipeline various conda environments must first be installed on the system. Use the following comand to initialize all the environments:

micone install

Or to initialize a particular environment use:

micone install -e "micone-qiime2"

The list of supported environments are:

  • micone-cozine
  • micone-dada2
  • micone-flashweave
  • micone-harmonies
  • micone-mldm
  • micone-propr
  • micone-qiime2
  • micone-sparcc
  • micone-spieceasi
  • micone-spring

Initializing the pipeline template

To initialize the full pipeline (from raw 16S sequencing reads to co-occurrence networks):

micone init -w <workflow> -o <path/to/folder>

Other supported pipeline templates are (work in progress):

  • full
  • ni
  • op_ni
  • ta_op_ni

To run the pipeline, update the relevant config files (see next section), activate the micone environment and run the run.sh script that was copied to the directory:

bash run.sh

This runs the pipeline locally using the config options specified.

To run the pipeline on an SGE enabled cluster, add the relevant project/resource allocation flags to the run.sh script and run as:

qsub run.sh

Configuration and the pipeline template

The pipeline template for the micone "workflow" (see previous section for list of supported options) is copied to the desired folder after running micone init -w <workflow>. The template folder contains the following folders and files:

  • nf_micone: Folder contatining the micone default configs, data, functions, and modules
  • templates: Folder containing the templates (scripts) that are executed during the pipeline run
  • main.nf: The pipeline "workflow" defined in the nextflow DSL 2 specification
  • nextflow.config: The configuration for the pipeline. This file needs to be modified in order to change any configuration options for the pipeline run
  • metadata.json: Contains the basic metadata that describes the dataset that is to be processed. Should be updated accordingly before pipeline execution
  • samplesheet.csv: The file that contains the locations of the input data necessary for the pipeline run. Should be updated accordingly before pipeline execution
  • run.sh: The bash script that contains commands used to execute the nextflow pipeline

The folder nf_micone/configs contains the default configs for all the micone pipeline workflows. These options can also be viewed in tabular format in the documentation.

For example, to change the tool used for OTU assignment to dada2 and deblur, you can add the following to nextflow.config:

// ... config initialization
params {
       // ... other config options
       denoise_cluster {
        otu_assignment {
            selection = ['dada2', 'deblur']
        }
    }
}

Example configuration files used for the analyses in the manuscript can be found here.

Visualization of results (coming soon)

The results of the pipeline execution can be visualized using the scripts in the manuscript repo

Know issues

  1. If you have a version of julia that is preinstalled, make sure that it does not conflict with the version downloaded by the micone-flashweave environment
  2. The data directory (nf_micone/data) needs to be manually downloaded using this link.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

micone-0.15.0.tar.gz (69.5 MB view details)

Uploaded Source

Built Distribution

micone-0.15.0-py3-none-any.whl (69.6 MB view details)

Uploaded Python 3

File details

Details for the file micone-0.15.0.tar.gz.

File metadata

  • Download URL: micone-0.15.0.tar.gz
  • Upload date:
  • Size: 69.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.11.3 Linux/5.15.0-73-generic

File hashes

Hashes for micone-0.15.0.tar.gz
Algorithm Hash digest
SHA256 2d996a32b77e90d46d17aa7d1e920dc68a0f3bebdbad00d367f7dafb6403f85c
MD5 684dd23b5d3aa3b51d17befc55608778
BLAKE2b-256 e535b2fe57cce7488f48e891da31149b17e7bbe36776aa122d3df70ae81694ff

See more details on using hashes here.

File details

Details for the file micone-0.15.0-py3-none-any.whl.

File metadata

  • Download URL: micone-0.15.0-py3-none-any.whl
  • Upload date:
  • Size: 69.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.11.3 Linux/5.15.0-73-generic

File hashes

Hashes for micone-0.15.0-py3-none-any.whl
Algorithm Hash digest
SHA256 321e3a381daa8f081f03c607c2baf1d25f63e7e3c4997432c66af9385a5b8292
MD5 8a9c9943708937520177db595584ca21
BLAKE2b-256 8233c1db2f90006190707aadc1c01cb0d1dda35da2dc08faf6a518fe51509ccb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page