The Microbial Co-occurrence Network Explorer
Project description
MiCoNE - Microbial Co-occurrence Network Explorer
MiCoNE
, is a flexible and modular pipeline for 16S data analysis.
It incorporates various popular, publicly available tools as well as custom Python modules and scripts to facilitate inference of co-occurrence networks from 16S data.
The package is under active development and breaking changes are possible
- Free software: MIT license
- Documentation: https://micone.readthedocs.io/
Manuscript can be found on bioRxiv
Features
- Plug and play architecture: allows easy additions and removal of new tools
- Flexible and portable: allows running the pipeline on local machine, compute cluster or the cloud with minimal configuration change. Uses the nextflow under the hood
- Parallelization: automatic parallelization both within and across samples (needs to be enabled in the
config
file) - Ease of use: available as a minimal
Python
library (without the pipeline) or the fullconda
package
Installation
Installing the minimal Python
library:
pip install micone
Installing the conda
package:
git clone https://github.com/segrelab/MiCoNE.git
cd MiCoNE
conda env create -n micone -f env.yml
pip install .
NOTE: The
conda
package is currently being updated and will be available soon.
Workflow
It supports the conversion of raw 16S sequence data or counts matrices into co-occurrence networks through multiple methods. Each process in the pipeline supports alternate tools for performing the same task, users can use the configuration file to change these values.
Usage
The MiCoNE
pipelines comes with an easy to use CLI. To get a list of subcommands you can type:
micone --help
Supported subcommands:
init
- Createsconda
environments for various pipeline processesrun
- The main subcommand that runs the pipelineclean
- Cleans temporary data, log files and other extraneous files
To run the pipeline:
micone run -p local -c run.toml -m 4
This runs the pipeline in the local
machine using run.toml
for the pipeline configuration and with a maximum of 4 processes in parallel at a time.
Configuration
The configuration of the pipeline can be done using a .toml
file.
The details can be found in the relevant section in the docs.
Here is an example config
file that performs:
- grouping of OTUs by taxonomy level
- correlation of the taxa using
fastspar
- calculates p-values
- constructs the networks
title = "A example pipeline for testing"
order = """
otu_processing.filter.group
otu_processing.export.biom2tsv
network_inference.bootstrap.resample
network_inference.correlation.sparcc
network_inference.bootstrap.pvalue
network_inference.network.make_network_with_pvalue
"""
output_location = "/home/dileep/Documents/results/sparcc_network"
[otu_processing.filter.group]
[[otu_processing.filter.group.input]]
datatype = "otu_table"
format = ["biom"]
location = "correlations/good/deblur/deblur.biom"
[[otu_processing.filter.group.parameters]]
process = "group"
tax_levels = "['Family', 'Genus', 'Species']"
[otu_processing.export.biom2tsv]
[network_inference.bootstrap.resample]
[[network_inference.bootstrap.resample.parameters]]
process = "resample"
bootstraps = 10
[network_inference.correlation.sparcc]
[[network_inference.correlation.sparcc.parameters]]
process = "sparcc"
iterations = 5
[network_inference.bootstrap.pvalue]
[network_inference.network.make_network_with_pvalue]
[[network_inference.network.make_network_with_pvalue.input]]
datatype = "metadata"
format = ["json"]
location = "correlations/good/deblur/deblur_metadata.json"
[[network_inference.network.make_network_with_pvalue.input]]
datatype = "computational_metadata"
format = ["json"]
location = "correlations/good/deblur/deblur_cmetadata.json"
Other example config
files can be found at tests/data/pipelines
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.