Skip to main content

GraphBin2: Refined and Overlapped Binning of Metagenomic Contigs Using Assembly Graphs

Project description

GraphBin2 Logo

GraphBin2: Refined and Overlapped Binning of Metagenomic Contigs Using Assembly Graphs

DOI DOI DOI GitHub install with bioconda PyPI version Downloads CI Code style: black CodeQL Documentation Status

GraphBin2 is an extension of GraphBin which refines the binning results obtained from existing tools and, more importantly, is able to assign contigs to multiple bins. GraphBin2 uses the connectivity and coverage information from assembly graphs to adjust existing binning results on contigs and to infer contigs shared by multiple species.

For detailed instructions on installation, usage and visualisation, please refer to the documentation hosted at Read the Docs.

Note: Due to recent requests from the community, we have added support for long-read assemblies produced from Flye. Please note that GraphBin2 has not been tested extensively on long-read assemblies. We originally developed GraphBin2 for short-read assemblies. Long-read assemblies might have sparsely connected graphs which can make the label propagation process less effective and may not result in improvements.

NEW: GraphBin2 is now available on Bioconda at https://anaconda.org/bioconda/graphbin2 and on PyPI at https://pypi.org/project/graphbin2/.

Installing GraphBin2

Using Conda (recommended)

You can install GraphBin2 using the bioconda distribution. You can download Anaconda or Miniconda which contains conda.

# add channels
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

# create conda environment and install
conda create -n graphbin2 graphbin2

# activate conda environment
conda activate graphbin2

# check graphbin2 installation
graphbin2 --help

Using pip

You can install GraphBin2 using pip from the PyPI distribution.

# install graphbin2
pip install graphbin2

# check graphbin2 installation
graphbin2 --help

For development purposes, please clone the repository and install via flit.

# clone repository to your local machine
git clone https://github.com/metagentools/GraphBin2.git

# go to repo directory
cd GraphBin2

# install flit
pip install flit

# install graphbin2 via flit
flit install -s --python `which python`

Example Usage

# SPAdes version
graphbin2 --assembler spades --graph /path/to/graph_file.gfa --contigs /path/to/contigs.fasta --paths /path/to/paths_file.paths --binned /path/to/binning_result.csv --abundance /path/to/abundance.tsv --output /path/to/output_folder

# SGA version
graphbin2 --assembler sga --graph /path/to/graph_file.asqg --contigs /path/to/contigs.fa --binned /path/to/binning_result.csv --abundance /path/to/abundance.tsv --output /path/to/output_folder

# MEGAHIT version
graphbin2 --assembler megahit --graph /path/to/final.gfa --contigs /path/to/final.contigs.fa --binned /path/to/binning_result.csv --abundance /path/to/abundance.tsv --output /path/to/output_folder

# metaFlye version
graphbin2 --assembler flye --graph /path/to/graph_file.gfa --contigs /path/to/assembly.fasta --paths /path/to/assembly_info.txt --binned /path/to/binning_result.csv --abundance /path/to/abundance.tsv --output /path/to/output_folder

Citation

GraphBin2 was accepted for presentation at the 20th International Workshop on Algorithms in Bioinformatics (WABI 2020) and is published in Leibniz International Proceedings in Informatics (LIPIcs) DOI: 10.4230/LIPIcs.WABI.2020.8.

An extended journal article of GraphBin2 has been published in BMC Algorithms for Molecular Biology at DOI: 10.1186/s13015-021-00185-6.

If you use GraphBin2 in your work, please cite the following publications.

@InProceedings{mallawaarachchi_et_al:LIPIcs:2020:12797,
  author =	{Vijini G. Mallawaarachchi and Anuradha S. Wickramarachchi and Yu Lin},
  title =	{{GraphBin2: Refined and Overlapped Binning of Metagenomic Contigs Using Assembly Graphs}},
  booktitle =	{20th International Workshop on Algorithms in Bioinformatics (WABI 2020)},
  pages =	{8:1--8:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-161-0},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{172},
  editor =	{Carl Kingsford and Nadia Pisanti},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2020/12797},
  URN =		{urn:nbn:de:0030-drops-127974},
  doi =		{10.4230/LIPIcs.WABI.2020.8},
  annote =	{Keywords: Metagenomics binning, contigs, assembly graphs, overlapped binning}
}

@Article{Mallawaarachchi2021,
  author={Mallawaarachchi, Vijini G. and Wickramarachchi, Anuradha S. and Lin, Yu},
  title={Improving metagenomic binning results with overlapped bins using assembly graphs},
  journal={Algorithms for Molecular Biology},
  year={2021},
  month={May},
  day={04},
  volume={16},
  number={1},
  pages={3},
  abstract={Metagenomic sequencing allows us to study the structure, diversity and ecology in microbial communities without the necessity of obtaining pure cultures. In many metagenomics studies, the reads obtained from metagenomics sequencing are first assembled into longer contigs and these contigs are then binned into clusters of contigs where contigs in a cluster are expected to come from the same species. As different species may share common sequences in their genomes, one assembled contig may belong to multiple species. However, existing tools for binning contigs only support non-overlapped binning, i.e., each contig is assigned to at most one bin (species).},
  issn={1748-7188},
  doi={10.1186/s13015-021-00185-6},
  url={https://doi.org/10.1186/s13015-021-00185-6}
}

Funding

GraphBin2 is funded by an Essential Open Source Software for Science Grant from the Chan Zuckerberg Initiative.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphbin2-1.3.2.tar.gz (903.8 kB view details)

Uploaded Source

Built Distribution

graphbin2-1.3.2-py3-none-any.whl (36.7 kB view details)

Uploaded Python 3

File details

Details for the file graphbin2-1.3.2.tar.gz.

File metadata

  • Download URL: graphbin2-1.3.2.tar.gz
  • Upload date:
  • Size: 903.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for graphbin2-1.3.2.tar.gz
Algorithm Hash digest
SHA256 7252aaa6066412a4eb5e350ff64edf9f19561f0466a0a0a752aaded7bb2a2b96
MD5 4a99a936cc6030542c01c3115be139a7
BLAKE2b-256 70d12f1b76fa8b5084d01eff73b0acb3d8689dbe69a37a87ac269e5544b02e87

See more details on using hashes here.

File details

Details for the file graphbin2-1.3.2-py3-none-any.whl.

File metadata

  • Download URL: graphbin2-1.3.2-py3-none-any.whl
  • Upload date:
  • Size: 36.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for graphbin2-1.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2a24056a32572ededec966de2b8f322a01af543dc0322bd5f9fe5f99aec4f116
MD5 02176dae95192383718532c6f8621684
BLAKE2b-256 d7706ce1857b54cba802c24a25adfb30c1ed7540d367c4c9c6ef73d5cadc79a6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page