Skip to main content

GraphBin2: Refined and Overlapped Binning of Metagenomic Contigs Using Assembly Graphs

Project description

GraphBin2 Logo GraphBin2 Logo

GraphBin2: Refined and Overlapped Binning of Metagenomic Contigs Using Assembly Graphs

DOI DOI DOI

CI GitHub Code style: black CodeQL Documentation Status

GraphBin2 is an extension of GraphBin which refines the binning results obtained from existing tools and, more importantly, is able to assign contigs to multiple bins. GraphBin2 uses the connectivity and coverage information from assembly graphs to adjust existing binning results on contigs and to infer contigs shared by multiple species.

Note: Due to recent requests from the community, we have added support for long-read assemblies produced from Flye. Please note that GraphBin2 has not been tested extensively on long-read assemblies. We originally developed GraphBin2 for short-read assemblies. Long-read assemblies might have sparsely connected graphs which can make the label propagation process less effective and may not result in improvements.

NEW: GraphBin2 is now available on PyPI at https://pypi.org/project/graphbin2/.

Getting Started

Downloading GraphBin2

You can download the latest release of GraphBin2 from Releases or clone the GraphBin2 repository to your machine.

git clone https://github.com/Vini2/GraphBin2.git

If you have downloaded a release, you will have to extract the files using the following command.

unzip [file_name].zip

Now go in to the GraphBin2 folder using the command

cd GraphBin2/

Setting up the environment

We recommend that you use Conda to run GraphBin2. You can download Anaconda or Miniconda which contains Conda.

Once you have installed Conda, make sure you are in the GraphBin2 folder. Now run the following commands to create a Conda environment and activate it to run GraphBin2.

conda env create -f environment.yml
conda activate graphbin2

Now install GraphBin2 using the following command.

flit install

Test the setup

After installing, run the following command to ensure that GraphBin2 is working.

graphbin2 -h

Now you are ready to run GraphBin2.

Citation

GraphBin2 was accepted for publication at the 20th International Workshop on Algorithms in Bioinformatics (WABI 2020) and is published in Leibniz International Proceedings in Informatics (LIPIcs) DOI: 10.4230/LIPIcs.WABI.2020.8.

An extended journal article of GraphBin2 has been published in BMC Algorithms for Molecular Biology at DOI: 10.1186/s13015-021-00185-6.

If you use GraphBin2 in your work, please cite the following publications.

@InProceedings{mallawaarachchi_et_al:LIPIcs:2020:12797,
  author =	{Vijini G. Mallawaarachchi and Anuradha S. Wickramarachchi and Yu Lin},
  title =	{{GraphBin2: Refined and Overlapped Binning of Metagenomic Contigs Using Assembly Graphs}},
  booktitle =	{20th International Workshop on Algorithms in Bioinformatics (WABI 2020)},
  pages =	{8:1--8:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-161-0},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{172},
  editor =	{Carl Kingsford and Nadia Pisanti},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2020/12797},
  URN =		{urn:nbn:de:0030-drops-127974},
  doi =		{10.4230/LIPIcs.WABI.2020.8},
  annote =	{Keywords: Metagenomics binning, contigs, assembly graphs, overlapped binning}
}

@Article{Mallawaarachchi2021,
  author={Mallawaarachchi, Vijini G. and Wickramarachchi, Anuradha S. and Lin, Yu},
  title={Improving metagenomic binning results with overlapped bins using assembly graphs},
  journal={Algorithms for Molecular Biology},
  year={2021},
  month={May},
  day={04},
  volume={16},
  number={1},
  pages={3},
  abstract={Metagenomic sequencing allows us to study the structure, diversity and ecology in microbial communities without the necessity of obtaining pure cultures. In many metagenomics studies, the reads obtained from metagenomics sequencing are first assembled into longer contigs and these contigs are then binned into clusters of contigs where contigs in a cluster are expected to come from the same species. As different species may share common sequences in their genomes, one assembled contig may belong to multiple species. However, existing tools for binning contigs only support non-overlapped binning, i.e., each contig is assigned to at most one bin (species).},
  issn={1748-7188},
  doi={10.1186/s13015-021-00185-6},
  url={https://doi.org/10.1186/s13015-021-00185-6}
}

Funding

GraphBin2 is funded by an Essential Open Source Software for Science Grant from the Chan Zuckerberg Initiative.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphbin2-1.3.0.tar.gz (902.6 kB view details)

Uploaded Source

Built Distribution

graphbin2-1.3.0-py3-none-any.whl (30.3 kB view details)

Uploaded Python 3

File details

Details for the file graphbin2-1.3.0.tar.gz.

File metadata

  • Download URL: graphbin2-1.3.0.tar.gz
  • Upload date:
  • Size: 902.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for graphbin2-1.3.0.tar.gz
Algorithm Hash digest
SHA256 c9500627ace78c2ba632316d320d17668e78857fda71a88e243395eca6f498a2
MD5 17266acd143e2c70262df01d6bf672e6
BLAKE2b-256 8b32247d45b0492505787f10320741847c1aef55efc49d869dc44974d74191dd

See more details on using hashes here.

File details

Details for the file graphbin2-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: graphbin2-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 30.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for graphbin2-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 91cb517db16b2ec0a2d8b6115c5b2ba8ffcf51086bea5c6a49978668aa6c1b28
MD5 1a0e62aa9c68cf3c7893eb2eae0c8b0f
BLAKE2b-256 fb01b0fb4f6cad2ea65c21d6f42ebfece6abc4af4f0a66f08f176863fc1a54f5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page