Metagenomic binning with semi-supervised siamese neural network
S³N²Bin (Semi-supervised Siamese Neural Network for metagenomic binning)
NOTE: This tool is still in development. You are welcome to try it out and feedback is appreciated, but expect some bugs/rapid changes until it stabilizes. Please use Github issues for bug reports and the Discussions for more open-ended discussions/questions.
Command tool for metagenomic binning with semi-supervised deep learning using information from reference genomes.
S3N2Bin runs on Python 3.6-3.8.
Install from source
You can download the source code from github and install.
conda install -c bioconda bedtools hmmer fraggenescan
conda install -c anaconda cmake=3.19.6
python setup.py install
Easy single/co-assembly binning mode
You will need the following inputs:
- A contig file (
contig.fnain the example below)
- BAM files from mapping
You can get the results with one line of code. The
single_easy_bin command can be used in
single-sample and co-assembly binning modes (contig annotations using mmseqs
with GTDB reference genome).
single_easy_bin includes the following steps:
S3N2Bin single_easy_bin -i contig.fna -b *.bam -o output
In this example, S³N²Bin will download GTDB to
$HOME/.cache/S3N2Bin/mmseqs2-GTDB/GTDB. You can change this default using the
Easy multi-samples binning mode
multi_easy_bin command can be used in
multi-samples binning modes (contig annotations using mmseqs
with GTDB reference genome).
multi_easy_bin includes following step:
You will need the following inputs.
A combined contig file
BAM files from mapping
For every contig, format of the name is
: is the default separator (it can be changed with the
argument). Note: Make sure the sample names are unique and the separator
does not introduce confusion when splitting. For example:
>S1:Contig_1 AGATAATAAAGATAATAATA >S1:Contig_2 CGAATTTATCTCAAGAACAAGAAAA >S1:Contig_3 AAAAAGAGAAAATTCAGAATTAGCCAATAAAATA >S2:Contig_1 AATGATATAATACTTAATA >S2:Contig_2 AAAATATTAAAGAAATAATGAAAGAAA >S3:Contig_1 ATAAAGACGATAAAATAATAAAAGCCAAATCCGACAAAGAAAGAACGG >S3:Contig_2 AATATTTTAGAGAAAGACATAAACAATAAGAAAAGTATT >S3:Contig_3 CAAATACGAATGATTCTTTATTAGATTATCTTAATAAGAATATC
You can get the results with one line of code.
S3N2Bin multi_easy_bin -i contig_whole.fna -b *.bam -o output
You can run individual steps by yourself, which can enable using compute clusters to make the binning process faster (especially in multi-samples binning mode).
For more details on usage, including information on how to run individual steps separately, read the docs.
The output folder will contain
Datasets used for training and clustering.
Saved semi-supervised deep learning model.
Some intermediate files.
For every sample, reconstructed bins are in
For more details about the output, read the docs.
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size S3N2Bin-0.1.1.tar.gz (2.9 MB)||File type Source||Python version None||Upload date||Hashes View|