SnakeMake Bioinformatics Library.

Project description

In case of any problem, don’t hesitate to contact me on

Short description

SMBL is a library of some useful rules and Python functions which can be used in Snakemake ( pipelines. It makes possible to automatically install various bioinformatics programs like read mappers, read simulators, conversion tools, etc. It supports also downloading and conversion of some important references in FASTA format (e.g., human genome).

Installation / upgrade

To install SMBL, you need to have Unix-like operating system (e.g., Linux, MacOS) and Python at least 3.3. Installation / upgrade can be performed using the following command.

pip3 install --upgrade smbl

If SnakeMake has not been installed, yet, it will be installed automatically with SMBL.

The current version of SMBL from git can be installed by

pip3 install --upgrade git+git://


To be able to download and install software automatically, SMBL requires the following programs to be present in you Unix system:

  • wget or curl
  • gcc 4.7+
  • git
  • make


To use SMBL, you have to import the smbl Python package and include a file with all rules using:

import smbl
include: smbl.include()

Then you can use all supported programs or data. When they appear as input of a rule, they will be downloaded or compiled.

All the programs are installed into ~/.smbl/bin/ and all FASTA files into ~/.smbl/fa/.


Program Variable with its filename Link
art_454 smbl.prog.ART_454
art_illumina smbl.prog.ART_ILLUMINA
art_solid smbl.prog.ART_SOLID
bcftools smbl.prog.BCFTOOLS
bfast smbl.prog.BFAST
bgzip smbl.prog.BGZIP
bowtie2 smbl.prog.BOWTIE2
bowtie2-build smbl.prog.BOWTIE2_BUILD
bowtie2-inspect smbl.prog.BOWTIE2_INSPECT
bwa smbl.prog.BWA
curesim.jar smbl.prog.CURESIM
curesim_eval.jar smbl.prog.CURESIM_EVAL
deez smbl.prog.DEEZ
drfast smbl.prog.DRFAST
dwgsim smbl.prog.DWGSIM smbl.prog.DWGSIM_EVAL
freec smbl.prog.FREEC
gem-indexer smbl.prog.GEM_INDEXER
gem-mapper smbl.prog.GEM_MAPPER
gem-2-sam smbl.prog.GEM_2_SAM
gnuplot4 smbl.prog.GNUPLOT4
gnuplot5 smbl.prog.GNUPLOT5
kallisto smbl.prog.KALLISTO
lastal smbl.prog.LASTAL
lastdb smbl.prog.LASTDB
mason_frag_sequencing smbl.prog.MASON_FRAG_SEQUENCING
mason_genome smbl.prog.MASON_GENOME
mason_materializer smbl.prog.MASON_MATERIALIZER
mason_methylation smbl.prog.MASON_METHYLATION
mason_simulator smbl.prog.MASON_SIMULATOR
mason_splicing smbl.prog.MASON_SPLICING
mason_variator smbl.prog.MASON_VARIATOR
mrfast smbl.prog.MRFAST
mrsfast smbl.prog.MRSFAST
perm smbl.prog.PERM
pbsim smbl.prog.PBSIM
picard smbl.prog.PICARD
sambamba smbl.prog.SAMBAMBA
samtools smbl.prog.SAMTOOLS
sirfast smbl.prog.SIRFAST
storm-color smbl.prog.STORM_COLOR
storm-nucleotide smbl.prog.STORM_NUCLEOTIDE
tabix smbl.prog.TABIX
twoBitToFa smbl.prog.TWOBITTOFA smbl.prog.VCFTULS
wgsim smbl.prog.WGSIM smbl.prog.WGSIM_EVAL
xs smbl.prog.XS

FASTA files

FASTA file Variable with its filename
An example small FASTA file smbl.fasta.EXAMPLE_1
An example small FASTA file smbl.fasta.EXAMPLE_2
An example small FASTA file smbl.fasta.EXAMPLE_3
Human genome HG38 (GRCh38) smbl.fasta.HG38, smbl.fasta.HUMAN_GRCH38
Mouse genome MM10 smbl.fasta.MOUSE_MM10
Chimpanzee genome PANTR04 smbl.fasta.CHIMP_PANTRO4


The following example demonstrates how SMBL can be used for automatic installation of software.

Create an empty file named Snakefile with the following content:

import smbl
include: smbl.include()

rule all:
                # read simulation
                shell("{input[0]} -C 1 {input[2]} {params.PREF}")

                # creating BWA index of the reference sequence
                shell("{input[1]} index {input[2]}")

                # mapping by BWA
                shell("{input[1]} mem {input[2]} {params.PREF}.bfast.fastq > alignment.sam")

Run the script.


What happens:

  1. An example FASTA file is downloaded
  2. DwgSim and BWA are downloaded, compiled and installed
  3. DwgSim simulates reads from the example Fasta file
  4. These reads are mapped back to the reference by BWA (alignment.sam is created)

