A bioinformatic classifier of Rab GTPases
Project description
Rabifier is an automated bioinformatic pipeline for prediction and classification of Rab GTPases. For more detailed description of the pipeline check the references. If you prefer just to browse Rab GTPases in all sequenced Eukaryotic genomes visit rabdb.org.
Rabifier is freely distributed under the GNU General Public License, check the LICENCE file for details.
Please cite our papers if you use Rabifier in your projects.
Rabifier2: an improved bioinformatic classifier of Rab GTPases. Surkont J, et al.
Thousands of Rab GTPases for the Cell Biologist. Diekmann Y, et al. PLoS Comput Biol 7(10): e1002217. doi:10.1371/journal.pcbi.1002217
Installation
To install Rabifier simply run
pip install rabifier
Python requirements, third party packages and other dependencies
Rabifier supports Python 2.7 and Python 3.4. Rabifier was tested only on a GNU/Linux operating system, we are not planning to support other platforms.
Rabifier depends on third-party Python libraries:
biopython (>=1.66)
numpy (>=1.10.1)
scipy (>=0.16.1)
Rabifier uses several bioinformatic tools, which are required for most of the classification stages. Ensure that the following programs (or links pointing to them) are available in the system path.
HMMER (3.1b1): phmmer, hmmbuild, hmmpress, hmmscan
BLAST+ (2.2.30): blastp
MEME4 (4.10.2): meme, mast
Superfamily (>=1.75): superfamily (NOTE: this is a folder containing several Superfamily database files and scripts, see below)
If you have cloned this repository you need to compile the HMMs of Rab subfamilies using hmmpress, i.e. run hmmpress rabifier/data/rab_subfamily.hmm
Rabifier requires a seed database for Rab classification. A precomputed database is a part of this repository. You can also create the database using rabifier-mkdb on the raw, manually curated data sets, available in a seperate repository https://github.com/evocell/rabifier-data. The build process requires additional software.
CD-HIT (v4.6.4): cd-hit
PRANK (v.150803): prank
MAFFT (v7.221): mafft
matplotlib (>=1.4.3) (optional)
To install Superfamily database follow the instructions below (based on the Superfamily website).
# Register at the Superfamily website to get your username and password
# Download files
mkdir superfamily
cd superfamily
wget --http-user USERNAME --http-password PASSWORD -r -np -nd -e robots=off \
-R 'index.html*' 'http://supfam.org/SUPERFAMILY/downloads/license/supfam-local-1.75/'
wget http://scop.mrc-lmb.cam.ac.uk/scop/parse/dir.cla.scop.txt_1.75 -O dir.cla.scop.txt
wget http://scop.mrc-lmb.cam.ac.uk/scop/parse/dir.des.scop.txt_1.75 -O dir.des.scop.txt
# Uncompress files
gzip -d *.gz
mv hmmlib_1.75 hmmblib
# Make Perl scripts executable
chmod u+x *.pl
# Build the HMM library
hmmpress hmmlib
# Create a symbolic link pointing to the database directory e.g. ln -s superfamily $HOME/bin/
Usage
To run Rab prediction on protein sequences, save sequences in the FASTA format and run:
rabifier sequences.fa
For more options controlling Rabifier behaviour type:
rabifier -h
Bug reports and contributing
Please use the issue tracker to report bugs and suggest improvements.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for rabifier-2.0.2-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a95e31ffaebd3ad1436db6daf62202d78bfc12ff7fe4708676018613a1bbc06b |
|
MD5 | c00f9ccb5bcf65c3ea65ea8e15ba43e7 |
|
BLAKE2b-256 | 963f5434ccce748de1b3fce2679a53dc1387379b1d61ad77dcefd9cff4de357e |