Skip to main content

Prediction and classification of conopeptides

Project description

PyPI Wheel Language Pyver Downloads Docker License: GPL v3

ConoDictor: A fast and accurate prediction and classification tool for conopeptides

Introduction

Cone snails are among the richest sources of natural peptides with promising pharmacological and therapeutic applications. With the reduced costs of RNAseq, scientists now heavily rely on venom gland transcriptomes for the mining of novel bioactive conopeptides, but the bioinformatic analyses often hamper the discovery process.

ConoDictor 2 is a standalone and user-friendly command-line program. We have updated the program originally published as a web server 10 years ago using novel and updated tools and algorithms and improved our classification models with new and higher quality sequences. ConoDictor 2 is now more accurate, faster, multiplatform, and able to deal with a whole cone snail venom gland transcriptome (raw reads or contigs) in a very short time.

The only input ConoDictor 2 requires is the assembled transcriptome or the raw reads file either in DNA or amino acid: the used alphabet is automatically recognized. ConoDictor 2 runs predictions directly on the proteins file (submitted or dynamically generated) and tries to report the longest conopeptide precursor-like sequence.

Installation

Install from Pip

You will have first to install HMMER 3 and Pftools to be able to run conodictor.

pip install conodictor

Using containers

Docker

Accessible at https://hub.docker.com/u/ebedthan or on BioContainers.

docker pull ebedthan/conodictor:latest
docker run ebedthan/conodictor:latest conodictor -h

Example of a run

docker run --rm=True -v $PWD:/data -u $(id -u):$(id -g) ebedthan/conodictor:latest conodictor --out /data/outdir /data/input.fa.gz

See https://staph-b.github.io/docker-builds/run_containers/ for more informations on how to properly run a docker container.

Singularity

The singularity container does not need admin privileges making it suitable for university clusters and HPC.

singularity build conodictor.sif docker://ebedthan/conodictor:latest
singularity exec conodictor.sif conodictor -h

Install from source

# Download ConoDictor development version
git clone https://github.com/koualab/conodictor.git conodictor

# Navigate to directory
cd conodictor

# Install with poetry: see https://python-poetry.org
poetry install --no-dev

# Enter the Python virtual environment with
poetry shell

# Test conodictor is correctly installed
conodictor -h

If you do not want to go into the virtual environment just do:

poetry run conodictor -h

Test

  • Type conodictor -h and it should output something like:
usage: conodictor [options] <FILE>

optional arguments:
  -o DIR, --out DIR   output result to DIR [ConoDictor]
  --mlen INT          minimum length of sequences to be considered [off]
  --ndup INT          minimum occurence sequences to be considered [off]
  --faa               dump a fasta file of matched sequences [false]
  --filter            only keep sequences matching sig, pro and mat regions [false]
  -a, --all           add unclassified sequences in result [false]
  -j INT, --cpus INT  number of threads [1]
  --force             re-use output directory [false]
  -q, --quiet         decrease program verbosity
  -v, --version       show program's version number and exit
  -h, --help          show this help message and exit

Citation: Koua et al., 2021, Bioinformatics Advances

Invoking conodictor

conodictor file.fa.gz
conodictor --out outfolder --cpus 4 --mlen 51 file.fa

Output files

The comma separeted-values file summary.csv can be easily viewed with any office suite, or text editor.

sequence,hmm_pred,pssm_pred definitive_pred
SEQ_ID_1,A,A,A
SEQ_ID_2,B,D,CONFLICT B and D
SEQ_ID_3,O1,O1,O1
...

Citation

When using ConoDictor2 in your work, you should cite:

Dominique Koua, Anicet Ebou, Sébastien Dutertre, Improved prediction of conopeptide superfamilies with ConoDictor 2.0, Bioinformatics Advances, Volume 1, Issue 1, 2021, vbab011, https://doi.org/10.1093/bioadv/vbab011.

Bugs

Submit problems or requests to the Issue Tracker.

Dependencies

Mandatory

  • HMMER 3
    Used for HMM profile prediction.
    Eddy SR, Accelerated Profile HMM Searches. PLOS Computational Biology 2011, 10.1371/journal.pcbi.1002195

  • Pftools
    Used for PSSM prediction.
    Schuepbach P et al. pfsearchV3: a code acceleration and heuristic to search PROSITE profiles. Bioinformatics 2013, 10.1093/bioinformatics/btt129

Licence

GPL v3.

For commercial uses please contact Dominique Koua at dominique.koua@inphb.ci.

Authors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

conodictor-2.3.5.tar.gz (270.2 kB view details)

Uploaded Source

Built Distribution

conodictor-2.3.5-py3-none-any.whl (273.7 kB view details)

Uploaded Python 3

File details

Details for the file conodictor-2.3.5.tar.gz.

File metadata

  • Download URL: conodictor-2.3.5.tar.gz
  • Upload date:
  • Size: 270.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.8.10 Linux/5.15.0-46-generic

File hashes

Hashes for conodictor-2.3.5.tar.gz
Algorithm Hash digest
SHA256 94830674c1718eda480923cb5fe359c0776f070d913c75da08c410618cf554fe
MD5 94c01669412a253ef8bba7425894cc64
BLAKE2b-256 0509c5331e85208f8a20c270ff27ab986a27bd0b2ffe9ad0a692ee3d2df64b99

See more details on using hashes here.

File details

Details for the file conodictor-2.3.5-py3-none-any.whl.

File metadata

  • Download URL: conodictor-2.3.5-py3-none-any.whl
  • Upload date:
  • Size: 273.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.8.10 Linux/5.15.0-46-generic

File hashes

Hashes for conodictor-2.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 a62c8220115ad1de7765911a358e36fa085803a34daec95663946e7947d6a290
MD5 0f89506844597da23dfc27b8c4320059
BLAKE2b-256 e07004158cdb1a5e0291fdf8a1601924b320626098d59ac1bb219ee9e93e7854

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page