Skip to main content

A resistance gene annotation tool

Project description

Resistify

Resistify is a lightweight and fast program designed to classify NLRs by their protein domain architecture. I have created this program as an alternative to several similar programmes for a couple of reasons.

The first is to move away from using InterProScan as a dependency. While InterProScan is a useful resource for annotating protein domains, it very feature-rich and can be challenging to set up on a new system. It's distribution isn't well supported by conda which is an additional challenge when integrating it into automated workflows. Resistify comes packaged with all the necessary databases so you don't have to worry about setting them up manually

Secondly, I've created this to be as free of dependencies as possible. This allows Resistify to be easily distributed and quickly installed!

I'm grateful to the authors of NLRexpress for the motif models used in this program.

Installation

To get started with Resistify:

pip install resistify

Resistify requires biopython and scikit-learn==0.24.2. It also requires hmmsearch and jackhmmer - install these via conda or any other means. A conda distribution is in progress!

Usage

To run Resistify:

resistify <input.fa> <output_directory>

Your input.fa should contain the amino acid sequences of your proteins of interest. Multiline and sequence description fields are allowed.

An output_directory will be created which will contain the results of your run:

  • results.tsv - A table of the length, classification, and predicted functionality of each sequence, as well as the presence of any MADA motif or CJID domain
  • motifs.tsv - A table of all the NLRexpress motifs for each sequence
  • domains.tsv - A table of all the domains identified for each sequence
  • nbarc.fasta - A fasta file of all the NB-ARC domains identified.

How does it work?

Resistify is a two step process.

First, all sequences are searched for CC, RPW8, TIR, and NB-ARC domains. This is used to quickly filter out any non-NLR sequences and identify the primary architecture of each NLR.

Secondly, each potential NLR sequence is scanned for CC, NB-ARC, and LRR associated motifs via the NLRexpress models. These are used as an additional layer of evidence to reclassify each NLR by predicting LRR domains, and predicting any CC domains which may have been missed in the initial hmmsearch which can be less sensitive for this domain. The functionality of each NLR is predicted by counting the number of conserved NB-ARC motifs. Currently, any order is accepted (this may change in the future!).

Resistify will also search for N-terminal MADA motifs and CJID domains that are common to CNLs and TNLs respectively.

Future improvements

Once the core functionality is stable, I will begin integrating NLR-associated into the pipeline.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

resistify-0.0.3.tar.gz (43.8 MB view details)

Uploaded Source

Built Distribution

resistify-0.0.3-py3-none-any.whl (44.2 MB view details)

Uploaded Python 3

File details

Details for the file resistify-0.0.3.tar.gz.

File metadata

  • Download URL: resistify-0.0.3.tar.gz
  • Upload date:
  • Size: 43.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for resistify-0.0.3.tar.gz
Algorithm Hash digest
SHA256 8e043c6851c44dc0d60659ae2586ada253e5c8823ee40d62ef33ca70c46cebf4
MD5 894a1985191c254724e27240b7f9a095
BLAKE2b-256 698f247ac52b005cb34106c7dd34115479f868b60563a6bb564e00c29a247b0c

See more details on using hashes here.

File details

Details for the file resistify-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: resistify-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 44.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for resistify-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 80622487365cf5223b7b4da0211d66479555418be41c66446f1353cd179cd617
MD5 6cdfa4bb39f91bd39eb1e8bbfe7dab1a
BLAKE2b-256 eb072314e493f3c9cce86f6e1ba59aa7ff4f28c383d5ad94b18487e9e6168acf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page