Skip to main content

Automatic prediction and classification of protein domain architectures

Project description

synthaser

Coverage Status Tests passing Documentation Status PyPI version

Process

synthaser parses the results of a batch NCBI conserved domain search and determines the domain architecture of secondary metabolite synthases.

Installation

Install from PyPI using pip:

$ pip install --user synthaser

or clone the repo and install locally:

$ git clone https://www.github.com/gamcil/synthaser
$ cd synthaser
$ pip install .

Dependencies

synthaser is written in pure Python (3.6+), and requires only the following dependencies for remote searches:

  • requests, for interaction with the NCBI's CD-Search API
  • biopython, for retrieving sequences from NCBI Entrez

If you want to do local searches, you'll need:

  • RPS-BLAST, for performing local domain searches
  • rpsbproc, for formatting RPS-BLAST results like CD-Search

These can be obtained from the NCBI FTP.

Usage

A full synthaser search can be performed as simply as:

$ synthaser -qf sequences.fasta

Where sequences.fasta is a FASTA format file containing the protein sequences that you would like to search.

For a full listing of available arguments, enter:

$ synthaser -h

Visualising your results

synthaser is capable of generating fully-interactive, annotated visualisations so you can easily explore your results. All that is required is one extra argument:

$ synthaser -qf sequences.fasta -p

This will generate a figure like so:

Example synthaser output

Click here to play around with the full version of this example.

Saving your search session

synthaser allows you to save your search results such that they can be easily reloaded for further visualisation or exploration without having to fully re-do the search.

To do this, use the --json_file command:

$ synthaser -qf sequences.fasta --json_file sequences.json

This will save all of your results, in JSON format, to the file sequences.json. Then, loading this session back into synthaser, is as easy as:

$ synthaser --json_file sequences.json ...

Using your own rules

Though synthaser was originally designed to analyse secondary metabolite synthases, it can easily be repurposed to analyse the domain architectures of any type of protein sequence.

Under the hood, synthaser uses a central rule file which contains:

  1. Domain types, containing specific families to save in CD-Search results, corresponding to domain 'islands';
  2. Rules for classifying the sequences based on domain architecture predictions; and
  3. A hierarchy which determines the order of evaluation for the rules.

We distribute our fungal megasynthase rule file as the default, but providing your own rule file is as simple as:

$ synthaser -qf sequences.fasta --rule_file my_rules.json

We also provide a web application for assembling your own rule files, which can be found here.

For a detailed explanation of how the rule file works, as well as API documentation, please refer to the documentation.

Citations

If you found synthaser helpful, please cite:

1. <pending>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synthaser-1.1.8.tar.gz (1.5 MB view details)

Uploaded Source

Built Distribution

synthaser-1.1.8-py3-none-any.whl (137.1 kB view details)

Uploaded Python 3

File details

Details for the file synthaser-1.1.8.tar.gz.

File metadata

  • Download URL: synthaser-1.1.8.tar.gz
  • Upload date:
  • Size: 1.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.3.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.9.5

File hashes

Hashes for synthaser-1.1.8.tar.gz
Algorithm Hash digest
SHA256 8ce6b3feca7ef8204fdc1229783a6ad68da89b3cce1b361eb07b716201d92b16
MD5 50df7b805c4d0f1d06a4f9a2408712fb
BLAKE2b-256 36d17fa727641ba8b2ee8ea6b9beaeb46ad0b6d6de469568203503ab7d25d03c

See more details on using hashes here.

File details

Details for the file synthaser-1.1.8-py3-none-any.whl.

File metadata

  • Download URL: synthaser-1.1.8-py3-none-any.whl
  • Upload date:
  • Size: 137.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.3.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.9.5

File hashes

Hashes for synthaser-1.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 e3b82638a630a447166b259f9b5ea2094a7e5900c29955754a44ca043a981078
MD5 1c067abbab51c3d465fc3f0acef4201f
BLAKE2b-256 bed6096fdcf126594d675c8905cd2ac06ec5e50122c08fca9d0831bcc0d6a9fc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page