Python package that generates consensus sequence from the forward and reverse sequences, performs multiple sequence alignment of the fasta sequences, and generates phylogenetic trees using Bayesian and Maximum Likelihood Methods
Project description
SATO - Sequence Analysis Toolkit
This Python application, built with PyQt6 and integrated with the BioPython library, serves as a Sequence Analysis Toolkit (SATO). SATO offers a user-friendly graphical interface with multiple tabs for various sequence analysis tasks. Users can perform tasks such as generating consensus sequences from two input sequences, aligning sequences using Clustal Omega or MAFFT, and conducting phylogenetic analysis using MrBayes or FastTree. The application also provides features for visualizing alignment results and phylogenetic trees, making it a versatile tool for researchers and scientists working with biological sequences.
Purpose of the Package
The package provides a comprehensive and user-friendly solution for biologists and researchers working with biological sequence data. It aims to streamline and simplify various sequence analysis tasks, including generating consensus sequences, conducting sequence alignments, and performing phylogenetic analysis. By offering a graphical user interface (GUI) and integrating with external tools and libraries like BioPython, Clustal Omega, MAFFT, MrBayes, FastTree, Jalview, and FigTree, the package empowers users to efficiently analyze and visualize biological sequence data, making it a valuable resource for molecular biology and bioinformatics research.
Features
SATO is a powerful Sequence Analysis Toolkit that offers a range of features for working with biological sequences. Whether you need to generate consensus sequences, perform sequence alignment, or conduct phylogenetic analysis, SATO has you covered. Below are some of the key features of this application:
Consensus Sequence Generation
- Users can provide two sequences in FASTA format.
- The app generates a consensus sequence by finding the best overlapping window that minimizes mismatches while maximizing sequence length.
Sequence Alignment
- Users can perform sequence alignment using either Clustal Omega or MAFFT.
- Input sequences are validated for FASTA format.
- Aligned sequences are displayed in a user-friendly format.
Phylogenetic Analysis
- Users can conduct phylogenetic analysis using either MrBayes (Bayesian Phylogeny) or FastTree (Maximum Likelihood).
- Supports both DNA and Protein sequences.
- The app handles input alignments in FASTA or Nexus format.
- Generates a phylogenetic tree and visualizes it using FigTree.
User-Friendly Interface
- The app offers a tabbed interface for easy navigation between different analysis functions.
Installation Instructions
1. Python Environment - Requires Python 3 environment and so ensure you have installed it on your computer.
pip install SATO
2. SATO uses the following programs:
- Clustal Omega and/or MAFFT for sequence alignment
- MrBayes and/or FastTree for phylogenetic analysis
- SeaView and/or Jalview for visualization and analysis of multiple sequence alignment and FigTree for isualization and analysis of phylogenetic trees.
Ensure that they are all installed on your computer
Usage
After installation, open the terminal (linux or macOS) or command prompt (windows) and type sato, then press Enter to launch SATO's GUI
Standalone
There is also a standalone executable at SATO v0.1.1
After extracting, double-click on the executable to launch the GUI
Acknowledgment
- Huelsenbeck, J. P., & Ronquist, F. (2001). MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics, 17(8), 754-755.
- Price, M. N., Dehal, P. S., & Arkin, A. P. (2009). FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Molecular biology and evolution, 26(7), 1641-1650.
- Sievers, F., Wilm, A., Dineen, D., Gibson, T. J., Karplus, K., Li, W., Lopez, R., McWilliam, H., Remmert, M., Söding, J., Thompson, J. D., & Higgins, D. G. (2011). Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Molecular systems biology, 7, 539. https://doi.org/10.1038/msb.2011.75
- Katoh, K., Misawa, K., Kuma, K., & Miyata, T. (2002). MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic acids research, 30(14), 3059–3066. https://doi.org/10.1093/nar/gkf436
- Rambaut, A. (2009). FigTree. Tree figure drawing tool. http://tree. bio. ed. ac. uk/software/figtree/.
- Waterhouse, A., Procter, J., Martin, D.A. and Barton, G.J., 2005. Jalview: visualization and analysis of molecular sequences, alignments, and structures. BMC Bioinformatics, 6(3), pp.1-1.
- Gouy, M., Guindon, S., & Gascuel, O. (2010). SeaView version 4: a multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Molecular biology and evolution, 27(2), 221-224.
Contribution
Should you notice a bug, please let us know through issues in the, GitHub Issue Tracker
Author
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file SATO-0.1.7.tar.gz
.
File metadata
- Download URL: SATO-0.1.7.tar.gz
- Upload date:
- Size: 253.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 039f65a268daa81a46604d7a4f7bea90deea43ba01fd13bf6fcc1dfa8187d7ad |
|
MD5 | 94ddf65fb7202f34850af000b98609ff |
|
BLAKE2b-256 | cf81bc503169959e04e148d84e211909aa1046a8c37018f180d33d991e9437e0 |
File details
Details for the file SATO-0.1.7-py3-none-any.whl
.
File metadata
- Download URL: SATO-0.1.7-py3-none-any.whl
- Upload date:
- Size: 249.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 392584180ccb21a2f21c4bafa8fed64d0cd6133c9e6e0016fdba1c5e04b310ec |
|
MD5 | 9abf93d235570835958c413d5c5356b7 |
|
BLAKE2b-256 | 9bb22a8206882ac407df2f00c1b0fc6802934b4b69e50d17887b79027d5f3284 |