Skip to main content

Solving Multiple Sequence Alignments with Python

Project description


Sequoya


Solving Multiple Sequence Alignments with Python

Build Status PyPI License PyPI Python version

Sequoya is an open source software tool aimed at for solving Multiple Sequence Alignment problems with multi-objective metaheuristics.

This tool implements a distributed async version of the M2Align algorithm as shown in:

"M2Align: parallel multiple sequence alignment with a multi-objective metaheuristic". Cristian Zambrano-Vega, Antonio J. Nebro José García-Nieto, José F. Aldana-Montes. Bioinformatics, Volume 33, Issue 19, 1 October 2017, Pages 3011–3017 (DOI).

Features

  • Score functions:
    • Sum of pairs,
    • Star,
    • Minimum entropy,
    • Percentage of non-gaps,
    • Percentage of totally conserved columns,
    • STRIKE.
  • Algorithm:
    • NSGA-II,
    • Distributed NSGA-II
  • Crossover operator:
    • Single-point crossover (GapSequenceSolutionSinglePoint).
  • Mutation operators:
    • Shift closest gap group (ShiftClosedGapGroups),
    • Shift gap group (ShiftGapGroup),
    • Random gap insertion (OneRandomGapInsertion),
    • Merge two random adjacent gaps group (TwoRandomAdjacentGapGroup),
    • Multiple mutation (MultipleMSAMutation).

Install

To download and install Sequoya just clone the Git repository hosted in GitHub:

git clone https://github.com/benhid/Sequoya.git
cd Sequoya
python setup.py install

Or via pip:

pip install Sequoya

Usage

Examples of running Sequoya are located in the examples folder:

Dask distributed

For running Sequoya in a cluster of machines, first setup a network with at least one dask-cheduler node and several dask-worker nodes:

conda create --name dask-cluster
conda activate dask-cluster

pip install git+https://github.com/benhid/Sequoya.git@develop

Then, on the master node run:

dask-scheduler

On each slave node run:

dask-worker <master-ip>:8786 --nprocs <total-cores> --nthreads 1

Authors

Active development team

License

This project is licensed under the terms of the MIT - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Sequoya-0.9.0.tar.gz (19.8 kB view hashes)

Uploaded source

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page