Skip to main content

Needleman-Wunsch and Smith-Waterman algorithms in python for any iterable objects.

Project description

Build Status PyPI version Python Version Code style License

Needleman-Wunsch and Smith-Waterman algorithms in python for any iterable objects.

Algorithms

Needleman-Wunsch

The Needleman–Wunsch algorithm is an algorithm used in bioinformatics to align protein or nucleotide sequences. It was one of the first applications of dynamic programming to compare biological sequences. The algorithm was developed by Saul B. Needleman and Christian D. Wunsch and published in 1970. The algorithm essentially divides a large problem (e.g. the full sequence) into a series of smaller problems and uses the solutions to the smaller problems to reconstruct a solution to the larger problem. It is also sometimes referred to as the optimal matching algorithm and the global alignment technique. The Needleman–Wunsch algorithm is still widely used for optimal global alignment, particularly when the quality of the global alignment is of the utmost importance.

-- From the Wikipedia article

Smith-Waterman

The Smith–Waterman algorithm performs local sequence alignment; that is, for determining similar regions between two strings of nucleic acid sequences or protein sequences. Instead of looking at the entire sequence, the Smith–Waterman algorithm compares segments of all possible lengths and optimizes the similarity measure.

-- From the Wikipedia article

Usage

from minineedle import needle, smith, core

# Use miniseq objects
# Load sequences as miniseq FASTA object
import miniseq
fasta = miniseq.FASTA(filename="myfasta.fa")
seq1, seq2 = fasta[0], fasta[1]

# Or use strings, lists, etc
# seq1, seq2 = "ACTG", "ATCTG"
# seq1, seq2 = ["A","C","T","G"], ["A","T","C","T","G"]

# Create the instance
alignment: needle.NeedlemanWunsch[str] = needle.NeedlemanWunsch(seq1, seq2)
# or
# alignment smith.SmithWaterman[str] = smith.SmithWaterman(seq1, seq2)

# Make the alignment
alignment.align()

# Get the score
alignment.get_score()

# Get the sequences aligned as lists
al1, al2 = alignment.get_aligned_sequences(core.AlignmentFormat.list) # or "list"

# Get the sequences as strings
al1, al2 = alignment.get_aligned_sequences(core.AlignmentFormat.str) # or "str

# Change the matrix and run again
alignment.change_matrix(core.ScoreMatrix(match=4, miss=-4, gap=-2))
alignment.align()

# Print the sequences aligned
print(alignment)

# Change gap character
alignment.gap_character = "-gap-"
print(alignment)

# Sort a list of alignments by score
first_al  = needle.NeedlemanWunsch(seq1, seq2)
second_al = needle.NeedlemanWunsch(seq3, seq4)

for align in sorted([first_al, second_al], reverse=True):
    print(align)

Install

pip install minineedle

Classes

NeedlemanWunsch

Needleman-Wunsch alignment class. It has the following attributes:

  • seq1
  • seq2
  • alseq1
  • alseq2
  • nmatrix
  • pmatrix
  • smatrix
  • score
  • identity
  • gap_character

To create the instance you have to provide two iterable objects with elements that can be compared with "==".

SmithWaterman

Smith-Waterman alignment class. It has the following attributes:

  • seq1
  • seq2
  • alseq1
  • alseq2
  • nmatrix
  • pmatrix
  • smatrix
  • score
  • identity

To create the instance you have to provide two iterable objects with elements that can be compared with "==".

ScoreMatrix

With this class you can define your own score matrices. It has three attributes:

  • match
  • miss
  • gap

Methods

align()

Performs the alignment.

get_score()

Returns the score of the alignment. It runs align() if it has not been done yet.

change_matrix(newmatrix)

Takes a ScoreMatrix object and updates the matrix for the alignment. You still have to run it calling align().

get identity()

Returns the % of identity (rounded with 2 decimal points).

get_almatrix()

Return the alignment matrix as a list of lists.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

minineedle-3.1.5.tar.gz (17.9 kB view details)

Uploaded Source

Built Distribution

minineedle-3.1.5-py3-none-any.whl (19.7 kB view details)

Uploaded Python 3

File details

Details for the file minineedle-3.1.5.tar.gz.

File metadata

  • Download URL: minineedle-3.1.5.tar.gz
  • Upload date:
  • Size: 17.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.1 CPython/3.10.2 Linux/5.15.0-1034-azure

File hashes

Hashes for minineedle-3.1.5.tar.gz
Algorithm Hash digest
SHA256 89d6a913daea13dff365063031b59f6f622084490ecb3cf0cbcf27d1a2ba4e41
MD5 7b9ab25606f224c7e0bfb2509b5aba83
BLAKE2b-256 8ed2212a5a18f3bb9f85198b397d5ff914a577f2238f9d8fad66615d68de769f

See more details on using hashes here.

File details

Details for the file minineedle-3.1.5-py3-none-any.whl.

File metadata

  • Download URL: minineedle-3.1.5-py3-none-any.whl
  • Upload date:
  • Size: 19.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.1 CPython/3.10.2 Linux/5.15.0-1034-azure

File hashes

Hashes for minineedle-3.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 5bf5643368fcd2c7cb3c22d61e9bd610301ebd63cd7cd668430a5c1af29ffd6e
MD5 300278b1e05c787c23bed4cac9d7c79e
BLAKE2b-256 33d65cf14cc4543658d4b6328422412a551b01eacbeb32c223d33a47b4986aad

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page