Skip to main content

Sequence alignment of Python objects.

Project description

paired

Paired is a Python package for pairwise alignment of arbitrary sequences.

Python has lots of great packages for sequence alignment and warping, but mostly for biological or numerical data. Paired performs global alignment on lists of arbitrary Python objects, and lets you define how element pairs are matched and scored.

Basic usage

import paired

seq_1 = 'The quick brown fox jumped over the lazy dog'.split(' ')
seq_2 = 'The brown fox leaped over the lazy dog'.split(' ')
alignment = paired.align(seq_1, seq_2)

print(alignment)
# [(0, 0), (1, None), (2, 1), (3, 2), (4, 3), (5, 4), (6, 5), (7, 6), (8, 7)]

for i_1, i_2 in alignment:
    print((seq_1[i_1] if i_1 is not None else '').ljust(15), end='')
    print(seq_2[i_2] if i_2 is not None else '')

# The            The
# quick          
# brown          brown
# fox            fox
# jumped         leaped
# over           over
# the            the
# lazy           lazy
# dog            dog

Custom scores

Paired uses the Needleman-Wunsch algorithm. The scoring for the different operations (match, mismatch, gap) can be specified:

alignment = paired.align(seq_1, seq_2, match_score=5, mismatch_score=-1, gap_score=-5)

Custom similarity

By default, two elements are said to match if element_1 == element_2. Paired also allows you to pass a function to return a match/mismatch score for a given pair of elements. For example, you could give different scores to case-sensitive and case-insensitive matches of strings:

def scorer(a, b):
    if a == b:
        return 2
    elif a.lower() == b.lower():
        return 1
    else:
        return -1

alignment = paired.align(seq_1, seq_2, scorer=scorer, gap_score=-3)

Installation

Paired is on PyPi and can be installed with pip. It has no dependencies.

pip install paired

API

pairwise.align(x, y, match_score=1, mismatch_score=-1, gap_score=-1, scorer=None)

Get the global alignment of two sequences.

Arguments:

  • x, y (list): Sequences of objects to align.
  • match_score (numeric): Score when matching elements are paired.
  • mismatch_score (numeric): Score when mismatching elements are paired.
  • gap_score (numeric): Score for an insertion/deletion, when an element is paired with no other element.
  • scorer (callable): Function that takes two elements as inputs, and returns a numerical score based on how well they match. If None is passed, the default function used is equivalent to lamdba a, b: match_score if a==b else mismatch_score.

Returns:

  • alignment (list of tuples): The aligned sequence, as a list of pairs of indices into x and y respectively. A gap is represented by None instead of an integer index.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

paired-0.0.1.tar.gz (4.7 kB view details)

Uploaded Source

File details

Details for the file paired-0.0.1.tar.gz.

File metadata

  • Download URL: paired-0.0.1.tar.gz
  • Upload date:
  • Size: 4.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for paired-0.0.1.tar.gz
Algorithm Hash digest
SHA256 8f1a82813eed4ac17f769fc80d9d4563e6c2696759254a64fd2fc8f2a58cfee8
MD5 9efd7eb4288541a4c7b31c4e6754d6e0
BLAKE2b-256 899a1d4059c96a2fe27e3e15da883b03a88dd4d708d10bdaf42eeeb9a6cdcc56

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page