Skip to main content

pyabpoa: SIMD-based partial order alignment using adaptive band

Project description

pyabpoa: abPOA Python interface

Introduction

pyabpoa provides an easy-to-use interface to abPOA, it contains all the APIs that can be used to perform MSA for a set of sequences and consensus calling from the final alignment graph.

Installation

Install pyabpoa with pip

pyabpoa can be installed with pip:

pip install pyabpoa

Install pyabpoa from source

Alternatively, you can install pyabpoa from source (cython is required):

git clone https://github.com/yangao07/abPOA.git
cd abPOA
make install_py

Examples

The following code illustrates how to use pyabpoa.

import pyabpoa as pa
a = pa.msa_aligner()
seqs=[
'CCGAAGA',
'CCGAACTCGA',
'CCCGGAAGA',
'CCGAAGA'
]
res=a.msa(seqs, out_cons=True, out_msa=True, out_pog='pog.png', incr_fn='') # perform multiple sequence alignment 
                                                                # generate a figure of alignment graph to pog.png

for seq in res.cons_seq:
    print(seq)  # print consensus sequence

res.print_msa() # print row-column multiple sequence alignment in PIR format

You can also try the example script provided in the source folder:

python ./python/example.py

APIs

Class pyabpoa.msa_aligner

pyabpoa.msa_aligner(aln_mode='g', ...)

This constructs a multiple sequence alignment handler of pyabpoa, it accepts the following arguments:

  • aln_mode: alignment mode. 'g': global, 'l': local, 'e': extension; default: 'g'
  • match: match score; default: 2
  • mismatch: match penaty; default: 4
  • score_matrix: scoring matrix file, match and mismatch are not used when score_matrix is used; default: ''
  • gap_open1: first gap opening penalty; default: 4
  • gap_ext1: first gap extension penalty; default: 2
  • gap_open2: second gap opening penalty; default: 24
  • gap_ext2: second gap extension penalty; default: 1
  • extra_b: first adaptive banding paremeter; set as < 0 to disable adaptive banded DP; default: 10
  • extra_f: second adaptive banding paremete; the number of extra bases added on both sites of the band is b+f*L, where L is the length of the aligned sequence; default : 0.01
  • is_diploid: set as 1 if input is diploid datal default: 0
  • min_freq: minimum frequency of each consensus to output for diploid datal default: 0.3

The msa_aligner handler provides one method which performs multiple sequence alignment and takes four arguments:

pyabpoa.msa_aligner.msa(seqs, out_cons, out_msa, out_pog='', incr_fn='')
  • seqs: a list variable containing a set of input sequences; positional
  • out_cons: a bool variable to ask pyabpoa to generate consensus sequence; positional
  • out_msa: a bool variable to ask pyabpoa to generate RC-MSA; positional
  • out_pog: name of a file (.png or .pdf) to store the plot of the final alignment graph; optional, default: ''
  • incr_fn: name of an existing graph (GFA) or MSA (FASTA) file, incrementally align sequence to this graph/MSA; optional, default: ''

Class pyabpoa.msa_result

pyabpoa.msa_result(seq_n, cons_n, cons_len, ...)

This class describes the information of the generated consensus sequence and the RC-MSA. The returned result of pyabpoa.msa_aligner.msa() is an object of this class that has the following properties:

  • seq_n: number of input aligned sequences
  • cons_n: number of generated consensus sequences (generally 1, could be 2 if input is specified as diploid)
  • cons_len: an array of consensus sequence length(s)
  • cons_seq: an array of consensus sequence(s)
  • msa_len: size of each row in the RC-MSA
  • msa_seq: an array containing seq_n strings that demonstrates the RC-MSA, each consisting of one input sequence and several - indicating the alignment gaps.

pyabpoa.msa_result() has a function of print_msa which prints the RC-MSA to screen.

pyabpoa.msa_result().print_msa()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyabpoa-1.2.3.tar.gz (138.8 kB view details)

Uploaded Source

Built Distribution

pyabpoa-1.2.3-py3.8-linux-x86_64.egg (670.6 kB view details)

Uploaded Source

File details

Details for the file pyabpoa-1.2.3.tar.gz.

File metadata

  • Download URL: pyabpoa-1.2.3.tar.gz
  • Upload date:
  • Size: 138.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.2

File hashes

Hashes for pyabpoa-1.2.3.tar.gz
Algorithm Hash digest
SHA256 3da878a5a839403a76e083e44f64dd556a0099464e1064b2ea1a1c7349944755
MD5 21ec4cf5054d32dd7ecc657e951f6908
BLAKE2b-256 3233605170870d6773d6bbfb138463ef8789dc9a9decdeab7b3e27d1261c82c5

See more details on using hashes here.

Provenance

File details

Details for the file pyabpoa-1.2.3-py3.8-linux-x86_64.egg.

File metadata

  • Download URL: pyabpoa-1.2.3-py3.8-linux-x86_64.egg
  • Upload date:
  • Size: 670.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.2

File hashes

Hashes for pyabpoa-1.2.3-py3.8-linux-x86_64.egg
Algorithm Hash digest
SHA256 9e6fe283951ba4138de09ee3b4969e695202c7ae8d2735303bbf5736cb8c9707
MD5 31432785848127ad4be570e2b1e31746
BLAKE2b-256 3e6ebbe955a90b5b861355388ae2d0dd0037b9f5f7a5c04b9c8458c886b50d36

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page