pyabPOA: SIMD-based partial order alignment using adaptive band
Project description
pyabPOA: abPOA Python interface
Introduction
pyabPOA provides an easy-to-use interface to abPOA.
Installation
Install pyabPOA with conda or pip
pyabPOA can be installed with conda or pip:
conda install -c bioconda pyabpoa
pip install pyabpoa
Install pyabPOA from source
Alternatively, you can install pyabPOA from source:
git clone https://github.com/yangao07/abPOA.git
cd abPOA/python
make install
Getting started
After installation, you can run the toy example script to test it:
python ./example.py
Usage
import pyabpoa as pa
a = pa.msa_aligner()
seqs=[
'CCGAAGA',
'CCGAACTCGA',
'CCCGGAAGA',
'CCGAAGA'
]
res=a.msa(seqs, out_cons=True, out_msa=True, out_pog='pog.png') # perform multiple sequence alignment
# generate a figure of alignment graph to pog.png
for seq in res.cons_seq:
print(seq) # print consensus sequence
res.print_msa() # print row-column multiple sequence alignment in PIR format
APIs
Class pyabpoa.msa_aligner
pyabpoa.msa_aligner(aln_mode='g', ...)
This constructs a multiple sequence alignment handler of pyabPOA, it accepts the following arguments:
- aln_mode: alignment mode. 'g': global, 'l': local, 'e': extension; default: 'g'
- match: match score; default: 2
- gap_open1: first gap opening penalty; default: 4
- gap_ext1: first gap extension penalty; default: 2
- gap_open2: second gap opening penalty; default: 24
- gap_ext2: second gap extension penalty; default: 1
- extra_b: first part of extra band width; default: 10
- extra_f: second part of extra band width; Total extra band width: b+f*L, L is the sequence lengthl default : 0.01
- is_diploid: set as 1 if input is diploid datal default: 0
- min_freq: minimum frequency of each consensus to output for diploid datal default: 0.3
pyabpoa.msa_aligner.msa(seqs, out_cons, out_msa, out_pog=None)
This method performs mutliple sequence alignment and generates
- consensus sequence if
out_cons
is set asTrue
- row-column multiple sequence alignment in PIR format if
out_msa
is set asTrue
- plot of alignment graph if
out_pog
is set as a file name with suffix as.png
or.pdf
Class pyabpoa.msa_result
pyabpoa.msa_result(seq_n, cons_n, cons_len, ...)
This class describes the information of the generated consensus sequence and row-column multiple sequence alignment. The returned result of pyabpoa.msa_aligner.msa()
is an object of this class and it has the following properties:
- seq_n: number of input sequences
- cons_n: number of generated consensus sequences
- cons_len: an array of consensus sequence length
- cons_seq: an array of consensus sequence
- msa_len: size of each row in the row-column multiple sequence alignment
- msa_seq: an array containing
seq_n
rows of the row-column multiple sequence alignment
pyabpoa.msa_result()
also has a function of print_msa
. It prints the row-column multiple sequence alignment.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pyabpoa-1.0.0a0-py3.7-linux-x86_64.egg
Algorithm | Hash digest | |
---|---|---|
SHA256 | e0c6ee6774d50f5ed9196a1f4dbf27da3934dfdd84be023e9e9e61e3e359e354 |
|
MD5 | ba9e099e2ee551a5b5e1390b899a56b7 |
|
BLAKE2b-256 | 3799227ea2b72b95963bba96ce1af45e69237cc461565b8340ed9e9f50f860fa |