Efficiently computes distances between protein sequences
Project description
pwseqdist
A small package that efficiently computes distances between protein sequences. Can accommodate similarity matrices, sequences of different lengths and custom metrics.
Install
pip install pwseqdist
Example
import pwseqdist as pw
import multiprocessing
from scipy.spatial.distance import squareform
peptides = ['CACADLGAYPDKLIF','CACDALLAYTDKLIF',
'CACDAVGDTLDKLIF','CACDDVTEVEGDKLIF',
'CACDFISPSNWGIQSGRNTDKLIF','CACDPVLGDTRLTDKLIF']
dvec = pw.apply_pairwise_sq(seqs = peptides,
metric = pw.metrics.nw_hamming_metric,
ncpus = multiprocessing.cpu_count() )
dmat = squareform(dvec).astype(int)
dmat
array([[ 0, 4, 6, 7, 15, 8],
[ 4, 0, 5, 7, 14, 7],
[ 6, 5, 0, 6, 14, 4],
[ 7, 7, 6, 0, 14, 8],
[15, 14, 14, 14, 0, 11],
[ 8, 7, 4, 8, 11, 0]])
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pwseqdist-0.6.tar.gz
(24.2 kB
view hashes)
Built Distribution
pwseqdist-0.6-py3-none-any.whl
(29.3 kB
view hashes)