super fast cpp implementation of longest common subsequence
Project description
pylcs
The original repository stop maintenance. This is a transfer version
pylcs is a super fast c++ library which adopts dynamic programming(DP) algorithm to solve two classic LCS problems as below .
The longest common subsequence problem is the problem of finding the longest subsequence common to all sequences in a set of sequences (often just two sequences).
The longest common substring problem is to find the longest string (or strings) that is a substring (or are substrings) of two or more strings.
Levenshtein distance, aka edit distance
is also supported. Emm...forget the package name. Example usage is in tests.
We also support Chinese(or any UTF8) string.
Install
To install, simply do pip install pylcs
to pull down the latest version from PyPI.
Python code example
import pylcs
# finding the longest common subsequence length of string A and string B
A = 'We are shannonai'
B = 'We like shannonai'
pylcs.lcs_sequence_length(A, B)
"""
>>> pylcs.lcs_sequence_length(A, B)
14
"""
# finding alignment from string A to B
A = 'We are shannonai'
B = 'We like shannonai'
res = pylcs.lcs_sequence_idx(A, B)
''.join([B[i] for i in res if i != 1])
"""
>>> res
[0, 1, 2, 1, 1, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
>>> ''.join([B[i] for i in res if i != 1])
'We e shannonai'
"""
# finding the longest common subsequence length of string A and a list of string B
A = 'We are shannonai'
B = ['We like shannonai', 'We work in shannonai', 'We are not shannonai']
pylcs.lcs_sequence_of_list(A, B)
"""
>>> pylcs.lcs_sequence_of_list(A, B)
[14, 14, 16]
"""
# finding the longest common substring length of string A and string B
A = 'We are shannonai'
B = 'We like shannonai'
pylcs.lcs_string_length(A, B)
"""
>>> pylcs.lcs_string_length(A, B)
11
"""
# finding alignment from string A to B
A = 'We are shannonai'
B = 'We like shannonai'
res = pylcs.lcs_string_idx(A, B)
''.join([B[i] for i in res if i != 1])
"""
>>> res
[1, 1, 1, 1, 1, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
>>> ''.join([B[i] for i in res if i != 1])
'e shannonai'
"""
# finding the longest common substring length of string A and a list of string B
A = 'We are shannonai'
B = ['We like shannonai', 'We work in shannonai', 'We are not shannonai']
pylcs.lcs_string_of_list(A, B)
"""
>>> pylcs.lcs_string_of_list(A, B)
[11, 10, 10]
"""
# finding the weighted edit distance from string A to B
pylcs.edit_distance("aaa", "aba")
pylcs.edit_distance("aaa", "aba", {'a': {'b': 2.0}})
pylcs.edit_distance("", "aa", {'': {'a': 0.5}})
# weight['']['a'] means inserting a char 'a' costs 0.5
# similarly, weight['a'][''] means the score of deleting a char 'a'
"""
>>> pylcs.edit_distance("aaa", "aba")
1
>>> pylcs.edit_distance("aaa", "aba", {'a': {'b': 2.0}})
2.0
>>> pylcs.edit_distance("", "aa", {'': {'a': 0.5}})
1.0
"""
# finding edit distance alignment from string A to B
pylcs.edit_distance_idx("aaa", "aba")
pylcs.edit_distance_idx("aaa", "aba", {'a': {'b': 3}})
pylcs.edit_distance_idx("aa", "aabb", {'a': {'a': 2, 'b': 0}})
"""
>>> pylcs.edit_distance_idx("aaa", "aba")
[0, 1, 2]
>>> pylcs.edit_distance_idx("aaa", "aba", {'a': {'b': 3}})
[0, 1, 2]
>>> pylcs.edit_distance_idx("aa", "aabb", {'a': {'a': 2, 'b': 0}})
[2, 3]
"""
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pylcs0.0.7cp37cp37mwin_amd64.whl
Algorithm  Hash digest  

SHA256  bf3b70fa3412a998a97d68c4358563bfc61cf2ccd7b67d02ff001d10db804d7a 

MD5  1f109034e42b1cd9ae3195d76f449b70 

BLAKE2256  9eb245846f068947ecc43d53233aef4a25a83c964346a98fe6f38d0e3077e5c9 