A Rust-based sequence alignment library for Python.
Project description
seq-smith
A Rust-based sequence alignment library for Python.
Installation
You can install seq-smith using pip:
pip install seq-smith
Usage
seq-smith provides several alignment functions and helper functions to make sequence alignment easy. Here's a basic example of how to perform a global alignment:
from seq_smith import global_align, make_score_matrix, encode
# Define your alphabet
alphabet = "ACGT"
# Create a scoring matrix
score_matrix = make_score_matrix(alphabet, match_score=1, mismatch_score=-1)
# Encode sequences
seqa = encode("ACGT", alphabet)
seqb = encode("AGCT", alphabet)
# Define gap penalties
gap_open = -2
gap_extend = -1
# Perform the alignment
alignment = global_align(seqa, seqb, score_matrix, gap_open, gap_extend)
# Print the alignment score
print(f"Alignment score: {alignment.score}")
# Print the alignment fragments
for frag in alignment.align_frag:
print(frag)
Helper Functions
make_score_matrix(alphabet, match_score, mismatch_score)
Creates a scoring matrix for a given alphabet.
Example:
from seq_smith import make_score_matrix
alphabet = "ACGT"
score_matrix = make_score_matrix(alphabet, match_score=2, mismatch_score=-1)
print(score_matrix)
encode(seq, alphabet)
Encodes a sequence into a byte array using the provided alphabet.
Example:
from seq_smith import encode
alphabet = "ACGT"
encoded_seq = encode("AGCT", alphabet)
print(encoded_seq)
# Output: b'\x00\x02\x01\x03'
decode(encoded_seq, alphabet)
Decodes a byte-encoded sequence back to a string using the provided alphabet.
Example:
from seq_smith import decode
alphabet = "ACGT"
decoded_seq = decode(b'\x00\x02\x01\x03', alphabet)
print(decoded_seq)
# Output: AGCT
generate_cigar(alignment)
Generates a CIGAR string from an Alignment object.
Example:
from seq_smith import global_align, make_score_matrix, encode, generate_cigar
alphabet = "ACGT"
score_matrix = make_score_matrix(alphabet, match_score=1, mismatch_score=-1)
seqa = encode("ACTTTTGT", alphabet)
seqb = encode("AGCT", alphabet)
alignment = global_align(seqa, seqb, score_matrix, -2, -1)
cigar_string = generate_cigar(alignment)
print(cigar_string)
# Expected output: 1M4D3M
Alignment Types
Global Alignment (global_align)
Performs a global alignment between two sequences using the Needleman-Wunsch algorithm. This alignment type attempts to align every residue in both sequences.
Example:
from seq_smith import global_align, make_score_matrix, encode
alphabet = "GATTACA"
score_matrix = make_score_matrix(alphabet, 2, -1)
seqa = encode("GATTACA", alphabet)
seqb = encode("GCATGCA", alphabet)
alignment = global_align(seqa, seqb, score_matrix, -2, -1)
# Expected score: 0
Local Alignment (local_align)
Performs a local alignment between two sequences using the Smith-Waterman algorithm. This alignment type finds the best-scoring local region of similarity between the two sequences.
Example:
from seq_smith import local_align, make_score_matrix, encode
alphabet = "ACGTXYZW"
score_matrix = make_score_matrix(alphabet, 2, -1)
seqa = encode("XXXXXAGCTYYYYY", alphabet)
seqb = encode("ZZZAGCTWWW", alphabet)
alignment = local_align(seqa, seqb, score_matrix, -2, -1)
# Expected score: 8 (for "AGCT")
Local-Global Alignment (local_global_align)
This alignment finds the best local alignment of seqa within seqb, but seqb must be aligned globally.
Example:
from seq_smith import local_global_align, make_score_matrix, encode
alphabet = "ACGTX"
score_matrix = make_score_matrix(alphabet, 2, -1)
seqa = encode("XACGTX", alphabet)
seqb = encode("ACGT", alphabet)
alignment = local_global_align(seqa, seqb, score_matrix, -2, -1)
# Expected score: 8
Overlap Alignment (overlap_align)
Performs an overlap alignment between two sequences. This alignment type does not penalize gaps at the start or end of either sequence, making it suitable for finding overlaps between sequences, such as in sequence assembly.
Example:
from seq_smith import overlap_align, make_score_matrix, encode
alphabet = "ACGT"
score_matrix = make_score_matrix(alphabet, 2, -1)
seqa = encode("ACGTACGT", alphabet)
seqb = encode("CGTA", alphabet)
alignment = overlap_align(seqa, seqb, score_matrix, -2, -1)
# Expected score: 8
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file seq_smith-0.1.0-cp313-cp313-win_amd64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp313-cp313-win_amd64.whl
- Upload date:
- Size: 173.1 kB
- Tags: CPython 3.13, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e390e4a51325e01b68d2b49628effd2cf35679342ae7d8dbd6dd063b89968f22
|
|
| MD5 |
fe157293e2225901643b3a37030e1f43
|
|
| BLAKE2b-256 |
2f3a114cc48b74a1705f5cd277bb1ca4e0cb560afd891a4f590db56c51ec4e51
|
File details
Details for the file seq_smith-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 333.7 kB
- Tags: CPython 3.13, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7c4256273b55dae40ed584454a724cdb9b9987a5b4b7dc143dfdb382cfe84511
|
|
| MD5 |
03450a32689ae9ee489cdd6113eeccd9
|
|
| BLAKE2b-256 |
5bd6c3cb55d84bf64b49d8dd81c1d11d2518dc8d626862e5cfe239f385c9befa
|
File details
Details for the file seq_smith-0.1.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 314.0 kB
- Tags: CPython 3.13, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cf7a85e59b9147d1e14aaab4f60c84ea0cc4731ba95141025f738d75e4646e74
|
|
| MD5 |
67b87c600406933e8f389e062164e9d4
|
|
| BLAKE2b-256 |
c830405111ea5421b336b9b041dc7c2d7866fc03e65948f7a0d557b144d7d64e
|
File details
Details for the file seq_smith-0.1.0-cp313-cp313-macosx_11_0_arm64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp313-cp313-macosx_11_0_arm64.whl
- Upload date:
- Size: 283.0 kB
- Tags: CPython 3.13, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eae78534a376bd825146f4222592a0c6ed81d9b6ff14bd1cf3fd15357af01e0e
|
|
| MD5 |
806963400797a18738e6df3e97a43f30
|
|
| BLAKE2b-256 |
ab073473a508c417453a8058c9680d7c4644f6cee0d283b44775d7a3eaa241f0
|
File details
Details for the file seq_smith-0.1.0-cp313-cp313-macosx_10_13_x86_64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp313-cp313-macosx_10_13_x86_64.whl
- Upload date:
- Size: 291.8 kB
- Tags: CPython 3.13, macOS 10.13+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4a6a5752d53aebc40b32651c0eb76c45b4f74caaab9d8564ca9546add3ea468
|
|
| MD5 |
47f1d8a325beb5ad774f8e129e353e44
|
|
| BLAKE2b-256 |
cec730bedf68086bd4f4f5157c8c67c8323320605f7c018097f4fa041db13833
|
File details
Details for the file seq_smith-0.1.0-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 173.0 kB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
151fe9ba10968105e4e9938a2d142fd95c37234f4d6a4800c9cdd9445f20981d
|
|
| MD5 |
4ec9c57d7bb435a1f73de30c2b2e6031
|
|
| BLAKE2b-256 |
9bd5b10b69f2fdb84d7125768f877e4dccafdf623b2a83e7a41200cb4a0c9301
|
File details
Details for the file seq_smith-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 333.7 kB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0e7d07bd2a65eb810aefaedc2d1d4e64df2d82c56bb0a791e9fd818e3aac8818
|
|
| MD5 |
0b57b6275698b5e209981dcfb4efa604
|
|
| BLAKE2b-256 |
04ef5ceba92572d4a2737f3929b45863caa5ca3a82d5919ff3a0c13c48854e63
|
File details
Details for the file seq_smith-0.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 314.5 kB
- Tags: CPython 3.12, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a7158133909c1a063b80466fe901b9788487e575cfdceb0ce062c9562b0db8f
|
|
| MD5 |
608248cba32b80922b740508ded7b5bd
|
|
| BLAKE2b-256 |
1d5aef69ce64077d8d489585fdb73a11eaa144667b89b94dc441cba5bbf0a112
|
File details
Details for the file seq_smith-0.1.0-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 282.5 kB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
79834acd3a316a3f4fe4c93cf72c25ac036499c186634ddff44c90225a83332a
|
|
| MD5 |
d85922ea16bc66829d286dc51ec4964c
|
|
| BLAKE2b-256 |
22280f1524563d79c249793248875eef3dd504d70fa3286ece58e869aab71be0
|
File details
Details for the file seq_smith-0.1.0-cp312-cp312-macosx_10_13_x86_64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp312-cp312-macosx_10_13_x86_64.whl
- Upload date:
- Size: 291.6 kB
- Tags: CPython 3.12, macOS 10.13+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
914c7d1522868fbeeed1dd5e8b517ac09fa0244f0f09001276a3a1d885eee4f7
|
|
| MD5 |
4c871ed7d4391593c304c23901fe7eed
|
|
| BLAKE2b-256 |
90b6613feb5eda3eeae49c1ed500e3331d09853f098ddfeaf1d5d5e8e41124cc
|
File details
Details for the file seq_smith-0.1.0-cp311-cp311-win_amd64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 173.5 kB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8961e67ac384baae6c8ba9cd72cf583f54c0d6cb133d173d12039990c7aca5a2
|
|
| MD5 |
5ad591de4b72233358e19ffaafbb7479
|
|
| BLAKE2b-256 |
f960e744de61407750792c650deaa7418e8814eed468a144ca3d7a64acbba1a6
|
File details
Details for the file seq_smith-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 335.5 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7a7e81c7ba48322a03d16874a98d468d97cf1dd3697a622ca06b55acd32f0d0e
|
|
| MD5 |
438dc7b6e143255f100f768b1ca519e2
|
|
| BLAKE2b-256 |
bd541cccb8ee8a071f612b51e5a6e6a6a951714e3dc97cc6259fe33c9e2c080f
|
File details
Details for the file seq_smith-0.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 315.6 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0185d7e3085b33f5a1309b94d19c6b263db158921ea4871a7748b6b1b9a3ad43
|
|
| MD5 |
6a9221c63ab85be9b65c8142c7fef7bf
|
|
| BLAKE2b-256 |
d9ebbcef2ac27328208993c627eeac04abd5293a4553683cc53814b6b1ebe76f
|
File details
Details for the file seq_smith-0.1.0-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 283.6 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b90f9545e1babf4acb75729769e6a9bd8bd8129cc7954c48c52121c596e24fa4
|
|
| MD5 |
a32cd95316b072e83a2a86e1354e6599
|
|
| BLAKE2b-256 |
4b728a9bb3db0ec71b9a262541225c59389b21c354de619035e651c4fe682d82
|
File details
Details for the file seq_smith-0.1.0-cp311-cp311-macosx_10_12_x86_64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp311-cp311-macosx_10_12_x86_64.whl
- Upload date:
- Size: 292.0 kB
- Tags: CPython 3.11, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5c6978a791abd33e90029287df25a8ad22e7ae0368630ff4fa5a9e7149c4f9f1
|
|
| MD5 |
6e328ff1b0aa64cb96f2274cdc8e3031
|
|
| BLAKE2b-256 |
fee128024aa0b7c3db1c9c22dcf1f8883fcbfc395329978cbe27018a30018364
|
File details
Details for the file seq_smith-0.1.0-cp310-cp310-win_amd64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 173.4 kB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f0d5878865abf790af1f4f1f5fff51b459535e1c6904368e217da40450ddcc75
|
|
| MD5 |
93cf8c879b4b8c1e652969c0c77fe162
|
|
| BLAKE2b-256 |
9d44207d482bc37c4aec27b61cfb5665d52c60decda4fd6197a2f1c10d805bc4
|
File details
Details for the file seq_smith-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 335.6 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
59fbe52b43dae8babdd89946a227b711697a0b84f6bb2b0eb8885177c49e9142
|
|
| MD5 |
856795afaa703c3e1521c49b08582dc6
|
|
| BLAKE2b-256 |
ac82dd78300be16414854a209b8f2c6a7f19d024ac533392a55c5e44b0343a01
|
File details
Details for the file seq_smith-0.1.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 315.7 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f67b0299c50e2a581898c1c37fb32b6cbceeb5f6c62a4f56a8272d5883c2e888
|
|
| MD5 |
9e845be395023172546999cc189ab876
|
|
| BLAKE2b-256 |
acb289f818aa13a0237e7c573dfd0873eb8d18bb58576950909331e7a2d325ed
|
File details
Details for the file seq_smith-0.1.0-cp310-cp310-macosx_11_0_arm64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp310-cp310-macosx_11_0_arm64.whl
- Upload date:
- Size: 283.9 kB
- Tags: CPython 3.10, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba3a7bf7bcbe02207881da267e779e094cba691dc1fc4bdf6fdc69f3e925a2b1
|
|
| MD5 |
652f19ef1955a3ad3732af7d8a75fabb
|
|
| BLAKE2b-256 |
57b1bbc8ed7c549fa458ec741a04d9964cc571694b32794793958742fe19726a
|
File details
Details for the file seq_smith-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl.
File metadata
- Download URL: seq_smith-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl
- Upload date:
- Size: 292.4 kB
- Tags: CPython 3.10, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5679a13e7a99f9b02330ab0b7f4f5dbed93d934a9cd76a2e3a5a123caf31f772
|
|
| MD5 |
28f49dd820d7317e29173fa9588ee21d
|
|
| BLAKE2b-256 |
75b0622cce9a9054dcab7d49119fbcfe3a065806f5002ccf7ef95e140a417793
|