Skip to main content

A Rust-based sequence alignment library for Python.

Project description

seq-smith

seq-smith

A Rust-based sequence alignment library for Python.

Installation

You can install seq-smith using pip:

pip install seq-smith

Usage

seq-smith provides several alignment functions and helper functions to make sequence alignment easy. Here's a basic example of how to perform a global alignment:

from seq_smith import global_align, make_score_matrix, encode

# Define your alphabet
alphabet = "ACGT"

# Create a scoring matrix
score_matrix = make_score_matrix(alphabet, match_score=1, mismatch_score=-1)

# Encode sequences
seqa = encode("ACGT", alphabet)
seqb = encode("AGCT", alphabet)

# Define gap penalties
gap_open = -2
gap_extend = -1

# Perform the alignment
alignment = global_align(seqa, seqb, score_matrix, gap_open, gap_extend)

# Print the alignment score
print(f"Alignment score: {alignment.score}")

# Print the alignment fragments
for frag in alignment.align_frag:
    print(frag)

Helper Functions

make_score_matrix(alphabet, match_score, mismatch_score)

Creates a scoring matrix for a given alphabet.

Example:

from seq_smith import make_score_matrix

alphabet = "ACGT"
score_matrix = make_score_matrix(alphabet, match_score=2, mismatch_score=-1)
print(score_matrix)

encode(seq, alphabet)

Encodes a sequence into a byte array using the provided alphabet.

Example:

from seq_smith import encode

alphabet = "ACGT"
encoded_seq = encode("AGCT", alphabet)
print(encoded_seq)
# Output: b'\x00\x02\x01\x03'

decode(encoded_seq, alphabet)

Decodes a byte-encoded sequence back to a string using the provided alphabet.

Example:

from seq_smith import decode

alphabet = "ACGT"
decoded_seq = decode(b'\x00\x02\x01\x03', alphabet)
print(decoded_seq)
# Output: AGCT

generate_cigar(alignment)

Generates a CIGAR string from an Alignment object.

Example:

from seq_smith import global_align, make_score_matrix, encode, generate_cigar

alphabet = "ACGT"
score_matrix = make_score_matrix(alphabet, match_score=1, mismatch_score=-1)
seqa = encode("ACTTTTGT", alphabet)
seqb = encode("AGCT", alphabet)
alignment = global_align(seqa, seqb, score_matrix, -2, -1)

cigar_string = generate_cigar(alignment)
print(cigar_string)
# Expected output: 1M4D3M

Alignment Types

Global Alignment (global_align)

Performs a global alignment between two sequences using the Needleman-Wunsch algorithm. This alignment type attempts to align every residue in both sequences.

Example:

from seq_smith import global_align, make_score_matrix, encode

alphabet = "GATTACA"
score_matrix = make_score_matrix(alphabet, 2, -1)
seqa = encode("GATTACA", alphabet)
seqb = encode("GCATGCA", alphabet)

alignment = global_align(seqa, seqb, score_matrix, -2, -1)
# Expected score: 0

Local Alignment (local_align)

Performs a local alignment between two sequences using the Smith-Waterman algorithm. This alignment type finds the best-scoring local region of similarity between the two sequences.

Example:

from seq_smith import local_align, make_score_matrix, encode

alphabet = "ACGTXYZW"
score_matrix = make_score_matrix(alphabet, 2, -1)
seqa = encode("XXXXXAGCTYYYYY", alphabet)
seqb = encode("ZZZAGCTWWW", alphabet)

alignment = local_align(seqa, seqb, score_matrix, -2, -1)
# Expected score: 8 (for "AGCT")

Local-Global Alignment (local_global_align)

This alignment finds the best local alignment of seqa within seqb, but seqb must be aligned globally.

Example:

from seq_smith import local_global_align, make_score_matrix, encode

alphabet = "ACGTX"
score_matrix = make_score_matrix(alphabet, 2, -1)
seqa = encode("XACGTX", alphabet)
seqb = encode("ACGT", alphabet)

alignment = local_global_align(seqa, seqb, score_matrix, -2, -1)
# Expected score: 8

Overlap Alignment (overlap_align)

Performs an overlap alignment between two sequences. This alignment type does not penalize gaps at the start or end of either sequence, making it suitable for finding overlaps between sequences, such as in sequence assembly.

Example:

from seq_smith import overlap_align, make_score_matrix, encode

alphabet = "ACGT"
score_matrix = make_score_matrix(alphabet, 2, -1)
seqa = encode("ACGTACGT", alphabet)
seqb = encode("CGTA", alphabet)

alignment = overlap_align(seqa, seqb, score_matrix, -2, -1)
# Expected score: 8

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

seq_smith-0.1.0-cp313-cp313-win_amd64.whl (173.1 kB view details)

Uploaded CPython 3.13Windows x86-64

seq_smith-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (333.7 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

seq_smith-0.1.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (314.0 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ARM64

seq_smith-0.1.0-cp313-cp313-macosx_11_0_arm64.whl (283.0 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

seq_smith-0.1.0-cp313-cp313-macosx_10_13_x86_64.whl (291.8 kB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

seq_smith-0.1.0-cp312-cp312-win_amd64.whl (173.0 kB view details)

Uploaded CPython 3.12Windows x86-64

seq_smith-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (333.7 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

seq_smith-0.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (314.5 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

seq_smith-0.1.0-cp312-cp312-macosx_11_0_arm64.whl (282.5 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

seq_smith-0.1.0-cp312-cp312-macosx_10_13_x86_64.whl (291.6 kB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

seq_smith-0.1.0-cp311-cp311-win_amd64.whl (173.5 kB view details)

Uploaded CPython 3.11Windows x86-64

seq_smith-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (335.5 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

seq_smith-0.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (315.6 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

seq_smith-0.1.0-cp311-cp311-macosx_11_0_arm64.whl (283.6 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

seq_smith-0.1.0-cp311-cp311-macosx_10_12_x86_64.whl (292.0 kB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

seq_smith-0.1.0-cp310-cp310-win_amd64.whl (173.4 kB view details)

Uploaded CPython 3.10Windows x86-64

seq_smith-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (335.6 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

seq_smith-0.1.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (315.7 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

seq_smith-0.1.0-cp310-cp310-macosx_11_0_arm64.whl (283.9 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

seq_smith-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl (292.4 kB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

File details

Details for the file seq_smith-0.1.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: seq_smith-0.1.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 173.1 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for seq_smith-0.1.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 e390e4a51325e01b68d2b49628effd2cf35679342ae7d8dbd6dd063b89968f22
MD5 fe157293e2225901643b3a37030e1f43
BLAKE2b-256 2f3a114cc48b74a1705f5cd277bb1ca4e0cb560afd891a4f590db56c51ec4e51

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for seq_smith-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7c4256273b55dae40ed584454a724cdb9b9987a5b4b7dc143dfdb382cfe84511
MD5 03450a32689ae9ee489cdd6113eeccd9
BLAKE2b-256 5bd6c3cb55d84bf64b49d8dd81c1d11d2518dc8d626862e5cfe239f385c9befa

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for seq_smith-0.1.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 cf7a85e59b9147d1e14aaab4f60c84ea0cc4731ba95141025f738d75e4646e74
MD5 67b87c600406933e8f389e062164e9d4
BLAKE2b-256 c830405111ea5421b336b9b041dc7c2d7866fc03e65948f7a0d557b144d7d64e

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for seq_smith-0.1.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 eae78534a376bd825146f4222592a0c6ed81d9b6ff14bd1cf3fd15357af01e0e
MD5 806963400797a18738e6df3e97a43f30
BLAKE2b-256 ab073473a508c417453a8058c9680d7c4644f6cee0d283b44775d7a3eaa241f0

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for seq_smith-0.1.0-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 b4a6a5752d53aebc40b32651c0eb76c45b4f74caaab9d8564ca9546add3ea468
MD5 47f1d8a325beb5ad774f8e129e353e44
BLAKE2b-256 cec730bedf68086bd4f4f5157c8c67c8323320605f7c018097f4fa041db13833

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: seq_smith-0.1.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 173.0 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for seq_smith-0.1.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 151fe9ba10968105e4e9938a2d142fd95c37234f4d6a4800c9cdd9445f20981d
MD5 4ec9c57d7bb435a1f73de30c2b2e6031
BLAKE2b-256 9bd5b10b69f2fdb84d7125768f877e4dccafdf623b2a83e7a41200cb4a0c9301

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for seq_smith-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0e7d07bd2a65eb810aefaedc2d1d4e64df2d82c56bb0a791e9fd818e3aac8818
MD5 0b57b6275698b5e209981dcfb4efa604
BLAKE2b-256 04ef5ceba92572d4a2737f3929b45863caa5ca3a82d5919ff3a0c13c48854e63

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for seq_smith-0.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 0a7158133909c1a063b80466fe901b9788487e575cfdceb0ce062c9562b0db8f
MD5 608248cba32b80922b740508ded7b5bd
BLAKE2b-256 1d5aef69ce64077d8d489585fdb73a11eaa144667b89b94dc441cba5bbf0a112

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for seq_smith-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 79834acd3a316a3f4fe4c93cf72c25ac036499c186634ddff44c90225a83332a
MD5 d85922ea16bc66829d286dc51ec4964c
BLAKE2b-256 22280f1524563d79c249793248875eef3dd504d70fa3286ece58e869aab71be0

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for seq_smith-0.1.0-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 914c7d1522868fbeeed1dd5e8b517ac09fa0244f0f09001276a3a1d885eee4f7
MD5 4c871ed7d4391593c304c23901fe7eed
BLAKE2b-256 90b6613feb5eda3eeae49c1ed500e3331d09853f098ddfeaf1d5d5e8e41124cc

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: seq_smith-0.1.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 173.5 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for seq_smith-0.1.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 8961e67ac384baae6c8ba9cd72cf583f54c0d6cb133d173d12039990c7aca5a2
MD5 5ad591de4b72233358e19ffaafbb7479
BLAKE2b-256 f960e744de61407750792c650deaa7418e8814eed468a144ca3d7a64acbba1a6

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for seq_smith-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7a7e81c7ba48322a03d16874a98d468d97cf1dd3697a622ca06b55acd32f0d0e
MD5 438dc7b6e143255f100f768b1ca519e2
BLAKE2b-256 bd541cccb8ee8a071f612b51e5a6e6a6a951714e3dc97cc6259fe33c9e2c080f

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for seq_smith-0.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 0185d7e3085b33f5a1309b94d19c6b263db158921ea4871a7748b6b1b9a3ad43
MD5 6a9221c63ab85be9b65c8142c7fef7bf
BLAKE2b-256 d9ebbcef2ac27328208993c627eeac04abd5293a4553683cc53814b6b1ebe76f

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for seq_smith-0.1.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b90f9545e1babf4acb75729769e6a9bd8bd8129cc7954c48c52121c596e24fa4
MD5 a32cd95316b072e83a2a86e1354e6599
BLAKE2b-256 4b728a9bb3db0ec71b9a262541225c59389b21c354de619035e651c4fe682d82

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for seq_smith-0.1.0-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 5c6978a791abd33e90029287df25a8ad22e7ae0368630ff4fa5a9e7149c4f9f1
MD5 6e328ff1b0aa64cb96f2274cdc8e3031
BLAKE2b-256 fee128024aa0b7c3db1c9c22dcf1f8883fcbfc395329978cbe27018a30018364

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: seq_smith-0.1.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 173.4 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for seq_smith-0.1.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 f0d5878865abf790af1f4f1f5fff51b459535e1c6904368e217da40450ddcc75
MD5 93cf8c879b4b8c1e652969c0c77fe162
BLAKE2b-256 9d44207d482bc37c4aec27b61cfb5665d52c60decda4fd6197a2f1c10d805bc4

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for seq_smith-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 59fbe52b43dae8babdd89946a227b711697a0b84f6bb2b0eb8885177c49e9142
MD5 856795afaa703c3e1521c49b08582dc6
BLAKE2b-256 ac82dd78300be16414854a209b8f2c6a7f19d024ac533392a55c5e44b0343a01

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for seq_smith-0.1.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 f67b0299c50e2a581898c1c37fb32b6cbceeb5f6c62a4f56a8272d5883c2e888
MD5 9e845be395023172546999cc189ab876
BLAKE2b-256 acb289f818aa13a0237e7c573dfd0873eb8d18bb58576950909331e7a2d325ed

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for seq_smith-0.1.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ba3a7bf7bcbe02207881da267e779e094cba691dc1fc4bdf6fdc69f3e925a2b1
MD5 652f19ef1955a3ad3732af7d8a75fabb
BLAKE2b-256 57b1bbc8ed7c549fa458ec741a04d9964cc571694b32794793958742fe19726a

See more details on using hashes here.

File details

Details for the file seq_smith-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for seq_smith-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 5679a13e7a99f9b02330ab0b7f4f5dbed93d934a9cd76a2e3a5a123caf31f772
MD5 28f49dd820d7317e29173fa9588ee21d
BLAKE2b-256 75b0622cce9a9054dcab7d49119fbcfe3a065806f5002ccf7ef95e140a417793

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page