Skip to main content

Python wrapper for the Kalign multiple sequence alignment engine

Project description

Kalign Python Package

Python bindings for Kalign, a fast multiple sequence alignment program for biological sequences (DNA, RNA, protein).

Installation

pip install kalign

Optional dependencies for ecosystem integration:

pip install kalign[biopython]    # Biopython integration (fmt="biopython", I/O helpers)
pip install kalign[skbio]        # scikit-bio integration (fmt="skbio")
pip install kalign[io]           # I/O helpers (requires Biopython)
pip install kalign[analysis]     # pandas + matplotlib for downstream analysis
pip install kalign[all]          # all of the above

Quick Start

import kalign

sequences = [
    "ATCGATCGATCG",
    "ATCGTCGATCG",
    "ATCGATCATCG"
]

aligned = kalign.align(sequences, seq_type="dna")
for seq in aligned:
    print(seq)

Core API

kalign.align()

aligned = kalign.align(
    sequences,              # list of str
    seq_type="auto",        # "auto", "dna", "rna", "protein", "divergent", "internal"
    gap_open=None,          # positive float, or None for defaults
    gap_extend=None,        # positive float, or None for defaults
    terminal_gap_extend=None,
    n_threads=None,         # int, or None for global default
    fmt="plain",            # "plain", "biopython", "skbio"
    ids=None,               # list of str (for biopython/skbio output)
)

Returns a list of aligned strings (default), a Bio.Align.MultipleSeqAlignment (fmt="biopython"), or a skbio.TabularMSA (fmt="skbio").

kalign.align_from_file()

Align sequences directly from a FASTA, MSF, or Clustal file:

result = kalign.align_from_file("sequences.fasta", seq_type="protein")
for name, seq in zip(result.names, result.sequences):
    print(f"{name}: {seq}")

Returns an AlignedSequences named tuple with .names and .sequences.

kalign.compare()

Score a test alignment against a reference using the Sum-of-Pairs (SP) score:

score = kalign.compare("reference.msf", "test.fasta")
print(f"SP score: {score:.1f}")  # 0 (no match) to 100 (identical)

kalign.write_alignment()

Write aligned sequences to a file:

kalign.write_alignment(aligned, "output.fasta", format="fasta", ids=ids)

Supported formats: fasta, clustal, stockholm, phylip (non-FASTA formats require Biopython).

Threading

import kalign

kalign.set_num_threads(4)        # set global default
n = kalign.get_num_threads()     # query current default

# or override per call
aligned = kalign.align(sequences, n_threads=8)

Thread settings are thread-local, so different threads can use different defaults.

Utilities (kalign.utils)

Requires only NumPy (installed automatically):

import kalign

aligned = kalign.align(sequences)

arr = kalign.utils.to_array(aligned)                          # numpy array
stats = kalign.utils.alignment_stats(aligned)                 # dict with gap_fraction, conservation, identity
consensus = kalign.utils.consensus_sequence(aligned, threshold=0.7)
matrix = kalign.utils.pairwise_identity_matrix(aligned)       # numpy array
trimmed = kalign.utils.remove_gap_columns(aligned)
region = kalign.utils.trim_alignment(aligned, start=2, end=10)

Biopython Integration

Requires pip install kalign[biopython].

import kalign

# Return a Biopython MultipleSeqAlignment
aln = kalign.align(sequences, fmt="biopython", ids=["s1", "s2", "s3"])
print(aln.get_alignment_length())

# Write in various formats via Biopython
from Bio import AlignIO
AlignIO.write(aln, "output.clustal", "clustal")

I/O helpers (kalign.io)

sequences = kalign.io.read_fasta("input.fasta")
sequences, ids = kalign.io.read_sequences("input.fasta")

aligned = kalign.align(sequences)
kalign.io.write_fasta(aligned, "output.fasta", ids=ids)
kalign.io.write_clustal(aligned, "output.aln", ids=ids)
kalign.io.write_stockholm(aligned, "output.sto", ids=ids)
kalign.io.write_phylip(aligned, "output.phy", ids=ids)

scikit-bio Integration

Requires pip install kalign[skbio].

import kalign

# Returns a TabularMSA of DNA, RNA, or Protein depending on seq_type
aln = kalign.align(sequences, seq_type="dna", fmt="skbio")
print(type(aln))  # <class 'skbio.alignment._tabular_msa.TabularMSA'>

Sequence Types

String Constant Description
"auto" kalign.AUTO Auto-detect (default)
"dna" kalign.DNA DNA sequences
"rna" kalign.RNA RNA sequences
"protein" kalign.PROTEIN Protein sequences
"divergent" kalign.PROTEIN_DIVERGENT Divergent protein sequences
"internal" kalign.DNA_INTERNAL DNA with internal gap preference

Command-line Interface

kalign-py -i sequences.fasta -o aligned.fasta --format fasta --type protein
kalign-py -i sequences.fasta -o - --format clustal   # stdout
cat input.fa | kalign-py -i - -o aligned.fasta        # stdin
kalign-py --version

Development

git clone https://github.com/TimoLassmann/kalign.git
cd kalign
uv pip install -e .
uv run pytest tests/python/ -v

Requirements: Python 3.9+, CMake 3.18+, C++11 compiler, NumPy.

Citation

If you use Kalign in your research, please cite:

Lassmann, T. (2020). Kalign 3: multiple sequence alignment of large data sets. Bioinformatics, 36(6), 1928-1929. doi:10.1093/bioinformatics/btz795

License

GNU General Public License v3.0 or later. See COPYING.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kalign_python-3.4.8.tar.gz (1.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

kalign_python-3.4.8-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (299.7 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

kalign_python-3.4.8-cp313-cp313-macosx_14_0_x86_64.whl (166.6 kB view details)

Uploaded CPython 3.13macOS 14.0+ x86-64

kalign_python-3.4.8-cp313-cp313-macosx_14_0_arm64.whl (400.5 kB view details)

Uploaded CPython 3.13macOS 14.0+ ARM64

kalign_python-3.4.8-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (298.6 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

kalign_python-3.4.8-cp312-cp312-macosx_14_0_x86_64.whl (166.6 kB view details)

Uploaded CPython 3.12macOS 14.0+ x86-64

kalign_python-3.4.8-cp312-cp312-macosx_14_0_arm64.whl (400.4 kB view details)

Uploaded CPython 3.12macOS 14.0+ ARM64

kalign_python-3.4.8-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (297.4 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

kalign_python-3.4.8-cp311-cp311-macosx_14_0_x86_64.whl (165.4 kB view details)

Uploaded CPython 3.11macOS 14.0+ x86-64

kalign_python-3.4.8-cp311-cp311-macosx_14_0_arm64.whl (399.8 kB view details)

Uploaded CPython 3.11macOS 14.0+ ARM64

kalign_python-3.4.8-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (295.7 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

kalign_python-3.4.8-cp310-cp310-macosx_14_0_x86_64.whl (163.9 kB view details)

Uploaded CPython 3.10macOS 14.0+ x86-64

kalign_python-3.4.8-cp310-cp310-macosx_14_0_arm64.whl (398.3 kB view details)

Uploaded CPython 3.10macOS 14.0+ ARM64

kalign_python-3.4.8-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (295.9 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

kalign_python-3.4.8-cp39-cp39-macosx_14_0_x86_64.whl (164.0 kB view details)

Uploaded CPython 3.9macOS 14.0+ x86-64

kalign_python-3.4.8-cp39-cp39-macosx_14_0_arm64.whl (398.4 kB view details)

Uploaded CPython 3.9macOS 14.0+ ARM64

File details

Details for the file kalign_python-3.4.8.tar.gz.

File metadata

  • Download URL: kalign_python-3.4.8.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for kalign_python-3.4.8.tar.gz
Algorithm Hash digest
SHA256 4a8be8a874ab2637aad1bac0a7026697cd71b8896cfc875cf3371f41626f1f7c
MD5 3898ee759dba64d018b0c1b74f8d35a4
BLAKE2b-256 d84a27c6f689673646c46b56d17267913153fa71276b0b5bb72cbfed061a9b4f

See more details on using hashes here.

Provenance

The following attestation bundles were made for kalign_python-3.4.8.tar.gz:

Publisher: wheels.yml on TimoLassmann/kalign

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kalign_python-3.4.8-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kalign_python-3.4.8-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ec16852a3a1f77a7c04c2ef867cfb13630585b26380807b2288081e7819361bd
MD5 671aa026c0a1f0f3909fff0f03a6a5ef
BLAKE2b-256 154ae2525ad2c83d3582c87fdb2d8876ded8bd502d10854c2ad1f7aa26684f7b

See more details on using hashes here.

Provenance

The following attestation bundles were made for kalign_python-3.4.8-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: wheels.yml on TimoLassmann/kalign

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kalign_python-3.4.8-cp313-cp313-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for kalign_python-3.4.8-cp313-cp313-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 b989310f1dc903a95144bc121f4e3b6b58e1ad4401d8358188dbd301fb7d4b41
MD5 974225e5db1a00f97f4f2b62a3c8a593
BLAKE2b-256 00179aa5b02d5fa11a80105c0eddcf4f5650a898eb5fdc1d7e2f77637c49cd31

See more details on using hashes here.

Provenance

The following attestation bundles were made for kalign_python-3.4.8-cp313-cp313-macosx_14_0_x86_64.whl:

Publisher: wheels.yml on TimoLassmann/kalign

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kalign_python-3.4.8-cp313-cp313-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for kalign_python-3.4.8-cp313-cp313-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 4666599cfcffda112bd1c121c98bde6d791b6b0481933aa514f7fb420a6d9095
MD5 8ef332578042e1889e51133bebbe0d95
BLAKE2b-256 3831d60a0f1b22c2e7e670a8cfd625f1780da26ff4dd449d0db111278c7b12c0

See more details on using hashes here.

Provenance

The following attestation bundles were made for kalign_python-3.4.8-cp313-cp313-macosx_14_0_arm64.whl:

Publisher: wheels.yml on TimoLassmann/kalign

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kalign_python-3.4.8-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kalign_python-3.4.8-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 20c1964a0c9f2b4ded3f01d6d1ebf94aa14e96c5c5a1146f48f54e0b32aa58be
MD5 bf0fcecd6c275719f497744f1e692058
BLAKE2b-256 c83aaacd9585dc83ad9ea158480f447830549ef13d82e0b9e2f29b66d09c4f98

See more details on using hashes here.

Provenance

The following attestation bundles were made for kalign_python-3.4.8-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: wheels.yml on TimoLassmann/kalign

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kalign_python-3.4.8-cp312-cp312-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for kalign_python-3.4.8-cp312-cp312-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 0852435569ee9aaca182fd42e713e811a6be4ccc01e3b0d7a10238773ac201f1
MD5 1724daf7cb1ae01382d740c3e4b4f609
BLAKE2b-256 5b6e10f24a6c5137e8722bf6e2f7fcb067b859c7b97c25c0ddddf8f2bc0a4239

See more details on using hashes here.

Provenance

The following attestation bundles were made for kalign_python-3.4.8-cp312-cp312-macosx_14_0_x86_64.whl:

Publisher: wheels.yml on TimoLassmann/kalign

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kalign_python-3.4.8-cp312-cp312-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for kalign_python-3.4.8-cp312-cp312-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 ac3bfae6f88b9c574ce353e66d23f2978befff304cbd8d1ee371bac8330ec656
MD5 183616413f382f6325f0745333211368
BLAKE2b-256 1ed9d1ed41f48ef29a5201560b5a7eeda21503ce8f0e221db14683ec255a2fa0

See more details on using hashes here.

Provenance

The following attestation bundles were made for kalign_python-3.4.8-cp312-cp312-macosx_14_0_arm64.whl:

Publisher: wheels.yml on TimoLassmann/kalign

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kalign_python-3.4.8-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kalign_python-3.4.8-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ddf9381bb12d48f36bb33850e7d945f8bbd24a2e68fdc0df07edd2cf06376eb8
MD5 44f16b2673af623d55d6c4c66f1fface
BLAKE2b-256 914ff1a24d2177ae084a8f5d9f9452a53f4be7fcfa0198c06a276ce4dee8c494

See more details on using hashes here.

Provenance

The following attestation bundles were made for kalign_python-3.4.8-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: wheels.yml on TimoLassmann/kalign

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kalign_python-3.4.8-cp311-cp311-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for kalign_python-3.4.8-cp311-cp311-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 4357f8ce1f9ffdaacd75d107abd152f25501059fb9981ab56aa369a7bd4f1371
MD5 34ec9e1f13c01b0e1cc3a6aecd5ff805
BLAKE2b-256 7ab04d5b94f6a0bcd15b5440e658077517995a65f283a32d25151e2b82263f87

See more details on using hashes here.

Provenance

The following attestation bundles were made for kalign_python-3.4.8-cp311-cp311-macosx_14_0_x86_64.whl:

Publisher: wheels.yml on TimoLassmann/kalign

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kalign_python-3.4.8-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for kalign_python-3.4.8-cp311-cp311-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 cb980b0e7a97a91a493d29168209ffbe05ea64f0515acff1a524ccd6f68cc63c
MD5 55c0e0650fc4368db078428105bcdab6
BLAKE2b-256 921d230dcb2f128aaf1b5a1f5f3b163c44ffe52bdd14f956bee1c029d614a349

See more details on using hashes here.

Provenance

The following attestation bundles were made for kalign_python-3.4.8-cp311-cp311-macosx_14_0_arm64.whl:

Publisher: wheels.yml on TimoLassmann/kalign

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kalign_python-3.4.8-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kalign_python-3.4.8-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a9d461acc5ed772bba663462ff116f83575aee5d54380193ccf422051b882c63
MD5 b435a4e6e9df1a91bfe927aed8a91ad0
BLAKE2b-256 c037a13c4a21c3a58dfee65929cdfb357fadd5652ed253201199e1e7439b8755

See more details on using hashes here.

Provenance

The following attestation bundles were made for kalign_python-3.4.8-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: wheels.yml on TimoLassmann/kalign

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kalign_python-3.4.8-cp310-cp310-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for kalign_python-3.4.8-cp310-cp310-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 a71ea82f9c6ec8b144bd8c6f98252fa1c73349558c1552ac7735edac44561736
MD5 be4b93a5ef1c84caf2ac426bd7f5d563
BLAKE2b-256 bfca0b7b258f11832d5459123572d7612e57ca92820efe3562bd3897b40ebb07

See more details on using hashes here.

Provenance

The following attestation bundles were made for kalign_python-3.4.8-cp310-cp310-macosx_14_0_x86_64.whl:

Publisher: wheels.yml on TimoLassmann/kalign

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kalign_python-3.4.8-cp310-cp310-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for kalign_python-3.4.8-cp310-cp310-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 68db72b19c7cd39be5642fdcf635d4043ca551cda973522b647e367d6dbdd9bd
MD5 c90fa0282d5f9e84905f379d57e36a4a
BLAKE2b-256 2614ba3d3d9b22c78be15bf229ed0f66a6922958bb4f07e2c1819fa12cf47b3b

See more details on using hashes here.

Provenance

The following attestation bundles were made for kalign_python-3.4.8-cp310-cp310-macosx_14_0_arm64.whl:

Publisher: wheels.yml on TimoLassmann/kalign

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kalign_python-3.4.8-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for kalign_python-3.4.8-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a63d1890a5e7c8b8d2491475840e31289a14fa3d922ecaa19614d201281988e4
MD5 b35e9af70cbeb1cf25d77f82a5d39944
BLAKE2b-256 be30c3eab07ad94ed7b1974840bdf7807954cef406e042058f0e835a37922fb6

See more details on using hashes here.

Provenance

The following attestation bundles were made for kalign_python-3.4.8-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: wheels.yml on TimoLassmann/kalign

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kalign_python-3.4.8-cp39-cp39-macosx_14_0_x86_64.whl.

File metadata

File hashes

Hashes for kalign_python-3.4.8-cp39-cp39-macosx_14_0_x86_64.whl
Algorithm Hash digest
SHA256 fc7ec33cc2b2f764aad8103adc5cd1e9a084e3da8c189049d3c1c17f0b361910
MD5 fe0711641bd86ece1f60b6745a765477
BLAKE2b-256 be13364853a29f9cb6bb17de9175f6b87afa1b22e630849a0ed4faa79e3b60b3

See more details on using hashes here.

Provenance

The following attestation bundles were made for kalign_python-3.4.8-cp39-cp39-macosx_14_0_x86_64.whl:

Publisher: wheels.yml on TimoLassmann/kalign

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kalign_python-3.4.8-cp39-cp39-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for kalign_python-3.4.8-cp39-cp39-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 315ec3001c77511bc7d4f3a8ddebe55e0e69ac899b5137280447ee292d10f9b5
MD5 adbc84aee7b0d81ec97fe7a5b93b322b
BLAKE2b-256 8e43d764b6d70dd60a46a0c85f395c3bdd5cb9f486ae6df51cc5afdc4346d861

See more details on using hashes here.

Provenance

The following attestation bundles were made for kalign_python-3.4.8-cp39-cp39-macosx_14_0_arm64.whl:

Publisher: wheels.yml on TimoLassmann/kalign

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page