Global Sequence Alignment
Project description
Purpose
Performs global sequence alignment of two strings. Allows for affine gap penalties.
Installation
The package can be installed via pip, conda, or mamba.
pip
Run the following in a terminal.
python3 -m venv my_venv_for_globalign
source my_venv_for_globalign/bin/activate
pip install globalign
Documentation
Contributing
You will need to install quarto and quartodoc to be able to publish documentation updates. To build the documentation locally and publish it to GitHub, do the following: From the project root, run:
quartodoc build
quarto render
If you are happy with the resulting website, then push your changes to a branch other than gh-pages. Then, (from a branch other than gh-pages), execute the command:
quarto publish gh-pages
The public-facing website should now be updated.
To run unit tests, make sure pytest and hatch are installed and available. Building also requires the hatch-vcs plugin. To build and test, do the following: From the project root, run:
rm -rf dist/
hatch build
hatch test
To install from source, run the following from the project root after building:
pip install --editable .
Versions can be changed via git tags. For example, run this with the version you want:
git tag -a "v0.0.0"
git push origin v0.0.0
Acknowledgements
A special thanks goes to Mykola Akulov and Ragnar Groot Koerkamp for their insightful blog post without which I would not have known how to make this package work with both scoring and costing schemes.
References
- https://web.stanford.edu/class/cs262/archives/presentations/lecture3.pdf
- https://ocw.mit.edu/courses/6-096-algorithms-for-computational-biology-spring-2005/01f55f348ea1e95f7015bd1b40586012_lecture5.pdf
- Martin Mann, Mostafa M Mohamed, Syed M Ali, and Rolf Backofen Interactive implementations of thermodynamics-based RNA structure and RNA-RNA interaction prediction approaches for example-driven teaching PLOS Computational Biology, 14 (8), e1006341, 2018.
- Martin Raden, Syed M Ali, Omer S Alkhnbashi, Anke Busch, Fabrizio Costa, Jason A Davis, Florian Eggenhofer, Rick Gelhausen, Jens Georg, Steffen Heyne, Michael Hiller, Kousik Kundu, Robert Kleinkauf, Steffen C Lott, Mostafa M Mohamed, Alexander Mattheis, Milad Miladi, Andreas S Richter, Sebastian Will, Joachim Wolff, Patrick R Wright, and Rolf Backofen Freiburg RNA tools: a central online resource for RNA-focused research and teaching Nucleic Acids Research, 46(W1), W25-W29, 2018.
- An improved algorithm for matching biological sequences. Osamu Gotoh. https://doi.org/10.1016/0022-2836(82)90398-9
- http://www.cs.cmu.edu/~durand/03-711/2017/Lectures/Sequence-Alignment-2017.pdf
- https://bioboot.github.io/bimm143_W20/class-material/nw/
- https://www.ncbi.nlm.nih.gov/CBBresearch/Przytycka/download/lectures/PCB_Lect02_Pairwise_allign.pdf
- https://ics.uci.edu/~xhx/courses/CS284A-F08/lectures/alignment.pdf
- https://link.springer.com/chapter/10.1007/978-3-319-90684-3_2
- Optimal sequence alignment using affine gap costs. https://link.springer.com/content/pdf/10.1007/BF02462326.pdf
- Optimal alignments in linear space. Eugene W. Myers, Webb Miller. https://doi.org/10.1093/bioinformatics/4.1.11
- Sequence alignment using FastLSA. https://webdocs.cs.ualberta.ca/~duane/publications/pdf/2000metmbs.pdf
- MASA: A Multiplatform Architecture for Sequence Aligners with Block Pruning. https://doi.org/10.1145/2858656
- https://community.gep.wustl.edu/repository/course_materials_WU/annotation/Introduction_Dynamic_Programming.pdf
- Optimal gap-affine alignment in O(s) space. https://doi.org/10.1093/bioinformatics/btad074
- Exact global alignment using A* with chaining seed heuristic and match pruning. https://doi.org/10.1093/bioinformatics/btae032
- Transforming match bonus into cost. https://curiouscoding.nl/posts/alignment-scores-transform/
- Improving the time and space complexity of the WFA algorithm and generalizing its scoring. https://doi.org/10.1101/2022.01.12.476087
- A* PA2: up to 20 times faster exact global alignment. https://doi.org/10.1101/2024.03.24.586481
- Notes on Dynamic-Programming Sequence Alignment. https://globin.bx.psu.edu/courses/fall2001/DP.pdf
- Lecture 6: Affine gap penalty function. https://www.cs.hunter.cuny.edu/~saad/courses/compbio/lectures/lecture6.pdf
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file globalign-0.1.15.tar.gz.
File metadata
- Download URL: globalign-0.1.15.tar.gz
- Upload date:
- Size: 26.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
882aeddf343bb15121ee7523b05c055a86a184318b8d07b0442311e3eb61c39b
|
|
| MD5 |
d1cf14a7766773ae3b7569cc84f3d036
|
|
| BLAKE2b-256 |
6149cc9f446836b3f3d90dbb2e4c883eda29b2a0e00e375ed23ca31a2e14d55a
|
Provenance
The following attestation bundles were made for globalign-0.1.15.tar.gz:
Publisher:
python_publish.yml on iamgiddyaboutgit/globalign
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
globalign-0.1.15.tar.gz -
Subject digest:
882aeddf343bb15121ee7523b05c055a86a184318b8d07b0442311e3eb61c39b - Sigstore transparency entry: 245669587
- Sigstore integration time:
-
Permalink:
iamgiddyaboutgit/globalign@c8dc2c94cefd4e1fc299aa542c678986f5f82ccf -
Branch / Tag:
refs/tags/v0.1.15 - Owner: https://github.com/iamgiddyaboutgit
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python_publish.yml@c8dc2c94cefd4e1fc299aa542c678986f5f82ccf -
Trigger Event:
release
-
Statement type:
File details
Details for the file globalign-0.1.15-py3-none-any.whl.
File metadata
- Download URL: globalign-0.1.15-py3-none-any.whl
- Upload date:
- Size: 21.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d61529e69060e7e16e9f6b69e57b6a55912d8af6eeb3c0d965a72f4bca11c333
|
|
| MD5 |
b6639edf93aa72ee9be5185bbfe49a37
|
|
| BLAKE2b-256 |
54e822bcb75db90c8922c31e58d77e7f7ce8711e98d7ae200ba281d6c8a21b4e
|
Provenance
The following attestation bundles were made for globalign-0.1.15-py3-none-any.whl:
Publisher:
python_publish.yml on iamgiddyaboutgit/globalign
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
globalign-0.1.15-py3-none-any.whl -
Subject digest:
d61529e69060e7e16e9f6b69e57b6a55912d8af6eeb3c0d965a72f4bca11c333 - Sigstore transparency entry: 245669589
- Sigstore integration time:
-
Permalink:
iamgiddyaboutgit/globalign@c8dc2c94cefd4e1fc299aa542c678986f5f82ccf -
Branch / Tag:
refs/tags/v0.1.15 - Owner: https://github.com/iamgiddyaboutgit
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python_publish.yml@c8dc2c94cefd4e1fc299aa542c678986f5f82ccf -
Trigger Event:
release
-
Statement type: