Skip to main content

Global Sequence Alignment

Project description

Purpose

Performs global sequence alignment of two strings. Allows for affine gap penalties.

Installation

The package can be installed via pip, conda, or mamba.

pip

Run the following in a terminal.

python3 -m venv my_venv_for_globalign
source my_venv_for_globalign/bin/activate
pip install globalign

conda

We recommend using mamba, but conda works too. Be careful not to install into your base environment. Here, we create and activate an evironment first. For more information on using conda to install packages, refer to the documentation. Run the following in a terminal.

conda create -n globalign_conda_test
conda activate globalign_conda_test
conda install --channel conda-forge globalign

mamba

A drop-in replacement for conda:

mamba create -n globalign_conda_test
mamba activate globalign_conda_test
mamba install --channel conda-forge globalign

Documentation

Contributing

You will need to install quarto and quartodoc to be able to publish documentation updates. To build the documentation locally and publish it to GitHub, do the following: From the project root, run:

quartodoc build
quarto render

If you are happy with the resulting website, then push your changes to a branch other than gh-pages. Then, (from a branch other than gh-pages), execute the command:

quarto publish gh-pages

The public-facing website should now be updated.

To run unit tests, make sure pytest and hatch are installed and available. Building also requires the hatch-vcs plugin. To build and test, do the following: From the project root, run:

rm -rf dist/
hatch build
hatch test

To install from source, run the following from the project root after building:

pip install --editable .

Versions can be changed via git tags. For example, run this with the version you want:

git tag -a "v0.0.0"
git push origin v0.0.0

Acknowledgements

A special thanks goes to Mykola Akulov and Ragnar Groot Koerkamp for their insightful blog post without which I would not have known how to make this package work with both scoring and costing schemes.

References

  1. https://web.stanford.edu/class/cs262/archives/presentations/lecture3.pdf
  2. https://ocw.mit.edu/courses/6-096-algorithms-for-computational-biology-spring-2005/01f55f348ea1e95f7015bd1b40586012_lecture5.pdf
  3. Martin Mann, Mostafa M Mohamed, Syed M Ali, and Rolf Backofen Interactive implementations of thermodynamics-based RNA structure and RNA-RNA interaction prediction approaches for example-driven teaching PLOS Computational Biology, 14 (8), e1006341, 2018.
  4. Martin Raden, Syed M Ali, Omer S Alkhnbashi, Anke Busch, Fabrizio Costa, Jason A Davis, Florian Eggenhofer, Rick Gelhausen, Jens Georg, Steffen Heyne, Michael Hiller, Kousik Kundu, Robert Kleinkauf, Steffen C Lott, Mostafa M Mohamed, Alexander Mattheis, Milad Miladi, Andreas S Richter, Sebastian Will, Joachim Wolff, Patrick R Wright, and Rolf Backofen Freiburg RNA tools: a central online resource for RNA-focused research and teaching Nucleic Acids Research, 46(W1), W25-W29, 2018.
  5. An improved algorithm for matching biological sequences. Osamu Gotoh. https://doi.org/10.1016/0022-2836(82)90398-9
  6. http://www.cs.cmu.edu/~durand/03-711/2017/Lectures/Sequence-Alignment-2017.pdf
  7. https://bioboot.github.io/bimm143_W20/class-material/nw/
  8. https://www.ncbi.nlm.nih.gov/CBBresearch/Przytycka/download/lectures/PCB_Lect02_Pairwise_allign.pdf
  9. https://ics.uci.edu/~xhx/courses/CS284A-F08/lectures/alignment.pdf
  10. https://link.springer.com/chapter/10.1007/978-3-319-90684-3_2
  11. Optimal sequence alignment using affine gap costs. https://link.springer.com/content/pdf/10.1007/BF02462326.pdf
  12. Optimal alignments in linear space. Eugene W. Myers, Webb Miller. https://doi.org/10.1093/bioinformatics/4.1.11
  13. Sequence alignment using FastLSA. https://webdocs.cs.ualberta.ca/~duane/publications/pdf/2000metmbs.pdf
  14. MASA: A Multiplatform Architecture for Sequence Aligners with Block Pruning. https://doi.org/10.1145/2858656
  15. https://community.gep.wustl.edu/repository/course_materials_WU/annotation/Introduction_Dynamic_Programming.pdf
  16. Optimal gap-affine alignment in O(s) space. https://doi.org/10.1093/bioinformatics/btad074
  17. Exact global alignment using A* with chaining seed heuristic and match pruning. https://doi.org/10.1093/bioinformatics/btae032
  18. Transforming match bonus into cost. https://curiouscoding.nl/posts/alignment-scores-transform/
  19. Improving the time and space complexity of the WFA algorithm and generalizing its scoring. https://doi.org/10.1101/2022.01.12.476087
  20. A* PA2: up to 20 times faster exact global alignment. https://doi.org/10.1101/2024.03.24.586481
  21. Notes on Dynamic-Programming Sequence Alignment. https://globin.bx.psu.edu/courses/fall2001/DP.pdf
  22. Lecture 6: Affine gap penalty function. https://www.cs.hunter.cuny.edu/~saad/courses/compbio/lectures/lecture6.pdf

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

globalign-0.1.16.tar.gz (27.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

globalign-0.1.16-py3-none-any.whl (22.2 kB view details)

Uploaded Python 3

File details

Details for the file globalign-0.1.16.tar.gz.

File metadata

  • Download URL: globalign-0.1.16.tar.gz
  • Upload date:
  • Size: 27.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for globalign-0.1.16.tar.gz
Algorithm Hash digest
SHA256 40d46a15d6ab1da5978406da1b35b339a7b6d9600e21b7ac2d5bda420a6530aa
MD5 7d1382e3754a59567eb67c9f8dc85288
BLAKE2b-256 b55d9a1c74d3e272228117df15dfc88726d3c48fd91ae97bbe8e030f69cc9b3f

See more details on using hashes here.

Provenance

The following attestation bundles were made for globalign-0.1.16.tar.gz:

Publisher: python_publish.yml on iamgiddyaboutgit/globalign

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file globalign-0.1.16-py3-none-any.whl.

File metadata

  • Download URL: globalign-0.1.16-py3-none-any.whl
  • Upload date:
  • Size: 22.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for globalign-0.1.16-py3-none-any.whl
Algorithm Hash digest
SHA256 2ec1b97d3259ea457e0c7781118b2426fd07213d6cf4ae9bf66eb0fa8e6d8f16
MD5 a1bdae13a798ae2b3c91e4b2a45c2455
BLAKE2b-256 c6979c42e025aa1f79681a01f29ff0a0f75955922587f4c76dd648e58cb1df3b

See more details on using hashes here.

Provenance

The following attestation bundles were made for globalign-0.1.16-py3-none-any.whl:

Publisher: python_publish.yml on iamgiddyaboutgit/globalign

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page