Skip to main content

Fast discovery of similar strings in bulk

Project description

SymScan

Check out the documentation page.

SymScan enables extremely fast discovery of pairs of similar strings within and across large collections.

SymScan is a variation on the symmetric deletion algorithm that is optimised for bulk-searching similar strings within one or across two large string collections at once (e.g. searching for similar protein sequences among a collection of 10M). The key algorithmic difference between SymScan and traditional symmetric deletion is the use of a sort-merge join approach in place of hash maps to discover input strings that share common deletion variants. This sort-and-scan approach trades off an additional factor of O(log N) (with N the total number of strings being compared) in expected time complexity for improved cache locality and effective parallelization, and ends up being much faster for the above use case.

Installing

CLI

brew install yutanagano/tap/symscan-cli

Rust library

cargo add symscan

Python package

pip install symscan

Licensing

SymScan is dual-licensed under the MIT and Apache 2.0 licenses. Unless explicitly stated otherwise, any contribution submitted by you, as defined in the Apache license, shall be dual-licensed as above, without any additional terms and conditions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

symscan-0.8.2.tar.gz (29.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

symscan-0.8.2-cp314-cp314t-win_arm64.whl (213.1 kB view details)

Uploaded CPython 3.14tWindows ARM64

symscan-0.8.2-cp314-cp314t-win_amd64.whl (231.2 kB view details)

Uploaded CPython 3.14tWindows x86-64

symscan-0.8.2-cp314-cp314t-win32.whl (224.6 kB view details)

Uploaded CPython 3.14tWindows x86

symscan-0.8.2-cp314-cp314t-musllinux_1_2_x86_64.whl (415.7 kB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ x86-64

symscan-0.8.2-cp314-cp314t-musllinux_1_2_aarch64.whl (376.5 kB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ ARM64

symscan-0.8.2-cp314-cp314t-manylinux_2_28_x86_64.whl (335.4 kB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.28+ x86-64

symscan-0.8.2-cp314-cp314t-manylinux_2_28_aarch64.whl (311.1 kB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.28+ ARM64

symscan-0.8.2-cp314-cp314t-macosx_11_0_arm64.whl (280.5 kB view details)

Uploaded CPython 3.14tmacOS 11.0+ ARM64

symscan-0.8.2-cp314-cp314t-macosx_10_15_x86_64.whl (306.9 kB view details)

Uploaded CPython 3.14tmacOS 10.15+ x86-64

symscan-0.8.2-cp310-abi3-win_arm64.whl (218.6 kB view details)

Uploaded CPython 3.10+Windows ARM64

symscan-0.8.2-cp310-abi3-win_amd64.whl (236.1 kB view details)

Uploaded CPython 3.10+Windows x86-64

symscan-0.8.2-cp310-abi3-win32.whl (230.4 kB view details)

Uploaded CPython 3.10+Windows x86

symscan-0.8.2-cp310-abi3-musllinux_1_2_x86_64.whl (420.2 kB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ x86-64

symscan-0.8.2-cp310-abi3-musllinux_1_2_aarch64.whl (380.8 kB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARM64

symscan-0.8.2-cp310-abi3-manylinux_2_28_x86_64.whl (340.0 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ x86-64

symscan-0.8.2-cp310-abi3-manylinux_2_28_aarch64.whl (315.6 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ ARM64

symscan-0.8.2-cp310-abi3-macosx_11_0_arm64.whl (287.3 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

symscan-0.8.2-cp310-abi3-macosx_10_12_x86_64.whl (311.6 kB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file symscan-0.8.2.tar.gz.

File metadata

  • Download URL: symscan-0.8.2.tar.gz
  • Upload date:
  • Size: 29.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.2.tar.gz
Algorithm Hash digest
SHA256 b0171e43ab7064acd00b8d3fe31255c6d1737b24d81e56b985911282bab42d87
MD5 d38648d3f95eb800e91ae6315f96b7c9
BLAKE2b-256 6e32be159fe03d94c1cc369ada532a8bd1c69038515b04eb90e1fc963b033a75

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2.tar.gz:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.2-cp314-cp314t-win_arm64.whl.

File metadata

  • Download URL: symscan-0.8.2-cp314-cp314t-win_arm64.whl
  • Upload date:
  • Size: 213.1 kB
  • Tags: CPython 3.14t, Windows ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.2-cp314-cp314t-win_arm64.whl
Algorithm Hash digest
SHA256 c3ab85c9723b360266029e21ebcb669f65b5bc35e8fe99acc54200e70cf626b3
MD5 cd49e40692b28cb77518ca64e5e577b1
BLAKE2b-256 ad0714a0cc2ea05d7f9096cf9ff504240c03058786e53fae5148d0c6a9e68888

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2-cp314-cp314t-win_arm64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.2-cp314-cp314t-win_amd64.whl.

File metadata

  • Download URL: symscan-0.8.2-cp314-cp314t-win_amd64.whl
  • Upload date:
  • Size: 231.2 kB
  • Tags: CPython 3.14t, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.2-cp314-cp314t-win_amd64.whl
Algorithm Hash digest
SHA256 960f3aaeda3ee05aa2648db4062775c419d254497b775bc3885db4ac940b2b27
MD5 b45ff3ddef26fbe7bf0df76d26c0e04f
BLAKE2b-256 a08fbe959af8730071a876773008707a0a823d8e86676a3b3c1dc0d6f6b92215

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2-cp314-cp314t-win_amd64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.2-cp314-cp314t-win32.whl.

File metadata

  • Download URL: symscan-0.8.2-cp314-cp314t-win32.whl
  • Upload date:
  • Size: 224.6 kB
  • Tags: CPython 3.14t, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.2-cp314-cp314t-win32.whl
Algorithm Hash digest
SHA256 f0b50921cf4d2278d52a10f8a959641f4f63b59a3adddd1617d62bb9fa3526b4
MD5 e74fe2e8ca59b925e53fb4ccf87c6503
BLAKE2b-256 be7bee3a67e17ab424387b3146d8d3674b8f902bb93a19885233c86b0df5db41

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2-cp314-cp314t-win32.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.2-cp314-cp314t-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for symscan-0.8.2-cp314-cp314t-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 125819988a85ee24172ef7fc8549995003895cacc4ba2e2415865e86487cacaf
MD5 7fe8449f45542fd2928dad67e704dec7
BLAKE2b-256 4a1c70cb62fc9c92126256118496ad37cd97a44d546a828cee8e05f86a996121

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2-cp314-cp314t-musllinux_1_2_x86_64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.2-cp314-cp314t-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for symscan-0.8.2-cp314-cp314t-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 7e51511e6ed1aea72320ee1a25751762f2c3ff39ae742fcf584add0b523a2b8a
MD5 e9265ee5eebec8e4f60c71366f5afd8e
BLAKE2b-256 752e3c44b5d6e3467cd6f3a30563606efd8da09a35331b8381b9f1ef61a2114e

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2-cp314-cp314t-musllinux_1_2_aarch64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.2-cp314-cp314t-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for symscan-0.8.2-cp314-cp314t-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 65a00187273e2f03adba488a82ddbdfef403b085471329e490915d7d8132e0fa
MD5 1157000781d8d1323f224ca6515b0fd6
BLAKE2b-256 4c72a2203d1393d23f6101a7f52b6c79606481f10c2c58399109579e9d1a4d85

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2-cp314-cp314t-manylinux_2_28_x86_64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.2-cp314-cp314t-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for symscan-0.8.2-cp314-cp314t-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 3c14268271c43c70ff0b8eb60289b320d8ed46aadc84e45ac673b6ba53c150e9
MD5 d072195046d9da7e44a1d7d81278148c
BLAKE2b-256 cabff13c963850a778eb68d1d2e0f4d8db8df30bff64ade804d4151d9081e5de

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2-cp314-cp314t-manylinux_2_28_aarch64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.2-cp314-cp314t-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for symscan-0.8.2-cp314-cp314t-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f9c4025292c12e88a2bf49ccdcf57b83e51dc3a256be11d1bbdfc6a1c18ee9be
MD5 1517cf85a00cfae586bf045f0329a4bb
BLAKE2b-256 d20b9fc2315d8902371323f21b1fb5575db9993228fca99405a4072f29a1ff7b

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2-cp314-cp314t-macosx_11_0_arm64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.2-cp314-cp314t-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for symscan-0.8.2-cp314-cp314t-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 0a26fe08f015a5361ba890183b0ce623690302bec6edb0f34752c78bfdb6c730
MD5 a85d6b38e4fb6b3b3082892a3bc0f39c
BLAKE2b-256 da79a914b7c4f6fde48a64faf6d6794a4735750f8a41cf927d6039260347f9dd

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2-cp314-cp314t-macosx_10_15_x86_64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.2-cp310-abi3-win_arm64.whl.

File metadata

  • Download URL: symscan-0.8.2-cp310-abi3-win_arm64.whl
  • Upload date:
  • Size: 218.6 kB
  • Tags: CPython 3.10+, Windows ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.2-cp310-abi3-win_arm64.whl
Algorithm Hash digest
SHA256 1f5aa3c3b61903287e0b20b5faf1b95c8ba672df5bbec46c46709263ae49c311
MD5 c88db8322071ece36f94f8492b817304
BLAKE2b-256 55b28226aabbeefcf521a2938be01130d772bd5a5e6fe0d70c1fe896c4cc1e8e

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2-cp310-abi3-win_arm64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.2-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: symscan-0.8.2-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 236.1 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.2-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 9ed98ab0755450520a9a2d23d87570010b0ad4f21bc3505bb711028cf8b8be42
MD5 361179e2f01a86969e845507ae3040d7
BLAKE2b-256 1420b422ffa8959177f7ffc11b20377bdaa0835c25d600906300d3a9cb91fc31

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2-cp310-abi3-win_amd64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.2-cp310-abi3-win32.whl.

File metadata

  • Download URL: symscan-0.8.2-cp310-abi3-win32.whl
  • Upload date:
  • Size: 230.4 kB
  • Tags: CPython 3.10+, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.2-cp310-abi3-win32.whl
Algorithm Hash digest
SHA256 6cf48ea32406b68914802bc87796e65e067395dc867fcc36ee20878445401b83
MD5 0409a4613d93aa42ec5445e36c3548e2
BLAKE2b-256 ad8d1de697ea78bc0eab0a801cee0af094eac90dcd13b5176d278d3b37543d3b

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2-cp310-abi3-win32.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.2-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for symscan-0.8.2-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 1c47880644deaabda20ebcb4bf7971b562971bab5d41f1455889622410311963
MD5 3c35f1b53500585facc43f44a21e6e85
BLAKE2b-256 0b8c65c9c7ee2e2f767dce5ada370ff30f689df5ba4a82877f72544d3a8c2b52

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2-cp310-abi3-musllinux_1_2_x86_64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.2-cp310-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for symscan-0.8.2-cp310-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 a39f769b6b170d3a5d5d852f4fe99a67fddb8af924bc233a5fa42cee09bbd8c3
MD5 5c31e46de45a2718376a5b478de3781c
BLAKE2b-256 e6d2e235c020660bc60cdb753b69960b73db1fca8ab4ca45456a2e46e49deba1

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2-cp310-abi3-musllinux_1_2_aarch64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.2-cp310-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for symscan-0.8.2-cp310-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 954e932de706a3c099845c00270e2dbdf2d8d36a910512e8303a8831eca39f36
MD5 788de508505a8fb590c695161e269a2b
BLAKE2b-256 ad4da2995349b5192e383f957e84fba8ed2804f6d864bd01778603c2970996f5

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2-cp310-abi3-manylinux_2_28_x86_64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.2-cp310-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for symscan-0.8.2-cp310-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 6dc584a68b9fbbe8cf6e7103ef738450c151ded68717b023eba574b157527ba5
MD5 ee93fe9742357f5b0456c902e4962486
BLAKE2b-256 b266dd7e025981098bb6b5ae580bd4cd7a58d27b05aa2b12c37e1ffafa4b85f1

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2-cp310-abi3-manylinux_2_28_aarch64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.2-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for symscan-0.8.2-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5059c72e29b12b2e0f15a452e7514408d9f94a2259a7c342474b19ba6bb298d7
MD5 0446bc614ea94564032b1021317d64e5
BLAKE2b-256 8d4a810444b7c8f5c0a600510a7c6c6d6b8b1ef1ab823dc4bcb8486557163793

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.2-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for symscan-0.8.2-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 1730595b54932d77c8f0eabfbc59d4fe40b5a4d1ea35d57cef146c98b3b274ca
MD5 d06e078caa5187828395c821083ed1cf
BLAKE2b-256 1be19776b91a846f356ea73f93791b9b2bab0c269afedb5cc15cec6e1c29fb02

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.2-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page