Skip to main content

Fast discovery of similar strings in bulk

Project description

SymScan

Check out the documentation page.

SymScan enables extremely fast discovery of pairs of similar strings within and across large collections.

SymScan is a variation on the symmetric deletion algorithm that is optimised for bulk-searching similar strings within one or across two large string collections at once (e.g. searching for similar protein sequences among a collection of 10M). The key algorithmic difference between SymScan and traditional symmetric deletion is the use of a sort-merge join approach in place of hash maps to discover input strings that share common deletion variants. This sort-and-scan approach trades off an additional factor of O(log N) (with N the total number of strings being compared) in expected time complexity for improved cache locality and effective parallelization, and ends up being much faster for the above use case.

Installing

CLI

brew install yutanagano/tap/symscan-cli

Rust library

cargo add symscan

Python package

pip install symscan

Licensing

SymScan is dual-licensed under the MIT and Apache 2.0 licenses. Unless explicitly stated otherwise, any contribution submitted by you, as defined in the Apache license, shall be dual-licensed as above, without any additional terms and conditions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

symscan-0.8.1.tar.gz (29.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

symscan-0.8.1-cp314-cp314t-win_arm64.whl (296.9 kB view details)

Uploaded CPython 3.14tWindows ARM64

symscan-0.8.1-cp314-cp314t-win_amd64.whl (317.7 kB view details)

Uploaded CPython 3.14tWindows x86-64

symscan-0.8.1-cp314-cp314t-win32.whl (300.1 kB view details)

Uploaded CPython 3.14tWindows x86

symscan-0.8.1-cp314-cp314t-musllinux_1_2_x86_64.whl (559.2 kB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ x86-64

symscan-0.8.1-cp314-cp314t-musllinux_1_2_aarch64.whl (531.7 kB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ ARM64

symscan-0.8.1-cp314-cp314t-manylinux_2_28_x86_64.whl (487.6 kB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.28+ x86-64

symscan-0.8.1-cp314-cp314t-manylinux_2_28_aarch64.whl (467.8 kB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.28+ ARM64

symscan-0.8.1-cp314-cp314t-macosx_11_0_arm64.whl (418.8 kB view details)

Uploaded CPython 3.14tmacOS 11.0+ ARM64

symscan-0.8.1-cp314-cp314t-macosx_10_15_x86_64.whl (436.1 kB view details)

Uploaded CPython 3.14tmacOS 10.15+ x86-64

symscan-0.8.1-cp310-abi3-win_arm64.whl (303.5 kB view details)

Uploaded CPython 3.10+Windows ARM64

symscan-0.8.1-cp310-abi3-win_amd64.whl (321.8 kB view details)

Uploaded CPython 3.10+Windows x86-64

symscan-0.8.1-cp310-abi3-win32.whl (303.4 kB view details)

Uploaded CPython 3.10+Windows x86

symscan-0.8.1-cp310-abi3-musllinux_1_2_x86_64.whl (563.2 kB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ x86-64

symscan-0.8.1-cp310-abi3-musllinux_1_2_aarch64.whl (541.0 kB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARM64

symscan-0.8.1-cp310-abi3-manylinux_2_28_x86_64.whl (494.4 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ x86-64

symscan-0.8.1-cp310-abi3-manylinux_2_28_aarch64.whl (475.9 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ ARM64

symscan-0.8.1-cp310-abi3-macosx_11_0_arm64.whl (427.4 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

symscan-0.8.1-cp310-abi3-macosx_10_12_x86_64.whl (440.6 kB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file symscan-0.8.1.tar.gz.

File metadata

  • Download URL: symscan-0.8.1.tar.gz
  • Upload date:
  • Size: 29.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.1.tar.gz
Algorithm Hash digest
SHA256 3de98403ed91631bfed06cbc0faa6479f5915020134bd52a64dbccc9a10185f4
MD5 3d38de84d9f6a313ba277e595c7dae0c
BLAKE2b-256 124f07ee9faba6b7597cf1cbc4c464aedb4d12ba3c2e25f7660a56e6c6c6eb3d

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1.tar.gz:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.1-cp314-cp314t-win_arm64.whl.

File metadata

  • Download URL: symscan-0.8.1-cp314-cp314t-win_arm64.whl
  • Upload date:
  • Size: 296.9 kB
  • Tags: CPython 3.14t, Windows ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.1-cp314-cp314t-win_arm64.whl
Algorithm Hash digest
SHA256 6a4e61a071b73709a9e70893e9cb51d8ba325035e6f5d5b62de9830a0bd8615e
MD5 34eae9474502ccd4cd1e373749d6ea4a
BLAKE2b-256 c10353706ad460fb70a82ec573fca73f8ebb1425e18ba7d471fa8311b3534504

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1-cp314-cp314t-win_arm64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.1-cp314-cp314t-win_amd64.whl.

File metadata

  • Download URL: symscan-0.8.1-cp314-cp314t-win_amd64.whl
  • Upload date:
  • Size: 317.7 kB
  • Tags: CPython 3.14t, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.1-cp314-cp314t-win_amd64.whl
Algorithm Hash digest
SHA256 f21fadc3e451ee33e2e02681369c8340f6b641d6baac8d495f570e53340b6422
MD5 b307a0844d0f8cad0e22cc09c1422b1f
BLAKE2b-256 0a6fc40d12031fa57e831d2dc239a7031b452b60204eab2e6b769986674184d1

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1-cp314-cp314t-win_amd64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.1-cp314-cp314t-win32.whl.

File metadata

  • Download URL: symscan-0.8.1-cp314-cp314t-win32.whl
  • Upload date:
  • Size: 300.1 kB
  • Tags: CPython 3.14t, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.1-cp314-cp314t-win32.whl
Algorithm Hash digest
SHA256 50e664d6978b70ae4f58abbc7ba5d4da8534be7815cd2ecd2e3c639bb5908c97
MD5 0f0f081bbed91756b520950b73027fb2
BLAKE2b-256 47f867a70c5daafdab16d1a492048031546b78044ddf4e09f55e6ae9f9b4d8ff

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1-cp314-cp314t-win32.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.1-cp314-cp314t-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for symscan-0.8.1-cp314-cp314t-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 daea9f300cf38ff9faed93e782862bba3e9bc2b429fddd05e7bab6778d7bbb7b
MD5 7ea1a3227c58b69c9a6ecf1019a7d1e0
BLAKE2b-256 cbac4a6c1f6bcb19ed948fe007db5b9393d44584472c6150e98a1eefca8c0d23

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1-cp314-cp314t-musllinux_1_2_x86_64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.1-cp314-cp314t-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for symscan-0.8.1-cp314-cp314t-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 119153a9a1847d22d488cd9448a4e81c63b3c37a01a45c62e4d39837b7d493d6
MD5 a00e305d2e4ed0b8c603d27508b347ff
BLAKE2b-256 be5223007ec11f3f1316c95b6d7d40a2e960ff62f8f951651771736dccbcae1e

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1-cp314-cp314t-musllinux_1_2_aarch64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.1-cp314-cp314t-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for symscan-0.8.1-cp314-cp314t-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d1fa7df7a2c4f6c5f4e829a22fa438201cdefa47b6f6d99d084567c6a3478cd1
MD5 c8a143161cc11b0d1f6e526e0aa781df
BLAKE2b-256 0235d8c2806d86a00e9a6b4720879f0be08643174f88d422c4e4d50e869edd33

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1-cp314-cp314t-manylinux_2_28_x86_64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.1-cp314-cp314t-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for symscan-0.8.1-cp314-cp314t-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8c3b83aac11402989a18107810e36cd43c3be44edf6dd948207efb7d4f94c63d
MD5 3d5fcdc81cb1cb128283c5fd757cd091
BLAKE2b-256 a75a831790538ffe94181f67c500d6a94735d0a3f75107ebd9be4ddd48e76213

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1-cp314-cp314t-manylinux_2_28_aarch64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.1-cp314-cp314t-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for symscan-0.8.1-cp314-cp314t-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e3a579b53e9cee9c3683198a8b7e93d24d4dd25ae5a61f04350319d242a9972b
MD5 fc83f0da755af02b13b21d74c2aebb06
BLAKE2b-256 e8ece421c67b6122662dd646afd4449e5329367c8ec716abf83dc78569a1da62

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1-cp314-cp314t-macosx_11_0_arm64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.1-cp314-cp314t-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for symscan-0.8.1-cp314-cp314t-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 fca7e8287885130728a3ae2554b7da68c1085cf3d9a1ac182ecc391d31390a55
MD5 e1dc587fa293e8c43e2d62c78a685cb0
BLAKE2b-256 ae93c9b91361242af68c915fd762bed032ec1b120fec2dcf408b6b9f3f6586dc

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1-cp314-cp314t-macosx_10_15_x86_64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.1-cp310-abi3-win_arm64.whl.

File metadata

  • Download URL: symscan-0.8.1-cp310-abi3-win_arm64.whl
  • Upload date:
  • Size: 303.5 kB
  • Tags: CPython 3.10+, Windows ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.1-cp310-abi3-win_arm64.whl
Algorithm Hash digest
SHA256 25bd0164daf47d42a84988bbede9d3b1090f7fd5494f2f93545f81228c822bce
MD5 c3a58c533c1a0ecbb864d6c05b006607
BLAKE2b-256 23b4fa96c31460369da3aec5026ec3f402bb983bf5444b10264256d38512d9a5

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1-cp310-abi3-win_arm64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.1-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: symscan-0.8.1-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 321.8 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.1-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 c6212a8df06d9f95309f58584a5fe75598896cac9ab4957f873aa462e88192f7
MD5 aaa89b870562ffca92e2edb5d8e59657
BLAKE2b-256 3673119d6eb1c8c9baca9f1741bae62d71aa7d06facf1c8d9c2276e212b8f80d

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1-cp310-abi3-win_amd64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.1-cp310-abi3-win32.whl.

File metadata

  • Download URL: symscan-0.8.1-cp310-abi3-win32.whl
  • Upload date:
  • Size: 303.4 kB
  • Tags: CPython 3.10+, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.1-cp310-abi3-win32.whl
Algorithm Hash digest
SHA256 2e7ff94c399295547679d160f614d054282b4319fefc1918786127aa64384331
MD5 05b08b4091a50c95c75e6aa2f38a41be
BLAKE2b-256 bc782d3052273aff61c30d3598c5ac368f0399c85175c9e42d5cb7b0bc18a43d

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1-cp310-abi3-win32.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.1-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for symscan-0.8.1-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 7516b4924ffc8f3a796d3bd13d9b7c81c4b8757bf5c9afbffcb11a531952b3ea
MD5 3a68771db48a84a38915d22709178b36
BLAKE2b-256 52e001699100507167df22c879ba77e485e1ca63f55fa67f74122abf058d273b

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1-cp310-abi3-musllinux_1_2_x86_64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.1-cp310-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for symscan-0.8.1-cp310-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 ab84ebeb30e2916bf29420381de9af94bc6c01784bde1ffb7b047748f369045d
MD5 a8b8935f777a23c1bec127d58c2f2cc8
BLAKE2b-256 1e3a7a31f679638abf0038b2f320cde6f3ad6eb906bbed16dabbe7512d469abf

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1-cp310-abi3-musllinux_1_2_aarch64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.1-cp310-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for symscan-0.8.1-cp310-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b44a6460c3552b3402e49d308ceb2e27f30c088f5d288bc717b0dff273430a21
MD5 c0874d1681624f532b8e51d05a9a888f
BLAKE2b-256 80115985546f9726764915c3b50950c3577924425b51165d3983d6f63fc60797

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1-cp310-abi3-manylinux_2_28_x86_64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.1-cp310-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for symscan-0.8.1-cp310-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 6e1623374fb84ef065efd55974977ed890f9ced6feb43b2b7b98d93a156cafb2
MD5 14208e839acf58da6b3363d4cb784bae
BLAKE2b-256 cef0ee0f2b9e5b3611b5466806988f0510a1a14185c1f5795678762a67a063ef

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1-cp310-abi3-manylinux_2_28_aarch64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for symscan-0.8.1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bfae1b017bf8fa53a4ad854d257f463f24a6de083d1ed4b06dc69f826b75b61f
MD5 9fe194edc37fdf2137a1ba3608b89572
BLAKE2b-256 1cf78ce9c9f06c8313cad1e185350a4786c11dd621aec822c21e136e0065a6d4

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.1-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for symscan-0.8.1-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 ac5922eb7ed916d6bd1c465247cb5b332aa62e410b467bf0cc19e8d2ddb3779f
MD5 917fd21212c20b6cbc2fc54e1eae5f71
BLAKE2b-256 658d2dcec28a37c202c30a711d9b1a45f818c1e95acc3f6c30a7ca02fef97f28

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.1-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page