Skip to main content

Fast discovery of similar strings in bulk

Project description

SymScan

Check out the documentation page.

SymScan enables extremely fast discovery of pairs of similar strings within and across large collections.

SymScan is a variation on the symmetric deletion algorithm that is optimised for bulk-searching similar strings within one or across two large string collections at once (e.g. searching for similar protein sequences among a collection of 10M). The key algorithmic difference between SymScan and traditional symmetric deletion is the use of a sort-merge join approach in place of hash maps to discover input strings that share common deletion variants. This sort-and-scan approach trades off an additional factor of O(log N) (with N the total number of strings being compared) in expected time complexity for improved cache locality and effective parallelization, and ends up being much faster for the above use case.

Installing

CLI

brew install yutanagano/tap/symscan-cli

Rust library

cargo add symscan

Python package

pip install symscan

Licensing

SymScan is dual-licensed under the MIT and Apache 2.0 licenses. Unless explicitly stated otherwise, any contribution submitted by you, as defined in the Apache license, shall be dual-licensed as above, without any additional terms and conditions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

symscan-0.8.0.tar.gz (29.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

symscan-0.8.0-cp314-cp314t-win_arm64.whl (297.0 kB view details)

Uploaded CPython 3.14tWindows ARM64

symscan-0.8.0-cp314-cp314t-win_amd64.whl (317.6 kB view details)

Uploaded CPython 3.14tWindows x86-64

symscan-0.8.0-cp314-cp314t-win32.whl (298.8 kB view details)

Uploaded CPython 3.14tWindows x86

symscan-0.8.0-cp314-cp314t-musllinux_1_2_x86_64.whl (559.7 kB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ x86-64

symscan-0.8.0-cp314-cp314t-musllinux_1_2_aarch64.whl (531.7 kB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ ARM64

symscan-0.8.0-cp314-cp314t-manylinux_2_28_x86_64.whl (487.9 kB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.28+ x86-64

symscan-0.8.0-cp314-cp314t-manylinux_2_28_aarch64.whl (468.4 kB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.28+ ARM64

symscan-0.8.0-cp314-cp314t-macosx_11_0_arm64.whl (419.2 kB view details)

Uploaded CPython 3.14tmacOS 11.0+ ARM64

symscan-0.8.0-cp314-cp314t-macosx_10_15_x86_64.whl (434.1 kB view details)

Uploaded CPython 3.14tmacOS 10.15+ x86-64

symscan-0.8.0-cp310-abi3-win_arm64.whl (303.7 kB view details)

Uploaded CPython 3.10+Windows ARM64

symscan-0.8.0-cp310-abi3-win_amd64.whl (321.8 kB view details)

Uploaded CPython 3.10+Windows x86-64

symscan-0.8.0-cp310-abi3-win32.whl (302.2 kB view details)

Uploaded CPython 3.10+Windows x86

symscan-0.8.0-cp310-abi3-musllinux_1_2_x86_64.whl (565.4 kB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ x86-64

symscan-0.8.0-cp310-abi3-musllinux_1_2_aarch64.whl (541.3 kB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARM64

symscan-0.8.0-cp310-abi3-manylinux_2_28_x86_64.whl (494.1 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ x86-64

symscan-0.8.0-cp310-abi3-manylinux_2_28_aarch64.whl (476.2 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ ARM64

symscan-0.8.0-cp310-abi3-macosx_11_0_arm64.whl (428.1 kB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

symscan-0.8.0-cp310-abi3-macosx_10_12_x86_64.whl (440.6 kB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file symscan-0.8.0.tar.gz.

File metadata

  • Download URL: symscan-0.8.0.tar.gz
  • Upload date:
  • Size: 29.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.0.tar.gz
Algorithm Hash digest
SHA256 53fd1836799cae96b7cff855ee58eda34c12d949a880bde12aa9526bb6595818
MD5 ec9fb3be93939287facf0f9eab07c1b0
BLAKE2b-256 40eb326ec140439224095e89d2c1e1efdafd1854b982fc018b5ca329b457c9c0

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0.tar.gz:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.0-cp314-cp314t-win_arm64.whl.

File metadata

  • Download URL: symscan-0.8.0-cp314-cp314t-win_arm64.whl
  • Upload date:
  • Size: 297.0 kB
  • Tags: CPython 3.14t, Windows ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.0-cp314-cp314t-win_arm64.whl
Algorithm Hash digest
SHA256 5731bd1a8158a186321d361eb1476271feba081bad283d3b498c973f4247f644
MD5 8b69d880ad8b72da40fac41ba7a5a668
BLAKE2b-256 d683bd141b37d29afe6d1a01a80e169637ff7b94536201a8acba87d5d67bc01c

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0-cp314-cp314t-win_arm64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.0-cp314-cp314t-win_amd64.whl.

File metadata

  • Download URL: symscan-0.8.0-cp314-cp314t-win_amd64.whl
  • Upload date:
  • Size: 317.6 kB
  • Tags: CPython 3.14t, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.0-cp314-cp314t-win_amd64.whl
Algorithm Hash digest
SHA256 842f7b9a1a0e7741445a4f640bbe71da85fd92ae4c4d2a753a42f3d0cc88cf26
MD5 e032b2aee7b3c11525c195944af8e696
BLAKE2b-256 524d0f14eed27af6a76356e552ce7b615dc93dcd5aeb565577a8b59330b4fe54

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0-cp314-cp314t-win_amd64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.0-cp314-cp314t-win32.whl.

File metadata

  • Download URL: symscan-0.8.0-cp314-cp314t-win32.whl
  • Upload date:
  • Size: 298.8 kB
  • Tags: CPython 3.14t, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.0-cp314-cp314t-win32.whl
Algorithm Hash digest
SHA256 23bab9e52550bb7efd03c904950b2f34cb6fce753facc640eb2616c568f181bd
MD5 b7cf29ff491b2b226bacc9effd26850a
BLAKE2b-256 3e74ec7396cd4605c6549ae867042c307474eb6baafd74c136390e7927198392

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0-cp314-cp314t-win32.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.0-cp314-cp314t-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for symscan-0.8.0-cp314-cp314t-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 692efdc88a32f43d3c362f243b49bfda804b2d5cd32c8c242969f8b296d51404
MD5 b00738d08e605db172e50748164c752a
BLAKE2b-256 c8958a82fd7f91ec3b893f94868dc9a11d2ce5b29a7319aaf12c9ecbfa3d349a

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0-cp314-cp314t-musllinux_1_2_x86_64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.0-cp314-cp314t-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for symscan-0.8.0-cp314-cp314t-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 43b8866dd19a13ce740d5854c30eb74efa08ccba9b2c9d8fdf0344c6bd24fc9d
MD5 13c9c267deebbd03d37f8abaa6e6782b
BLAKE2b-256 440e94a28a31dbcd34e030589bc9db343fdb676e2059034405150866df482b7d

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0-cp314-cp314t-musllinux_1_2_aarch64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.0-cp314-cp314t-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for symscan-0.8.0-cp314-cp314t-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 3b1fc6a064299df0c27162c13c363f5358bca48e3d6787c0054c1d7a2d9edbf4
MD5 31e5ad85eeb419b107a49a88df87e24d
BLAKE2b-256 257d980f47e20f97c28ee8b62d2078189d2944dd435d3318fb4fd445cabfbc11

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0-cp314-cp314t-manylinux_2_28_x86_64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.0-cp314-cp314t-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for symscan-0.8.0-cp314-cp314t-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 5d398359d0e6ff806f983ccf693cca26e3df26223ed7ea05fd63fc5548829d6e
MD5 9f09672ba66e75ffc4da016d1f060eb1
BLAKE2b-256 1f242cd0c90c688f4bdfa8748163b78ca596d150e46097a04cc8a351a7dc2dab

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0-cp314-cp314t-manylinux_2_28_aarch64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.0-cp314-cp314t-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for symscan-0.8.0-cp314-cp314t-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0c88dc5f3ce454ade56956ba0a90c7dadc40ce5b044bbddc2ea0af1bf8ac6b5f
MD5 a6596744289003befb897a36d5022b9a
BLAKE2b-256 740d31f5da3bd711291de281406adc01a76c75fbb5794e1395754c49aa7f2f3c

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0-cp314-cp314t-macosx_11_0_arm64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.0-cp314-cp314t-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for symscan-0.8.0-cp314-cp314t-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 e815c0b0cf241dec04464d6337c6a5e5518dcdf3254155566d52ad00104768f2
MD5 10ed18a25adc09f366974cfb5b153c53
BLAKE2b-256 ddf30c24b58464cf05c2bbfda5301470fe93a0ee655b59d035a2ca5b2625e301

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0-cp314-cp314t-macosx_10_15_x86_64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.0-cp310-abi3-win_arm64.whl.

File metadata

  • Download URL: symscan-0.8.0-cp310-abi3-win_arm64.whl
  • Upload date:
  • Size: 303.7 kB
  • Tags: CPython 3.10+, Windows ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.0-cp310-abi3-win_arm64.whl
Algorithm Hash digest
SHA256 16e74b2a763d0b40f7000a8d541131e89adc3878a7f447bc97367cadc77085a8
MD5 ff40e6a7589efe93e819debb6fc7b36e
BLAKE2b-256 ac742b05b83cceb89b019e2dab7aad4fc2e6f25388d6ee9a49fdafc8ce61af50

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0-cp310-abi3-win_arm64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.0-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: symscan-0.8.0-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 321.8 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 6e91029ad6d61e328cf140735617218de5ae6b3432b36f83b5ddfdaf01e94708
MD5 045198e4577d1975baa234c599fee552
BLAKE2b-256 ff239b9c129ef4ac831425e285553cacbda3ae04306a310e6b1b129cfa43167f

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0-cp310-abi3-win_amd64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.0-cp310-abi3-win32.whl.

File metadata

  • Download URL: symscan-0.8.0-cp310-abi3-win32.whl
  • Upload date:
  • Size: 302.2 kB
  • Tags: CPython 3.10+, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for symscan-0.8.0-cp310-abi3-win32.whl
Algorithm Hash digest
SHA256 226bf64b223516ca2db391cc91d46e4f3f2cba56827e107f7bfc3c186f1251e5
MD5 ae5a12efcc5f0673275ddcdc85519f35
BLAKE2b-256 7d64969b048062fc8a8fdabcbf07b9f1a5a1c1354cfc14171d22463367b98e35

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0-cp310-abi3-win32.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.0-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for symscan-0.8.0-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 e880c9c39a6b05fe04fb944519764a6df3ecb98d5cd6855d0c71a0acf15744fb
MD5 acbaeaa67eaa4fb40c45ca95ea65dddb
BLAKE2b-256 ad66eafe11f0d0dfa148852bb161a8473da961eb05f6c292a40e14d25ac503fe

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0-cp310-abi3-musllinux_1_2_x86_64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.0-cp310-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for symscan-0.8.0-cp310-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 3a688e2d0f32a6dc00d2ae20f209890db9a53b92d21518c16dab3a49f2720b48
MD5 cf41135269b8ba62ea93cb2eb4eb4455
BLAKE2b-256 11b264589dcd3cd68f44ea7c972261b4a32d65ae49361714b8374adf94217fc1

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0-cp310-abi3-musllinux_1_2_aarch64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.0-cp310-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for symscan-0.8.0-cp310-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 aa9794eb9f9b3786445543e620fa8b668af73d32a123806c1dbccf53f154f1de
MD5 06f94cfb8cf4438201fe0c3cd5030bc9
BLAKE2b-256 8ad2ab62c0ede4f16e72a7ea3d2f8dc5b042faca2f28bb3f554de63fa86838bf

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0-cp310-abi3-manylinux_2_28_x86_64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.0-cp310-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for symscan-0.8.0-cp310-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 cae2057af454d12aad0e1ad437304bde6621c9d4829c7c102f99460b3e2bf039
MD5 9cd9637075982f94392ecc1374f2e2b1
BLAKE2b-256 13e14371ce4acaa90d0ca112d9824c23c410ffe62519677a7ba6ccca7b8ee507

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0-cp310-abi3-manylinux_2_28_aarch64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for symscan-0.8.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9afc480a45f2228305ec7d58fda399d8a5209b8dc5ce65e1c825339e9f2bb5e3
MD5 a44c19e10873fa6e15698f70e5846e0f
BLAKE2b-256 6e886ab6f21087a72287cc4f2e104e30f8e7a0757f9fa2409fe8fa1e9d5d30f9

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file symscan-0.8.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for symscan-0.8.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 40fa3b9babf62cc08f4b45b61aabe6c74a44c80edda548a1c7d785f80fcc0203
MD5 4d40fe4e1a858c3d06b92d349d9fca42
BLAKE2b-256 3748a6ca8e04ca8d44644d4ff3a15773cd503718d5041500fb6f01d0a0a061ac

See more details on using hashes here.

Provenance

The following attestation bundles were made for symscan-0.8.0-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: release_python.yml on yutanagano/symscan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page