Skip to main content

SIMD-accelerated similarity measures for x86 and Arm

Project description

SimSIMD 📏

SIMD-accelerated similarity measures, metrics, and distance functions for x86 and Arm. They are tuned for Machine Learning applications and mid-size vectors with 100-1024 dimensions. One can expect the following performance for Cosine (Angular) distance, the most common metric in AI.

Method Vectors Any Length Speed on 256b Speed on 1024b
Serial f32 5 GB/s 5 GB/s
SVE f32 34 GB/s 40 GB/s
SVE f16 28 GB/s 35 GB/s
NEON f16 16 GB/s 18 GB/s

The benchmarks were done on Arm-based "Graviton 3" CPUs powering AWS c7g.metal instances. We only use Arm NEON implementation with vector lengths multiples of 128 bits, avoiding additional head or tail for loops for misaligned data. By default, we use GCC12, -O3, -march=native for benchmarks. Serial versions imply auto-vectorization pragmas.

Need something like this in your CMake-based project?

FetchContent_Declare(
    simsimd
    GIT_REPOSITORY https://github.com/ashvardanian/simsimd.git
    GIT_SHALLOW TRUE
)
FetchContent_MakeAvailable(simsimd)
include_directories(${simsimd_SOURCE_DIR}/include)

Want to use it in Python with USearch?

from usearch import Index, CompiledMetric, MetricKind, MetricSignature
from simsimd import to_int, cos_f32x4_neon

metric = CompiledMetric(
    pointer=to_int(cos_f32x4_neon),
    kind=MetricKind.Cos,
    signature=MetricSignature.ArrayArraySize,
)

index = Index(256, metric=metric)

Available Metrics

In the C99 interface, all functions are prepended with the simsimd_ namespace prefix. The signature defines the number of arguments:

  • two pointers, and length,
  • two pointers.

The latter is intended for cases where the number of dimensions is hard-coded. Constraints define the limitations on the number of dimensions an argument vector can have.

Name Signature ISA Extension Constraints
dot_f32_sve ✳️✳️📏 Arm SVE
dot_f32x4_neon ✳️✳️📏 Arm NEON d % 4 == 0
cos_f32_sve ✳️✳️📏 Arm SVE
cos_f16_sve ✳️✳️📏 Arm SVE
cos_f16x4_neon ✳️✳️📏 Arm NEON d % 4 == 0
cos_i8x16_neon ✳️✳️📏 Arm NEON d % 16 == 0
cos_f32x4_neon ✳️✳️📏 Arm NEON d % 4 == 0
cos_f16x16_avx512 ✳️✳️📏 x86 AVX-512 d % 16 == 0
cos_f32x4_avx2 ✳️✳️📏 x86 AVX2 d % 4 == 0
l2sq_f32_sve ✳️✳️📏 Arm SVE
l2sq_f16_sve ✳️✳️📏 Arm SVE
hamming_b1x8_sve ✳️✳️📏 Arm SVE d % 8 == 0
hamming_b1x128_sve ✳️✳️📏 Arm SVE d % 128 == 0
hamming_b1x128_avx512 ✳️✳️📏 x86 AVX-512 d % 128 == 0
tanimoto_b1x8_naive ✳️✳️📏 d % 8 == 0
tanimoto_maccs_naive ✳️✳️ d == 166
tanimoto_maccs_neon ✳️✳️ Arm NEON d == 166
tanimoto_maccs_sve ✳️✳️ Arm SVE d == 166
tanimoto_maccs_avx512 ✳️✳️ x86 AVX-512 d == 166

Benchmarks

The benchmarks are repeated for every function with a different number of cores involved. Light-weight distance functions would be memory bound, implying that multi-core performance may be lower if the bus bandwidth cannot saturate all the cores. Similarly, heavy-weight distance functions running on all cores may result in CPU frequency downclocking. This is well illustrated by the single-core performance of the Intel i9-13950HX, equipped with DDR5 memory.

Method Threads Vector Size Speed
dot_f32x4_avx2 1 1024 b 96.2 GB/s
dot_f32x4_avx2 32 1024 b 23.6 GB/s
cos_f32_naive 1 1024 b 15.3 GB/s
cos_f32_naive 32 1024 b 4.5 GB/s
cos_f32x4_avx2 1 1024 b 56.3 GB/s
cos_f32x4_avx2 32 1024 b 13.9 GB/s
tanimoto_maccs_naive 1 21 b 2.8 GB/s
tanimoto_maccs_naive 32 21 b 1.2 GB/s

Switching to the Intel Sapphire Rapids server platform, we can also evaluate some of the AVX-512 extensions, including VPOPCNTDQ and F16.

Method Threads Vector Size Speed
dot_f32x4_avx2 1 1024 b 57.8 GB/s
dot_f32x4_avx2 224 1024 b 16.1 GB/s
cos_f32_naive 1 1024 b 10.7 GB/s
cos_f32_naive 224 1024 b 3.0 GB/s
cos_f32x4_avx2 1 1024 b 39.5 GB/s
cos_f32x4_avx2 224 1024 b 15.1 GB/s
cos_f16x16_avx512 1 1024 b 50.6 GB/s
cos_f16x16_avx512 224 1024 b 15.9 GB/s
hamming_b1x128_avx512 1 1024 b 790.3 GB/s
hamming_b1x128_avx512 224 1024 b 259.3 GB/s
tanimoto_maccs_naive 1 21 b 3.0 GB/s
tanimoto_maccs_naive 224 21 b 1.3 GB/s
tanimoto_maccs_avx512 1 21 b 13.1 GB/s
tanimoto_maccs_avx512 224 21 b 3.7 GB/s

To replicate this on your hardware, please run the following on Linux:

git clone https://github.com/ashvardanian/SimSIMD.git && cd SimSIMD
cmake -DCMAKE_BUILD_TYPE=Release -DSIMSIMD_BUILD_BENCHMARKS=1 \
    -DCMAKE_CXX_COMPILER="g++-12" -DCMAKE_C_COMPILER="gcc-12" \
    -B ./build && make -C ./build && ./build/simsimd_bench

MacOS:

brew install llvm
git clone https://github.com/ashvardanian/SimSIMD.git && cd SimSIMD
cmake -B ./build \
    -DCMAKE_C_COMPILER="/opt/homebrew/opt/llvm/bin/clang" \
    -DCMAKE_CXX_COMPILER="/opt/homebrew/opt/llvm/bin/clang++" \
    -DSIMSIMD_BUILD_BENCHMARKS=1 \
    && \
    make -C ./build -j && ./build/simsimd_bench

Install and test locally:

pip install -e . && pytest python/test.py -s -x

Project details


Release history Release notifications | RSS feed

This version

1.2.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

simsimd-1.2.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl (30.4 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

simsimd-1.2.0-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl (26.9 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64manylinux: glibc 2.5+ x86-64

simsimd-1.2.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl (30.2 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

simsimd-1.2.0-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl (26.7 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64manylinux: glibc 2.5+ x86-64

simsimd-1.2.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl (30.2 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

simsimd-1.2.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl (26.7 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64manylinux: glibc 2.5+ x86-64

simsimd-1.2.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl (30.1 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

simsimd-1.2.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl (26.5 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.28+ x86-64manylinux: glibc 2.5+ x86-64

simsimd-1.2.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl (30.7 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

simsimd-1.2.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl (27.1 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.28+ x86-64manylinux: glibc 2.5+ x86-64

simsimd-1.2.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl (30.7 kB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

simsimd-1.2.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl (27.1 kB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.28+ x86-64manylinux: glibc 2.5+ x86-64

simsimd-1.2.0-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl (30.7 kB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

simsimd-1.2.0-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl (27.1 kB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.28+ x86-64manylinux: glibc 2.5+ x86-64

File details

Details for the file simsimd-1.2.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for simsimd-1.2.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 c61820332f240b76066699f669015d7a4e36dc04299188b091c96383ccb645e1
MD5 6ed410675223a80a9abed86cd4efaba1
BLAKE2b-256 ade92d273d6089b07c689f1da94d1e77ad79a30993265282ab7fd5a417a5ebb3

See more details on using hashes here.

File details

Details for the file simsimd-1.2.0-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for simsimd-1.2.0-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 70825b7dc28f1c357010c9386c8b60ba5f940f884dfdb4c33ec55ac51bc90721
MD5 13c5fbc2b7dbc652dcd8d88c2d3b4113
BLAKE2b-256 6598c42e8896ebf1376a5b35bb696ae738986fe748a41095b3e4b5c9648c4763

See more details on using hashes here.

File details

Details for the file simsimd-1.2.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for simsimd-1.2.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 9a6b24eb43701817142ac1f2a42fa144efb01efcf67683137556d5f88554eb4a
MD5 4aeb9eb269be16952690c785d4dbf0dd
BLAKE2b-256 32643a891b155ae749dece3bbfbd277906faba8826200db30f07c65a650f7265

See more details on using hashes here.

File details

Details for the file simsimd-1.2.0-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for simsimd-1.2.0-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2363302c52b9f4fe35499d69c9889f8eea0c73a80b7d5af0aee4afa03240fba1
MD5 0794f40b033540eefeedfa226bbfa413
BLAKE2b-256 8221004cc8a753ea965e75256af7cf5848c350fe799741674f2950b4e18956a7

See more details on using hashes here.

File details

Details for the file simsimd-1.2.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for simsimd-1.2.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 f44a3e9ca274bcbe9b00e5828e8412add6d7bcae698aac21af91c939d8335689
MD5 8701a6f5533b9803f67de2269a46dcf3
BLAKE2b-256 5add4e346dfbb210f0aca8b85a49a0893c6e8fe2d5611a58358934bbdceea080

See more details on using hashes here.

File details

Details for the file simsimd-1.2.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for simsimd-1.2.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 444a4e51ddba3ac88eb47aa2863be947a63d5a48937ed036584956e3b8ca8349
MD5 8a7a8887d423d513283a2e5bc75c9c01
BLAKE2b-256 fcbe5cc90546c432a5ef4582dd787079cb119e6fa032cca8ad59921b0abb6e7a

See more details on using hashes here.

File details

Details for the file simsimd-1.2.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for simsimd-1.2.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 9cc811d91f614f649b28438b4ed88e43c6984bfbf58e188260b41157cc6b282d
MD5 467c0ad2b2cf9afd215910563b050fb8
BLAKE2b-256 270f28443d16db888574f91151605a6fdc9aa647b5659dad7ae7d4ac61bb40a3

See more details on using hashes here.

File details

Details for the file simsimd-1.2.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for simsimd-1.2.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6976525e0f11c70c80e1450036b33718cf1544810ee4278be469d53713efc600
MD5 af9eb5c490d537eca15f9f3398e125b1
BLAKE2b-256 e3d31e235e9e4c89b802a2594b6876375f145f60db6a65b9a39624769f4f9b22

See more details on using hashes here.

File details

Details for the file simsimd-1.2.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for simsimd-1.2.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 7b46b93e94c12daa22ec3bdb3684f811bcefd4cefd11516a6ec7e30e8668656f
MD5 8f5bdc2863b92ebe771b6c70ef6dc6e3
BLAKE2b-256 7dc662b34d51d1f46c7fe26f8280a2ebdc6a367e58c9fa6f3df07ffedf8b6d42

See more details on using hashes here.

File details

Details for the file simsimd-1.2.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for simsimd-1.2.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 3d2c23666a20dfd0c8cb8b2d3628bdda535a7011010bdf66865caeeb39746fd4
MD5 f62c03bbac977ee7a76f50758bffcb99
BLAKE2b-256 109dfa9adb2596b753255e1b5c5627a23167fdff38efe7acc4cabee0b649cd8f

See more details on using hashes here.

File details

Details for the file simsimd-1.2.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for simsimd-1.2.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 3247534bdcac1a4b1f2c9749cd0c1ecf2e67b23b3333423e878b8ed1a4adfaf8
MD5 2db76c3b7d059a00ebe5217e25858112
BLAKE2b-256 3a8f35b8f7f9b88bb8afb1a0c5539f71edc88626f2b5f2bffc8a8d3aa6064011

See more details on using hashes here.

File details

Details for the file simsimd-1.2.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for simsimd-1.2.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e869d262ec83e1f313cf07705766f15ccbd92b643c574155eb53f1bbe628b9b4
MD5 acc812bfdccb4fe6ed4bbb74da300334
BLAKE2b-256 c1bf409e73c2900af3b06e27b726f2ed79b84d75cbd7a145927bc12bdaa995ce

See more details on using hashes here.

File details

Details for the file simsimd-1.2.0-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for simsimd-1.2.0-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 0f3f2aa588663760fb85dced628d267f970b989a14ed5cde41ac85d6c6d9e6cc
MD5 01f48f88f47b457d298d5cb489dc595b
BLAKE2b-256 2ef9ea06b544cedab7738efa4d57c960bb39b56887b1826e2de52175911b6eed

See more details on using hashes here.

File details

Details for the file simsimd-1.2.0-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for simsimd-1.2.0-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2a1cc76176630638755aa3cf46d3a28cd085939ae90c3ff60b7047d9dadc2ca1
MD5 8740e666461de4f0558146b8c8f6cc53
BLAKE2b-256 e8446806c3568881716846c06ef1389d3c11a1b965064d5610426c04ab1a2807

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page