Skip to main content

SIMD-accelerated similarity measures for x86 and Arm

Project description

SimSIMD 📏

Efficient Alternative to scipy.spatial.distance and numpy.inner

SimSIMD leverages SIMD intrinsics, capabilities that only select compilers effectively utilize. This framework supports conventional AVX2 instructions on x86, NEON on Arm, as well as rare AVX-512 FP16 instructions on x86 and Scalable Vector Extensions on Arm. Designed specifically for Machine Learning contexts, it's optimized for handling high-dimensional vector embeddings.

  • 3-200x faster than NumPy and SciPy distance functions.
  • ✅ Euclidean (L2), Inner Product, and Cosine (Angular) distances.
  • ✅ Single-precision f32, half-precision f16, and i8 vectors.
  • ✅ Compatible with NumPy, PyTorch, TensorFlow, and other tensors.
  • ✅ Has no dependencies, not even LibC.

Benchmarks

Apple M2 Pro

Given 10,000 embeddings from OpenAI Ada API with 1536 dimensions, running on the Apple M2 Pro Arm CPU with NEON support, here's how SimSIMD performs against conventional methods:

Conventional SimSIMD f32 improvement f16 improvement i8 improvement
scipy.spatial.distance.cosine cosine 39 x 84 x 196 x
scipy.spatial.distance.sqeuclidean sqeuclidean 8 x 25 x 22 x
numpy.inner inner 3 x 10 x 18 x

Intel Sapphire Rapids

On the Intel Sapphire Rapids platform, SimSIMD was benchmarked against autovectorized-code using GCC 12. GCC handles single-precision float and int8_t well. However, it fails on _Float16 arrays, which has been part of the C language since 2011.

GCC 12 f32 GCC 12 f16 SimSIMD f16 f16 improvement
cosine 3.28 M/s 336.29 k/s 6.88 M/s 20 x
sqeuclidean 4.62 M/s 147.25 k/s 5.32 M/s 36 x
inner 3.81 M/s 192.02 k/s 5.99 M/s 31 x

Technical Insights:

  • Uses Arm SVE and x86 AVX-512's masked loads to eliminate tail for-loops.
  • Substitutes LibC's sqrt calls with bithacks using Jan Kadlec's constant.
  • Avoids slow PyBind11 and SWIG, directly using the CPython C API.
  • Avoids slow PyArg_ParseTuple and manually unpacks argument tuples.

Using in Python

Installation

pip install simsimd

Distance Between 2 Vectors

import simsimd
import numpy as np

vec1 = np.random.randn(1536).astype(np.float32)
vec2 = np.random.randn(1536).astype(np.float32)
dist = simsimd.cosine(vec1, vec2)

Distance Between 2 Batches

batch1 = np.random.randn(100, 1536).astype(np.float32)
batch2 = np.random.randn(100, 1536).astype(np.float32)
dist = simsimd.cosine(batch1, batch2)

All Pairwise Distances

For calculating distances between all possible pairs of rows across two matrices (akin to scipy.spatial.distance.cdist):

matrix1 = np.random.randn(1000, 1536).astype(np.float32)
matrix2 = np.random.randn(10, 1536).astype(np.float32)
distances = simsimd.cdist(matrix1, matrix2, metric="cosine")

Multithreading

By default, computations use a single CPU core. To optimize and utilize all CPU cores on Linux systems, add the threads=0 argument. Alternatively, specify a custom number of threads:

distances = simsimd.cdist(matrix1, matrix2, metric="cosine", threads=0)

Hardware Backend Capabilities

To view a list of hardware backends that SimSIMD supports:

print(simsimd.get_capabilities())

Using Python API with USearch

Want to use it in Python with USearch? You can wrap the raw C function pointers SimSIMD backends into a CompiledMetric, and pass it to USearch, similar to how it handles Numba's JIT-compiled code.

from usearch.index import Index, CompiledMetric, MetricKind, MetricSignature
from simsimd import pointer_to_sqeuclidean, pointer_to_cosine, pointer_to_inner

metric = CompiledMetric(
    pointer=pointer_to_cosine("f16"),
    kind=MetricKind.Cos,
    signature=MetricSignature.ArrayArraySize,
)

index = Index(256, metric=metric)

Using SimSIMD in C

If you're aiming to utilize the _Float16 functionality with SimSIMD, ensure your development environment is compatible with C 11. For other functionalities of SimSIMD, C 99 compatibility will suffice.

For integration within a CMake-based project, add the following segment to your CMakeLists.txt:

FetchContent_Declare(
    simsimd
    GIT_REPOSITORY https://github.com/ashvardanian/simsimd.git
    GIT_SHALLOW TRUE
)
FetchContent_MakeAvailable(simsimd)
include_directories(${simsimd_SOURCE_DIR}/include)

Stay updated with the latest advancements by always using the most recent compiler available for your platform. This ensures that you benefit from the newest intrinsics.

Should you wish to integrate SimSIMD within USearch, simply compile USearch with the flag USEARCH_USE_SIMSIMD=1. Notably, this is the default setting on the majority of platforms.

Upcoming Features

Here's a glance at the exciting developments on our horizon:

  • Exposing Hamming and Tanimoto bitwise distances to the Python interface.
  • Intel AMX backend. Note: Currently, the intrinsics are functional only with Intel's latest compiler.

To Rerun Experiments utilize the following command:

cmake -DCMAKE_BUILD_TYPE=Release -DSIMSIMD_BUILD_BENCHMARKS=1 -B ./build_release && make -C ./build_release && ./build_release/simsimd_bench

To Test with PyTest:

pip install -e . && pytest python/test.py -s -x

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

simsimd-2.0.4-cp311-cp311-manylinux_2_28_x86_64.whl (179.6 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.28+ x86-64

simsimd-2.0.4-cp311-cp311-manylinux_2_28_aarch64.whl (161.4 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.28+ ARM64

simsimd-2.0.4-cp311-cp311-macosx_11_0_arm64.whl (21.8 kB view hashes)

Uploaded CPython 3.11 macOS 11.0+ ARM64

simsimd-2.0.4-cp311-cp311-macosx_10_9_x86_64.whl (22.0 kB view hashes)

Uploaded CPython 3.11 macOS 10.9+ x86-64

simsimd-2.0.4-cp310-cp310-manylinux_2_28_x86_64.whl (179.5 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.28+ x86-64

simsimd-2.0.4-cp310-cp310-manylinux_2_28_aarch64.whl (161.3 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.28+ ARM64

simsimd-2.0.4-cp310-cp310-macosx_11_0_arm64.whl (21.8 kB view hashes)

Uploaded CPython 3.10 macOS 11.0+ ARM64

simsimd-2.0.4-cp310-cp310-macosx_10_9_x86_64.whl (22.0 kB view hashes)

Uploaded CPython 3.10 macOS 10.9+ x86-64

simsimd-2.0.4-cp39-cp39-manylinux_2_28_x86_64.whl (179.4 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.28+ x86-64

simsimd-2.0.4-cp39-cp39-manylinux_2_28_aarch64.whl (161.1 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.28+ ARM64

simsimd-2.0.4-cp39-cp39-macosx_11_0_arm64.whl (21.8 kB view hashes)

Uploaded CPython 3.9 macOS 11.0+ ARM64

simsimd-2.0.4-cp39-cp39-macosx_10_9_x86_64.whl (22.0 kB view hashes)

Uploaded CPython 3.9 macOS 10.9+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page