Skip to main content

Scalable Vector Search (SVS) is a performance library for vector similarity search.

Project description

Scalable Vector Search

Scalable Vector Search (SVS) is a performance library for vector similarity search. Thanks to the use of Locally-adaptive Vector Quantization [ABHT23] and its highly optimized indexing and search algorithms, SVS provides vector similarity search:

  • on billions of high-dimensional vectors,
  • at high accuracy
  • and state-of-the-art speed,
  • while enabling the use of less memory than its alternatives.

This enables application and framework developers using similarity search to unleash its performance on Intel ® Xeon CPUs (2nd generation and newer).

SVS offers a fully-featured and yet simple Python API, compatible with most standard libraries. SVS is written in C++ to facilitate its integration into performance-critical applications.

Performance

SVS provides state-of-the-art performance and accuracy [ABHT23] for billion-scale similarity search on standard benchmarks.

For example, for the standard billion-scale Deep-1B dataset, different configurations of SVS yield significantly increased performance (measured in queries per second, QPS) with a smaller memory footprint (horizontal axis) than the alternatives[^1]:

SVS is primarily optimized for large-scale similarity search but it still offers state-of-the-art performance at million-scale.

Best performance is obtained with 4th generation (Sapphire Rapids) by making use of Intel(R) AVX-512 instructions, with excellent results also with 2nd and 3rd Intel ® Xeon ® processors (Cascade Lake and Ice Lake).

Performance will be degraded if Intel(R) AVX-512 instructions are not available. A warning message will appear when loading the SVS Python module if the system does not support Intel(R) AVX-512 instructions.

Key Features

SVS supports:

  • Similarity functions: Euclidean distance, inner product, cosine similarity.
  • Vectors with individual values encoded as: float32, float16, uint8, int8.
  • Vector compression (including Locally-adaptive Vector Quantization [ABHT23])
  • Optimizations for Intel ® Xeon ® processors:
    • 2nd generation (Cascade Lake)
    • 3rd generation (Ice Lake)
    • 4th generation (Sapphire Rapids)

See Roadmap for upcoming features.

Documentation

SVS documentation includes getting started tutorials with installation instructions for Python and C++ and step-by-step search examples, an API reference, as well as several guides and benchmark comparisons.

References

Reference to cite when you use SVS in a research paper:

@article{aguerrebere2023similarity,
        title={Similarity search in the blink of an eye with compressed indices},
        volume = {16},
        number = {11},
        pages = {3433--3446},
        journal = {Proceedings of the VLDB Endowment},
        author={Cecilia Aguerrebere and Ishwar Bhati and Mark Hildebrand and Mariano Tepper and Ted Willke},
        year = {2023}
}

[ABHT23] Aguerrebere, C.; Bhati I.; Hildebrand M.; Tepper M.; Willke T.:Similarity search in the blink of an eye with compressed indices. In: Proceedings of the VLDB Endowment, 16, 11, 3433 - 3446. (2023)

Legal

Refer to the LICENSE file for details.

[^1]: Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex. Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. No product or component can be absolutely secure. Your costs and results may vary. Intel technologies may require enabled hardware, software or service activation. © Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

scalable_vs-0.0.5-0-cp312-cp312-manylinux_2_28_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.28+ x86-64

scalable_vs-0.0.5-0-cp311-cp311-manylinux_2_28_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.28+ x86-64

scalable_vs-0.0.5-0-cp310-cp310-manylinux_2_28_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.28+ x86-64

scalable_vs-0.0.5-0-cp39-cp39-manylinux_2_28_x86_64.whl (9.7 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.28+ x86-64

File details

Details for the file scalable_vs-0.0.5-0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for scalable_vs-0.0.5-0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5ce503e31fb6775a855dec1af950be43d15f4799d90291adc32ab86b2660df50
MD5 acdaf6064e4facdb3969c8c6366999ff
BLAKE2b-256 cafa7f9b8b13e400845dec1602926808bfcc2eeed99271b979160d8d6433bf8f

See more details on using hashes here.

File details

Details for the file scalable_vs-0.0.5-0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for scalable_vs-0.0.5-0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f2b580d5f6347cc2b85f03879afcd1e6f6b591e9cd941756bda38859c977fa76
MD5 1c6a2c19937f13a015cdb4c8c6f1440e
BLAKE2b-256 8da8d476448873a7dcc8e18598800bf3f1c5f82b25278d40476b9b7ac89b59f0

See more details on using hashes here.

File details

Details for the file scalable_vs-0.0.5-0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for scalable_vs-0.0.5-0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f7072ea08f1dc75354a22e798380c51c13d48e348b8c32f01c2e8369061b41c4
MD5 0faa6a82f716a96bc7067c424b0e5ed6
BLAKE2b-256 08983fb9885e8f9490966a6993e1b54739276d7b030e5ca296a637801d0a8200

See more details on using hashes here.

File details

Details for the file scalable_vs-0.0.5-0-cp39-cp39-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for scalable_vs-0.0.5-0-cp39-cp39-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 8d429fef5e8bf0a2fee9e679c9558c76d46a9d7ca843f365e2c9b14f979bec5d
MD5 42888770891b3db834f4cd07f8e7d539
BLAKE2b-256 025c43a2ed1e8fba169fa7d51f387c4b56221169392ba6df8df3a6fa7e350d87

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page