Skip to main content

Fast CPU implementation of MaxSim scoring

Project description

maxsim-cpu

maxsim-cpu is a high-performance CPU implementation of MaxSim scoring for late-interaction (ColBERT, ColPali) workflows.

It is a python library written in Rust and powered by libsxmm on x86 CPUs and Apple Accelerate on ARM macs. It only supports Linux x86 machines and ARM Macs at the moment.

maxsim-cpu is built to run exclusively on CPU, and achieves speed-ups that scale with core count on the scoring machine. It's designed to be used in situations where index/scoring machines do not have access to GPUs, and achieves ~2-3x speed-ups on ARM macs and 5x speedups on Linux CPUs over common PyTorch maxsim implementations.

It also implements effective just-in-time batching and padding for variable documents, greatly reducing padding overhead and needless computations.

Getting Started

Pre-built wheels are available on Pypi for Python 3.9 through 3.13 and can be installed in the usual way:

uv pip install maxsim-cpu # You may use vanilla pip install but why would you? If you're sophisticated, you could use `uv add` too!

Once installed, the simple API exposes two methods. For uniform-length inputs, you may use:

import numpy as np
import maxsim_cpu

# Prepare normalized embeddings
query = np.random.randn(32, 128).astype(np.float32)  # [num_query_tokens, dim]

# NOTE: maxsim-cpu expects normalized vectors.
query /= np.linalg.norm(query, axis=1, keepdims=True)

docs = np.random.randn(1000, 512, 128).astype(np.float32)  # [num_docs, doc_len, dim]
# Normalize document embeddings...

# Compute MaxSim scores
scores = maxsim_cpu.maxsim_scores(query, docs)  # Returns [num_docs] scores

For variable length inputs, you should use the alternate maxsim_scores_variable:

import numpy as np
import maxsim_cpu

# Prepare normalized embeddings
query = np.random.randn(32, 128).astype(np.float32)  # [num_query_tokens, dim]

# NOTE: maxsim-cpu expects normalized vectors.
query /= np.linalg.norm(query, axis=1, keepdims=True)

# Create variable-length documents as a list
docs = [
    np.random.randn(np.random.randint(50, 800), 128).astype(np.float32)  # Variable length docs
    for _ in range(1000)
]
# Normalize document embeddings...

# Compute MaxSim scores
scores = maxsim_cpu.maxsim_scores_variable(query, docs)  # Returns [num_docs] scores

Platform Requirements

  • macOS ARM: Apple Silicon (M1+)
  • macOS Intel: x86_64 with AVX2 (Intel Haswell 2013+ - Core i3/i5/i7 4xxx series or newer)
  • Linux: x86_64 with AVX2 (Intel Haswell 2013+, AMD Excavator 2015+)

We currently do not support Windows or take advantage of AVX512 instructions, nor do we optimise caching for specific CPUs. Contributions/PRs in this direction are welcome!

Note: Pre-built wheels on PyPI are currently only available for Linux x86_64 and macOS ARM (Apple Silicon). For Intel Mac users, you'll need to build from source (see below).

Building

We use maturin as our build system.

Linux

The easy way to build maxsim-cpu from source on Linux is as follows:

# Install necessary system deps
apt-get install libssl-dev libopenblas-dev -y
apt-get install pkg-config -y
# Install tooling
uv pip install maturin patchelf numpy
# Install libxsmm
git@github.com:libxsmm/libxsmm.git && cd libxsmm && make STATIC=1 && make
# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
. "$HOME/.cargo/env"
# Clone and install maxsim-cpu
git clone git@github.com:mixedbread-ai/maxsim-cpu.git
cd maxsim-cpu
RUSTFLAGS="-L native=$(pwd)/../libxsmm/lib" maturin build --release --features use-libxsmm

Step by step:

  • This installs OpenSSL and OpenBLAS, which will be required for compiling, as well as pkg-config so they can be found easily.
  • It then clones libxsmm, on which most of the performance depends, and installs it.
  • Installs RUST and enables its environment
  • Clones this repository and finally build it

You may modify it and remove any step depending on dependencies already present on your machine.

Mac

On Mac, the installation is simplified, assuming you use homebrew:

For Apple Silicon (M1+):

# Install maturin
uv pip install maturin
# Install patchelf
brew install patchelf
# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
. "$HOME/.cargo/env"
# Clone and install maxsim-cpu
git clone git@github.com:mixedbread-ai/maxsim-cpu.git
cd maxsim-cpu
maturin build --release

For Intel Mac (x86_64):

# Install maturin
uv pip install maturin
# Install patchelf
brew install patchelf
# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
. "$HOME/.cargo/env"
# Clone and install maxsim-cpu
git clone git@github.com:mixedbread-ai/maxsim-cpu.git
cd maxsim-cpu
# Build with AVX2 support (requires Intel Haswell 2013+ or newer)
RUSTFLAGS="-C target-cpu=haswell" maturin build --release

Performance

For documents of uniform lengths, performance on Linux is slower than Jax on 4 core machines and either somewhat faster or slower depending on the CPU at 8 cores, and always faster than alternatives on ARM Macs. For variable document lengths (evaluated as a uniform distribution between 128 and 1536 tokens), maxsim-cpu is always pretty fast thanks to more efficient batching.

Mac M4 Ultra

Mac M4 Ultra performance

Linux AMD EPYC

32 core limit performance

Linux AMD EPYC 32 core performance

16 core limit performance

It seems our performance was hindered during benchmarking due to a Rayon config issue when limiting the available cores. Leaving reporting as-is for now but performance is expected to be considerably better on an actual 16-core CPU.

Linux AMD EPYC 16 core performance

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

maxsim_fast-0.1.1-cp313-cp313-manylinux_2_34_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

maxsim_fast-0.1.1-cp313-cp313-macosx_11_0_arm64.whl (215.4 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

maxsim_fast-0.1.1-cp313-cp313-macosx_10_12_x86_64.whl (232.7 kB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

maxsim_fast-0.1.1-cp312-cp312-manylinux_2_34_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

maxsim_fast-0.1.1-cp312-cp312-macosx_11_0_arm64.whl (215.4 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

maxsim_fast-0.1.1-cp312-cp312-macosx_10_12_x86_64.whl (232.7 kB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

maxsim_fast-0.1.1-cp311-cp311-manylinux_2_34_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

maxsim_fast-0.1.1-cp311-cp311-macosx_11_0_arm64.whl (215.4 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

maxsim_fast-0.1.1-cp311-cp311-macosx_10_12_x86_64.whl (232.7 kB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

maxsim_fast-0.1.1-cp310-cp310-manylinux_2_34_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

maxsim_fast-0.1.1-cp310-cp310-macosx_11_0_arm64.whl (215.4 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

maxsim_fast-0.1.1-cp310-cp310-macosx_10_12_x86_64.whl (232.7 kB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

maxsim_fast-0.1.1-cp39-cp39-manylinux_2_34_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.34+ x86-64

maxsim_fast-0.1.1-cp39-cp39-macosx_11_0_arm64.whl (215.5 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

maxsim_fast-0.1.1-cp39-cp39-macosx_10_12_x86_64.whl (233.0 kB view details)

Uploaded CPython 3.9macOS 10.12+ x86-64

File details

Details for the file maxsim_fast-0.1.1-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for maxsim_fast-0.1.1-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 cabe11c6a6ca5907b4aa33e144770734aab9a0684ce81b036e09687d7d22e094
MD5 00810f1b83071b1f7bb535d32838dd6a
BLAKE2b-256 6aede041b5790cebba039aa3e66f0583e7d25a63d4790cb77b7a52d75d42e0a2

See more details on using hashes here.

File details

Details for the file maxsim_fast-0.1.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for maxsim_fast-0.1.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 179583b4263895488b79685765963d0f8c2d6bc4902e5e0146bf0fae40cbd8ad
MD5 53415df1da111a6a4550065a1a39d387
BLAKE2b-256 f69fa4587ed5a8e41adcc7cae8918a913f4a3e86d94a862f04ffc6d900ef29a3

See more details on using hashes here.

File details

Details for the file maxsim_fast-0.1.1-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for maxsim_fast-0.1.1-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 a765d936ddca732effbd6ba0ba306735d947ae746e406e5bbe5d74d8a79a2e88
MD5 0c9f16290b6de5b24ec096795aa187f4
BLAKE2b-256 de7b13cb8fbac51f128bc51a51f8139f905d760615719f0fed957dde25d7b8bf

See more details on using hashes here.

File details

Details for the file maxsim_fast-0.1.1-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for maxsim_fast-0.1.1-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 24ad6be4711cb1b78816948ffcdc5a9c8916702ce0d131dae1178a92ab80fc0d
MD5 1429c742db4c876f49274893d407a3c9
BLAKE2b-256 ffb3994ce478532e64a161bc74dcf576d2ecf864d246dc2b27c8f98a9fc3da41

See more details on using hashes here.

File details

Details for the file maxsim_fast-0.1.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for maxsim_fast-0.1.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f79a2c56a5f9db204ba523ae0736221c994d9ecc84e19cc4b6af1c6a8c9915d9
MD5 c68e4c43716804659941d9b1062852b0
BLAKE2b-256 00ac2e56c0edfc51eea5e78e0d751bfda569ca5021db7a8cc10a4948f2192e4d

See more details on using hashes here.

File details

Details for the file maxsim_fast-0.1.1-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for maxsim_fast-0.1.1-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 d5ef4d3efbc8bbe8cd9be9996651196cae5bd8e51593cd923bca59f5d0e8814c
MD5 a04116f1db73695399a69b30537b4e9c
BLAKE2b-256 a7ccb5ccd58ca8639ddf999f4c90920b83774abf673dcb8d29f90ec8e0344c6a

See more details on using hashes here.

File details

Details for the file maxsim_fast-0.1.1-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for maxsim_fast-0.1.1-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 bc655ed5f9a4b94f17706f0342e0f90f904ff19ebba1458a5c1e14eae935edbd
MD5 987063e66a8ec513f652975044acabde
BLAKE2b-256 96f673dc23ba199c0fe7ba4d473d8074b9252c083da39f0cfbd16e7db8726c5d

See more details on using hashes here.

File details

Details for the file maxsim_fast-0.1.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for maxsim_fast-0.1.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 869118712b13c59bd1a65acb119c459b91003ade101b7a4828c46632f32edeaa
MD5 7b5cd100b3c170a1b387168443163468
BLAKE2b-256 b8852d393c3a09e074d9f3ecfce20fd043d950d2ccdee9323cce4dc3af3e54e9

See more details on using hashes here.

File details

Details for the file maxsim_fast-0.1.1-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for maxsim_fast-0.1.1-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 1f9e5251e46c1d75dba1174bbdd7f2e380e7ef1e8c0ccab34b817677be4f43da
MD5 d9d2dde52ba0a75b47b0efe3dce0cd3d
BLAKE2b-256 6dedfaf3ccc32d80e6ef7e47b86002b112a9d83ce0f5f68daf76401363c129f4

See more details on using hashes here.

File details

Details for the file maxsim_fast-0.1.1-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for maxsim_fast-0.1.1-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 d44c80d3a79971209796a85db0947cd7660cc62b81bc2e53a071f00fd1dc5454
MD5 03fd141bb5d5d1889e751228713240a6
BLAKE2b-256 fd8a4f6c5088ece15945900f9c436f63895d3097b7cc246ba00edbfab94b5635

See more details on using hashes here.

File details

Details for the file maxsim_fast-0.1.1-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for maxsim_fast-0.1.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 13e7df8aca971c8ce225600b6bf79e3b775a3faa1c2aac06c08f01830be1e7c9
MD5 3dd3bf721ad2f32f7a9112713810137f
BLAKE2b-256 567b5fe69f56b6940035db915ce0b4a78c4e7e59a9b427d8b1724bb49dcba6a2

See more details on using hashes here.

File details

Details for the file maxsim_fast-0.1.1-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for maxsim_fast-0.1.1-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 bdbb80e5fa6e7e48ea53b406256cb1d8d2d9355649fd0299019e82e52db0c90f
MD5 a50c8a496b2761cab354ebb77162fe5c
BLAKE2b-256 591fe79e1c710a3d35b6ddd0c3702d0096e933b69708d68060fa66790ffdb23a

See more details on using hashes here.

File details

Details for the file maxsim_fast-0.1.1-cp39-cp39-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for maxsim_fast-0.1.1-cp39-cp39-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 8d606a23d49421b960fe141532f9053c899f95c9dd1be660cd5dd81495a8890c
MD5 c1bc0260f5c9d5f94f4f4cf17d24ce96
BLAKE2b-256 30acdfb743214c7554298b673691b568e90d842ab572623813bc530ee40f7936

See more details on using hashes here.

File details

Details for the file maxsim_fast-0.1.1-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for maxsim_fast-0.1.1-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 990d89db4905443f93f6ee0f80227801bffc7b891c42e5ff8707024bdc4287b0
MD5 e47dceff8fa31e834f6520d47d1595d8
BLAKE2b-256 3fb6ca4baa7a54dac8a1300e9be5d1a7d07600ee5e265d15765915b4c3a11dcb

See more details on using hashes here.

File details

Details for the file maxsim_fast-0.1.1-cp39-cp39-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for maxsim_fast-0.1.1-cp39-cp39-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 6086612779c414d2f1f2391163f207e7e9ffd666fde8a74f6d10e5b325069581
MD5 7e8ac8293c5b6257d9e42fcbbd12806d
BLAKE2b-256 81158ba824c4afc255f27b802a66ae3b73c6e6fc4ec4fbcff2f1d9f8783bb53c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page