Skip to main content

A lightweight, zero-PyTorch ONNX encoder for generic ColBERT models.

Project description

intextus

License: MIT Python 3.9+

ColBERT embedding and MaxSim scoring without PyTorch. Uses a native C++ extension (ONNX Runtime + tokenizers-cpp) so you don't need to pull in 2 GB of deep learning dependencies just to encode some text.

Install

pip install intextus-embed

Only runtime deps are numpy and huggingface-hub. The C++ bits (ONNX Runtime, tokenizer) are compiled into the wheel.

Usage

from intextus import LateInteractionEncoder, compute_maxsim

model = LateInteractionEncoder()  # downloads intextus/mxbai-edge-colbert-v0-17m-onnx

q = model.encode_queries("What is late interaction?")
d = model.encode_docs("ColBERT computes token-level similarity.")

score = compute_maxsim(q[0], d[0])
print(score)

You can also point it at a local directory with model.onnx and tokenizer.json:

model = LateInteractionEncoder("./my-model/")

Models

Alias Repo Size Dim Notes
mxbai-edge-colbert-v0-17m intextus/mxbai-edge-colbert-v0-17m-onnx 66 MB 48 Default
mxbai-edge-colbert-v0-32m intextus/mxbai-edge-colbert-v0-32m-onnx 124 MB 64
lateon intextus/lateon-onnx 580 MB 128 Case-sensitive: use do_lower_case=False

Any ColBERT ONNX model should work if you put model.onnx and tokenizer.json in a folder and pass the path.

How it works

  • Tokenization and inference run in C++ via a nanobind extension
  • GIL is released during encode and MaxSim calls, so you can run multiple threads
  • Punctuation tokens are masked out of document embeddings (standard ColBERT behavior)
  • Embeddings are L2-normalized by default
  • CPU only for now

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

intextus_embed-0.1.4.tar.gz (16.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

intextus_embed-0.1.4-cp313-cp313-win_amd64.whl (6.7 MB view details)

Uploaded CPython 3.13Windows x86-64

intextus_embed-0.1.4-cp313-cp313-manylinux_2_28_x86_64.whl (21.4 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

intextus_embed-0.1.4-cp313-cp313-manylinux_2_28_aarch64.whl (18.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

intextus_embed-0.1.4-cp313-cp313-macosx_13_0_arm64.whl (16.6 MB view details)

Uploaded CPython 3.13macOS 13.0+ ARM64

intextus_embed-0.1.4-cp312-cp312-win_amd64.whl (6.7 MB view details)

Uploaded CPython 3.12Windows x86-64

intextus_embed-0.1.4-cp312-cp312-manylinux_2_28_x86_64.whl (21.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

intextus_embed-0.1.4-cp312-cp312-manylinux_2_28_aarch64.whl (18.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

intextus_embed-0.1.4-cp312-cp312-macosx_13_0_arm64.whl (16.6 MB view details)

Uploaded CPython 3.12macOS 13.0+ ARM64

intextus_embed-0.1.4-cp311-cp311-win_amd64.whl (6.7 MB view details)

Uploaded CPython 3.11Windows x86-64

intextus_embed-0.1.4-cp311-cp311-manylinux_2_28_x86_64.whl (21.4 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

intextus_embed-0.1.4-cp311-cp311-manylinux_2_28_aarch64.whl (18.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

intextus_embed-0.1.4-cp311-cp311-macosx_13_0_arm64.whl (16.6 MB view details)

Uploaded CPython 3.11macOS 13.0+ ARM64

intextus_embed-0.1.4-cp310-cp310-win_amd64.whl (6.7 MB view details)

Uploaded CPython 3.10Windows x86-64

intextus_embed-0.1.4-cp310-cp310-manylinux_2_28_x86_64.whl (21.4 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

intextus_embed-0.1.4-cp310-cp310-manylinux_2_28_aarch64.whl (18.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

intextus_embed-0.1.4-cp310-cp310-macosx_13_0_arm64.whl (16.6 MB view details)

Uploaded CPython 3.10macOS 13.0+ ARM64

intextus_embed-0.1.4-cp39-cp39-win_amd64.whl (6.7 MB view details)

Uploaded CPython 3.9Windows x86-64

intextus_embed-0.1.4-cp39-cp39-manylinux_2_28_x86_64.whl (21.4 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.28+ x86-64

intextus_embed-0.1.4-cp39-cp39-manylinux_2_28_aarch64.whl (18.8 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.28+ ARM64

intextus_embed-0.1.4-cp39-cp39-macosx_13_0_arm64.whl (16.6 MB view details)

Uploaded CPython 3.9macOS 13.0+ ARM64

File details

Details for the file intextus_embed-0.1.4.tar.gz.

File metadata

  • Download URL: intextus_embed-0.1.4.tar.gz
  • Upload date:
  • Size: 16.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for intextus_embed-0.1.4.tar.gz
Algorithm Hash digest
SHA256 d2db60be4afe54cba98371edef3f174bc62d8ad1cb69cbf71fa2e3eae0e47425
MD5 c5ebb3136d828b2dd0bd9f43348afa96
BLAKE2b-256 d8e9f79d43f26ca521edd452bd2a9a90ed3329574eee6698b1855a459765a720

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4.tar.gz:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 8aa5b1c01e559e8500f5316b0fb215f046289ed158beaf2439593d1824f370b6
MD5 50a6f2e38ab539ed3c9c7d2514383aaa
BLAKE2b-256 78cb5a5a58fabba31aaa49c9f86dbdd9b197abf7c4ba8dcf60b7084ef0223732

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp313-cp313-win_amd64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e6966bf7b2da09c473482a159b70d4515b23d998de7e0969f8b75477641f58c7
MD5 5db328403ffe27fa9df4a4bc16803e0d
BLAKE2b-256 9f8babe00b6d31b541ed1fdf9297ff354e02f908eb941003077f7b02621cb076

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 0650ee64a5b9b987d9b6b5e71fed1721b658fb828911a0cbb1201c987f15b715
MD5 b742e38ee2d572d5a9730383fd4868cc
BLAKE2b-256 b09b5ccd59911782e8cc34721ddf46e96b9718b8c039cd2955949e2b7a53d7cc

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp313-cp313-manylinux_2_28_aarch64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp313-cp313-macosx_13_0_arm64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp313-cp313-macosx_13_0_arm64.whl
Algorithm Hash digest
SHA256 6eebba856b3e8639ec3ddb540ae962a5002ec8f69f6d176b43b6ce2de949560b
MD5 a8e3d3fec16516530d60ca4260594fd9
BLAKE2b-256 7824e3b5a95970eac8f431c952491e1c1c96b512ff62d5a651f7e77e8a6e0a89

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp313-cp313-macosx_13_0_arm64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 6fa545ec076eb7146975e58c7a6d06aa4216f6afa60fba3783baf015d54a3168
MD5 a8bde39f7517eb3f8616af6d2dedad2b
BLAKE2b-256 2521eccdd084a46631af57cf1789b905f88dbcaffcf8af000ec6d149b5f2ee34

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp312-cp312-win_amd64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 01a8a7909c3ae36942be019f8b1d1797a6c3b1c64d68d11f895517365f760bff
MD5 6aee986ddd11524f674e3be3aaa2e63b
BLAKE2b-256 641baaf15ea2d81477e8f28926f20e01756655bb57d35347bd8027480de43d3c

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 ad8a768756a03cf73a6f2bb47308ec6b16c19655427f3784505a5626d4f4a5f1
MD5 06c87c756b0367beb6e56fa1f145af65
BLAKE2b-256 d12696a6a36897fb4fc4c8bcf732387afd8692fbd58c7a0ce3eda0219c5c7039

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp312-cp312-manylinux_2_28_aarch64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp312-cp312-macosx_13_0_arm64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp312-cp312-macosx_13_0_arm64.whl
Algorithm Hash digest
SHA256 c8ddc9f59b3b53a3c26f67f895625ad97e71fee1fef78cbf124a7af1c4318627
MD5 77a41b53bae7c40515ffc5c688c3f960
BLAKE2b-256 4c7b8824d282649b357c1b40e774ab022403ab9ca5c9547c45fbdea1ceb147f4

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp312-cp312-macosx_13_0_arm64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 bb5e282382f9066b0cfc1c544e19ac10a2722cec7349e40d52c3a90a45711234
MD5 adf5127b2fadac85bf09bfc3814bc942
BLAKE2b-256 682ca2d333f4608281f1cd4da3fefd5b4a7d732c2496d1b968fdec9a1b441683

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp311-cp311-win_amd64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e35c7a9f45321b785c7f8a71419b0c64a8baf7c30825d2156c43656e64d011d2
MD5 e106b3272dacb2ed6df5f7773d11fbf1
BLAKE2b-256 89847688a5cb288988e299993e3b9b5d15b4dae7145784a0b2431df24cf1fe0c

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 1d2ca944b62e8287ca1491f8cbd362e2141ad4ce3b8958de68fac40ddc4599fe
MD5 883f76ce9bb9e8b3d02649e356877cb3
BLAKE2b-256 731c23e10bbaa57063a10d13d0545cba3cd1883ed64756e0f197e81d164e7270

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp311-cp311-manylinux_2_28_aarch64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp311-cp311-macosx_13_0_arm64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp311-cp311-macosx_13_0_arm64.whl
Algorithm Hash digest
SHA256 316923b4a11a2122d692678a8e17ea87d5936ca8e83097b66a19d6a46ad2ffd6
MD5 98760e04f6a92312763740e61bda7cad
BLAKE2b-256 896d809de20849d2d27109816d63e2ef4b49f7e6bb2dfa32c7646d776ba51af2

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp311-cp311-macosx_13_0_arm64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 bc401fb4a8fdef92ac7a799b7987291b0648254410b19e211d0295d3da1478cd
MD5 13b15fb1f2247542ba4e92faf653c162
BLAKE2b-256 f06f51aba63e406b58c3ac8918f452d608198a7bef84c62ad46625b0eb362a49

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp310-cp310-win_amd64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f8d84a7979e2c26f478490852e44f1eef0ffa80f9b90dedb873f2fe7520941d5
MD5 41cf9e5cc121e7d8cccdcbd590b928d8
BLAKE2b-256 99049a6d6e2816b5de7b3b29126d8df28ee7613b43198dbc8cd8e44b5c17987b

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp310-cp310-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 b5f58bc6e06a611967ff08dfaff3a7aa4eaddbdd7755650e22ba7eeaa303c8fd
MD5 77dee2bdb683c8214184a3917a894d27
BLAKE2b-256 724e28cb2a7ac60170b201f0319c824d444a9e9a2ba9e2a89a74238273472578

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp310-cp310-manylinux_2_28_aarch64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp310-cp310-macosx_13_0_arm64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp310-cp310-macosx_13_0_arm64.whl
Algorithm Hash digest
SHA256 e85cf59e8423dcf1c2003cbfad94df71da156ba9eb49c8b14e476dae29b187e9
MD5 337f77a87c0d31b80e3eb3ac8e77a5e9
BLAKE2b-256 f3ea7321904ca9efd322608ba6053a746f98e85d4e941bcb706da1a89740d2a8

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp310-cp310-macosx_13_0_arm64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 64a14804aca46e2ea7038a85e6a74e7202ca5a32b18afd95ad4f2a4452636090
MD5 1a1bca4c7232886afe62ad432f4ed132
BLAKE2b-256 54dad336757ab9da83891bcdeec49e70fc6f3bd2f19f66ef8c6a0a06438b1f03

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp39-cp39-win_amd64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp39-cp39-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp39-cp39-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 59b5fefbeb0b308cc06c2d8a74e3421ce62984a51c6fcd726eb8c84cc84cc05a
MD5 d839ace60de2d0fe665a5894273bdab5
BLAKE2b-256 a6128de3a80478e34068f195c7f4b5df487b8f9f1a1432667ed2d3e31cd03ed2

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp39-cp39-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp39-cp39-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp39-cp39-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 5e8f926499d15b219d8ed8da3c6bf861f317513e6b57fa7592e1df3971864762
MD5 4f1503604ecd0b6c0872a712c60e8d7a
BLAKE2b-256 0289d82eeeba6897b4263b9c58e4143ad6c404fd1b24f9a0d5b9055c4fcf6323

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp39-cp39-manylinux_2_28_aarch64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.4-cp39-cp39-macosx_13_0_arm64.whl.

File metadata

File hashes

Hashes for intextus_embed-0.1.4-cp39-cp39-macosx_13_0_arm64.whl
Algorithm Hash digest
SHA256 5136e1affd858ed642f3c2ebc3ebd3eec91a7d2f287a00d5af948309028dcb48
MD5 8cb1206b70fe2acadb892021c4ee66a1
BLAKE2b-256 3aaf2134417956c15cc9666fd86b8b6b85fc2f18a8f5b6ba40ec2ab9bca688ed

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.4-cp39-cp39-macosx_13_0_arm64.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page