Skip to main content

A lightweight, zero-PyTorch ONNX encoder for generic ColBERT models.

Project description

intextus

License: MIT Python 3.8+

intextus is an ultra-lightweight, 100% PyTorch-free, and production-grade Python library designed to encode late-interaction ColBERT multi-vectors.

By replacing massive deep learning libraries with highly optimized, compiled C++/Rust backends, intextus delivers full ColBERT MaxSim embeddings in under 65MB of RAM with zero PyTorch or Transformers dependencies. It is optimized for edge devices, serverless functions (AWS Lambda, Cloudflare Workers), and resource-constrained environments.


Installation

Install the library directly via pip:

pip install intextus-embed

[!NOTE] intextus currently defaults to highly optimized CPU inference. Full hardware acceleration and GPU execution support are planned for a future release.


Quick Start

Here is how to load a model, extract multi-vector embeddings, and compute late-interaction cross-similarity scores entirely in NumPy:

from intextus import IntextusEncoder, compute_maxsim

# Initialize the encoder (defaults to intextus/mxbai-edge-colbert-v0-17m-onnx)
model = IntextusEncoder()

# Or initialize from a local directory containing 'model.onnx' and 'tokenizer.json'
# model = IntextusEncoder("./my_model_directory")

# Extract query and document embeddings (Batch_Size, Sequence_Length, Dimension)
query_embeddings = model.encode_queries("What is ultra-low latency?")
doc_embeddings = model.encode_docs("ONNX runtime bypasses the PyTorch layer completely.")

# Compute the cross-similarity score via NumPy (using the first item in the batch)
score = compute_maxsim(query_embeddings[0], doc_embeddings[0])
print(f"Relevance Score (MaxSim): {score:.4f}")

Supported & Tested Models

intextus is designed for ultra-fast, edge-compatible ColBERT execution. The primary officially supported and fully validated models are:

  • intextus/mxbai-edge-colbert-v0-17m-onnx (Alias: mxbai-edge-colbert-v0-17m) — A highly-optimized, single-file ONNX representation of ModernBERT-backed mxbai-edge-colbert-v0-17m (66 MB, 48-dimensional late-interaction embeddings). (Default Model)
  • intextus/mxbai-edge-colbert-v0-32m-onnx (Alias: mxbai-edge-colbert-v0-32m) — A larger, higher-capacity ONNX representation of ModernBERT-backed mxbai-edge-colbert-v0-32m (124 MB, 64-dimensional late-interaction embeddings).
  • intextus/lateon-onnx (Alias: lateon) — A high-capacity base ModernBERT-backed model (580 MB, 128-dimensional late-interaction embeddings). Note: LateOn is case-sensitive, so load it with IntextusEncoder("lateon", do_lower_case=False).

[!NOTE] Any ColBERT model exported via standard Hugging Face/PyLate workflows can be loaded locally by providing the path to its model.onnx and tokenizer.json.


License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

intextus_embed-0.1.3.tar.gz (11.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

intextus_embed-0.1.3-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file intextus_embed-0.1.3.tar.gz.

File metadata

  • Download URL: intextus_embed-0.1.3.tar.gz
  • Upload date:
  • Size: 11.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for intextus_embed-0.1.3.tar.gz
Algorithm Hash digest
SHA256 16ee9bd56da7652b4b5d28cd23855634e950c41927f391980431d20084fc8085
MD5 491037f1071aac8cf47a88c9ec45c729
BLAKE2b-256 db6c04159f3954968c40363e387608e9dadcbd16ec9dcecaddb3ce2a8801980b

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.3.tar.gz:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intextus_embed-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: intextus_embed-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for intextus_embed-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1cd8d04b7db483eecaf4f5e4272ca9d73dd2d667f980653fe6521c9d36c83265
MD5 d8d9a38f9b4b58a375359bdf9e3cdde2
BLAKE2b-256 dd939b98c232510143a4692f554ad38f89610ff65f67f2b0f2b517604b1e2633

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.3-py3-none-any.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page