A lightweight, zero-PyTorch ONNX encoder for generic ColBERT models.

These details have not been verified by PyPI

Project description

intextus

intextus is an ultra-lightweight, 100% PyTorch-free, and production-grade Python library designed to encode late-interaction ColBERT multi-vectors.

By replacing massive deep learning libraries with highly optimized, compiled C++/Rust backends, intextus delivers full ColBERT MaxSim embeddings in under 65MB of RAM with zero PyTorch or Transformers dependencies. It is optimized for edge devices, serverless functions (AWS Lambda, Cloudflare Workers), and resource-constrained environments.

Key Features

No PyTorch or Transformers: Fully decoupled from the heavy standard library pipeline. A simple pip install completes in seconds.
Micro Memory Footprint: Executes multi-vector graphs inside ONNX Runtime, drawing less than 65MB of RAM during inference.
Fast Rust Tokenization: Uses Hugging Face's raw Rust tokenization backend directly.
Dynamic Punctuation Skiplist: Dynamically parses tokenizer.json at initialization, creating a zero-overhead mask to discard punctuation vectors, matching ColBERT index-saving behaviors.
Standardized Late Interaction: Exposes native NumPy-based MaxSim calculations.

Installation

Install the library directly via pip:

pip install intextus-embed

[!NOTE] intextus currently defaults to highly optimized CPU inference. Full hardware acceleration and GPU execution support are planned for a future release.

Quick Start

Here is how to load a model, extract multi-vector embeddings, and compute late-interaction cross-similarity scores entirely in NumPy:

from intextus import IntextusEncoder, compute_maxsim

# Initialize the encoder (defaults to intextus/mxbai-edge-colbert-v0-17m-onnx)
model = IntextusEncoder()

# Or initialize from a local directory containing 'model.onnx' and 'tokenizer.json'
# model = IntextusEncoder("./my_model_directory")

# Extract query and document embeddings (Batch_Size, Sequence_Length, Dimension)
query_embeddings = model.encode_queries("What is ultra-low latency?")
doc_embeddings = model.encode_docs("ONNX runtime bypasses the PyTorch layer completely.")

# Compute the cross-similarity score via NumPy (using the first item in the batch)
score = compute_maxsim(query_embeddings[0], doc_embeddings[0])
print(f"Relevance Score (MaxSim): {score:.4f}")

Supported & Tested Models

intextus is designed for ultra-fast, edge-compatible ColBERT execution. The primary officially supported and fully validated models are:

intextus/mxbai-edge-colbert-v0-17m-onnx (Alias: mxbai-edge-colbert-v0-17m) — A highly-optimized, single-file ONNX representation of ModernBERT-backed mxbai-edge-colbert-v0-17m (66 MB, 48-dimensional late-interaction embeddings). (Default Model)
intextus/mxbai-edge-colbert-v0-32m-onnx (Alias: mxbai-edge-colbert-v0-32m) — A larger, higher-capacity ONNX representation of ModernBERT-backed mxbai-edge-colbert-v0-32m (124 MB, 64-dimensional late-interaction embeddings).
intextus/lateon-onnx (Alias: lateon) — A high-capacity base ModernBERT-backed model (580 MB, 128-dimensional late-interaction embeddings). Note: LateOn is case-sensitive, so load it with IntextusEncoder("lateon", do_lower_case=False).

[!NOTE] Any ColBERT model exported via standard Hugging Face/PyLate workflows can be loaded locally by providing the path to its model.onnx and tokenizer.json.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details

These details have not been verified by PyPI

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.1.5

Jun 21, 2026

0.1.4

Jun 20, 2026

0.1.3

Jun 17, 2026

This version

0.1.2

Jun 17, 2026

0.1.1

Jun 17, 2026

0.1.0

Jun 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

intextus_embed-0.1.2.tar.gz (12.2 kB view details)

Uploaded Jun 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

intextus_embed-0.1.2-py3-none-any.whl (9.9 kB view details)

Uploaded Jun 17, 2026 Python 3

File details

Details for the file intextus_embed-0.1.2.tar.gz.

File metadata

Download URL: intextus_embed-0.1.2.tar.gz
Upload date: Jun 17, 2026
Size: 12.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for intextus_embed-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`b834bcf8e6dd7eb4c514da23744fbddf94b3eb0c0451f607d3e04d6579e3a41e`
MD5	`59f96a93f5284f3e3549e550fb33ba11`
BLAKE2b-256	`000ae2c9ec493faf10d6854ce2691b3577fd38e14b8640fc2559018503be4de2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.2.tar.gz:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: intextus_embed-0.1.2.tar.gz
- Subject digest: b834bcf8e6dd7eb4c514da23744fbddf94b3eb0c0451f607d3e04d6579e3a41e
- Sigstore transparency entry: 1847919088
- Sigstore integration time: Jun 17, 2026
Source repository:
- Permalink: Intextus/intextus-embed@a6e3a8c86ef87b1c4bdc9afc5ebf7168276b883b
- Branch / Tag: refs/heads/main
- Owner: https://github.com/Intextus
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@a6e3a8c86ef87b1c4bdc9afc5ebf7168276b883b
- Trigger Event: push

File details

Details for the file intextus_embed-0.1.2-py3-none-any.whl.

File metadata

Download URL: intextus_embed-0.1.2-py3-none-any.whl
Upload date: Jun 17, 2026
Size: 9.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for intextus_embed-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`566934ac13b5c6ef5cedc914dec163ccd941a4dcd07f34d87cd386e8be5d1d94`
MD5	`c05aa6e17c4075c4255063a6879de1a2`
BLAKE2b-256	`6589d5bc5aa13b948674875f6b75e53ab1b45da0cfdf25360d597e4771c5cb7d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for intextus_embed-0.1.2-py3-none-any.whl:

Publisher: publish.yml on Intextus/intextus-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: intextus_embed-0.1.2-py3-none-any.whl
- Subject digest: 566934ac13b5c6ef5cedc914dec163ccd941a4dcd07f34d87cd386e8be5d1d94
- Sigstore transparency entry: 1847919187
- Sigstore integration time: Jun 17, 2026
Source repository:
- Permalink: Intextus/intextus-embed@a6e3a8c86ef87b1c4bdc9afc5ebf7168276b883b
- Branch / Tag: refs/heads/main
- Owner: https://github.com/Intextus
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@a6e3a8c86ef87b1c4bdc9afc5ebf7168276b883b
- Trigger Event: push

intextus-embed 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

intextus

Key Features

Installation

Quick Start

Supported & Tested Models

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance