Modern ColBERT for Late Interaction with native multi-vector support

These details have not been verified by PyPI

Project links

Project description

Lateness - Modern ColBERT for Late Interaction

A Python package for Modern ColBERT (late interaction) embeddings with native multi-vector support for efficient retrieval using Qdrant vector database.

Features

Dual Backend Architecture: ONNX for fast retrieval, PyTorch for GPU indexing
Native Multi-Vector Support: Optimized for Qdrant's MaxSim comparator
Smart Installation: Lightweight retrieval or heavy indexing based on your needs
Production Ready: Separate deployment targets for different workloads

Quick Start

Installation

# Lightweight retrieval (ONNX + Qdrant)
pip install lateness

# Heavy indexing (PyTorch + Transformers + ONNX + Qdrant)
pip install lateness[index]

Backend Selection

Basic Usage

Default Installation (ONNX Backend):

# pip install lateness
from lateness import ModernColBERT
colbert = ModernColBERT("prithivida/modern_colbert_base_en_v1")
# Output:
# 🚀 Using ONNX backend Using ONNX backend (default, for GPU accelerated indexing, install lateness[index] and set LATENESS_USE_TORCH=true)
# 🔄 Downloading model: prithivida/modern_colbert_base_en_v1
# ✅ ONNX ColBERT loaded with providers: ['CPUExecutionProvider']
# Query max length: 256, Document max length: 300

Index Installation (PyTorch Backend):

# pip install lateness[index]
import os
os.environ['LATENESS_USE_TORCH'] = 'true'
from lateness import ModernColBERT

colbert = ModernColBERT("prithivida/modern_colbert_base_en_v1")
# Output:
# 🚀 Using PyTorch backend (LATENESS_USE_TORCH=true)
# 🔄 Downloading model: prithivida/modern_colbert_base_en_v1
# Loading model from: /root/.cache/huggingface/hub/models--prithivida--modern_colbert_base_en_v1/...
# ✅ PyTorch ColBERT loaded on cuda
# Query max length: 256, Document max length: 300

Complete Example with Qdrant:

For a complete working example with Qdrant integration, environment setup, and testing instructions, see the examples/qdrant folder.

The examples include:

Environment setup and testing
Local Qdrant server management
Complete indexing and retrieval workflows
Both ONNX and PyTorch backend examples

Architecture

Two Deployment Models

Retrieval Service (Lightweight)

pip install lateness

ONNX backend (fast CPU inference)
Qdrant integration
~50MB total dependencies
Perfect for user-facing search APIs

Indexing Service (Heavy)

pip install lateness[index]

PyTorch backend (GPU acceleration)
Full Transformers support
~2GB+ dependencies
Perfect for batch document processing

Backend Selection

The package uses environment variables for backend control:

Default behavior → ONNX backend (CPU retrieval)
LATENESS_USE_TORCH=true → PyTorch backend (GPU indexing)

Note: PyTorch backend requires pip install lateness[index] to install PyTorch dependencies.

API Reference

ModernColBERT

from lateness import ModernColBERT

# Initialize
colbert = ModernColBERT("prithivida/modern_colbert_base_en_v1")

# Encode queries
query_embeddings = colbert.encode_queries(["What is AI?"])

# Encode documents  
doc_embeddings = colbert.encode_documents(["AI is artificial intelligence"])

# Compute similarity
scores = ModernColBERT.compute_similarity(query_embeddings, doc_embeddings)

Qdrant Integration

from lateness import QdrantIndexer, QdrantRetriever
from qdrant_client import QdrantClient

client = QdrantClient("localhost", port=6333)

# Indexing
indexer = QdrantIndexer(client, "documents")
indexer.create_collection()
indexer.index_documents_simple(documents)

# Retrieval
retriever = QdrantRetriever(client, "documents")
results = retriever.search_simple("query", top_k=10)

License

Apache License 2.0

Contributing

Contributions welcome! Please check our contributing guidelines.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jul 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lateness-0.1.0.tar.gz (17.5 kB view details)

Uploaded Jul 17, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lateness-0.1.0-py3-none-any.whl (19.7 kB view details)

Uploaded Jul 17, 2025 Python 3

File details

Details for the file lateness-0.1.0.tar.gz.

File metadata

Download URL: lateness-0.1.0.tar.gz
Upload date: Jul 17, 2025
Size: 17.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.0

File hashes

Hashes for lateness-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`fe807e2e8cd6757f4acf78eb15dcfbdef21e6aa917bf5964143ae6703b5e18e7`
MD5	`f62faf8d16cd0194bcaeedbd55c87121`
BLAKE2b-256	`bdfd352f8bd8c48526cc37f7dc8614ec899cdd1624bf40b818856d5037767622`

See more details on using hashes here.

File details

Details for the file lateness-0.1.0-py3-none-any.whl.

File metadata

Download URL: lateness-0.1.0-py3-none-any.whl
Upload date: Jul 17, 2025
Size: 19.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.0

File hashes

Hashes for lateness-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a36d2ac44be6b48ba06a2e9c8bd643318f947ba56cee24347cdcada034b6a580`
MD5	`c015281960e6d63d9169054611c47448`
BLAKE2b-256	`0e1402351c6bdf42d9880e999ab5234bbd10f6b99b99b8674cd68b8e40b4b26f`

See more details on using hashes here.

lateness 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Lateness - Modern ColBERT for Late Interaction

Features

Quick Start

Installation

Backend Selection

Basic Usage

Architecture

Two Deployment Models

Backend Selection

API Reference

ModernColBERT

Qdrant Integration

License

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes