Production-grade multimodal embedding model with ONNX export and int8 quantization

These details have not been verified by PyPI

Project description

OmniVector-Embed

Production-grade multimodal embedding model — unified 4096-dimensional embeddings for text, code, image, video, and audio. Built on Mistral-7B with ONNX export and int8 quantization for deployment.

Replicates and extends NV-Embed-v2 with multimodal support and CPU-friendly inference.

Installation

pip install omnivector-embed

With vision support:

pip install omnivector-embed[vision]

Quick Start

from omnivector.model import OmniVectorModel

# Load a trained model
model = OmniVectorModel.from_pretrained("AuralithAI/omnivector-embed-v1")

# Encode text
embeddings = model.encode(["What is machine learning?", "ML is a subset of AI."])

# Encode with Matryoshka dimensionality
embeddings_512 = model.encode(["query"], output_dim=512)
embeddings_4096 = model.encode(["query"], output_dim=4096)

Key Features

Feature	Description
Multimodal	Text, code, image, video, and audio in one embedding space
Matryoshka	Flexible output dimensions: 512, 1024, 2048, 4096
ONNX Export	Opset 17 with dynamic int8 quantization
CPU Inference	Full ONNX Runtime support — no GPU required
LoRA Training	Fine-tune with 0.1% of parameters (rank 16)
3-Stage Pipeline	Retrieval → Generalist → Multimodal training

Architecture

Input → Mistral-7B (bidirectional, eager attention, LoRA)
      → Latent Attention Pooling (512 latents × 4096, 8 heads)
      → Matryoshka dimensions [512, 1024, 2048, 4096]
      → L2 normalize

Backbone: Mistral-7B-v0.1 with bidirectional attention
Pooling: Cross-attention with learned latent queries
Vision: SigLIP-SO400M (1152 → 4096 projection)
Audio: Whisper-tiny (384 → 4096 MLP projection)
Loss: InfoNCE + Matryoshka Representation Learning + cross-modal contrastive

ONNX Export

from omnivector.export import OnnxExporter

exporter = OnnxExporter(model_path="path/to/model", opset_version=17)
exporter.export("model.onnx")

# Quantize to int8
from omnivector.export import OnnxQuantizer
OnnxQuantizer.quantize_dynamic("model.onnx", "model_int8.onnx")

Evaluation

Built-in MTEB evaluation:

python scripts/evaluate.py --model-path path/to/model --tasks retrieval

Training

3-stage training pipeline with DeepSpeed ZeRO-2:

# Stage 1: Retrieval (text pairs with hard negatives)
python scripts/training.py --config configs/stage1_retrieval.yaml

# Stage 2: Generalist (55M+ pairs)
python scripts/training.py --config configs/stage2_generalist.yaml

# Stage 3: Multimodal (image/video/audio + text)
python scripts/train_multimodal.py --config configs/stage3_multimodal.yaml

Stack

Component	Version
Python	≥ 3.9
PyTorch	≥ 2.2.0
Transformers	4.44.2
PEFT	0.12.0
ONNX Runtime	≥ 1.18.0
DeepSpeed	≥ 0.14.0

License

Apache 2.0 — see LICENSE.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Mar 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omnivector_embed-0.1.0.tar.gz (661.7 kB view details)

Uploaded Mar 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

omnivector_embed-0.1.0-py3-none-any.whl (49.0 kB view details)

Uploaded Mar 20, 2026 Python 3

File details

Details for the file omnivector_embed-0.1.0.tar.gz.

File metadata

Download URL: omnivector_embed-0.1.0.tar.gz
Upload date: Mar 20, 2026
Size: 661.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for omnivector_embed-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`02769b456c6236f8ae4f15d201748e2f59e53f965db66d94d1ee12d4d8fd99cc`
MD5	`0ed5f2de4ba2fa0462f454cdd70dd6b9`
BLAKE2b-256	`3883818f15066f96ae445ffcebdd2c957c1ed25bc407433eaef0b56a93b59a8f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for omnivector_embed-0.1.0.tar.gz:

Publisher: release.yml on AuralithAI/OmniVector-Embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: omnivector_embed-0.1.0.tar.gz
- Subject digest: 02769b456c6236f8ae4f15d201748e2f59e53f965db66d94d1ee12d4d8fd99cc
- Sigstore transparency entry: 1150016626
- Sigstore integration time: Mar 20, 2026
Source repository:
- Permalink: AuralithAI/OmniVector-Embed@5621935eb3c1f43da123c1fd7e8df765a90f046f
- Branch / Tag: refs/heads/main
- Owner: https://github.com/AuralithAI
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@5621935eb3c1f43da123c1fd7e8df765a90f046f
- Trigger Event: push

File details

Details for the file omnivector_embed-0.1.0-py3-none-any.whl.

File metadata

Download URL: omnivector_embed-0.1.0-py3-none-any.whl
Upload date: Mar 20, 2026
Size: 49.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for omnivector_embed-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`caffba33a36a04176a6206849cf2f345d64456c5f14507aabdce5dce82862a57`
MD5	`4cb3c91f09fe16951a146c226f0c17bb`
BLAKE2b-256	`e964e2fbaf204ee4cb349df63fd592e4b31860d4902aecc5b426ddb2100a6ee6`

See more details on using hashes here.

Provenance

The following attestation bundles were made for omnivector_embed-0.1.0-py3-none-any.whl:

Publisher: release.yml on AuralithAI/OmniVector-Embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: omnivector_embed-0.1.0-py3-none-any.whl
- Subject digest: caffba33a36a04176a6206849cf2f345d64456c5f14507aabdce5dce82862a57
- Sigstore transparency entry: 1150016661
- Sigstore integration time: Mar 20, 2026
Source repository:
- Permalink: AuralithAI/OmniVector-Embed@5621935eb3c1f43da123c1fd7e8df765a90f046f
- Branch / Tag: refs/heads/main
- Owner: https://github.com/AuralithAI
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@5621935eb3c1f43da123c1fd7e8df765a90f046f
- Trigger Event: push

omnivector-embed 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

OmniVector-Embed

Installation

Quick Start

Key Features

Architecture

ONNX Export

Evaluation

Training

Stack

Links

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance