Skip to main content

Fast ONNX-based text, image, and sparse embeddings for Python. 5-8x faster than fastembed.

Project description

libembedding

Fast ONNX-based text, image, and sparse embeddings for Python. 5-8x faster than fastembed with 3.5x less memory.

Built on a C/C++ backend using ONNX Runtime, exposed to Python via zero-overhead cffi bindings. Supports 44 text embedding models, 5 image models, 2 sparse models, and 4 rerankers with automatic model downloading from HuggingFace Hub.

Installation

pip install libembedding

Requirements: ONNX Runtime must be installed on your system.

# macOS
brew install onnxruntime

# Ubuntu/Debian
apt install libonnxruntime-dev

# Or set ONNXRUNTIME_ROOT to your installation path

Quick Start

Text Embeddings

from libembedding import TextEmbedding

model = TextEmbedding("BAAI/bge-small-en-v1.5")
embeddings = model.embed(["Hello world", "How are you?"])

print(embeddings.shape)  # (2, 384)
print(embeddings.dtype)  # float32

Sparse Embeddings

from libembedding import SparseTextEmbedding

model = SparseTextEmbedding()
results = model.embed(["machine learning algorithms"])

for r in results:
    print(r.indices.shape, r.values.shape)

Image Embeddings

from libembedding import ImageEmbedding

model = ImageEmbedding()
embeddings = model.embed_files(["photo.jpg", "diagram.png"])

Reranking

from libembedding import Reranker

reranker = Reranker("BAAI/bge-reranker-base")
results = reranker.rerank(
    "What is deep learning?",
    [
        "Deep learning uses neural networks with many layers",
        "The weather is sunny today",
        "Neural networks are inspired by biological brains",
    ],
)
for r in results:
    print(f"doc[{r.index}] score={r.score:.4f}")

Model Discovery

import libembedding

for m in libembedding.list_text_models():
    print(f"{m.model_name:45} dim={m.dim:<5} {m.pooling}")

API Reference

TextEmbedding

TextEmbedding(
    model_name="BAAI/bge-small-en-v1.5",  # HuggingFace model name or repo code
    provider="cpu",                         # "cpu", "cuda", "coreml", "directml", "tensorrt"
    device_id=0,
    cache_dir=None,                         # None = ~/.cache/libembedding
    max_length=0,                           # 0 = model default
    num_threads=0,                          # 0 = auto
    show_download_progress=True,
)
Method Returns Description
embed(texts, batch_size=0) np.ndarray (n, dim) L2-normalized dense embeddings
dim int Embedding dimension
close() None Release resources

SparseTextEmbedding

SparseTextEmbedding(model_name="prithvida/SPLADE_PP_en_v1", ...)
Method Returns Description
embed(texts, batch_size=0) list[SparseEmbedding] Sparse vectors with .indices and .values

ImageEmbedding

ImageEmbedding(model_name="Qdrant/clip-ViT-B-32-vision", ...)
Method Returns Description
embed_files(paths, batch_size=0) np.ndarray (n, dim) Embed from file paths
embed_bytes(images, batch_size=0) np.ndarray (n, dim) Embed from raw bytes

Reranker

Reranker(model_name="BAAI/bge-reranker-base", ...)
Method Returns Description
rerank(query, documents, batch_size=0) list[RerankResult] Sorted by score descending

All classes support context managers (with TextEmbedding(...) as model:).

Available Models

44 text models including BGE, MiniLM, Nomic, E5, CLIP, Jina, GTE, Snowflake, ModernBERT (with quantized variants).

5 image models including CLIP ViT-B/32, ResNet-50, Unicom, Nomic Vision.

2 sparse models: SPLADE++, BGE-M3.

4 reranker models: BGE Reranker, Jina Reranker.

Benchmarks

Measured on Apple M-series with all-MiniLM-L6-v2 (384-dim). Median of 10 runs.

Metric libembedding fastembed Speedup
Single text latency (ms) 4.4 38.0 8.6x
Batch 8 (texts/sec) 641 92 7.0x
Batch 32 (texts/sec) 581 89 6.5x
Peak RSS (MB) 567 1,981 3.5x less

Configuration

Environment Variable Purpose
LIBEMBEDDING_CACHE_DIR Override model cache directory
FASTEMBED_CACHE_DIR Alternative cache dir (fastembed compatibility)
HF_ENDPOINT Custom HuggingFace Hub endpoint

Building & Publishing

Prerequisites

pip install build twine

Build the shared library + wheel

cd python/

# Step 1: Build the C/C++ shared library and copy it into the package
./setup.sh --build-only

# Step 2: Build sdist and wheel
python -m build

This produces files in dist/:

dist/
  libembedding-0.1.0.tar.gz              # source distribution
  libembedding-0.1.0-py3-none-any.whl    # wheel (includes bundled .dylib/.so)

Upload to PyPI

# Upload to TestPyPI first to verify
twine upload --repository testpypi dist/*

# Install from TestPyPI to verify
pip install --index-url https://test.pypi.org/simple/ libembedding

# Upload to production PyPI
twine upload dist/*

One-liner (build + upload)

./setup.sh --build-only && python -m build && twine upload dist/*

Platform-specific wheels

The default wheel is py3-none-any and bundles the shared library for the build platform. To build platform-tagged wheels for distribution:

# macOS (current arch)
./setup.sh --build-only
python -m build

# For other platforms, build on that platform or use cibuildwheel:
pip install cibuildwheel
cibuildwheel --platform linux   # builds manylinux wheels
cibuildwheel --platform macos   # builds macOS wheels

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

libembedding-0.1.0.tar.gz (172.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

libembedding-0.1.0-py3-none-any.whl (172.3 kB view details)

Uploaded Python 3

File details

Details for the file libembedding-0.1.0.tar.gz.

File metadata

  • Download URL: libembedding-0.1.0.tar.gz
  • Upload date:
  • Size: 172.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for libembedding-0.1.0.tar.gz
Algorithm Hash digest
SHA256 396d0e3d85865fcc84fcee835cfdc7969271a01a5d43e7649be2a8e364b9cd26
MD5 6cd051a3266c62c215084b473b4ad78a
BLAKE2b-256 ddf1d0530357f45c763723512fb3c44a43f8197e20b5089bffacdc4fe80a4d08

See more details on using hashes here.

File details

Details for the file libembedding-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: libembedding-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 172.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for libembedding-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2b587779fd5861b21f51e62644653f4992a7e00756a18375a4d8f08930b7259d
MD5 d4687219df97fc21e3813d5b6ed9e431
BLAKE2b-256 93defc0f7eea36e5377fa25137c1ca0ce0bcca180600896bf8f5d4484e349ed1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page