Skip to main content

Fast ONNX-based text, image, and sparse embeddings for Python. 5-8x faster than fastembed.

Project description

libembedding

Fast ONNX-based text, image, and sparse embeddings for Python. 5-8x faster than fastembed with 3.5x less memory.

Built on a C/C++ backend using ONNX Runtime, exposed to Python via zero-overhead cffi bindings. Supports 44 text embedding models, 5 image models, 2 sparse models, and 4 rerankers with automatic model downloading from HuggingFace Hub.

Installation

pip install libembedding

Requirements: ONNX Runtime must be installed on your system.

# macOS
brew install onnxruntime

# Ubuntu/Debian
apt install libonnxruntime-dev

# Or set ONNXRUNTIME_ROOT to your installation path

Quick Start

Text Embeddings

from libembedding import TextEmbedding

model = TextEmbedding("BAAI/bge-small-en-v1.5")
embeddings = model.embed(["Hello world", "How are you?"])

print(embeddings.shape)  # (2, 384)
print(embeddings.dtype)  # float32

Sparse Embeddings

from libembedding import SparseTextEmbedding

model = SparseTextEmbedding()
results = model.embed(["machine learning algorithms"])

for r in results:
    print(r.indices.shape, r.values.shape)

Image Embeddings

from libembedding import ImageEmbedding

model = ImageEmbedding()
embeddings = model.embed_files(["photo.jpg", "diagram.png"])

Reranking

from libembedding import Reranker

reranker = Reranker("BAAI/bge-reranker-base")
results = reranker.rerank(
    "What is deep learning?",
    [
        "Deep learning uses neural networks with many layers",
        "The weather is sunny today",
        "Neural networks are inspired by biological brains",
    ],
)
for r in results:
    print(f"doc[{r.index}] score={r.score:.4f}")

Model Discovery

import libembedding

for m in libembedding.list_text_models():
    print(f"{m.model_name:45} dim={m.dim:<5} {m.pooling}")

API Reference

TextEmbedding

TextEmbedding(
    model_name="BAAI/bge-small-en-v1.5",  # HuggingFace model name or repo code
    provider="cpu",                         # "cpu", "cuda", "coreml", "directml", "tensorrt"
    device_id=0,
    cache_dir=None,                         # None = ~/.cache/libembedding
    max_length=0,                           # 0 = model default
    num_threads=0,                          # 0 = auto
    show_download_progress=True,
)
Method Returns Description
embed(texts, batch_size=0) np.ndarray (n, dim) L2-normalized dense embeddings
dim int Embedding dimension
close() None Release resources

SparseTextEmbedding

SparseTextEmbedding(model_name="prithvida/SPLADE_PP_en_v1", ...)
Method Returns Description
embed(texts, batch_size=0) list[SparseEmbedding] Sparse vectors with .indices and .values

ImageEmbedding

ImageEmbedding(model_name="Qdrant/clip-ViT-B-32-vision", ...)
Method Returns Description
embed_files(paths, batch_size=0) np.ndarray (n, dim) Embed from file paths
embed_bytes(images, batch_size=0) np.ndarray (n, dim) Embed from raw bytes

Reranker

Reranker(model_name="BAAI/bge-reranker-base", ...)
Method Returns Description
rerank(query, documents, batch_size=0) list[RerankResult] Sorted by score descending

All classes support context managers (with TextEmbedding(...) as model:).

Available Models

44 text models including BGE, MiniLM, Nomic, E5, CLIP, Jina, GTE, Snowflake, ModernBERT (with quantized variants).

5 image models including CLIP ViT-B/32, ResNet-50, Unicom, Nomic Vision.

2 sparse models: SPLADE++, BGE-M3.

4 reranker models: BGE Reranker, Jina Reranker.

Benchmarks

Measured on Apple M-series with all-MiniLM-L6-v2 (384-dim). Median of 10 runs.

Metric libembedding fastembed Speedup
Single text latency (ms) 4.4 38.0 8.6x
Batch 8 (texts/sec) 641 92 7.0x
Batch 32 (texts/sec) 581 89 6.5x
Peak RSS (MB) 567 1,981 3.5x less

Configuration

Environment Variable Purpose
LIBEMBEDDING_CACHE_DIR Override model cache directory
FASTEMBED_CACHE_DIR Alternative cache dir (fastembed compatibility)
HF_ENDPOINT Custom HuggingFace Hub endpoint

Building & Publishing

Prerequisites

pip install build twine

Build the shared library + wheel

cd python/

# Step 1: Build the C/C++ shared library and copy it into the package
./setup.sh --build-only

# Step 2: Build sdist and wheel
python -m build

This produces files in dist/:

dist/
  libembedding-0.1.0.tar.gz              # source distribution
  libembedding-0.1.0-py3-none-any.whl    # wheel (includes bundled .dylib/.so)

Upload to PyPI

# Upload to TestPyPI first to verify
twine upload --repository testpypi dist/*

# Install from TestPyPI to verify
pip install --index-url https://test.pypi.org/simple/ libembedding

# Upload to production PyPI
twine upload dist/*

One-liner (build + upload)

./setup.sh --build-only && python -m build && twine upload dist/*

Platform-specific wheels

The default wheel is py3-none-any and bundles the shared library for the build platform. To build platform-tagged wheels for distribution:

# macOS (current arch)
./setup.sh --build-only
python -m build

# For other platforms, build on that platform or use cibuildwheel:
pip install cibuildwheel
cibuildwheel --platform linux   # builds manylinux wheels
cibuildwheel --platform macos   # builds macOS wheels

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

libembedding-0.1.1.tar.gz (172.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

libembedding-0.1.1-py3-none-any.whl (172.3 kB view details)

Uploaded Python 3

File details

Details for the file libembedding-0.1.1.tar.gz.

File metadata

  • Download URL: libembedding-0.1.1.tar.gz
  • Upload date:
  • Size: 172.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for libembedding-0.1.1.tar.gz
Algorithm Hash digest
SHA256 814162a2376ee5f6f4c56be9c03f0aca8701ae78b7f556037dfcaa3b5c2b0fa2
MD5 c8431602e864bfd9f48acb92bfb98735
BLAKE2b-256 ef5f1fe35d5bc3c21e355ac652a194bdcffdc72cc5c7fa9171944d06f37e4003

See more details on using hashes here.

File details

Details for the file libembedding-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: libembedding-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 172.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for libembedding-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 71e9dfadd24c29a437b8ca750793daede418152be7bae260e3c389f7b203b34f
MD5 1638587085fe9e9c54da0e2f3b2b1da6
BLAKE2b-256 35b05cf5b224e777bb74b85e9eac6988242b2a8ae03adb957af668e8a87914b1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page