Fast ONNX-based text, image, and sparse embeddings for Python. 5-8x faster than fastembed.

These details have not been verified by PyPI

Project links

Project description

libembedding

Fast ONNX-based text, image, and sparse embeddings for Python. 5-8x faster than fastembed with 3.5x less memory.

Built on a C/C++ backend using ONNX Runtime, exposed to Python via zero-overhead cffi bindings. Supports 44 text embedding models, 5 image models, 2 sparse models, and 4 rerankers with automatic model downloading from HuggingFace Hub.

Installation

pip install libembedding

Requirements: ONNX Runtime must be installed on your system.

# macOS
brew install onnxruntime

# Ubuntu/Debian
apt install libonnxruntime-dev

# Or set ONNXRUNTIME_ROOT to your installation path

Quick Start

Text Embeddings

from libembedding import TextEmbedding

model = TextEmbedding("BAAI/bge-small-en-v1.5")
embeddings = model.embed(["Hello world", "How are you?"])

print(embeddings.shape)  # (2, 384)
print(embeddings.dtype)  # float32

Sparse Embeddings

from libembedding import SparseTextEmbedding

model = SparseTextEmbedding()
results = model.embed(["machine learning algorithms"])

for r in results:
    print(r.indices.shape, r.values.shape)

Image Embeddings

from libembedding import ImageEmbedding

model = ImageEmbedding()
embeddings = model.embed_files(["photo.jpg", "diagram.png"])

Reranking

from libembedding import Reranker

reranker = Reranker("BAAI/bge-reranker-base")
results = reranker.rerank(
    "What is deep learning?",
    [
        "Deep learning uses neural networks with many layers",
        "The weather is sunny today",
        "Neural networks are inspired by biological brains",
    ],
)
for r in results:
    print(f"doc[{r.index}] score={r.score:.4f}")

Model Discovery

import libembedding

for m in libembedding.list_text_models():
    print(f"{m.model_name:45} dim={m.dim:<5} {m.pooling}")

API Reference

TextEmbedding

TextEmbedding(
    model_name="BAAI/bge-small-en-v1.5",  # HuggingFace model name or repo code
    provider="cpu",                         # "cpu", "cuda", "coreml", "directml", "tensorrt"
    device_id=0,
    cache_dir=None,                         # None = ~/.cache/libembedding
    max_length=0,                           # 0 = model default
    num_threads=0,                          # 0 = auto
    show_download_progress=True,
)

Method	Returns	Description
`embed(texts, batch_size=0)`	`np.ndarray (n, dim)`	L2-normalized dense embeddings
`dim`	`int`	Embedding dimension
`close()`	`None`	Release resources

SparseTextEmbedding

SparseTextEmbedding(model_name="prithvida/SPLADE_PP_en_v1", ...)

Method	Returns	Description
`embed(texts, batch_size=0)`	`list[SparseEmbedding]`	Sparse vectors with `.indices` and `.values`

ImageEmbedding

ImageEmbedding(model_name="Qdrant/clip-ViT-B-32-vision", ...)

Method	Returns	Description
`embed_files(paths, batch_size=0)`	`np.ndarray (n, dim)`	Embed from file paths
`embed_bytes(images, batch_size=0)`	`np.ndarray (n, dim)`	Embed from raw bytes

Reranker

Reranker(model_name="BAAI/bge-reranker-base", ...)

Method	Returns	Description
`rerank(query, documents, batch_size=0)`	`list[RerankResult]`	Sorted by score descending

All classes support context managers (with TextEmbedding(...) as model:).

Available Models

44 text models including BGE, MiniLM, Nomic, E5, CLIP, Jina, GTE, Snowflake, ModernBERT (with quantized variants).

5 image models including CLIP ViT-B/32, ResNet-50, Unicom, Nomic Vision.

2 sparse models: SPLADE++, BGE-M3.

4 reranker models: BGE Reranker, Jina Reranker.

Benchmarks

Measured on Apple M-series with all-MiniLM-L6-v2 (384-dim). Median of 10 runs.

Metric	libembedding	fastembed	Speedup
Single text latency (ms)	4.4	38.0	8.6x
Batch 8 (texts/sec)	641	92	7.0x
Batch 32 (texts/sec)	581	89	6.5x
Peak RSS (MB)	567	1,981	3.5x less

Configuration

Environment Variable	Purpose
`LIBEMBEDDING_CACHE_DIR`	Override model cache directory
`FASTEMBED_CACHE_DIR`	Alternative cache dir (fastembed compatibility)
`HF_ENDPOINT`	Custom HuggingFace Hub endpoint

Building & Publishing

Prerequisites

pip install build twine

Build the shared library + wheel

cd python/

# Step 1: Build the C/C++ shared library and copy it into the package
./setup.sh --build-only

# Step 2: Build sdist and wheel
python -m build

This produces files in dist/:

dist/
  libembedding-0.1.0.tar.gz              # source distribution
  libembedding-0.1.0-py3-none-any.whl    # wheel (includes bundled .dylib/.so)

Upload to PyPI

# Upload to TestPyPI first to verify
twine upload --repository testpypi dist/*

# Install from TestPyPI to verify
pip install --index-url https://test.pypi.org/simple/ libembedding

# Upload to production PyPI
twine upload dist/*

One-liner (build + upload)

./setup.sh --build-only && python -m build && twine upload dist/*

Platform-specific wheels

The default wheel is py3-none-any and bundles the shared library for the build platform. To build platform-tagged wheels for distribution:

# macOS (current arch)
./setup.sh --build-only
python -m build

# For other platforms, build on that platform or use cibuildwheel:
pip install cibuildwheel
cibuildwheel --platform linux   # builds manylinux wheels
cibuildwheel --platform macos   # builds macOS wheels

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Apr 1, 2026

0.1.0

Apr 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

libembedding-0.1.1.tar.gz (172.9 kB view details)

Uploaded Apr 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

libembedding-0.1.1-py3-none-any.whl (172.3 kB view details)

Uploaded Apr 1, 2026 Python 3

File details

Details for the file libembedding-0.1.1.tar.gz.

File metadata

Download URL: libembedding-0.1.1.tar.gz
Upload date: Apr 1, 2026
Size: 172.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for libembedding-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`814162a2376ee5f6f4c56be9c03f0aca8701ae78b7f556037dfcaa3b5c2b0fa2`
MD5	`c8431602e864bfd9f48acb92bfb98735`
BLAKE2b-256	`ef5f1fe35d5bc3c21e355ac652a194bdcffdc72cc5c7fa9171944d06f37e4003`

See more details on using hashes here.

File details

Details for the file libembedding-0.1.1-py3-none-any.whl.

File metadata

Download URL: libembedding-0.1.1-py3-none-any.whl
Upload date: Apr 1, 2026
Size: 172.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for libembedding-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`71e9dfadd24c29a437b8ca750793daede418152be7bae260e3c389f7b203b34f`
MD5	`1638587085fe9e9c54da0e2f3b2b1da6`
BLAKE2b-256	`35b05cf5b224e777bb74b85e9eac6988242b2a8ae03adb957af668e8a87914b1`

See more details on using hashes here.

libembedding 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

libembedding

Installation

Quick Start

Text Embeddings

Sparse Embeddings

Image Embeddings

Reranking

Model Discovery

API Reference

TextEmbedding

SparseTextEmbedding

ImageEmbedding

Reranker

Available Models

Benchmarks

Configuration

Building & Publishing

Prerequisites

Build the shared library + wheel

Upload to PyPI

One-liner (build + upload)

Platform-specific wheels

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes