Fast, State of the Art Quantized Embedding Models
Project description
⚡️ What is FastEmbed?
FastEmbed is an easy to use -- lightweight, fast, Python library built for retrieval augmented generation. The default embedding supports "query" and "passage" prefixes for the input text.
-
Light
- Quantized model weights
- ONNX Runtime for inference
- No hidden dependencies on PyTorch or TensorFlow via Huggingface Transformers
-
Accuracy/Recall
- Better than OpenAI Ada-002
- Default is Flag Embedding, which is top of the MTEB leaderboard
-
Fast
- About 2x faster than Huggingface (PyTorch) transformers on single queries
- Lot faster for batches!
- ONNX Runtime allows you to use dedicated runtimes for even higher throughput and lower latency
🚀 Installation
To install the FastEmbed library, pip works:
pip install fastembed
📖 Usage
from fastembed.embedding import FlagEmbedding as Embedding
documents: List[str] = [
"passage: Hello, World!",
"query: Hello, World!", # these are two different embedding
"passage: This is an example passage.",
# You can leave out the prefix but it's recommended
"fastembed is supported by and maintained by Qdrant."
]
embedding_model = Embedding(model_name="BAAI/bge-base-en", max_length=512)
embeddings: List[np.ndarray] = list(embedding_model.embed(documents))
🚒 Under the hood
Why fast?
It's important we justify the "fast" in FastEmbed. FastEmbed is fast because:
- Quantized model weights
- ONNX Runtime which allows for inference on CPU, GPU, and other dedicated runtimes
Why light?
- No hidden dependencies on PyTorch or TensorFlow via Huggingface Transformers
Why accurate?
- Better than OpenAI Ada-002
- Top of the Embedding leaderboards e.g. MTEB
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
fastembed-0.0.3a1.tar.gz
(9.2 kB
view hashes)
Built Distribution
Close
Hashes for fastembed-0.0.3a1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7975160610025cfce45396d7ef7a1befd410bea2da54aef7b7301c47a0965ed2 |
|
MD5 | 7edf1a4443b2d8068972edf508836153 |
|
BLAKE2b-256 | f29a8211f081b3aecb7ad4278ac88178378f8198ff42cfa9376159ae444f86e6 |