Fast, State of the Art Quantized Embedding Models
Project description
⚡️ What is FastEmbed?
FastEmbed is an easy to use -- lightweight, fast, Python library built for retrieval augmented generation. The default embedding supports "query" and "passage" prefixes for the input text.
-
Light
- Quantized model weights
- ONNX Runtime for inference
- No hidden dependencies on PyTorch or TensorFlow via Huggingface Transformers
-
Accuracy/Recall
- Better than OpenAI Ada-002
- Default is Flag Embedding, which is top of the MTEB leaderboard
-
Fast
- About 2x faster than Huggingface (PyTorch) transformers on single queries
- Lot faster for batches!
- ONNX Runtime allows you to use dedicated runtimes for even higher throughput and lower latency
🚀 Installation
To install the FastEmbed library, pip works:
pip install fastembed
📖 Usage
from fastembed.embedding import FlagEmbedding as Embedding
documents: List[str] = [
"passage: Hello, World!",
"query: Hello, World!", # these are two different embedding
"passage: This is an example passage.",
# You can leave out the prefix but it's recommended
"fastembed is supported by and maintained by Qdrant."
]
embedding_model = Embedding(model_name="BAAI/bge-base-en", max_length=512)
embeddings: List[np.ndarray] = list(embedding_model.embed(documents))
🚒 Under the hood
Why fast?
It's important we justify the "fast" in FastEmbed. FastEmbed is fast because:
- Quantized model weights
- ONNX Runtime which allows for inference on CPU, GPU, and other dedicated runtimes
Why light?
- No hidden dependencies on PyTorch or TensorFlow via Huggingface Transformers
Why accurate?
- Better than OpenAI Ada-002
- Top of the Embedding leaderboards e.g. MTEB
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
fastembed-0.0.4.tar.gz
(9.4 kB
view hashes)
Built Distribution
Close
Hashes for fastembed-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3ea83bf164f4e2875e53b2795ce79986d99f8ecbc62bfb26d82592fbe87e0858 |
|
MD5 | 54452d8b50b6c0577477231b406529b4 |
|
BLAKE2b-256 | 632e9f4697bf6ef11f4aa124474dc297f5ff4434c381008e32b2e8ab0bb61ed7 |