Fast, State of the Art Quantized Embedding Models
Project description
⚡️ What is FastEmbed?
FastEmbed is an easy to use -- lightweight, fast, Python library built for retrieval augmented generation. The default embedding supports "query" and "passage" prefixes for the input text.
-
Light
- Quantized model weights
- ONNX Runtime for inference
- No hidden dependencies on PyTorch or TensorFlow via Huggingface Transformers
-
Accuracy/Recall
- Better than OpenAI Ada-002
- Default is Flag Embedding, which is top of the MTEB leaderboard
-
Fast
- About 2x faster than Huggingface (PyTorch) transformers on single queries
- Lot faster for batches!
- ONNX Runtime allows you to use dedicated runtimes for even higher throughput and lower latency
🚀 Installation
To install the FastEmbed library, pip works:
pip install fastembed
📖 Usage
from fastembed.embedding import FlagEmbedding as Embedding
documents: List[str] = [
"passage: Hello, World!",
"query: Hello, World!", # these are two different embedding
"passage: This is an example passage.",
# You can leave out the prefix but it's recommended
"fastembed is supported by and maintained by Qdrant."
]
embedding_model = Embedding(model_name="BAAI/bge-base-en", max_length=512)
embeddings: List[np.ndarray] = list(embedding_model.embed(documents))
🚒 Under the hood
Why fast?
It's important we justify the "fast" in FastEmbed. FastEmbed is fast because:
- Quantized model weights
- ONNX Runtime which allows for inference on CPU, GPU, and other dedicated runtimes
Why light?
- No hidden dependencies on PyTorch or TensorFlow via Huggingface Transformers
Why accurate?
- Better than OpenAI Ada-002
- Top of the Embedding leaderboards e.g. MTEB
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
fastembed-0.0.3.tar.gz
(9.2 kB
view details)
Built Distribution
File details
Details for the file fastembed-0.0.3.tar.gz
.
File metadata
- Download URL: fastembed-0.0.3.tar.gz
- Upload date:
- Size: 9.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.1 CPython/3.10.9 Darwin/22.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e37942a4474bd53919e7178b32f535d512b0e07c08ab07c85ed41257719932a0 |
|
MD5 | 89e3b78f2b5944ecbf14f490d75173eb |
|
BLAKE2b-256 | 90abbab6a5b9949fbaab23923d2c68f0a25437d27868a0d7d723be22a90232c3 |
File details
Details for the file fastembed-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: fastembed-0.0.3-py3-none-any.whl
- Upload date:
- Size: 9.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.1 CPython/3.10.9 Darwin/22.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | de0b4ef7f564e0904989499db32ed29bc89637b115d7b5df7106ffb04cd971cf |
|
MD5 | ad2d4988cc4f4474d766790ac370b9fc |
|
BLAKE2b-256 | a98158de8b7caddc5eebccb361e33d5574712851985890f4f2643c148182c353 |