Skip to main content

Fork from fastembed extended with BM25s for sparse embeddings. From fastembed: Fast, light, accurate library built for retrieval embedding generation

Project description

⚡️ What is FastEmbed?

FastEmbed is a lightweight, fast, Python library built for embedding generation. We support popular text models. Please open a GitHub issue if you want us to add a new model.

The default text embedding (TextEmbedding) model is Flag Embedding, presented in the MTEB leaderboard. It supports "query" and "passage" prefixes for the input text. Here is an example for Retrieval Embedding Generation and how to use FastEmbed with Qdrant.

📈 Why FastEmbed?

  1. Light: FastEmbed is a lightweight library with few external dependencies. We don't require a GPU and don't download GBs of PyTorch dependencies, and instead use the ONNX Runtime. This makes it a great candidate for serverless runtimes like AWS Lambda.

  2. Fast: FastEmbed is designed for speed. We use the ONNX Runtime, which is faster than PyTorch. We also use data parallelism for encoding large datasets.

  3. Accurate: FastEmbed is better than OpenAI Ada-002. We also support an ever-expanding set of models, including a few multilingual models.

🚀 Installation

To install the FastEmbed library, pip works best. You can install it with or without GPU support:

pip install fastembed

# or with GPU support

pip install fastembed-gpu

📖 Quickstart

from fastembed import TextEmbedding


# Example list of documents
documents: list[str] = [
    "This is built to be faster and lighter than other embedding libraries e.g. Transformers, Sentence-Transformers, etc.",
    "fastembed is supported by and maintained by Qdrant.",
]

# This will trigger the model download and initialization
embedding_model = TextEmbedding()
print("The model BAAI/bge-small-en-v1.5 is ready to use.")

embeddings_generator = embedding_model.embed(documents)  # reminder this is a generator
embeddings_list = list(embedding_model.embed(documents))
  # you can also convert the generator to a list, and that to a numpy array
len(embeddings_list[0]) # Vector of 384 dimensions

Fastembed supports a variety of models for different tasks and modalities. The list of all the available models can be found here

🎒 Dense text embeddings

from fastembed import TextEmbedding

model = TextEmbedding(model_name="BAAI/bge-small-en-v1.5")
embeddings = list(model.embed(documents))

# [
#   array([-0.1115,  0.0097,  0.0052,  0.0195, ...], dtype=float32),
#   array([-0.1019,  0.0635, -0.0332,  0.0522, ...], dtype=float32)
# ]

🔱 Sparse text embeddings

  • SPLADE++
from fastembed import SparseTextEmbedding

model = SparseTextEmbedding(model_name="prithivida/Splade_PP_en_v1")
embeddings = list(model.embed(documents))

# [
#   SparseEmbedding(indices=[ 17, 123, 919, ... ], values=[0.71, 0.22, 0.39, ...]),
#   SparseEmbedding(indices=[ 38,  12,  91, ... ], values=[0.11, 0.22, 0.39, ...])
# ]

🦥 Late interaction models (aka ColBERT)

from fastembed import LateInteractionTextEmbedding

model = LateInteractionTextEmbedding(model_name="colbert-ir/colbertv2.0")
embeddings = list(model.embed(documents))

# [
#   array([
#       [-0.1115,  0.0097,  0.0052,  0.0195, ...],
#       [-0.1019,  0.0635, -0.0332,  0.0522, ...],
#   ]),
#   array([
#       [-0.9019,  0.0335, -0.0032,  0.0991, ...],
#       [-0.2115,  0.8097,  0.1052,  0.0195, ...],
#   ]),  
# ]

🖼️ Image embeddings

from fastembed import ImageEmbedding

images = [
    "./path/to/image1.jpg",
    "./path/to/image2.jpg",
]

model = ImageEmbedding(model_name="Qdrant/clip-ViT-B-32-vision")
embeddings = list(model.embed(images))

# [
#   array([-0.1115,  0.0097,  0.0052,  0.0195, ...], dtype=float32),
#   array([-0.1019,  0.0635, -0.0332,  0.0522, ...], dtype=float32)
# ]

🔄 Rerankers

from fastembed.rerank.cross_encoder import TextCrossEncoder

query = "Who is maintaining Qdrant?"
documents: list[str] = [
    "This is built to be faster and lighter than other embedding libraries e.g. Transformers, Sentence-Transformers, etc.",
    "fastembed is supported by and maintained by Qdrant.",
]
encoder = TextCrossEncoder(model_name="Xenova/ms-marco-MiniLM-L-6-v2")
scores = list(encoder.rerank(query, documents))

# [-11.48061752319336, 5.472434997558594]

⚡️ FastEmbed on a GPU

FastEmbed supports running on GPU devices. It requires installation of the fastembed-gpu package.

pip install fastembed-gpu

Check our example for detailed instructions, CUDA 12.x support and troubleshooting of the common issues.

from fastembed import TextEmbedding

embedding_model = TextEmbedding(
    model_name="BAAI/bge-small-en-v1.5", 
    providers=["CUDAExecutionProvider"]
)
print("The model BAAI/bge-small-en-v1.5 is ready to use on a GPU.")

Usage with Qdrant

Installation with Qdrant Client in Python:

pip install qdrant-client[fastembed]

or

pip install qdrant-client[fastembed-gpu]

You might have to use quotes pip install 'qdrant-client[fastembed]' on zsh.

from qdrant_client import QdrantClient

# Initialize the client
client = QdrantClient("localhost", port=6333) # For production
# client = QdrantClient(":memory:") # For small experiments

# Prepare your documents, metadata, and IDs
docs = ["Qdrant has Langchain integrations", "Qdrant also has Llama Index integrations"]
metadata = [
    {"source": "Langchain-docs"},
    {"source": "Llama-index-docs"},
]
ids = [42, 2]

# If you want to change the model:
# client.set_model("sentence-transformers/all-MiniLM-L6-v2")
# List of supported models: https://qdrant.github.io/fastembed/examples/Supported_Models

# Use the new add() instead of upsert()
# This internally calls embed() of the configured embedding model
client.add(
    collection_name="demo_collection",
    documents=docs,
    metadata=metadata,
    ids=ids
)

search_result = client.query(
    collection_name="demo_collection",
    query_text="This is a query document"
)
print(search_result)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

itk_bm25s_extended_fastembed-0.1.2.tar.gz (47.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

itk_bm25s_extended_fastembed-0.1.2-py3-none-any.whl (78.5 kB view details)

Uploaded Python 3

File details

Details for the file itk_bm25s_extended_fastembed-0.1.2.tar.gz.

File metadata

  • Download URL: itk_bm25s_extended_fastembed-0.1.2.tar.gz
  • Upload date:
  • Size: 47.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.1 CPython/3.11.2 Linux/5.15.153.1-microsoft-standard-WSL2

File hashes

Hashes for itk_bm25s_extended_fastembed-0.1.2.tar.gz
Algorithm Hash digest
SHA256 6694831fb63ddc6ad22547432023c6c00a24ac4156e9e8bdbe6b8457fe9e4c0c
MD5 10522d15cec9ec3a96e990eef9cff33c
BLAKE2b-256 a36ef43a02150b9b31c8cc1175413e6d0a6a76fc0843a96d7277fb671aea42f3

See more details on using hashes here.

File details

Details for the file itk_bm25s_extended_fastembed-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for itk_bm25s_extended_fastembed-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 39cb5adb94f25c3e23b79385365d0bc7760fe82e0fbedb97ecf39ac0d80955de
MD5 9e040e8f64e61ee4981b9a2708c42b84
BLAKE2b-256 52ac0e7d6080df1616dbf5189b2b21c02c3677af09421dd8df6e63ea16cb855c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page