Skip to main content

Haystack 2.x component to embed strings and Documents using fastembed embedding model

Project description

fastembed-haystack

PyPI - Version PyPI - Python Version


Table of Contents

Installation

pip install fastembed-haystack

Usage

You can use FastembedTextEmbedder and FastembedDocumentEmbedder by importing as:

from haystack_integrations.components.embedders.fastembed import FastembedTextEmbedder

text = "fastembed is supported by and maintained by Qdrant."
text_embedder = FastembedTextEmbedder(
    model="BAAI/bge-small-en-v1.5"
)
text_embedder.warm_up()
embedding = text_embedder.run(text)["embedding"]
from haystack_integrations.components.embedders.fastembed import FastembedDocumentEmbedder
from haystack import Document

embedder = FastembedDocumentEmbedder(
    model="BAAI/bge-small-en-v1.5",
)
embedder.warm_up()
doc = Document(content="fastembed is supported by and maintained by Qdrant.", meta={"long_answer": "no",})
result = embedder.run(documents=[doc])

You can use FastembedSparseTextEmbedder and FastembedSparseDocumentEmbedder by importing as:

from haystack_integrations.components.embedders.fastembed import FastembedSparseTextEmbedder

text = "fastembed is supported by and maintained by Qdrant."
text_embedder = FastembedSparseTextEmbedder(
    model="prithivida/Splade_PP_en_v1"
)
text_embedder.warm_up()
embedding = text_embedder.run(text)["sparse_embedding"]
from haystack_integrations.components.embedders.fastembed import FastembedSparseDocumentEmbedder
from haystack import Document

embedder = FastembedSparseDocumentEmbedder(
    model="prithivida/Splade_PP_en_v1",
)
embedder.warm_up()
doc = Document(content="fastembed is supported by and maintained by Qdrant.", meta={"long_answer": "no",})
result = embedder.run(documents=[doc])

You can use FastembedRanker by importing as:

from haystack import Document

from haystack_integrations.components.rankers.fastembed import FastembedRanker

query = "Who is maintaining Qdrant?"
documents = [
    Document(
        content="This is built to be faster and lighter than other embedding libraries e.g. Transformers, Sentence-Transformers, etc."
    ),
    Document(content="fastembed is supported by and maintained by Qdrant."),
]

ranker = FastembedRanker(model_name="Xenova/ms-marco-MiniLM-L-6-v2")
ranker.warm_up()
reranked_documents = ranker.run(query=query, documents=documents)["documents"]

print(reranked_documents[0])

# Document(id=...,
#  content: 'fastembed is supported by and maintained by Qdrant.',
#  score: 5.472434997558594..)

License

fastembed-haystack is distributed under the terms of the Apache-2.0 license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastembed_haystack-1.5.0.tar.gz (20.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fastembed_haystack-1.5.0-py3-none-any.whl (20.3 kB view details)

Uploaded Python 3

File details

Details for the file fastembed_haystack-1.5.0.tar.gz.

File metadata

  • Download URL: fastembed_haystack-1.5.0.tar.gz
  • Upload date:
  • Size: 20.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.28.1

File hashes

Hashes for fastembed_haystack-1.5.0.tar.gz
Algorithm Hash digest
SHA256 0c0afa8d4067ae25996d70a38fd59b61ac725d2d82b2ced146dfe4ff4550bef9
MD5 d425b8588683453d1eae27f3bfecc7b8
BLAKE2b-256 f22c99fcbdd20554a6747687bd5395f1991f0b08c77ea8b0ff42f06fef49b041

See more details on using hashes here.

File details

Details for the file fastembed_haystack-1.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for fastembed_haystack-1.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cca5dfd855fb05a99d7f2aa919aaef2d676eb53e5b9237028b3b48d730c4eedb
MD5 5ca24e3faba274350be09483da5b6474
BLAKE2b-256 2d8f77dde365584cb3bc39b0e4b461b238b5e459a9e8178530356f60954f3f1e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page