Skip to main content

No project description provided

Project description

Marqo Document Store for Haystack

PyPI - Version PyPI - Python Version test


Table of Contents

Installation

pip install marqo-haystack

About

This is a document store integration for Marqo with haystack.

Marqo is an end-to-end vector search engine which includes preprocessing and inference to generate vectors from your data. You can use pre-trained models or bring finetuned ones.

Haystack is an end-to-end NLP framework that enables you to build applications powered by LLMs, with haystack you can build end-to-end NLP applications solving your use case using state-of-the-art models.

Examples

from marqo_haystack import MarqoDocumentStore
 
document_store = MarqoDocumentStore()

You can find a code example showing how to use the Document Store and the Retriever under the example/ folder of this repo.

Usage

For documentation on Marqo itself, please refer to the documentation.

You can use the MarqoDocumentStore in your haystack pipelines for single queries like so:

from marqo_haystack import MarqoDocumentStore
from marqo_haystack.retriever import MarqoSingleRetriever

document_store = MarqoDocumentStore()

querying = Pipeline()
querying.add_component("retriever", MarqoSingleRetriever(document_store))
results = querying.run({"retriever": {"query": "Is black and white text boring?", "top_k": 3}})

Or for a list of queries:

from marqo_haystack import MarqoDocumentStore
from marqo_haystack.retriever import MarqoRetriever

document_store = MarqoDocumentStore()

querying = Pipeline()
querying.add_component("retriever", MarqoRetriever(document_store))
results = querying.run({"retriever": {"queries": ["Is black and white text boring?"], "top_k": 3}})

Using Locally

If you specify a collection_name that doesn't exist as a Marqo index then one will be created for you.

from marqo_haystack import MarqoDocumentStore

# Use an existing index (if my-index does exist)
document_store = MarqoDocumentStore(collection_name="my-index")

# Use an existing index (if my-new-index doesn't exist)
document_store = MarqoDocumentStore(collection_name="my-new-index")

# Use the default index name, 'documents'. One will be created if it doesn't exist.
document_store = MarqoDocumentStore()

You can also pass in settings for the index created by the API by passing a dictionary to the settings_dict parameter. For details on the settings object please refer to the Marqo docs.

In this example we specify that the index should use the e5-large-v2 model and increase the ef_construction parameter to 512 for the HNSW graph construction.

from marqo_haystack import MarqoDocumentStore

index_settings = {
    "index_defaults": {
        "model": "hf/e5-large-v2",
        "ann_parameters" : {
            "parameters": {
                "ef_construction": 512
            }
        }
    }
}

document_store = MarqoDocumentStore(settings_dict=index_settings)

Using with Marqo Cloud

This integration can also be used with Marqo Cloud. You can sign up or access you Marqo Cloud account here.

To use Marqo Cloud with this integration you will need to pass the collection_name (index name), url (https://api.marqo.ai), and api_key into the constructor.

Note that when using this integration with Marqo Cloud you will need to have already created an index in your Marqo Cloud account.

from marqo_haystack import MarqoDocumentStore
 
document_store = MarqoDocumentStore(
    url="https://api.marqo.ai",
    api_key="XXXXXXXXXXXXX",
    collection_name="my-cloud-index"
)

License

marqo-haystack is distributed under the terms of the Apache-2.0 license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

marqo_haystack-0.1.0.tar.gz (237.3 kB view hashes)

Uploaded Source

Built Distribution

marqo_haystack-0.1.0-py3-none-any.whl (12.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page