Skip to main content

No project description provided

Project description

Marqo Document Store for Haystack

PyPI - Version PyPI - Python Version test


Table of Contents

Installation

pip install marqo-haystack

About

This is a document store integration for Marqo with haystack.

Marqo is an end-to-end vector search engine which includes preprocessing and inference to generate vectors from your data. You can use pre-trained models or bring finetuned ones.

Haystack is an end-to-end NLP framework that enables you to build applications powered by LLMs, with haystack you can build end-to-end NLP applications solving your use case using state-of-the-art models.

Examples

from marqo_haystack import MarqoDocumentStore
 
document_store = MarqoDocumentStore()

You can find a code example showing how to use the Document Store and the Retriever under the example/ folder of this repo.

Usage

For documentation on Marqo itself, please refer to the documentation.

You can use the MarqoDocumentStore in your haystack pipelines for single queries like so:

from marqo_haystack import MarqoDocumentStore
from marqo_haystack.retriever import MarqoSingleRetriever

document_store = MarqoDocumentStore()

querying = Pipeline()
querying.add_component("retriever", MarqoSingleRetriever(document_store))
results = querying.run({"retriever": {"query": "Is black and white text boring?", "top_k": 3}})

Or for a list of queries:

from marqo_haystack import MarqoDocumentStore
from marqo_haystack.retriever import MarqoRetriever

document_store = MarqoDocumentStore()

querying = Pipeline()
querying.add_component("retriever", MarqoRetriever(document_store))
results = querying.run({"retriever": {"queries": ["Is black and white text boring?"], "top_k": 3}})

Using Locally

If you specify a collection_name that doesn't exist as a Marqo index then one will be created for you.

from marqo_haystack import MarqoDocumentStore

# Use an existing index (if my-index does exist)
document_store = MarqoDocumentStore(collection_name="my-index")

# Use an existing index (if my-new-index doesn't exist)
document_store = MarqoDocumentStore(collection_name="my-new-index")

# Use the default index name, 'documents'. One will be created if it doesn't exist.
document_store = MarqoDocumentStore()

You can also pass in settings for the index created by the API by passing a dictionary to the settings_dict parameter. For details on the settings object please refer to the Marqo docs.

In this example we specify that the index should use the e5-large-v2 model and increase the ef_construction parameter to 512 for the HNSW graph construction.

from marqo_haystack import MarqoDocumentStore

index_settings = {
    "index_defaults": {
        "model": "hf/e5-large-v2",
        "ann_parameters" : {
            "parameters": {
                "ef_construction": 512
            }
        }
    }
}

document_store = MarqoDocumentStore(settings_dict=index_settings)

Using with Marqo Cloud

This integration can also be used with Marqo Cloud. You can sign up or access you Marqo Cloud account here.

To use Marqo Cloud with this integration you will need to pass the collection_name (index name), url (https://api.marqo.ai), and api_key into the constructor.

Note that when using this integration with Marqo Cloud you will need to have already created an index in your Marqo Cloud account.

from marqo_haystack import MarqoDocumentStore
 
document_store = MarqoDocumentStore(
    url="https://api.marqo.ai",
    api_key="XXXXXXXXXXXXX",
    collection_name="my-cloud-index"
)

License

marqo-haystack is distributed under the terms of the Apache-2.0 license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

marqo_haystack-0.1.0.tar.gz (237.3 kB view details)

Uploaded Source

Built Distribution

marqo_haystack-0.1.0-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file marqo_haystack-0.1.0.tar.gz.

File metadata

  • Download URL: marqo_haystack-0.1.0.tar.gz
  • Upload date:
  • Size: 237.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.25.0

File hashes

Hashes for marqo_haystack-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a302cb0e775830342ecba29b23bf876c501853fc04ed7f743acdb9ba66b27b1b
MD5 14a25230e398712ed950bc5bc199a8d5
BLAKE2b-256 abccbcc142317b45ebc954078777dd107817fd1b54eb4f77468f4d77419b1332

See more details on using hashes here.

File details

Details for the file marqo_haystack-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for marqo_haystack-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f88c18ee379a56853a489db0fc5832c8af47997bdf754746c91751e9e4105823
MD5 1e3a48088b44bfbabb49d8b31ae8b5d8
BLAKE2b-256 4a5e5cd7a5701b140887801e020fbde88d7829e46815f2097dabc01492217f19

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page