Skip to main content

Haystack integration for Dewey — document store, retriever, and research component

Project description

dewey-haystack

CI

Haystack integration for Dewey — document store, retriever, and research component.

Installation

pip install dewey-haystack

Components

DeweyDocumentStore

Haystack DocumentStore backed by a Dewey collection. Handles document upload and deletion; Dewey manages chunking and embeddings automatically.

from haystack_integrations.document_stores.dewey import DeweyDocumentStore
from haystack.utils import Secret

store = DeweyDocumentStore(
    api_key=Secret.from_env_var("DEWEY_API_KEY"),
    collection_id="3f7a1b2c-...",
)

Upload Haystack Documents:

from haystack import Document

store.write_documents([
    Document(content="Neural networks learn via backpropagation.", meta={"source": "ml.txt"}),
    Document(content="Transformers use self-attention mechanisms."),
])

DeweyRetriever

Drop-in Haystack retriever backed by Dewey's hybrid semantic + BM25 search.

from haystack import Pipeline
from haystack_integrations.document_stores.dewey import DeweyDocumentStore
from haystack_integrations.components.retrievers.dewey import DeweyRetriever
from haystack.utils import Secret

store = DeweyDocumentStore(
    api_key=Secret.from_env_var("DEWEY_API_KEY"),
    collection_id="3f7a1b2c-...",
)

pipeline = Pipeline()
pipeline.add_component("retriever", DeweyRetriever(document_store=store, top_k=8))

result = pipeline.run({"retriever": {"query": "What are the key findings?"}})
for doc in result["retriever"]["documents"]:
    print(f"[{doc.meta['filename']}] {doc.content}")

Each returned Document carries citation metadata:

Field Description
score Relevance score (0–1)
document_id Dewey document ID
filename Original filename
section_id Section ID
section_title Section heading
section_level Heading depth (1 = top-level)

RAG pipeline with an LLM:

from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator

prompt_template = """
Answer the question using only the provided context.

Context:
{% for doc in documents %}
- {{ doc.content }}
{% endfor %}

Question: {{ query }}
"""

pipeline = Pipeline()
pipeline.add_component("retriever", DeweyRetriever(document_store=store, top_k=5))
pipeline.add_component("prompt", PromptBuilder(template=prompt_template))
pipeline.add_component("llm", OpenAIGenerator(model="gpt-4o-mini"))

pipeline.connect("retriever.documents", "prompt.documents")
pipeline.connect("prompt.prompt", "llm.prompt")

result = pipeline.run({
    "retriever": {"query": "What were the main findings?"},
    "prompt": {"query": "What were the main findings?"},
})
print(result["llm"]["replies"][0])

DeweyResearchComponent

A Haystack component that runs Dewey's full agentic research loop — searching, reading, and synthesising across multiple documents — and returns a grounded Markdown answer with cited sources.

Use this as a drop-in replacement for an LLM generator when you want Dewey to handle both retrieval and generation.

from haystack import Pipeline
from haystack_integrations.components.retrievers.dewey import DeweyResearchComponent
from haystack.utils import Secret

pipeline = Pipeline()
pipeline.add_component(
    "research",
    DeweyResearchComponent(
        api_key=Secret.from_env_var("DEWEY_API_KEY"),
        collection_id="3f7a1b2c-...",
        depth="balanced",
    ),
)

result = pipeline.run({"research": {"query": "What were the key findings across all studies?"}})
print(result["research"]["answer"])

for source in result["research"]["sources"]:
    print(f"  [{source.meta['filename']}] {source.content[:80]}...")

Outputs:

Key Type Description
answer str Synthesised Markdown answer
sources list[Document] Source chunks cited by the answer

Research depths:

depth Speed Tools Requires BYOK
quick fast basic search no
balanced fast basic search no
deep slower full tool suite yes
exhaustive slowest full tool suite yes

deep and exhaustive require a Dewey Pro plan and a BYOK API key configured on your project.

With a custom model:

DeweyResearchComponent(
    api_key=Secret.from_env_var("DEWEY_API_KEY"),
    collection_id="3f7a1b2c-...",
    depth="deep",
    model="claude-sonnet-4-6",  # requires Anthropic BYOK key on your project
)

Requirements

  • Python 3.9+
  • meetdewey >= 1.0
  • haystack-ai >= 2.0

Development

pip install -e ".[dev]"
pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dewey_haystack-0.1.0.tar.gz (10.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dewey_haystack-0.1.0-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file dewey_haystack-0.1.0.tar.gz.

File metadata

  • Download URL: dewey_haystack-0.1.0.tar.gz
  • Upload date:
  • Size: 10.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for dewey_haystack-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2fdf7c9bc84867f870c9c1c75f3882b628680e4a5a8449b82a3c4d1aa5b14217
MD5 f14d9d0d54954c2448000d0c5c8c6c4d
BLAKE2b-256 68cc6b693084f5c6cca382e7d677b50be6bff145e30cdf2bec292227e2646d15

See more details on using hashes here.

File details

Details for the file dewey_haystack-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: dewey_haystack-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for dewey_haystack-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 47f4e62ad4850430639c2813b067481810e0c9661c69d30d2fe4a401d1f9694e
MD5 55a2b8888e52bca49266dee5a0c1fae0
BLAKE2b-256 5d52b1c7792bcbafb91bd2bc80abbe97889126a5e8cd03c468a23435fd74cf25

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page