Haystack integration for Dewey — document store, retriever, and research component
Project description
dewey-haystack
Haystack integration for Dewey — document store, retriever, and research component.
Installation
pip install dewey-haystack
Components
DeweyDocumentStore
Haystack DocumentStore backed by a Dewey collection. Handles document upload and deletion; Dewey manages chunking and embeddings automatically.
from haystack_integrations.document_stores.dewey import DeweyDocumentStore
from haystack.utils import Secret
store = DeweyDocumentStore(
api_key=Secret.from_env_var("DEWEY_API_KEY"),
collection_id="3f7a1b2c-...",
)
Upload Haystack Documents:
from haystack import Document
store.write_documents([
Document(content="Neural networks learn via backpropagation.", meta={"source": "ml.txt"}),
Document(content="Transformers use self-attention mechanisms."),
])
DeweyRetriever
Drop-in Haystack retriever backed by Dewey's hybrid semantic + BM25 search.
from haystack import Pipeline
from haystack_integrations.document_stores.dewey import DeweyDocumentStore
from haystack_integrations.components.retrievers.dewey import DeweyRetriever
from haystack.utils import Secret
store = DeweyDocumentStore(
api_key=Secret.from_env_var("DEWEY_API_KEY"),
collection_id="3f7a1b2c-...",
)
pipeline = Pipeline()
pipeline.add_component("retriever", DeweyRetriever(document_store=store, top_k=8))
result = pipeline.run({"retriever": {"query": "What are the key findings?"}})
for doc in result["retriever"]["documents"]:
print(f"[{doc.meta['filename']}] {doc.content}")
Each returned Document carries citation metadata:
| Field | Description |
|---|---|
score |
Relevance score (0–1) |
document_id |
Dewey document ID |
filename |
Original filename |
section_id |
Section ID |
section_title |
Section heading |
section_level |
Heading depth (1 = top-level) |
RAG pipeline with an LLM:
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator
prompt_template = """
Answer the question using only the provided context.
Context:
{% for doc in documents %}
- {{ doc.content }}
{% endfor %}
Question: {{ query }}
"""
pipeline = Pipeline()
pipeline.add_component("retriever", DeweyRetriever(document_store=store, top_k=5))
pipeline.add_component("prompt", PromptBuilder(template=prompt_template))
pipeline.add_component("llm", OpenAIGenerator(model="gpt-4o-mini"))
pipeline.connect("retriever.documents", "prompt.documents")
pipeline.connect("prompt.prompt", "llm.prompt")
result = pipeline.run({
"retriever": {"query": "What were the main findings?"},
"prompt": {"query": "What were the main findings?"},
})
print(result["llm"]["replies"][0])
DeweyResearchComponent
A Haystack component that runs Dewey's full agentic research loop — searching, reading, and synthesising across multiple documents — and returns a grounded Markdown answer with cited sources.
Use this as a drop-in replacement for an LLM generator when you want Dewey to handle both retrieval and generation.
from haystack import Pipeline
from haystack_integrations.components.retrievers.dewey import DeweyResearchComponent
from haystack.utils import Secret
pipeline = Pipeline()
pipeline.add_component(
"research",
DeweyResearchComponent(
api_key=Secret.from_env_var("DEWEY_API_KEY"),
collection_id="3f7a1b2c-...",
depth="balanced",
),
)
result = pipeline.run({"research": {"query": "What were the key findings across all studies?"}})
print(result["research"]["answer"])
for source in result["research"]["sources"]:
print(f" [{source.meta['filename']}] {source.content[:80]}...")
Outputs:
| Key | Type | Description |
|---|---|---|
answer |
str |
Synthesised Markdown answer |
sources |
list[Document] |
Source chunks cited by the answer |
Research depths:
| depth | Speed | Tools | Requires BYOK |
|---|---|---|---|
quick |
fast | basic search | no |
balanced |
fast | basic search | no |
deep |
slower | full tool suite | yes |
exhaustive |
slowest | full tool suite | yes |
deep and exhaustive require a Dewey Pro plan and a BYOK API key configured on your project.
With a custom model:
DeweyResearchComponent(
api_key=Secret.from_env_var("DEWEY_API_KEY"),
collection_id="3f7a1b2c-...",
depth="deep",
model="claude-sonnet-4-6", # requires Anthropic BYOK key on your project
)
Requirements
- Python 3.9+
meetdewey >= 1.0haystack-ai >= 2.0
Development
pip install -e ".[dev]"
pytest
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dewey_haystack-0.1.0.tar.gz.
File metadata
- Download URL: dewey_haystack-0.1.0.tar.gz
- Upload date:
- Size: 10.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2fdf7c9bc84867f870c9c1c75f3882b628680e4a5a8449b82a3c4d1aa5b14217
|
|
| MD5 |
f14d9d0d54954c2448000d0c5c8c6c4d
|
|
| BLAKE2b-256 |
68cc6b693084f5c6cca382e7d677b50be6bff145e30cdf2bec292227e2646d15
|
File details
Details for the file dewey_haystack-0.1.0-py3-none-any.whl.
File metadata
- Download URL: dewey_haystack-0.1.0-py3-none-any.whl
- Upload date:
- Size: 10.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
47f4e62ad4850430639c2813b067481810e0c9661c69d30d2fe4a401d1f9694e
|
|
| MD5 |
55a2b8888e52bca49266dee5a0c1fae0
|
|
| BLAKE2b-256 |
5d52b1c7792bcbafb91bd2bc80abbe97889126a5e8cd03c468a23435fd74cf25
|