Haystack 2.x DocumentStore for VelesDB: The Local AI Memory Database.

These details have not been verified by PyPI

Project links

Project description

haystack-velesdb

A Haystack 2.x DocumentStore backed by VelesDB — the local-first, microsecond-latency vector database.

This integration joins the existing LangChain and LlamaIndex connectors, completing the trio of major Python RAG frameworks supported by VelesDB.

Installation

pip install haystack-velesdb

For development:

pip install -e "integrations/haystack[dev]"

Quick start

from haystack_velesdb import VelesDBDocumentStore
from haystack.dataclasses import Document

store = VelesDBDocumentStore(
    path="./my_docs",
    collection_name="knowledge_base",
    embedding_dim=768,
    metric="cosine",
)

# Write pre-embedded documents
documents = [
    Document(id="doc1", content="VelesDB is fast.", embedding=[0.1, 0.2, ...]),
    Document(id="doc2", content="Local-first AI memory.", embedding=[0.3, 0.4, ...]),
]
store.write_documents(documents)

# Retrieve by vector
results = store.embedding_retrieval(query_embedding=[0.1, 0.2, ...], top_k=5)
for doc in results:
    print(doc.content, doc.score)

Full RAG pipeline

See examples/rag_pipeline.py for a complete PDF ingestion and semantic search example using SentenceTransformersDocumentEmbedder.

from haystack import Pipeline
from haystack.components.converters import PyPDFToDocument
from haystack.components.embedders import (
    SentenceTransformersDocumentEmbedder,
    SentenceTransformersTextEmbedder,
)
from haystack.components.preprocessors import DocumentSplitter
from haystack.components.writers import DocumentWriter
from haystack_velesdb import VelesDBDocumentStore

store = VelesDBDocumentStore(path="./rag_store", embedding_dim=384)

# Indexing pipeline
indexer = Pipeline()
indexer.add_component("converter", PyPDFToDocument())
indexer.add_component("splitter", DocumentSplitter(split_by="sentence", split_length=3))
indexer.add_component("embedder", SentenceTransformersDocumentEmbedder(model="all-MiniLM-L6-v2"))
indexer.add_component("writer", DocumentWriter(document_store=store))
indexer.connect("converter", "splitter")
indexer.connect("splitter", "embedder")
indexer.connect("embedder", "writer")
indexer.run({"converter": {"sources": ["paper.pdf"]}})

# Query pipeline. `InMemoryEmbeddingRetriever` is bound to `InMemoryDocumentStore`
# and would NOT work against a custom DocumentStore — wrap `embedding_retrieval`
# in a thin Haystack component that forwards the call. Full working example in
# `integrations/haystack/examples/rag_pipeline.py` (`_VelesRetriever`).
from haystack import component
from haystack.dataclasses import Document
from typing import List

@component
class VelesRetriever:
    def __init__(self, document_store, top_k: int = 10):
        self._store = document_store
        self._top_k = top_k

    @component.output_types(documents=List[Document])
    def run(self, query_embedding: List[float]):
        return {"documents": self._store.embedding_retrieval(query_embedding, top_k=self._top_k)}

querier = Pipeline()
querier.add_component("embedder", SentenceTransformersTextEmbedder(model="all-MiniLM-L6-v2"))
querier.add_component("retriever", VelesRetriever(document_store=store))
querier.connect("embedder.embedding", "retriever.query_embedding")
result = querier.run({"embedder": {"text": "What is VelesDB?"}})
print(result["retriever"]["documents"])

API reference

`VelesDBDocumentStore`

Parameter	Default	Description
`path`	`"./velesdb_haystack"`	Directory where VelesDB persists data
`collection_name`	`"haystack_documents"`	VelesDB collection name
`embedding_dim`	`768`	Embedding vector dimension
`metric`	`"cosine"`	Distance metric: `"cosine"`, `"euclidean"`, `"dot"`, `"hamming"`, or `"jaccard"`

Methods

Method	Description
`write_documents(documents, policy)`	Upsert documents; returns count written
`filter_documents(filters)`	Scroll documents matching a VelesDB filter dict
`embedding_retrieval(query_embedding, top_k, filters, scale_score)`	Vector similarity search
`count_documents()`	Total document count
`delete_documents(document_ids)`	Delete by Haystack string IDs
`to_dict()` / `from_dict()`	Haystack pipeline serialisation

Note on DuplicatePolicy: NONE and OVERWRITE use VelesDB upsert semantics and always overwrite on collision. FAIL is fully enforced: a pre-scan is performed before writing and DuplicateDocumentError is raised if any document already exists (prefer OVERWRITE or NONE for bulk loads to skip the scan cost).

Note on document IDs and SHA-256: Haystack string IDs are mapped to 63-bit integers using the first 8 bytes of SHA-256 (~9.2 × 10¹⁸ slots). For a 1 M-document collection the collision probability is roughly 5 × 10⁻¹⁴, which is negligible for typical RAG workloads. A ValueError is raised at write time if a collision is detected between a new document and an existing one.

Note on scale_score: When True (default), cosine similarity scores are normalised from [-1, 1] to [0, 1] so they behave like probabilities in downstream re-ranking.

Running tests

cd integrations/haystack
pip install -e ".[dev]"
pytest tests/ -v

Tests use lightweight fake VelesDB objects — no running server required.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

3.0.1

Jun 16, 2026

3.0.0

Jun 16, 2026

2.0.0

Jun 12, 2026

1.18.0

Jun 7, 2026

1.17.0

Jun 5, 2026

1.16.0

May 30, 2026

1.15.0

May 14, 2026

1.14.4

May 1, 2026

1.14.3

May 1, 2026

1.14.2

May 1, 2026

1.14.1

Apr 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

haystack_velesdb-3.0.1.tar.gz (26.3 kB view details)

Uploaded Jun 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

haystack_velesdb-3.0.1-py3-none-any.whl (13.1 kB view details)

Uploaded Jun 16, 2026 Python 3

File details

Details for the file haystack_velesdb-3.0.1.tar.gz.

File metadata

Download URL: haystack_velesdb-3.0.1.tar.gz
Upload date: Jun 16, 2026
Size: 26.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for haystack_velesdb-3.0.1.tar.gz
Algorithm	Hash digest
SHA256	`f25cbedc4a7fff92514a146ee6d394b3e43faccd8e97187724db585e21cae5ac`
MD5	`186722cf24c958ed6c4a822ba5b957a9`
BLAKE2b-256	`12d78ff078b24c0a53327483f16a6dd63cadd75b28910840ceade63cabaae439`

See more details on using hashes here.

File details

Details for the file haystack_velesdb-3.0.1-py3-none-any.whl.

File metadata

Download URL: haystack_velesdb-3.0.1-py3-none-any.whl
Upload date: Jun 16, 2026
Size: 13.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for haystack_velesdb-3.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cf06d742520bf3a5e0163cec940528bf3ddfeb3914cf5acbeeac5755502d514d`
MD5	`af90403e815ab8cd33b864dfbc83d180`
BLAKE2b-256	`0860fa5bf8509e9dd84f85c1f67815c9a7f87300f7fb1c59c8f555df4cfe0d09`

See more details on using hashes here.

haystack-velesdb 3.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

haystack-velesdb

Installation

Quick start

Full RAG pipeline

API reference

`VelesDBDocumentStore`

Methods

Running tests

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes