An integration of Google Cloud AlloyDB with Haystack for vector search
Project description
AlloyDB Haystack Integration
AlloyDB is a fully managed, PostgreSQL-compatible database service on Google Cloud, optimised for demanding transactional and analytical workloads.
This package provides a Haystack DocumentStore backed by AlloyDB with the
pgvector extension, enabling both dense vector
similarity search and full-text keyword search.
Connections are established through the AlloyDB Python Connector, which handles IAM-based authentication and TLS encryption without requiring manual firewall rules or IP allowlisting.
Installation
pip install alloydb-haystack
Usage
from haystack_integrations.document_stores.alloydb import AlloyDBDocumentStore
from haystack_integrations.components.retrievers.alloydb import (
AlloyDBEmbeddingRetriever,
AlloyDBKeywordRetriever,
)
Environment Variables
| Variable | Description |
|---|---|
ALLOYDB_INSTANCE_URI |
AlloyDB instance URI: projects/P/locations/R/clusters/C/instances/I |
ALLOYDB_USER |
Database user (or IAM principal for IAM auth) |
ALLOYDB_PASSWORD |
Database password (not required when enable_iam_auth=True) |
Basic Example
import os
from haystack import Document
from haystack_integrations.document_stores.alloydb import AlloyDBDocumentStore
# Requires ALLOYDB_INSTANCE_URI, ALLOYDB_USER, and ALLOYDB_PASSWORD env vars
store = AlloyDBDocumentStore(
db="my-database",
embedding_dimension=768,
recreate_table=True,
)
store.write_documents([
Document(content="Paris is the capital of France", embedding=[0.1] * 768),
Document(content="Berlin is the capital of Germany", embedding=[0.2] * 768),
])
print(store.count_documents()) # 2
IAM Authentication
When using a service account for database access:
store = AlloyDBDocumentStore(
db="my-database",
user=Secret.from_env_var("ALLOYDB_IAM_USER"), # e.g. "my-sa@my-project.iam"
enable_iam_auth=True,
embedding_dimension=768,
)
Vector Similarity Search
from haystack_integrations.components.retrievers.alloydb import AlloyDBEmbeddingRetriever
retriever = AlloyDBEmbeddingRetriever(document_store=store, top_k=5)
result = retriever.run(query_embedding=[0.1] * 768)
print(result["documents"])
Keyword Search
from haystack_integrations.components.retrievers.alloydb import AlloyDBKeywordRetriever
retriever = AlloyDBKeywordRetriever(document_store=store, top_k=5)
result = retriever.run(query="capital France")
print(result["documents"])
HNSW Index
For large datasets, the HNSW index provides approximate nearest-neighbour search with significantly better query throughput:
store = AlloyDBDocumentStore(
db="my-database",
embedding_dimension=768,
search_strategy="hnsw",
hnsw_index_creation_kwargs={"m": 16, "ef_construction": 64},
hnsw_ef_search=40,
)
Integration Tests
Integration tests require a running AlloyDB instance. Set the following environment variables before running:
export ALLOYDB_INSTANCE_URI="projects/MY_PROJECT/locations/MY_REGION/clusters/MY_CLUSTER/instances/MY_INSTANCE"
export ALLOYDB_USER="my-db-user"
export ALLOYDB_PASSWORD="my-db-password"
Then run:
cd integrations/alloydb
hatch run test:integration
License
alloydb-haystack is distributed under the terms of the Apache-2.0 license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file alloydb_haystack-0.1.0.tar.gz.
File metadata
- Download URL: alloydb_haystack-0.1.0.tar.gz
- Upload date:
- Size: 31.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
72ec25a1c407cb5e3fcf0cca5b5fa5326a0d1e15eb06c2f7e2f7d149c19ee9fc
|
|
| MD5 |
45eb77d762932bef16ffe74d3ef616a6
|
|
| BLAKE2b-256 |
1aada038e54ebd18907b4cfd5a3a7a75b078e211acc532f40c634e2cbdfa372d
|
Provenance
The following attestation bundles were made for alloydb_haystack-0.1.0.tar.gz:
Publisher:
CI_pypi_release.yml on deepset-ai/haystack-core-integrations
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
alloydb_haystack-0.1.0.tar.gz -
Subject digest:
72ec25a1c407cb5e3fcf0cca5b5fa5326a0d1e15eb06c2f7e2f7d149c19ee9fc - Sigstore transparency entry: 1439938834
- Sigstore integration time:
-
Permalink:
deepset-ai/haystack-core-integrations@0dad13579f9f6a698e727ac8a0da65c83a155e65 -
Branch / Tag:
refs/tags/integrations/alloydb-v0.1.0 - Owner: https://github.com/deepset-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
CI_pypi_release.yml@0dad13579f9f6a698e727ac8a0da65c83a155e65 -
Trigger Event:
push
-
Statement type:
File details
Details for the file alloydb_haystack-0.1.0-py3-none-any.whl.
File metadata
- Download URL: alloydb_haystack-0.1.0-py3-none-any.whl
- Upload date:
- Size: 27.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a12a357a428c6c69e6682df2bc1a5815a94b5c1727e025616e5731065ce8d768
|
|
| MD5 |
b3769c4a5cd4ab1da97ae18dea350da3
|
|
| BLAKE2b-256 |
56fad5fe9173e33dbd7a138baf5721bf7263b472dfee09940dbf8e3bd89b8070
|
Provenance
The following attestation bundles were made for alloydb_haystack-0.1.0-py3-none-any.whl:
Publisher:
CI_pypi_release.yml on deepset-ai/haystack-core-integrations
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
alloydb_haystack-0.1.0-py3-none-any.whl -
Subject digest:
a12a357a428c6c69e6682df2bc1a5815a94b5c1727e025616e5731065ce8d768 - Sigstore transparency entry: 1439938852
- Sigstore integration time:
-
Permalink:
deepset-ai/haystack-core-integrations@0dad13579f9f6a698e727ac8a0da65c83a155e65 -
Branch / Tag:
refs/tags/integrations/alloydb-v0.1.0 - Owner: https://github.com/deepset-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
CI_pypi_release.yml@0dad13579f9f6a698e727ac8a0da65c83a155e65 -
Trigger Event:
push
-
Statement type: