Skip to main content

An integration of Google Cloud AlloyDB with Haystack for vector search

Project description

AlloyDB Haystack Integration

PyPI - Version PyPI - Python Version


AlloyDB is a fully managed, PostgreSQL-compatible database service on Google Cloud, optimised for demanding transactional and analytical workloads.

This package provides a Haystack DocumentStore backed by AlloyDB with the pgvector extension, enabling both dense vector similarity search and full-text keyword search.

Connections are established through the AlloyDB Python Connector, which handles IAM-based authentication and TLS encryption without requiring manual firewall rules or IP allowlisting.

Installation

pip install alloydb-haystack

Usage

from haystack_integrations.document_stores.alloydb import AlloyDBDocumentStore
from haystack_integrations.components.retrievers.alloydb import (
    AlloyDBEmbeddingRetriever,
    AlloyDBKeywordRetriever,
)

Environment Variables

Variable Description
ALLOYDB_INSTANCE_URI AlloyDB instance URI: projects/P/locations/R/clusters/C/instances/I
ALLOYDB_USER Database user (or IAM principal for IAM auth)
ALLOYDB_PASSWORD Database password (not required when enable_iam_auth=True)

Basic Example

import os
from haystack import Document
from haystack_integrations.document_stores.alloydb import AlloyDBDocumentStore

# Requires ALLOYDB_INSTANCE_URI, ALLOYDB_USER, and ALLOYDB_PASSWORD env vars
store = AlloyDBDocumentStore(
    db="my-database",
    embedding_dimension=768,
    recreate_table=True,
)

store.write_documents([
    Document(content="Paris is the capital of France", embedding=[0.1] * 768),
    Document(content="Berlin is the capital of Germany", embedding=[0.2] * 768),
])

print(store.count_documents())  # 2

IAM Authentication

When using a service account for database access:

store = AlloyDBDocumentStore(
    db="my-database",
    user=Secret.from_env_var("ALLOYDB_IAM_USER"),  # e.g. "my-sa@my-project.iam"
    enable_iam_auth=True,
    embedding_dimension=768,
)

Vector Similarity Search

from haystack_integrations.components.retrievers.alloydb import AlloyDBEmbeddingRetriever

retriever = AlloyDBEmbeddingRetriever(document_store=store, top_k=5)
result = retriever.run(query_embedding=[0.1] * 768)
print(result["documents"])

Keyword Search

from haystack_integrations.components.retrievers.alloydb import AlloyDBKeywordRetriever

retriever = AlloyDBKeywordRetriever(document_store=store, top_k=5)
result = retriever.run(query="capital France")
print(result["documents"])

HNSW Index

For large datasets, the HNSW index provides approximate nearest-neighbour search with significantly better query throughput:

store = AlloyDBDocumentStore(
    db="my-database",
    embedding_dimension=768,
    search_strategy="hnsw",
    hnsw_index_creation_kwargs={"m": 16, "ef_construction": 64},
    hnsw_ef_search=40,
)

Integration Tests

Integration tests require a running AlloyDB instance. Set the following environment variables before running:

export ALLOYDB_INSTANCE_URI="projects/MY_PROJECT/locations/MY_REGION/clusters/MY_CLUSTER/instances/MY_INSTANCE"
export ALLOYDB_USER="my-db-user"
export ALLOYDB_PASSWORD="my-db-password"

Then run:

cd integrations/alloydb
hatch run test:integration

License

alloydb-haystack is distributed under the terms of the Apache-2.0 license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

alloydb_haystack-0.1.0.tar.gz (31.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

alloydb_haystack-0.1.0-py3-none-any.whl (27.0 kB view details)

Uploaded Python 3

File details

Details for the file alloydb_haystack-0.1.0.tar.gz.

File metadata

  • Download URL: alloydb_haystack-0.1.0.tar.gz
  • Upload date:
  • Size: 31.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for alloydb_haystack-0.1.0.tar.gz
Algorithm Hash digest
SHA256 72ec25a1c407cb5e3fcf0cca5b5fa5326a0d1e15eb06c2f7e2f7d149c19ee9fc
MD5 45eb77d762932bef16ffe74d3ef616a6
BLAKE2b-256 1aada038e54ebd18907b4cfd5a3a7a75b078e211acc532f40c634e2cbdfa372d

See more details on using hashes here.

Provenance

The following attestation bundles were made for alloydb_haystack-0.1.0.tar.gz:

Publisher: CI_pypi_release.yml on deepset-ai/haystack-core-integrations

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file alloydb_haystack-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for alloydb_haystack-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a12a357a428c6c69e6682df2bc1a5815a94b5c1727e025616e5731065ce8d768
MD5 b3769c4a5cd4ab1da97ae18dea350da3
BLAKE2b-256 56fad5fe9173e33dbd7a138baf5721bf7263b472dfee09940dbf8e3bd89b8070

See more details on using hashes here.

Provenance

The following attestation bundles were made for alloydb_haystack-0.1.0-py3-none-any.whl:

Publisher: CI_pypi_release.yml on deepset-ai/haystack-core-integrations

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page