Skip to main content

Azure AI search vector database adapter for cognee

Project description

Azure AI Search Adapter for Cognee

This adapter provides integration between Cognee and Azure AI Search (formerly Azure Cognitive Search) for vector storage and retrieval operations.

Features

  • Full vector search capabilities using Azure AI Search
  • Hybrid search (combining text and vector search)
  • HNSW algorithm for efficient similarity search
  • Async/await support for all operations
  • Batch operations for improved performance

Installation

If published, the package can be simply installed via pip:

pip install cognee-community-vector-adapter-azure

In case it is not published yet, you can use poetry to locally build the adapter package:

pip install poetry
poetry install # run this command in the directory containing the pyproject.toml file

Configuration

The adapter requires the following credentials:

  • endpoint: Your Azure AI Search service endpoint (e.g., https://your-service.search.windows.net)
  • api_key: Your Azure AI Search API key
  • embedding_engine: An instance of EmbeddingEngine for text vectorization

Usage

from cognee.infrastructure.databases.vector.embeddings.EmbeddingEngine import EmbeddingEngine
from packages.vector.azureaisearch import AzureAISearchAdapter

# Initialize the adapter
embedding_engine = EmbeddingEngine(...)  # Your embedding engine
adapter = AzureAISearchAdapter(
    endpoint="https://your-service.search.windows.net",
    api_key="your-api-key",
    embedding_engine=embedding_engine
)

# Create a collection (index)
await adapter.create_collection("my_collection")

# Add data points
await adapter.create_data_points("my_collection", data_points)

# Search
results = await adapter.search(
    collection_name="my_collection",
    query_text="search query",
    limit=10
)

# Batch search
results = await adapter.batch_search(
    collection_name="my_collection",
    query_texts=["query1", "query2"],
    limit=10
)

Key Differences from Other Vector Databases

  1. Collections as Indexes: In Azure AI Search, what other vector databases call "collections" are called "indexes"
  2. Document Structure: Documents in Azure AI Search have a specific schema with defined fields
  3. Batch Operations: Azure AI Search doesn't have native batch search, so batch operations are parallelized
  4. Scoring: Azure AI Search returns @search.score which is normalized differently than other vector databases

Vector Search Configuration

The adapter uses HNSW (Hierarchical Navigable Small World) algorithm with the following default parameters:

  • m: 4 (number of bi-directional links)
  • efConstruction: 400 (size of the dynamic list)
  • efSearch: 500 (size of the dynamic list for search)
  • metric: cosine (similarity metric)

These parameters can be adjusted in the create_collection method if needed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cognee_community_vector_adapter_azure-0.0.2.tar.gz (376.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file cognee_community_vector_adapter_azure-0.0.2.tar.gz.

File metadata

File hashes

Hashes for cognee_community_vector_adapter_azure-0.0.2.tar.gz
Algorithm Hash digest
SHA256 a6bf9dcb1271c41136a0ad30334e09d8d4dda55c9cbbe5e888674a78970c58c6
MD5 501b81e35a3eabc14c09f81461692e3d
BLAKE2b-256 29ca4c2c699096b2765097c4e41d20fc4a3c6677a0c7fb6d28120cc79296ebde

See more details on using hashes here.

File details

Details for the file cognee_community_vector_adapter_azure-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for cognee_community_vector_adapter_azure-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f6b5a9068f6b179b4ad906d8e1571b8b13e61eb2b2e41e93a7228b8bf6c2bab2
MD5 709ff0bc09416ae947a5d9646475ecc0
BLAKE2b-256 31ba62390160167880cec5b8e192831f723baf8fe6f76db8d7aade3e98cf21b1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page