Skip to main content

LangChain vector store implementation for Azure Data Explorer (Kusto) - PRE-ALPHA VERSION

Project description

LangChain Kusto Vector Store

⚠️ PRE-ALPHA VERSION - This package is in very early development and not recommended for production use.

A LangChain vector store implementation for Azure Data Explorer (Kusto), Microsoft Fabric Eventhouse, and other Kusto-compatible databases.

Current Status

This is a very initial version with only retrieval capabilities. Document storage functionality is not yet implemented.

Features

  • ✅ Retrieve vector embeddings from Azure Data Explorer (Kusto) or Microsoft Fabric Eventhouse
  • ✅ Compatible with LangChain's vector store interface
  • ✅ Similarity search with cosine similarity metric
  • ❌ Document storage (not yet implemented)
  • ❌ Batch operations (not yet implemented)

Installation

pip install langchain-kusto

Quick Start

from langchain_kusto import KustoVectorStore
from langchain_openai import AzureOpenAIEmbeddings
from azure.identity import DefaultAzureCredential

# Initialize embeddings
embeddings = AzureOpenAIEmbeddings(
    azure_endpoint="your-openai-endpoint",
    azure_deployment="your-embedding-deployment",
    openai_api_version="2023-05-15"
)

# Initialize the vector store (retrieval only)
vector_store = KustoVectorStore(
    connection="https://your-cluster.kusto.windows.net",  # or KustoConnectionStringBuilder
    database="your_database",
    collection_name="your_table",
    embedding=embeddings,
    embedding_column="embedding_text",  # optional, defaults to "embedding"
    id_column="vector_id",              # optional, defaults to "id"
    content_column="doc_text"           # optional, defaults to "text"
)

# Search for similar documents (this requires pre-existing data in Kusto)
results = vector_store.similarity_search("your query text", k=5)

Complete Example

See demo.py for a complete working example using Azure OpenAI embeddings and a RAG (Retrieval-Augmented Generation) pipeline.

Requirements

  • Python >= 3.8
  • Azure Data Explorer cluster or Microsoft Fabric Eventhouse with pre-existing vector data
  • LangChain Core >= 0.1.0
  • Azure authentication (DefaultAzureCredential)

Data Prerequisites

Since this version only supports retrieval, you need to have your vector embeddings already stored in a Kusto table with the following structure:

.create table your_table (
    vector_id: string,
    doc_text: string,
    embedding_text: dynamic  // Array of float values representing the vector
    // ... other metadata columns
)

Development Status

This package is currently in pre-alpha development. APIs may change significantly between versions. The current version only supports reading existing vector data from Kusto - document ingestion and storage capabilities will be added in future releases.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_kusto-0.0.1.tar.gz (5.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langchain_kusto-0.0.1-py3-none-any.whl (6.3 kB view details)

Uploaded Python 3

File details

Details for the file langchain_kusto-0.0.1.tar.gz.

File metadata

  • Download URL: langchain_kusto-0.0.1.tar.gz
  • Upload date:
  • Size: 5.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for langchain_kusto-0.0.1.tar.gz
Algorithm Hash digest
SHA256 bf12c8775d8cd8ea63512c1db3142e71b64f142f8fbb35827d17a7eef9a99505
MD5 0ce7fc2f84fff0a9216f59c12c394721
BLAKE2b-256 eeb375b4bf4e15510a1a8392f9a3d192b621cfe2c508a74646994f2462ef69ea

See more details on using hashes here.

File details

Details for the file langchain_kusto-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_kusto-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b655af8012a53f1e51ca9a0a6713b90cec3639e6b0e79e51bc9ee8935f9b4796
MD5 0bdb0cf076bc87227cdfff71791cb859
BLAKE2b-256 8c73276621be9504da933f331d09cb48e1041c9dc9adb5ec61a8d2acd42e0dd6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page