Skip to main content

Encrypted Vector Database for Secure and Fast ANN Searches with LangChain

Project description

VectorX LangChain Integration

This package provides an integration between VectorX (an encrypted vector database) and LangChain, allowing you to use VectorX as a vector store backend for LangChain.

Features

  • Encrypted Vector Storage: Use VectorX's client-side encryption for your LangChain embeddings
  • Multiple Distance Metrics: Support for cosine, L2, and inner product distance metrics
  • Metadata Filtering: Filter search results based on metadata
  • High Performance: Optimized for speed and efficiency with encrypted data

Installation

pip install vecx-langchain

This will install both the vecx-langchain package and its dependencies (vecx, langchain, and langchain-core).

Quick Start

import os
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document
from vecx.vectorx import VectorX
from vecx_langchain import VectorXVectorStore

# Configure your VectorX credentials
api_token = os.environ.get("VECTORX_API_TOKEN")
vx = VectorX(token=api_token)

# Generate a secure encryption key
encryption_key = vx.generate_key()
# The key is automatically printed with a warning to store it securely

# Initialize embedding model
embedding_model = OpenAIEmbeddings()

# Initialize the vector store
vector_store = VectorXVectorStore.from_params(
    embedding=embedding_model,
    api_token=api_token,
    encryption_key=encryption_key,
    index_name="my_langchain_vectors",
    space_type="cosine"
)

# Add documents
texts = [
    "VectorX is an encrypted vector database",
    "LangChain is a framework for developing applications powered by language models",
    "Encryption keeps your data secure"
]

metadatas = [
    {"source": "product", "category": "database"},
    {"source": "github", "category": "framework"},
    {"source": "textbook", "category": "security"}
]

vector_store.add_texts(texts=texts, metadatas=metadatas)

# Search similar documents
results = vector_store.similarity_search("How does encryption work?", k=2)

# Process results
for doc in results:
    print(f"Content: {doc.page_content}")
    print(f"Metadata: {doc.metadata}")
    print()

How Encryption Works

When using the VectorX LangChain integration:

  1. Key Generation: The vx.generate_key() method generates a secure encryption key
  2. Client-Side Encryption: Your vectors and metadata are encrypted before being sent to the server
  3. Secure Queries: Query vectors are also encrypted, maintaining security throughout the process
  4. Zero-Knowledge Architecture: The VectorX server never sees your unencrypted data

Using with LangChain

VectorX can be used anywhere a LangChain vector store is needed:

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from vecx_langchain import VectorXVectorStore

# Initialize your vector store
vector_store = VectorXVectorStore.from_params(
    embedding=OpenAIEmbeddings(),
    api_token="your_api_token",
    encryption_key="your_encryption_key",
    index_name="your_index_name"
)

# Create a retriever
retriever = vector_store.as_retriever()

# Create the RAG chain
model = ChatOpenAI()
prompt = ChatPromptTemplate.from_template(
    """Answer the following question based on the provided context:
    
    Context: {context}
    Question: {question}
    """
)

rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

# Use the chain
response = rag_chain.invoke("What is VectorX?")
print(response)

API Reference

VectorXVectorStore

The main class for integrating with LangChain. Key methods include:

  • __init__: Initialize with a VectorX index or parameters to create a new one
  • from_params: Create a vector store using an API token and encryption key
  • add_texts: Add text documents with optional metadata
  • similarity_search: Search for similar documents
  • similarity_search_with_score: Search and return similarity scores
  • delete: Delete documents by ID or filter

Configuration Options

The VectorXVectorStore constructor and from_params method accept the following parameters:

  • embedding: LangChain embedding function to use
  • api_token: Your VectorX API token
  • encryption_key: Your encryption key for the index
  • index_name: Name of the VectorX index
  • dimension: Vector dimension (can be inferred from embedding model)
  • space_type: Distance metric, one of "cosine", "l2", or "ip" (default: "cosine")
  • text_key: Key to use for storing text in metadata (default: "text")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vecx_langchain-0.1.3.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vecx_langchain-0.1.3-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file vecx_langchain-0.1.3.tar.gz.

File metadata

  • Download URL: vecx_langchain-0.1.3.tar.gz
  • Upload date:
  • Size: 6.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for vecx_langchain-0.1.3.tar.gz
Algorithm Hash digest
SHA256 f2ea444899743cfe991e240604ccea812eb1fa11730788a9ab9a6ec993f5847d
MD5 75800aca4a5ea4ad4b7d91c487c2ba2f
BLAKE2b-256 0da4de0466f493d689dba8ba420dcf047d5b092e66c7f58f20a306bb128b9442

See more details on using hashes here.

File details

Details for the file vecx_langchain-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: vecx_langchain-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 6.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for vecx_langchain-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 eb758edf57d596d1a441a9a1c7abc4785a1d86a07107de9e4ccc199a0114e57a
MD5 29ac13988ef9f090de6cc61b620967f1
BLAKE2b-256 cc92a333eabca302c89fd67dd694d71d4af10522ecc372aafcfd6f111ceab52d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page