Skip to main content

LangChain integration for Snowflake vector store

Project description

LangChain Snowflake Vector Store

A LangChain integration for Snowflake's vector database capabilities, enabling semantic search and similarity matching using Snowflake's native VECTOR data type and VECTOR_COSINE_SIMILARITY function.

Features

  • 🏔️ Native Snowflake Integration: Uses Snowflake's built-in vector capabilities
  • 🔍 Semantic Search: Powered by VECTOR_COSINE_SIMILARITY function
  • 📊 Scalable: Leverages Snowflake's cloud-native architecture
  • 🔒 Secure: Enterprise-grade security and compliance
  • 🚀 High Performance: Optimized for large-scale vector operations

Installation

pip install langchain-snowflake-vectorstore

Quick Start

from langchain_snowflake_vectorstore import SnowflakeVectorStore
from langchain_openai import OpenAIEmbeddings

# Initialize embeddings
embeddings = OpenAIEmbeddings()

# Create vector store
vector_store = SnowflakeVectorStore(
    account="your-account",
    user="your-username", 
    password="your-password",
    database="your-database",
    schema="your-schema",
    warehouse="your-warehouse",
    role="your-role",
    table_name="vector_documents",
    embedding_function=embeddings,
    embedding_dimension=1536
)

# Add documents
texts = [
    "LangChain is a framework for developing applications powered by language models.",
    "Snowflake is a cloud-based data warehousing platform.",
    "Vector databases enable semantic search and similarity matching."
]

ids = vector_store.add_texts(texts)

# Search for similar documents
results = vector_store.similarity_search("What is LangChain?", k=2)
for doc in results:
    print(doc.page_content)

Configuration

Environment Variables

You can set your Snowflake credentials using environment variables:

export SNOWFLAKE_ACCOUNT=your-account
export SNOWFLAKE_USER=your-username
export SNOWFLAKE_PASSWORD=your-password
export SNOWFLAKE_DATABASE=your-database
export SNOWFLAKE_SCHEMA=your-schema
export SNOWFLAKE_WAREHOUSE=your-warehouse
export SNOWFLAKE_ROLE=your-role

Connection Parameters

Parameter Description Required
account Snowflake account identifier Yes
user Username for authentication Yes
password Password for authentication Yes
database Database name Yes
schema Schema name Yes
warehouse Warehouse name Yes
role Role name No
table_name Table name for storing vectors Yes
embedding_function Function to generate embeddings Yes
embedding_dimension Dimension of embedding vectors Yes

Advanced Usage

Custom Table Creation

# Recreate table with custom settings
vector_store.recreate_table()

Similarity Search with Scores

# Get similarity scores along with documents
results_with_scores = vector_store.similarity_search_with_score("query", k=5)
for doc, score in results_with_scores:
    print(f"Score: {score}, Content: {doc.page_content}")

Adding Documents with Metadata

texts = ["Document 1", "Document 2"]
metadatas = [{"source": "file1.txt"}, {"source": "file2.txt"}]
ids = vector_store.add_texts(texts, metadatas=metadatas)

Batch Operations

# Create from texts (class method)
vector_store = SnowflakeVectorStore.from_texts(
    texts=texts,
    embedding=embeddings,
    account="your-account",
    # ... other parameters
)

Requirements

  • Python 3.8+
  • Snowflake account with vector support
  • LangChain Core
  • Snowflake Connector for Python
  • SQLAlchemy

Snowflake Setup

Your Snowflake account must support the VECTOR data type and VECTOR_COSINE_SIMILARITY function. These features are available in recent Snowflake versions.

Required Permissions

Ensure your Snowflake role has the following permissions:

  • CREATE TABLE on the target schema
  • INSERT, SELECT, UPDATE, DELETE on the vector table
  • USAGE on the database, schema, and warehouse

Testing

Run the test suite:

# Unit tests
pytest tests/test_vectorstore.py

# Integration tests (requires Snowflake credentials)
export SNOWFLAKE_ACCOUNT=your-account
# ... set other environment variables
pytest tests/test_integration.py -m integration

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For issues and questions:

Changelog

v0.1.0

  • Initial release
  • Basic vector store functionality
  • Snowflake integration with VECTOR data type
  • Similarity search using VECTOR_COSINE_SIMILARITY
  • Comprehensive test suite

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_snowflake_vectorstore-0.1.0.tar.gz (6.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file langchain_snowflake_vectorstore-0.1.0.tar.gz.

File metadata

File hashes

Hashes for langchain_snowflake_vectorstore-0.1.0.tar.gz
Algorithm Hash digest
SHA256 675fe6f15669414e2f6f5603a66c9107b6aef2ef6b1931151488349e6a238de2
MD5 e783d8e63521536672c8c8cbeccee576
BLAKE2b-256 287176cbdf8012d0f0a672471f4682a091e354e80d645d317b578e48f10e707d

See more details on using hashes here.

File details

Details for the file langchain_snowflake_vectorstore-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_snowflake_vectorstore-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3360fa078ab36968d559324dd0ea02fc13548e91185946ccda9b8016430edd19
MD5 eabdc70764bbc3b1d652e2ecc577e12e
BLAKE2b-256 bbb47565e469912405075fbf4cef152cfda413d266a7e8e66fff7286bccba52d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page