An integration package connecting ClickZetta and LangChain
Project description
LangChain ClickZetta Integration
An integration package connecting ClickZetta and LangChain.
LangChain integration for ClickZetta, providing SQL queries, vector storage, and full-text search capabilities.
Features
- SQL Queries: Natural language to SQL conversion and execution
- Vector Storage: Efficient vector storage and similarity search
- Full-text Search: Advanced text search capabilities with inverted index
- Chat History: Persistent conversation memory
- Hybrid Search: Combine vector and full-text search
- True Hybrid Store: Single table with both vector and inverted indexes (ClickZetta native)
- Key-Value Store: LangChain BaseStore implementation for persistent key-value storage
- Document Store: Structured document storage with metadata support
- File Store: Binary file storage using ClickZetta Volume
- Volume Store: Native ClickZetta Volume storage for large binary data
Installation
pip install langchain-clickzetta
Quick Start
Basic Setup
from langchain_clickzetta import ClickZettaEngine
# Create engine
engine = ClickZettaEngine(
service="your-service",
instance="your-instance",
workspace="your-workspace",
schema="your-schema",
username="your-username",
password="your-password",
vcluster="your-vcluster"
)
Vector Storage
from langchain_clickzetta import ClickZettaVectorStore
from langchain_community.embeddings import DashScopeEmbeddings
# Setup embeddings
embeddings = DashScopeEmbeddings(
dashscope_api_key="your-api-key",
model="text-embedding-v4"
)
# Create vector store
vector_store = ClickZettaVectorStore(
engine=engine,
embeddings=embeddings,
table_name="my_vectors"
)
# Add documents
texts = ["Hello world", "LangChain is great"]
vector_store.add_texts(texts)
# Search
results = vector_store.similarity_search("greeting", k=2)
True Hybrid Search
from langchain_clickzetta import ClickZettaHybridStore, ClickZettaUnifiedRetriever
# Create hybrid store (single table with vector + full-text indexes)
hybrid_store = ClickZettaHybridStore(
engine=engine,
embeddings=embeddings,
table_name="hybrid_docs"
)
# Add documents
hybrid_store.add_texts([
"ClickZetta is a high-performance analytics database",
"LangChain enables building applications with LLMs"
])
# Create unified retriever
retriever = ClickZettaUnifiedRetriever(
hybrid_store=hybrid_store,
search_type="hybrid", # "vector", "fulltext", or "hybrid"
alpha=0.5 # Balance between vector and full-text search
)
# Search with hybrid approach
results = retriever.get_relevant_documents("analytics database")
SQL Chain
from langchain_clickzetta import ClickZettaSQLChain
from langchain_community.llms import Tongyi
llm = Tongyi(dashscope_api_key="your-api-key")
sql_chain = ClickZettaSQLChain.from_engine(
engine=engine,
llm=llm
)
result = sql_chain.invoke({"query": "How many tables are there?"})
print(result["result"])
Key-Value Store
from langchain_clickzetta import ClickZettaStore
# Create key-value store
store = ClickZettaStore(
engine=engine,
table_name="my_store"
)
# Store and retrieve data
store.mset([("key1", b"value1"), ("key2", b"value2")])
values = store.mget(["key1", "key2"])
print(values) # [b'value1', b'value2']
Document Store
from langchain_clickzetta import ClickZettaDocumentStore
# Create document store
doc_store = ClickZettaDocumentStore(
engine=engine,
table_name="documents"
)
# Store document with metadata
doc_store.store_document(
doc_id="doc1",
content="This is a sample document",
metadata={"author": "John", "category": "sample"}
)
# Retrieve document
content, metadata = doc_store.get_document("doc1")
File Store
from langchain_clickzetta import ClickZettaFileStore
# Create file store using ClickZetta Volume
file_store = ClickZettaFileStore(
engine=engine,
volume_type="user",
subdirectory="my_files"
)
# Store binary file
with open("image.png", "rb") as f:
content = f.read()
file_store.store_file("images/logo.png", content, "image/png")
# Retrieve file
file_content, mime_type = file_store.get_file("images/logo.png")
Chat History
from langchain_clickzetta import ClickZettaChatMessageHistory
from langchain_core.messages import HumanMessage, AIMessage
# Create chat history
chat_history = ClickZettaChatMessageHistory(
engine=engine,
session_id="session123",
table_name="chat_history"
)
# Add messages
chat_history.add_message(HumanMessage(content="Hello"))
chat_history.add_message(AIMessage(content="Hi there!"))
# Retrieve messages
messages = chat_history.messages
Documentation
For more detailed documentation, see the main repository README and examples.
Development
See CONTRIBUTING.md for development setup and guidelines.
License
This package is released under the MIT License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langchain_clickzetta-0.1.1.tar.gz.
File metadata
- Download URL: langchain_clickzetta-0.1.1.tar.gz
- Upload date:
- Size: 71.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f32a3148438147d9d297ef07d7ff5160dab6a1a85d5b8703dbbdf5f137d9461b
|
|
| MD5 |
8e67552f8eef4d3152d759619446a8d4
|
|
| BLAKE2b-256 |
822f398454870d04fc155900f2c577ac61f1553681c022eca093b80042360494
|
File details
Details for the file langchain_clickzetta-0.1.1-py3-none-any.whl.
File metadata
- Download URL: langchain_clickzetta-0.1.1-py3-none-any.whl
- Upload date:
- Size: 36.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f89b407669829e4c2896a9ce9b7028a71ac70eb28963d6f5854c55d7a2573fd
|
|
| MD5 |
5c59a637cf73d4bc9157a63a4e8883b6
|
|
| BLAKE2b-256 |
d09525d169ffbfbf34cc5a6468448b7e9c96dd4b2ea81bd2fe95dff67a65a9db
|