LangChain VectorStore integration for Envector
Project description
LangChain Envector Integration
Encrypted vector search for LangChain using Envector, powered by homomorphic encryption (CKKS). This repo ships a LangChain-compatible VectorStore and retriever utilities built on the high-level pyenvector Python SDK.
Features
- LangChain
VectorStoreinterface withsimilarity_search,from_texts, etc. - Optional
VectorStoreRetrieverhelper for quick RAG integrations. - Client-side encryption handled transparently by the SDK, including score thresholds and filtering.
Installation
- Python 3.9–3.13 (recommend 3.11)
- Create and activate a virtualenv:
python3.11 -m venv .venv && source .venv/bin/activate
- Install runtime dependencies:
pip install -U pip setuptools wheelpip install pyenvector langchain sentence-transformers
Usage Overview
- Configure Envector using
EnvectorConfig, pointing to your EnVector endpoint and keys. - Initialize embeddings (or provide pre-computed vectors).
- Instantiate
Envector(config=cfg, embeddings=emb)and calladd_texts,add_documents, or useas_retriever. - Run
similarity_searchor plug the retriever into your LangChain pipeline.
See
notebooks/for end-to-end walkthroughs and thelibs/envectorpackage for implementation details.
Configuration
Key dataclasses live in libs/envector/config.py:
ConnectionConfig: address or host/port for EnVector.KeyConfig: key path, key ID, optional preset/eval mode.IndexSettings: index name, dimension (32–4096), query encryption mode, optional output fields and fetch parameters.EnvectorConfig: wraps the above and enables auto-creation viacreate_if_missing.
Data Model
- Each vector stores a single
metadatastring in EnVector. - To align with LangChain’s
Document, inserts wrap data as JSON:{"text": ..., "metadata": ...}. - Retrieval unwraps JSON, returning
Document(page_content=text, metadata={...}). - Client-side filtering requires the JSON envelope to include an object under
metadata.
Limitations
- Item-level delete/update is unsupported (drop the index to reset).
- Manual item IDs are not accepted; returned IDs from
add_textsare ephemeral. - Filtering happens client-side; ensure metadata is JSON for structured filters.
Examples
-
Configuration
from langchain_envector.config import ConnectionConfig, EnvectorConfig, IndexSettings, KeyConfig cfg = EnvectorConfig( connection=ConnectionConfig( address=ENVECTOR_ADDRESS, access_token=ENVECTOR_ACCESS_TOKEN ), key=KeyConfig( key_path=ENVECTOR_KEY_PATH, key_id=ENVECTOR_KEY_ID, preset="ip", eval_mode="rmp" ), index=IndexSettings( index_name=INDEX_NAME, dim=vector_dim, query_encryption="cipher" ), create_if_missing=True, )
-
Add documents (from LangChain Documents):
from langchain_core.documents import Document from langchain_envector.vectorstore import Envector docs = [ Document( page_content="chunk-1", metadata={"source": "paper.pdf", "page": 1, "chunk": 0} ), Document( page_content="chunk-2", metadata={"source": "paper.pdf", "page": 1, "chunk": 1} ), ] store = Envector(config=cfg, embeddings=emb) store.add_documents(docs)
The method
add_textsis also available to store texts. -
Similarity search
results = store.similarity_search_with_score(query, k=3) for doc, score in results: print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")
The methods
similarity_searchandsimilarity_search_with_vector(withembeddings.embed_query()) are also available to perform vector search.
Troubleshooting
- Connection issues: verify EnVector address and registered keys.
- Embeddings mismatch: ensure embedding dimension equals
index.dimwhen supplying vectors. - Unexpected raw strings: confirm inserts used the JSON envelope.
- Key Issues: check key's metadata to sync with the registered key if facing any key issue.
Testing Without EnVector
- Run unit tests offline (no EnVector or SDK required):
python -m pytest -q -m "not integration"- or
python scripts/run_unit_tests.py
- Run integration tests (requires server and keys):
- Export
ENVECTOR_ADDRESS,ENVECTOR_KEY_PATH,ENVECTOR_KEY_ID - Optional:
ENVECTOR_USE_EMBEDDINGS=1,ENVECTOR_EMB_MODEL,ENVECTOR_USE_HF_DATASET=1 python -m pytest -q -m integration -s
- Export
Contributing
See CONTRIBUTE.md for development, testing, and PR guidelines.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langchain_envector-0.1.3-py3-none-any.whl.
File metadata
- Download URL: langchain_envector-0.1.3-py3-none-any.whl
- Upload date:
- Size: 11.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
568fbc0feb63e9982ef276bfc93fa0fe2e06ccbbe0716067a4f16dee5743f0c3
|
|
| MD5 |
e6b7cd09f37211e3787e4c089745e4a3
|
|
| BLAKE2b-256 |
fa950024a10efb8d6d87631024e5cdad1643a7c8c78602630b8bd3abadde1234
|