High Speed Vector Database for Faster and Efficient ANN Searches with LangChain
Project description
Endee LangChain Integration
LangChain vector store integration for Endee.
For Endee setup, features, and server docs see docs.endee.io.
Sections: Setup | Dense | Hybrid | Filters | RAG Chain
1. Setup
Install
pip install langchain-endee endee endee-model
Pick an embedding model:
# Option A: Local (no API key)
pip install langchain-huggingface sentence-transformers
# Option B: OpenAI
pip install langchain-openai
For hybrid search with SPLADE (optional):
pip install fastembed
Endee Serverless
Create a token at app.endee.io. See docs for details.
from langchain_endee import EndeeVectorStore
from langchain_core.documents import Document
from endee import Precision
from langchain_huggingface import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
DIMENSION = 384
# Or OpenAI:
# from langchain_openai import OpenAIEmbeddings
# embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
# DIMENSION = 1536
vector_store = EndeeVectorStore(
embedding=embeddings,
api_token="your-token", # from app.endee.io
index_name="my_index",
dimension=DIMENSION,
)
Endee Local (Docker)
Run Endee locally — no token needed. See GitHub for setup.
docker run -p 8000:8080 -v endee-data:/data endee-oss:latest
The API is served at /api/v1, so pass base_url pointing to that path:
vector_store = EndeeVectorStore(
embedding=embeddings,
index_name="local_index",
dimension=DIMENSION,
base_url="http://localhost:8000/api/v1", # local server, no token needed
)
base_url works with all factory methods:
# from_documents
store = EndeeVectorStore.from_documents(
documents=documents,
embedding=embeddings,
index_name="my_index",
dimension=DIMENSION,
base_url="http://localhost:8000/api/v1",
)
# from_existing_index
store = EndeeVectorStore.from_existing_index(
index_name="my_index",
embedding=embeddings,
base_url="http://localhost:8000/api/v1",
)
Ingest Documents
Each LangChain Document has page_content (the text to embed) and metadata (key-value pairs for filtering).
documents = [
Document(
page_content="Python is a high-level programming language known for readability.",
metadata={"topic": "programming", "language": "python"},
),
Document(
page_content="Rust is a systems language focused on safety and speed.",
metadata={"topic": "programming", "language": "rust"},
),
Document(
page_content="Machine learning gives systems the ability to learn from data.",
metadata={"topic": "ai", "field": "ml"},
),
Document(
page_content="Vector databases store embeddings for fast similarity search.",
metadata={"topic": "database", "type": "vector"},
),
Document(
page_content="RAG enhances LLM responses by retrieving relevant documents first.",
metadata={"topic": "ai", "field": "rag"},
),
]
There are three ways to insert:
from_documents() — create index + insert Document objects
vector_store = EndeeVectorStore.from_documents(
documents=documents,
embedding=embeddings,
api_token="your-token",
index_name="my_index",
dimension=DIMENSION,
space_type="cosine",
precision=Precision.INT16,
force_recreate=True,
)
from_texts() — create index + insert raw strings
vector_store = EndeeVectorStore.from_texts(
texts=[
"Python is a high-level programming language.",
"Rust is a systems language focused on safety.",
],
metadatas=[
{"topic": "programming", "language": "python"},
{"topic": "programming", "language": "rust"},
],
embedding=embeddings,
api_token="your-token",
index_name="my_index",
dimension=DIMENSION,
)
add_texts() — insert into an existing store
new_ids = vector_store.add_texts(
texts=[
"Go is designed for scalable services.",
"TypeScript adds static typing to JavaScript.",
],
metadatas=[
{"topic": "programming", "language": "go"},
{"topic": "programming", "language": "typescript"},
],
batch_size=1000, # vectors per upsert (max 1000)
embedding_chunk_size=100, # texts per embedding API call
)
print(f"Inserted IDs: {new_ids}")
Reconnect to an Existing Index
Use from_existing_index() to reconnect without re-ingesting — ideal for production.
vector_store = EndeeVectorStore.from_existing_index(
index_name="my_index",
embedding=embeddings,
api_token="your-token",
)
2. Dense Search
similarity_search()
results = vector_store.similarity_search(query="How does RAG work?", k=3)
for doc in results:
print(f"[{doc.metadata.get('topic')}] {doc.page_content[:70]}")
similarity_search_with_score()
scored = vector_store.similarity_search_with_score(query="neural networks", k=3)
for doc, score in scored:
print(f"sim={score:.3f} {doc.page_content[:60]}")
similarity_search_by_vector()
query_vec = embeddings.embed_query("programming language safety")
# Dense mode
results = vector_store.similarity_search_by_vector(embedding=query_vec, k=2)
# Hybrid mode — sparse_indices and sparse_values must be supplied
# (omitting them logs a warning and falls back to dense-only)
sparse_vec = sparse.embed_query("programming language safety")
results = hybrid_store.similarity_search_by_vector(
embedding=query_vec,
sparse_indices=sparse_vec.indices,
sparse_values=sparse_vec.values,
k=2,
)
similarity_search_by_vector_with_score()
scored_by_vec = vector_store.similarity_search_by_vector_with_score(
embedding=query_vec,
k=3,
filter=[{"topic": {"$eq": "programming"}}],
)
for doc, score in scored_by_vec:
print(f"sim={score:.3f} {doc.page_content[:65]}")
Search tuning
See Endee docs for details on ef, prefilter_cardinality_threshold, and filter_boost_percentage.
results = vector_store.similarity_search(
query="vector search",
k=10,
ef=256,
filter=[{"topic": {"$eq": "database"}}],
prefilter_cardinality_threshold=5_000,
filter_boost_percentage=20,
include_vectors=False,
)
as_retriever()
retriever = vector_store.as_retriever(search_kwargs={"k": 3})
docs = retriever.invoke("What are vector databases used for?")
3. Hybrid Search
Pass retrieval_mode=RetrievalMode.HYBRID and a sparse_embedding to enable hybrid search. The correct sparse_model is auto-detected.
Sparse Embedding Classes
| Class | Model | Install |
|---|---|---|
EndeeModelSparse |
Native BM25 (recommended) | included with endee-model |
FastEmbedSparse |
SPLADE (neural) | pip install fastembed |
Create a Hybrid Store
from langchain_endee import EndeeVectorStore, EndeeModelSparse, FastEmbedSparse, RetrievalMode
# Option A: EndeeModelSparse (recommended)
sparse = EndeeModelSparse()
# Option B: FastEmbedSparse with SPLADE
# sparse = FastEmbedSparse()
hybrid_store = EndeeVectorStore.from_documents(
documents=documents,
embedding=embeddings,
api_token="your-token",
index_name="hybrid_index",
dimension=DIMENSION,
space_type="cosine",
retrieval_mode=RetrievalMode.HYBRID,
sparse_embedding=sparse,
force_recreate=True,
)
All search methods automatically use both dense and sparse:
results = hybrid_store.similarity_search("vector database semantic search", k=3)
RRF Tuning
See Endee docs for details on Reciprocal Rank Fusion.
results = hybrid_store.similarity_search_with_score(
query="vector database semantic search",
k=3,
rrf_rank_constant=60,
dense_rrf_weight=0.7,
)
4. Filters
Pass filters as a list of dicts (AND logic). See Endee docs for filter operators ($eq, $in, $range).
Search with filters
results = vector_store.similarity_search(
query="learning from data",
k=5,
filter=[{"topic": {"$eq": "ai"}}],
)
results = vector_store.similarity_search(
query="safe languages",
k=5,
filter=[
{"topic": {"$eq": "programming"}},
{"language": {"$in": ["python", "rust"]}},
],
)
Retriever with filters
retriever = vector_store.as_retriever(
search_type="similarity",
search_kwargs={"k": 3, "filter": [{"topic": {"$eq": "ai"}}]},
)
docs = retriever.invoke("machine learning")
get_by_ids()
docs = vector_store.get_by_ids(["id1", "id2"]) # positional-only
update_filters()
Update filter metadata without re-embedding.
vector_store.update_filters([
{"id": "id1", "filter": {"topic": "updated", "priority": 1}},
])
delete()
# Delete by IDs
vector_store.delete(ids=["id1", "id2"])
# Delete by filter
vector_store.delete(filter=[{"status": {"$eq": "expired"}}])
5. RAG Chain
Wire the retriever into a LangChain chain that passes retrieved context to an LLM.
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)
retriever = vector_store.as_retriever(search_kwargs={"k": 3})
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
prompt = ChatPromptTemplate.from_template(
"Answer the question based only on the context below.\n\n"
"Context:\n{context}\n\n"
"Question: {question}"
)
rag_chain = (
{"context": retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
answer = rag_chain.invoke("How does vector search work?")
print(answer)
Works with any retriever — dense, hybrid, or filtered:
# Hybrid RAG
retriever = hybrid_store.as_retriever(search_kwargs={"k": 3})
# Filtered RAG
retriever = vector_store.as_retriever(
search_type="similarity",
search_kwargs={"k": 3, "filter": [{"topic": {"$eq": "ai"}}]},
)
Constructor Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
embedding |
Embeddings |
required | LangChain embedding function |
index_name |
str |
required | Name of the Endee index |
api_token |
str | None |
None |
From app.endee.io (None for local) |
base_url |
str | None |
None |
API base URL for local deployment (e.g. http://localhost:8000/api/v1) |
dimension |
int | None |
None |
Vector dimension (required for new indexes) |
space_type |
str |
"cosine" |
"cosine", "l2", or "ip" |
precision |
str |
Precision.INT16 |
See Endee docs |
M |
int |
16 |
See Endee docs |
ef_con |
int |
128 |
See Endee docs |
retrieval_mode |
RetrievalMode |
DENSE |
DENSE or HYBRID |
sparse_embedding |
SparseEmbeddings | None |
None |
Sparse model for hybrid search |
max_text_length |
int | None |
auto-detected | Max text length in tokens |
force_recreate |
bool |
False |
Delete and recreate index if exists |
validate_index_config |
bool |
True |
Validate dimension/config on connect |
Links
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langchain_endee-0.1.0b4.tar.gz.
File metadata
- Download URL: langchain_endee-0.1.0b4.tar.gz
- Upload date:
- Size: 38.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
badaea88750a85c917d88755a4202264b1518f5bbde1d9b78bd3527362214c5e
|
|
| MD5 |
2ab4ddf39a72f391f0021f2daa624e2f
|
|
| BLAKE2b-256 |
bb524f4d5cc1dc93f7cba24520038fc64f404e0919c661b69a287f69caa84dc2
|
File details
Details for the file langchain_endee-0.1.0b4-py3-none-any.whl.
File metadata
- Download URL: langchain_endee-0.1.0b4-py3-none-any.whl
- Upload date:
- Size: 16.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8b7ab978b49ff09c70f9e6ff4ac7ee834de008ff7c11673e61f8ae8f6f6637ed
|
|
| MD5 |
5d5124604dafcf48dd328b1be93b3998
|
|
| BLAKE2b-256 |
3a937c7fab70641618781cbb737c2da4c810c0d0ecbbc9af029212834233c272
|