Skip to main content

High Speed Vector Database for Faster and Efficient ANN Searches with LangChain

Project description

Endee LangChain Integration

LangChain vector store integration for Endee.

For Endee setup, features, and server docs see docs.endee.io.

Sections: Setup | Dense | Hybrid | Filters | RAG Chain


1. Setup

Install

pip install langchain-endee endee endee-model

Pick an embedding model:

# Option A: Local (no API key)
pip install langchain-huggingface sentence-transformers

# Option B: OpenAI
pip install langchain-openai

For hybrid search with SPLADE (optional):

pip install fastembed

Endee Serverless

Create a token at app.endee.io. See docs for details.

from langchain_endee import EndeeVectorStore
from langchain_core.documents import Document
from endee import Precision

from langchain_huggingface import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
DIMENSION = 384

# Or OpenAI:
# from langchain_openai import OpenAIEmbeddings
# embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
# DIMENSION = 1536

vector_store = EndeeVectorStore(
    embedding=embeddings,
    api_token="your-token",       # from app.endee.io
    index_name="my_index",
    dimension=DIMENSION,
)

Endee Local (Docker)

Run Endee locally — no token needed. See GitHub for setup.

docker run -p 8000:8080 -v endee-data:/data endee-oss:latest

The API is served at /api/v1, so pass base_url pointing to that path:

vector_store = EndeeVectorStore(
    embedding=embeddings,
    index_name="local_index",
    dimension=DIMENSION,
    base_url="http://localhost:8000/api/v1",   # local server, no token needed
)

base_url works with all factory methods:

# from_documents
store = EndeeVectorStore.from_documents(
    documents=documents,
    embedding=embeddings,
    index_name="my_index",
    dimension=DIMENSION,
    base_url="http://localhost:8000/api/v1",
)

# from_existing_index
store = EndeeVectorStore.from_existing_index(
    index_name="my_index",
    embedding=embeddings,
    base_url="http://localhost:8000/api/v1",
)

Ingest Documents

Each LangChain Document has page_content (the text to embed) and metadata (key-value pairs for filtering).

documents = [
    Document(
        page_content="Python is a high-level programming language known for readability.",
        metadata={"topic": "programming", "language": "python"},
    ),
    Document(
        page_content="Rust is a systems language focused on safety and speed.",
        metadata={"topic": "programming", "language": "rust"},
    ),
    Document(
        page_content="Machine learning gives systems the ability to learn from data.",
        metadata={"topic": "ai", "field": "ml"},
    ),
    Document(
        page_content="Vector databases store embeddings for fast similarity search.",
        metadata={"topic": "database", "type": "vector"},
    ),
    Document(
        page_content="RAG enhances LLM responses by retrieving relevant documents first.",
        metadata={"topic": "ai", "field": "rag"},
    ),
]

There are three ways to insert:

from_documents() — create index + insert Document objects

vector_store = EndeeVectorStore.from_documents(
    documents=documents,
    embedding=embeddings,
    api_token="your-token",
    index_name="my_index",
    dimension=DIMENSION,
    space_type="cosine",
    precision=Precision.INT16,
    force_recreate=True,
)

from_texts() — create index + insert raw strings

vector_store = EndeeVectorStore.from_texts(
    texts=[
        "Python is a high-level programming language.",
        "Rust is a systems language focused on safety.",
    ],
    metadatas=[
        {"topic": "programming", "language": "python"},
        {"topic": "programming", "language": "rust"},
    ],
    embedding=embeddings,
    api_token="your-token",
    index_name="my_index",
    dimension=DIMENSION,
)

add_texts() — insert into an existing store

new_ids = vector_store.add_texts(
    texts=[
        "Go is designed for scalable services.",
        "TypeScript adds static typing to JavaScript.",
    ],
    metadatas=[
        {"topic": "programming", "language": "go"},
        {"topic": "programming", "language": "typescript"},
    ],
    batch_size=1000,           # vectors per upsert (max 1000)
    embedding_chunk_size=100,  # texts per embedding API call
)
print(f"Inserted IDs: {new_ids}")

Reconnect to an Existing Index

Use from_existing_index() to reconnect without re-ingesting — ideal for production.

vector_store = EndeeVectorStore.from_existing_index(
    index_name="my_index",
    embedding=embeddings,
    api_token="your-token",
)

2. Dense Search

similarity_search()

results = vector_store.similarity_search(query="How does RAG work?", k=3)

for doc in results:
    print(f"[{doc.metadata.get('topic')}] {doc.page_content[:70]}")

similarity_search_with_score()

scored = vector_store.similarity_search_with_score(query="neural networks", k=3)

for doc, score in scored:
    print(f"sim={score:.3f}  {doc.page_content[:60]}")

similarity_search_by_vector()

query_vec = embeddings.embed_query("programming language safety")

# Dense mode
results = vector_store.similarity_search_by_vector(embedding=query_vec, k=2)

# Hybrid mode — sparse_indices and sparse_values must be supplied
# (omitting them logs a warning and falls back to dense-only)
sparse_vec = sparse.embed_query("programming language safety")
results = hybrid_store.similarity_search_by_vector(
    embedding=query_vec,
    sparse_indices=sparse_vec.indices,
    sparse_values=sparse_vec.values,
    k=2,
)

similarity_search_by_vector_with_score()

scored_by_vec = vector_store.similarity_search_by_vector_with_score(
    embedding=query_vec,
    k=3,
    filter=[{"topic": {"$eq": "programming"}}],
)

for doc, score in scored_by_vec:
    print(f"sim={score:.3f}  {doc.page_content[:65]}")

Search tuning

See Endee docs for details on ef, prefilter_cardinality_threshold, and filter_boost_percentage.

results = vector_store.similarity_search(
    query="vector search",
    k=10,
    ef=256,
    filter=[{"topic": {"$eq": "database"}}],
    prefilter_cardinality_threshold=5_000,
    filter_boost_percentage=20,
    include_vectors=False,
)

as_retriever()

retriever = vector_store.as_retriever(search_kwargs={"k": 3})
docs = retriever.invoke("What are vector databases used for?")

3. Hybrid Search

Pass retrieval_mode=RetrievalMode.HYBRID and a sparse_embedding to enable hybrid search. The correct sparse_model is auto-detected.

Sparse Embedding Classes

Class Model Install
EndeeModelSparse Native BM25 (recommended) included with endee-model
FastEmbedSparse SPLADE (neural) pip install fastembed

Create a Hybrid Store

from langchain_endee import EndeeVectorStore, EndeeModelSparse, FastEmbedSparse, RetrievalMode

# Option A: EndeeModelSparse (recommended)
sparse = EndeeModelSparse()

# Option B: FastEmbedSparse with SPLADE
# sparse = FastEmbedSparse()

hybrid_store = EndeeVectorStore.from_documents(
    documents=documents,
    embedding=embeddings,
    api_token="your-token",
    index_name="hybrid_index",
    dimension=DIMENSION,
    space_type="cosine",
    retrieval_mode=RetrievalMode.HYBRID,
    sparse_embedding=sparse,
    force_recreate=True,
)

All search methods automatically use both dense and sparse:

results = hybrid_store.similarity_search("vector database semantic search", k=3)

RRF Tuning

See Endee docs for details on Reciprocal Rank Fusion.

results = hybrid_store.similarity_search_with_score(
    query="vector database semantic search",
    k=3,
    rrf_rank_constant=60,
    dense_rrf_weight=0.7,
)

4. Filters

Pass filters as a list of dicts (AND logic). See Endee docs for filter operators ($eq, $in, $range).

Search with filters

results = vector_store.similarity_search(
    query="learning from data",
    k=5,
    filter=[{"topic": {"$eq": "ai"}}],
)
results = vector_store.similarity_search(
    query="safe languages",
    k=5,
    filter=[
        {"topic": {"$eq": "programming"}},
        {"language": {"$in": ["python", "rust"]}},
    ],
)

Retriever with filters

retriever = vector_store.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 3, "filter": [{"topic": {"$eq": "ai"}}]},
)
docs = retriever.invoke("machine learning")

get_by_ids()

docs = vector_store.get_by_ids(["id1", "id2"])  # positional-only

update_filters()

Update filter metadata without re-embedding.

vector_store.update_filters([
    {"id": "id1", "filter": {"topic": "updated", "priority": 1}},
])

delete()

# Delete by IDs
vector_store.delete(ids=["id1", "id2"])

# Delete by filter
vector_store.delete(filter=[{"status": {"$eq": "expired"}}])

5. RAG Chain

Wire the retriever into a LangChain chain that passes retrieved context to an LLM.

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


retriever = vector_store.as_retriever(search_kwargs={"k": 3})
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

prompt = ChatPromptTemplate.from_template(
    "Answer the question based only on the context below.\n\n"
    "Context:\n{context}\n\n"
    "Question: {question}"
)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

answer = rag_chain.invoke("How does vector search work?")
print(answer)

Works with any retriever — dense, hybrid, or filtered:

# Hybrid RAG
retriever = hybrid_store.as_retriever(search_kwargs={"k": 3})

# Filtered RAG
retriever = vector_store.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 3, "filter": [{"topic": {"$eq": "ai"}}]},
)

Constructor Parameters

Parameter Type Default Description
embedding Embeddings required LangChain embedding function
index_name str required Name of the Endee index
api_token str | None None From app.endee.io (None for local)
base_url str | None None API base URL for local deployment (e.g. http://localhost:8000/api/v1)
dimension int | None None Vector dimension (required for new indexes)
space_type str "cosine" "cosine", "l2", or "ip"
precision str Precision.INT16 See Endee docs
M int 16 See Endee docs
ef_con int 128 See Endee docs
retrieval_mode RetrievalMode DENSE DENSE or HYBRID
sparse_embedding SparseEmbeddings | None None Sparse model for hybrid search
max_text_length int | None auto-detected Max text length in tokens
force_recreate bool False Delete and recreate index if exists
validate_index_config bool True Validate dimension/config on connect

Links

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_endee-0.1.0b5.tar.gz (38.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langchain_endee-0.1.0b5-py3-none-any.whl (16.5 kB view details)

Uploaded Python 3

File details

Details for the file langchain_endee-0.1.0b5.tar.gz.

File metadata

  • Download URL: langchain_endee-0.1.0b5.tar.gz
  • Upload date:
  • Size: 38.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for langchain_endee-0.1.0b5.tar.gz
Algorithm Hash digest
SHA256 6b9e9d73e2ed826b692466ea55d2228817685b1943795d14f4930975c25ba5e3
MD5 e79a1987863bce05dfb7c9a351c83047
BLAKE2b-256 f1d47aa16df3d0a1a5b82c08bbf151ba5f2f67f764e5cd4733282222d5d76505

See more details on using hashes here.

File details

Details for the file langchain_endee-0.1.0b5-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_endee-0.1.0b5-py3-none-any.whl
Algorithm Hash digest
SHA256 2d255eb73bbc6ddabacb3466d658640f0da5e230d48b2135ababe9f1db2be96d
MD5 0b704b4843f62bf601c113aaa66d0d16
BLAKE2b-256 851580838e01a84a4b6a1130ad3f23eee8d0898b7ed30415ae2287e08bbd3083

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page