Skip to main content

Vector Database for Fast ANN Searches

Project description

Endee + LlamaIndex Integration

llama-index-vector-stores-endee connects LlamaIndex to the Endee vector database — so you can use LlamaIndex's retrievers, query engines, and filters backed by Endee.

New to Endee? See the Endee Quick Start or the GitHub repo.

New to LlamaIndex? See the LlamaIndex docs.


1. Setup

Installation

pip install llama-index-vector-stores-endee

This installs endee, endee_model, llama-index, and pydantic as core dependencies.

Connecting to Endee

With an API token (Endee Cloud)

Get a token from the Endee Quick Start.

import os
from llama_index.core import Document, StorageContext, VectorStoreIndex, Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index_endee import EndeeVectorStore

Settings.embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")

# EndeeVectorStore.from_params() → creates a new index or reconnects to an existing one
vector_store = EndeeVectorStore.from_params(
    api_token=os.getenv("ENDEE_API_TOKEN"),
    index_name="my_index",
    dimension=384,
)

Without a token (local Endee server)

Set up a local server using the Endee GitHub repo.

# No api_token needed for local Endee
vector_store = EndeeVectorStore.from_params(
    index_name="my_index",
    dimension=384,
)

Reconnecting to an existing index

If the index already exists, from_params reconnects — no data loss. You don't need to pass dimension:

vector_store = EndeeVectorStore.from_params(
    api_token="your-token",
    index_name="my_existing_index",
)
# VectorStoreIndex.from_vector_store() → loads existing index for querying
index = VectorStoreIndex.from_vector_store(vector_store)

from_params Parameters

Parameter Description Default
api_token Endee API token (get one) None (local config)
index_name Index name required
dimension Vector dimension (must match your embedding model) required for new indexes
sparse_model None (dense), "endee_bm25" (BM25), "default" (SPLADE) None
batch_size Vectors per upsert 100

For Endee-specific index parameters (space_type, precision, M, ef_con), see the Endee docs.


2. Dense Search

The default mode when sparse_model is not set.

import os
from llama_index.core import Document, StorageContext, VectorStoreIndex, Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index_endee import EndeeVectorStore

Settings.embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")

# EndeeVectorStore.from_params() → creates or reconnects to index
vector_store = EndeeVectorStore.from_params(
    api_token=os.getenv("ENDEE_API_TOKEN"),
    index_name="dense_demo",
    dimension=384,
)

# VectorStoreIndex.from_documents() → chunks, embeds, and calls vector_store.add()
documents = [
    Document(text="Endee is a vector database for AI search.", metadata={"category": "database"}),
    Document(text="LlamaIndex is a data framework for LLM apps.", metadata={"category": "ai"}),
]
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

# index.as_retriever().retrieve() → calls vector_store.query()
results = index.as_retriever(similarity_top_k=3).retrieve("Tell me about vector databases")
for node in results:
    print(f"{node.get_score():.4f} | {node.text}")

Loading many documents

from llama_index.core import SimpleDirectoryReader

# Load all files from a directory (PDF, TXT, CSV, etc.)
documents = SimpleDirectoryReader("./data").load_data()

storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
print(f"Indexed {len(documents)} documents")

Documents are automatically chunked, embedded, and upserted in batches (default batch_size=100).


3. Dense + Sparse Search

Set sparse_model to enable dense + sparse search.

sparse_model value Encoder Install
"endee_bm25" BM25 via endee_model included (core dep)
"default" SPLADE++ via fastembed pip install llama-index-vector-stores-endee[splade]
import os
from llama_index.core import Document, StorageContext, VectorStoreIndex, Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index_endee import EndeeVectorStore

Settings.embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")

# EndeeVectorStore.from_params(sparse_model=...) → creates sparse-enabled index
# BM25 sparse search (works out of the box)
vector_store = EndeeVectorStore.from_params(
    api_token=os.getenv("ENDEE_API_TOKEN"),
    index_name="bm25_demo",
    dimension=384,
    sparse_model="endee_bm25",
)

# SPLADE sparse search (requires fastembed)
# vector_store = EndeeVectorStore.from_params(
#     api_token=os.getenv("ENDEE_API_TOKEN"),
#     index_name="splade_demo",
#     dimension=384,
#     sparse_model="default",
# )

# VectorStoreIndex.from_documents() → chunks, embeds, computes sparse vectors, and calls vector_store.add()
documents = [
    Document(text="Endee is a vector database for AI search.", metadata={"category": "database"}),
    Document(text="LlamaIndex is a data framework for LLM apps.", metadata={"category": "ai"}),
]
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

# index.as_retriever().retrieve() → calls vector_store.query() with both dense and sparse vectors
results = index.as_retriever(similarity_top_k=3).retrieve("Tell me about vector databases")
for node in results:
    print(f"{node.get_score():.4f} | {node.text}")

Control dense vs sparse balance

Use vector_store_kwargs to pass dense_rrf_weight to vector_store.query():

for weight, label in [(1.0, "dense-only"), (0.5, "balanced"), (0.0, "sparse-only")]:
    retriever = index.as_retriever(
        similarity_top_k=3,
        vector_store_kwargs={"dense_rrf_weight": weight},  # passed to vector_store.query()
    )
    results = retriever.retrieve("privacy vector search")
    print(f"  {label}: {results[0].get_score():.4f} | {results[0].text[:60]}...")
dense_rrf_weight Effect
1.0 Dense only
0.5 Balanced (default)
0.0 Sparse only

4. Metadata Filtering

Pass filters to as_retriever() — they are converted and forwarded to vector_store.query().

from llama_index.core.vector_stores.types import MetadataFilters, MetadataFilter, FilterOperator

# EQ — exact match
filters = MetadataFilters(
    filters=[MetadataFilter(key="category", value="ai", operator=FilterOperator.EQ)]
)
results = index.as_retriever(similarity_top_k=2, filters=filters).retrieve("machine learning")

# IN — match any in list
filters = MetadataFilters(
    filters=[MetadataFilter(key="category", value=["ai", "database"], operator=FilterOperator.IN)]
)
results = index.as_retriever(similarity_top_k=3, filters=filters).retrieve("vector search")

# Multiple filters (AND logic)
filters = MetadataFilters(filters=[
    MetadataFilter(key="category", value="database", operator=FilterOperator.EQ),
    MetadataFilter(key="type", value="vector", operator=FilterOperator.EQ),
])

Supported operators: EQ and IN.


Additional Features

Query Tuning

Pass tuning parameters via vector_store_kwargs — they are forwarded to vector_store.query():

Parameter Description Default
dense_rrf_weight Dense (1.0) vs sparse (0.0) balance when sparse_model is set 0.5
include_vectors False to skip returning embeddings True

For Endee-specific query parameters (ef, prefilter_cardinality_threshold, filter_boost_percentage, rrf_rank_constant), see the Endee docs.

Vector Operations

These methods call the Endee SDK directly, bypassing LlamaIndex's query engine:

Method What it does
vector_store.fetch(["node-id-1"]) Fetch vectors by ID → calls Index.get_vector()
vector_store.update_filters([{"id": "...", "filter": {...}}]) Update filter metadata → calls Index.update_filters()
vector_store.delete_vector("node-id-1") Delete by vector ID → calls Index.delete_vector()
vector_store.delete(ref_doc_id="doc-uuid") Delete by source document → calls Index.delete_with_filter()
vector_store.describe() Index metadata → calls Index.describe()
vector_store.client Direct access to the Endee SDK Index object

How LlamaIndex maps to EndeeVectorStore

LlamaIndex call EndeeVectorStore method Endee SDK call
VectorStoreIndex.from_documents(docs, ...) vector_store.add(nodes) Index.upsert()
index.as_retriever().retrieve("query") vector_store.query(query) Index.query()
EndeeVectorStore.from_params(...) creates or reconnects Endee.create_index() / Endee.get_index()

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_vector_stores_endee-0.1.0b5.tar.gz (28.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file llama_index_vector_stores_endee-0.1.0b5.tar.gz.

File metadata

File hashes

Hashes for llama_index_vector_stores_endee-0.1.0b5.tar.gz
Algorithm Hash digest
SHA256 b0e88c84a3343459cc1d3a0c480ebcb66c417355e600df341a145e78a81be3df
MD5 dc63601406d29b44fdb42a17c1b210a5
BLAKE2b-256 61e9ff24565dfdef32d452f841f32d3a8c13211d113733673058f91e5ff40d6a

See more details on using hashes here.

File details

Details for the file llama_index_vector_stores_endee-0.1.0b5-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_vector_stores_endee-0.1.0b5-py3-none-any.whl
Algorithm Hash digest
SHA256 774a27045d4637c8a5c8ded5a09d2c52badfd0a2879b757f6ab28a917da35ff7
MD5 ec307aacfdab92af900a9ffa83e0a9db
BLAKE2b-256 559e559d51e1177d2f0191e45e5f92f2785adaeeae30b7303b5e590e6e533f2a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page