Skip to main content

Vector Database for Fast ANN Searches

Project description

Endee + LlamaIndex Integration

llama-index-vector-stores-endee connects LlamaIndex to the Endee vector database — so you can use LlamaIndex's retrievers, query engines, and filters backed by Endee.

New to Endee? See the Endee Quick Start or the GitHub repo.

New to LlamaIndex? See the LlamaIndex docs.


1. Setup

Installation

pip install llama-index-vector-stores-endee

This installs endee, endee_model, llama-index, and pydantic as core dependencies.

Connecting to Endee

With an API token (Endee Cloud)

Get a token from the Endee Quick Start.

import os
from llama_index.core import Document, StorageContext, VectorStoreIndex, Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index_endee import EndeeVectorStore

Settings.embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")

# EndeeVectorStore.from_params() → creates a new index or reconnects to an existing one
vector_store = EndeeVectorStore.from_params(
    api_token=os.getenv("ENDEE_API_TOKEN"),
    index_name="my_index",
    dimension=384,
)

Without a token (local Endee server)

Set up a local server using the Endee GitHub repo.

# No api_token needed for local Endee
vector_store = EndeeVectorStore.from_params(
    index_name="my_index",
    dimension=384,
)

Reconnecting to an existing index

If the index already exists, from_params reconnects — no data loss. You don't need to pass dimension:

vector_store = EndeeVectorStore.from_params(
    api_token="your-token",
    index_name="my_existing_index",
)
# VectorStoreIndex.from_vector_store() → loads existing index for querying
index = VectorStoreIndex.from_vector_store(vector_store)

from_params Parameters

Parameter Description Default
api_token Endee API token (get one) None (local config)
index_name Index name required
dimension Vector dimension (must match your embedding model) required for new indexes
sparse_model None (dense), "endee_bm25" (BM25), "default" (SPLADE) None
batch_size Vectors per upsert 100

For Endee-specific index parameters (space_type, precision, M, ef_con), see the Endee docs.


2. Dense Search

The default mode when sparse_model is not set.

import os
from llama_index.core import Document, StorageContext, VectorStoreIndex, Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index_endee import EndeeVectorStore

Settings.embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")

# EndeeVectorStore.from_params() → creates or reconnects to index
vector_store = EndeeVectorStore.from_params(
    api_token=os.getenv("ENDEE_API_TOKEN"),
    index_name="dense_demo",
    dimension=384,
)

# VectorStoreIndex.from_documents() → chunks, embeds, and calls vector_store.add()
documents = [
    Document(text="Endee is a vector database for AI search.", metadata={"category": "database"}),
    Document(text="LlamaIndex is a data framework for LLM apps.", metadata={"category": "ai"}),
]
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

# index.as_retriever().retrieve() → calls vector_store.query()
results = index.as_retriever(similarity_top_k=3).retrieve("Tell me about vector databases")
for node in results:
    print(f"{node.get_score():.4f} | {node.text}")

Loading many documents

from llama_index.core import SimpleDirectoryReader

# Load all files from a directory (PDF, TXT, CSV, etc.)
documents = SimpleDirectoryReader("./data").load_data()

storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
print(f"Indexed {len(documents)} documents")

Documents are automatically chunked, embedded, and upserted in batches (default batch_size=100).


3. Dense + Sparse Search

Set sparse_model to enable dense + sparse search.

sparse_model value Encoder Install
"endee_bm25" BM25 via endee_model included (core dep)
"default" SPLADE++ via fastembed pip install llama-index-vector-stores-endee[splade]
import os
from llama_index.core import Document, StorageContext, VectorStoreIndex, Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index_endee import EndeeVectorStore

Settings.embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")

# EndeeVectorStore.from_params(sparse_model=...) → creates sparse-enabled index
# BM25 sparse search (works out of the box)
vector_store = EndeeVectorStore.from_params(
    api_token=os.getenv("ENDEE_API_TOKEN"),
    index_name="bm25_demo",
    dimension=384,
    sparse_model="endee_bm25",
)

# SPLADE sparse search (requires fastembed)
# vector_store = EndeeVectorStore.from_params(
#     api_token=os.getenv("ENDEE_API_TOKEN"),
#     index_name="splade_demo",
#     dimension=384,
#     sparse_model="default",
# )

# VectorStoreIndex.from_documents() → chunks, embeds, computes sparse vectors, and calls vector_store.add()
documents = [
    Document(text="Endee is a vector database for AI search.", metadata={"category": "database"}),
    Document(text="LlamaIndex is a data framework for LLM apps.", metadata={"category": "ai"}),
]
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

# index.as_retriever().retrieve() → calls vector_store.query() with both dense and sparse vectors
results = index.as_retriever(similarity_top_k=3).retrieve("Tell me about vector databases")
for node in results:
    print(f"{node.get_score():.4f} | {node.text}")

Control dense vs sparse balance

Use vector_store_kwargs to pass dense_rrf_weight to vector_store.query():

for weight, label in [(1.0, "dense-only"), (0.5, "balanced"), (0.0, "sparse-only")]:
    retriever = index.as_retriever(
        similarity_top_k=3,
        vector_store_kwargs={"dense_rrf_weight": weight},  # passed to vector_store.query()
    )
    results = retriever.retrieve("privacy vector search")
    print(f"  {label}: {results[0].get_score():.4f} | {results[0].text[:60]}...")
dense_rrf_weight Effect
1.0 Dense only
0.5 Balanced (default)
0.0 Sparse only

4. Metadata Filtering

Pass filters to as_retriever() — they are converted and forwarded to vector_store.query().

from llama_index.core.vector_stores.types import MetadataFilters, MetadataFilter, FilterOperator

# EQ — exact match
filters = MetadataFilters(
    filters=[MetadataFilter(key="category", value="ai", operator=FilterOperator.EQ)]
)
results = index.as_retriever(similarity_top_k=2, filters=filters).retrieve("machine learning")

# IN — match any in list
filters = MetadataFilters(
    filters=[MetadataFilter(key="category", value=["ai", "database"], operator=FilterOperator.IN)]
)
results = index.as_retriever(similarity_top_k=3, filters=filters).retrieve("vector search")

# Multiple filters (AND logic)
filters = MetadataFilters(filters=[
    MetadataFilter(key="category", value="database", operator=FilterOperator.EQ),
    MetadataFilter(key="type", value="vector", operator=FilterOperator.EQ),
])

Supported operators: EQ and IN.


Additional Features

Query Tuning

Pass tuning parameters via vector_store_kwargs — they are forwarded to vector_store.query():

Parameter Description Default
dense_rrf_weight Dense (1.0) vs sparse (0.0) balance when sparse_model is set 0.5
include_vectors False to skip returning embeddings True

For Endee-specific query parameters (ef, prefilter_cardinality_threshold, filter_boost_percentage, rrf_rank_constant), see the Endee docs.

Vector Operations

These methods call the Endee SDK directly, bypassing LlamaIndex's query engine:

Method What it does
vector_store.fetch(["node-id-1"]) Fetch vectors by ID → calls Index.get_vector()
vector_store.update_filters([{"id": "...", "filter": {...}}]) Update filter metadata → calls Index.update_filters()
vector_store.delete_vector("node-id-1") Delete by vector ID → calls Index.delete_vector()
vector_store.delete(ref_doc_id="doc-uuid") Delete by source document → calls Index.delete_with_filter()
vector_store.describe() Index metadata → calls Index.describe()
vector_store.client Direct access to the Endee SDK Index object

How LlamaIndex maps to EndeeVectorStore

LlamaIndex call EndeeVectorStore method Endee SDK call
VectorStoreIndex.from_documents(docs, ...) vector_store.add(nodes) Index.upsert()
index.as_retriever().retrieve("query") vector_store.query(query) Index.query()
EndeeVectorStore.from_params(...) creates or reconnects Endee.create_index() / Endee.get_index()

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_vector_stores_endee-0.1.0b4.tar.gz (29.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file llama_index_vector_stores_endee-0.1.0b4.tar.gz.

File metadata

File hashes

Hashes for llama_index_vector_stores_endee-0.1.0b4.tar.gz
Algorithm Hash digest
SHA256 fc143950ebc034b080f47c0e8ec4d44be17ec72a7447f6da10348b8500985070
MD5 9f3300b101b6fb53e7a75ee6ed29d73a
BLAKE2b-256 93a9ae9ae795c12266ec6f74642552a4f67ddc10789cdf28eef24327aa40d9cd

See more details on using hashes here.

File details

Details for the file llama_index_vector_stores_endee-0.1.0b4-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_vector_stores_endee-0.1.0b4-py3-none-any.whl
Algorithm Hash digest
SHA256 6240c98d215c95cce5c6e743a0432acd53bd0773922900508fd3894bfec981c3
MD5 51d372b2fc11f5e72e365279ab39c5f9
BLAKE2b-256 9f299999543660066141b573c3a5b3aa17245963c3724c058d6857d71ada2f90

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page