Skip to main content

llama-index vector_stores Apache Solr integration

Project description

Apache Solr Vector Store for LlamaIndex

A LlamaIndex VectorStore using Apache Solr as the backend.

Install

pip install llama-index
pip install llama-index-vector-stores-solr

Solr References

Quickstart

Imports

from llama_index.vector_stores.solr import (
    ApacheSolrVectorStore,
    SyncSolrClient,
    AsyncSolrClient,
)
from llama_index.core import (
    VectorStoreIndex,
    StorageContext,
    Document,
    MockEmbedding,
)
import pytest
from llama_index.core import Settings
from llama_index.core.vector_stores.types import (
    VectorStoreQuery,
    VectorStoreQueryMode,
)
from llama_index.core.schema import NodeWithScore

Setup Vector Store

SOLR_COLLECTION_URL = "http://localhost:8983/solr/my_collection"  # assumes a solr collection is running here

sync_client = SyncSolrClient(base_url=SOLR_COLLECTION_URL)
async_client = AsyncSolrClient(base_url=SOLR_COLLECTION_URL)

vector_store = ApacheSolrVectorStore(
    sync_client=sync_client,
    async_client=async_client,
    nodeid_field="id",
    content_field="content_txt_en",  # store content in a text field searchable by BM25
    embedding_field="vector_field",  # dense vector field configured in Solr schema
    metadata_to_solr_field_mapping=[
        ("author", "author_s"),
    ],
    text_search_fields=["content_txt_en"],
)

Create Index and Query

# Dummy Documents
docs = [
    Document(
        text="Apache Solr integrates with LlamaIndex.",
        metadata={"author": "alice"},
    ),
    Document(
        text="Vector search lets you find semantically similar text.",
        metadata={"author": "bob"},
    ),
]

# Create the index
storage_context = StorageContext.from_defaults(vector_store=vector_store)
# NOTE: This will use the default embedding model set at Settings.embed_model
# Configure your own if you wish to do so.
# Alternatively this Vectorstore can be used with the IngestionPipeline as well.
index = VectorStoreIndex.from_documents(docs, storage_context=storage_context)

BM25 Search

  • This is a naive implementation ideally create a retriever with BaseRetriever
def simple_bm25(vector_store, query_str="semantic search"):
    results = vector_store.query(
        VectorStoreQuery(
            mode=VectorStoreQueryMode.TEXT_SEARCH,
            query_str=query_str,
            sparse_top_k=2,
        )
    )
    return [
        NodeWithScore(node=node, score=score)
        for node, score in zip(
            results.nodes, results.similarities or [], strict=True
        )
    ]


bm25_search_results = simple_bm25(vector_store, query_str="semantic search")

Dense Vector Search

  • This is a naive implementation ideally create a retriever with BaseRetriever
# Dense Vector Search
def simple_vector_search(vector_store, query_str="semantic search"):
    retriever = index.as_retriever(similarity_top_k=1)
    return retriever.retrieve("semantic search")


vector_search_results = simple_vector_search(vector_store, "semantic search")

Query Engine

query_engine = index.as_query_engine(similarity_top_k=2)
# NOTE: Will use default embedding and LLM model set at Settings
# Configure your own if you wish to do so.
res = query_engine.query("semantic search")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_vector_stores_solr-0.1.0.tar.gz (21.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llama_index_vector_stores_solr-0.1.0-py3-none-any.whl (26.3 kB view details)

Uploaded Python 3

File details

Details for the file llama_index_vector_stores_solr-0.1.0.tar.gz.

File metadata

File hashes

Hashes for llama_index_vector_stores_solr-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6936f580938d205aad6419cbe4ac4bcaba91f4b34875cc9b812b7bd00cb37deb
MD5 00139630c00e5b5b162b24a7eb599d68
BLAKE2b-256 ec0ebbd6eb79be75216aa80962d691d2f1220f30d7fdc2bb512156d7050ac92a

See more details on using hashes here.

File details

Details for the file llama_index_vector_stores_solr-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_vector_stores_solr-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9f1ad633ade6ed6a62ec34acef2ec4728e08f30cb66573429195b568c06a4f60
MD5 ef8ceb2fd3386a84c5018393f2a08e26
BLAKE2b-256 7ff123d552421b26d40e6a935b7111d8e1e8b600b1a6716f3c8a1244bc04ee48

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page