Skip to main content

Library that provide a Solr based vector store for Langchain

Project description

Eurelis-Langchain-SolR-VectorStore

This library allows to use a Solr based vector store with the python version of LangChain

Usage

This library assume you already have a running Solr instance with a configured dense vector field

    <fieldType name="knn_vector" class="solr.DenseVectorField" vectorDimension="768" similarityFunction="euclidean"/>
    <field name="vector" type="knn_vector" indexed="true" stored="true"/>

Be sure to set a vectorDimension value corresponding to what yor embeddings model provide.

from langchain.embeddings import HuggingFaceEmbeddings
from eurelis_langchain_solr_vectorstore import Solr

embeddings = HuggingFaceEmbeddings()  # you are free to use any embeddings method allowed by langchain

vector_store = Solr(embeddings)  # with default core configuration

You can also specify data about the solr instance and core to use:

vector_store = Solr(embeddings, core_kwargs={
        'page_content_field': 'text_t',  # field containing the text content
        'vector_field': 'vector',        # field containing the embeddings of the text content
        'core_name': 'langchain',        # core name
        'url_base': 'http://localhost:8983/solr' # base url to access solr
    })  # with custom default core configuration

In the code above you have both the allowed core arguments and the default value.

Metadata

The Solr based vector store also supports storing and filtering on metadata.

Metadata are mapped into Solr using the following convention: metadata_{key}_{type} with key being the original metadata key, and type is automatically inferred as:

  • i for integer fields
  • d for float fields
  • s for string fields
  • b for boolean fields

The vector_search method take an optional where param expecting a dict:

  • dict item key: base name of a metadata field
  • dict item value: value expected in the metadata field

Example using the vector store as a retriever:

retriever = solr.as_retriever(search_kwargs: {'language': 'en', year: 2000})

Docker

A docker compose file is present in the etc/docker folder, use it with

docker compose up -d

to launch a solr instance with a core named langchain and a 'vector' field with 768 dimensions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eurelis_langchain_solr_vectorstore-0.0.1.tar.gz (11.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file eurelis_langchain_solr_vectorstore-0.0.1.tar.gz.

File metadata

File hashes

Hashes for eurelis_langchain_solr_vectorstore-0.0.1.tar.gz
Algorithm Hash digest
SHA256 294b5e5d0b38a786d226609911c52d70beeb7b5467d902dd19c0d146982213dc
MD5 d4de86de9d8a4932a278a8cdab443656
BLAKE2b-256 be1c2fd53fd518cbe8aa04f2838d8cbd1b9820085afbb2ec1ff224df469a6476

See more details on using hashes here.

File details

Details for the file eurelis_langchain_solr_vectorstore-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for eurelis_langchain_solr_vectorstore-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b998cafe3705193c7093771910faa63631c26751ecff45a12936281c4cecf967
MD5 8910e7b4ac6d1587e39f484b51fc01e0
BLAKE2b-256 edb64489603e5e181c76b9c0eece9a529d3f321105793ec77aacb75a03626eed

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page