Skip to main content

llama-index vector_stores vespa integration

Project description

#Vespa

LlamaIndex Vector_Stores Integration: Vespa

Vespa.ai is an open-source big data serving engine. It is designed for low-latency and high-throughput serving of data and models. Vespa.ai is used by many companies to serve search results, recommendations, and rankings for billions of documents and users, expecting response times in the milliseconds.

This integration allows you to use Vespa.ai as a vector store for LlamaIndex. Vespa has integrated support for embedding inference, so you don't need to run a separate service for these tasks.

Huggingface 🤗 embedders are supported, as well as SPLADE and ColBERT.

Abstraction level of this integration

To make it really simple to get started, we provide a template Vespa application that will be deployed upon initializing the vector store. This removes some of the complexity of setting up Vespa for the first time, but for serious use cases, we strongly recommend that you read the Vespa documentation and tailor the application to your needs.

The template

The provided template Vespa application can be seen below:

from vespa.package import (
    ApplicationPackage,
    Field,
    Schema,
    Document,
    HNSW,
    RankProfile,
    Component,
    Parameter,
    FieldSet,
    GlobalPhaseRanking,
    Function,
)

hybrid_template = ApplicationPackage(
    name="hybridsearch",
    schema=[
        Schema(
            name="doc",
            document=Document(
                fields=[
                    Field(name="id", type="string", indexing=["summary"]),
                    Field(
                        name="metadata", type="string", indexing=["summary"]
                    ),
                    Field(
                        name="text",
                        type="string",
                        indexing=["index", "summary"],
                        index="enable-bm25",
                        bolding=True,
                    ),
                    Field(
                        name="embedding",
                        type="tensor<float>(x[384])",
                        indexing=[
                            "input text",
                            "embed",
                            "index",
                            "attribute",
                        ],
                        ann=HNSW(distance_metric="angular"),
                        is_document_field=False,
                    ),
                ]
            ),
            fieldsets=[FieldSet(name="default", fields=["text", "metadata"])],
            rank_profiles=[
                RankProfile(
                    name="bm25",
                    inputs=[("query(q)", "tensor<float>(x[384])")],
                    functions=[
                        Function(name="bm25sum", expression="bm25(text)")
                    ],
                    first_phase="bm25sum",
                ),
                RankProfile(
                    name="semantic",
                    inputs=[("query(q)", "tensor<float>(x[384])")],
                    first_phase="closeness(field, embedding)",
                ),
                RankProfile(
                    name="fusion",
                    inherits="bm25",
                    inputs=[("query(q)", "tensor<float>(x[384])")],
                    first_phase="closeness(field, embedding)",
                    global_phase=GlobalPhaseRanking(
                        expression="reciprocal_rank_fusion(bm25sum, closeness(field, embedding))",
                        rerank_count=1000,
                    ),
                ),
            ],
        )
    ],
    components=[
        Component(
            id="e5",
            type="hugging-face-embedder",
            parameters=[
                Parameter(
                    "transformer-model",
                    {
                        "url": "https://github.com/vespa-engine/sample-apps/raw/master/simple-semantic-search/model/e5-small-v2-int8.onnx"
                    },
                ),
                Parameter(
                    "tokenizer-model",
                    {
                        "url": "https://raw.githubusercontent.com/vespa-engine/sample-apps/master/simple-semantic-search/model/tokenizer.json"
                    },
                ),
            ],
        )
    ],
)

Note that the fields id, metadata, text, and embedding are required for the integration to work. The schema name must also be doc, and the rank profiles must be named bm25, semantic, and fusion.

Other than that you are free to modify as you see fit by switching out embedding models, adding more fields, or changing the ranking expressions.

For more details, check out this Pyvespa example notebook on hybrid search.

Going to production

If you are ready to graduate to a production setup, we highly recommend to check out the Vespa Cloud service, where we manage all infrastructure and operations for you. Free trials are available.

Next steps

There are many awesome features in Vespa, that are not exposed directly in this integration, check out Pyvespa examples for some inspiration on what you can do with Vespa.

Teasers:

  • Binary + Matryoshka embeddings.
  • ColBERT.
  • ONNX models.
  • XGBoost and lightGBM models for ranking.
  • Multivector indexing.
  • and much more.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_vector_stores_vespa-0.4.0.tar.gz (9.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llama_index_vector_stores_vespa-0.4.0-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file llama_index_vector_stores_vespa-0.4.0.tar.gz.

File metadata

  • Download URL: llama_index_vector_stores_vespa-0.4.0.tar.gz
  • Upload date:
  • Size: 9.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_vector_stores_vespa-0.4.0.tar.gz
Algorithm Hash digest
SHA256 35bf2ff17fa078dfee5e0723f0e6974747d24ebb439f993609cb4ad31809d135
MD5 6588f678de49ca026dd1d70b1f818aff
BLAKE2b-256 7ee90165ce2a7a396e5d784381d96cf874ca1afb19a1364a74ecb252cbe2e2c6

See more details on using hashes here.

File details

Details for the file llama_index_vector_stores_vespa-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: llama_index_vector_stores_vespa-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 10.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_vector_stores_vespa-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8e05381a215613ed010469afb6dd8796c8579682abdbfcc7554af17e07c3af34
MD5 7c8e5ad357d6908e5da0a97f86da5d4f
BLAKE2b-256 33f9e578a13ed1a4618a9c33b9ba3e9f8c5594024b6001d8833507e9a0d4f7ab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page