llama-index embeddings ollama integration

These details have not been verified by PyPI

Project description

LlamaIndex Embeddings Integration: Ollama

The llama-index-embeddings-ollama package contains LlamaIndex integrations for generating embeddings using Ollama, a tool for running large language models locally.

Ollama allows you to run embedding models on your local machine, providing privacy, cost savings, and the ability to work offline. This integration enables you to use Ollama's embedding models seamlessly with LlamaIndex's vector store and retrieval systems.

Installation

To install the llama-index-embeddings-ollama package, run the following command:

pip install llama-index-embeddings-ollama

You'll also need to have Ollama installed and running on your machine. Visit ollama.ai to download and install Ollama.

Prerequisites

Before using this integration, ensure you have:

Ollama installed: Download from ollama.ai
Ollama running: Start the Ollama service (usually runs on http://localhost:11434 by default)

An embedding model pulled: Pull an embedding model using Ollama CLI:

ollama pull nomic-embed-text
# or
ollama pull embeddinggemma

Basic Usage

Simple Embedding Generation

from llama_index.embeddings.ollama import OllamaEmbedding

# Initialize the embedding model
embed_model = OllamaEmbedding(
    model_name="nomic-embed-text",  # or "embeddinggemma"
    base_url="http://localhost:11434",  # default Ollama URL
)

# Generate an embedding for a single text
text_embedding = embed_model.get_text_embedding("Hello, world!")
print(f"Embedding dimension: {len(text_embedding)}")

# Generate an embedding for a query
query_embedding = embed_model.get_query_embedding("What is AI?")

Batch Embedding Generation

# Generate embeddings for multiple texts at once
texts = [
    "The capital of France is Paris.",
    "Python is a programming language.",
    "Machine learning is a subset of AI.",
]

embeddings = embed_model.get_text_embeddings(texts)
print(f"Generated {len(embeddings)} embeddings")

Integration with LlamaIndex

Using with VectorStoreIndex

The most common use case is to integrate Ollama embeddings with LlamaIndex's vector store:

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.embeddings.ollama import OllamaEmbedding

# Set the embedding model globally
Settings.embed_model = OllamaEmbedding(
    model_name="nomic-embed-text",
    base_url="http://localhost:11434",
)

# Load documents
documents = SimpleDirectoryReader("data").load_data()

# Create index with Ollama embeddings
index = VectorStoreIndex.from_documents(documents)

# Query the index
query_engine = index.as_query_engine()
response = query_engine.query("What is the main topic?")
print(response)

Using with Custom LLM

You can combine Ollama embeddings with other LLMs (including Ollama LLMs):

from llama_index.core import VectorStoreIndex, Settings
from llama_index.embeddings.ollama import OllamaEmbedding
from llama_index.llms.ollama import Ollama

# Set both LLM and embedding model
Settings.llm = Ollama(model="llama3.1", base_url="http://localhost:11434")
Settings.embed_model = OllamaEmbedding(
    model_name="nomic-embed-text",
    base_url="http://localhost:11434",
)

# Your documents and indexing code here...

Configuration Options

The OllamaEmbedding class supports several configuration options:

embed_model = OllamaEmbedding(
    model_name="nomic-embed-text",  # Required: Ollama model name
    base_url="http://localhost:11434",  # Optional: Ollama server URL (default: http://localhost:11434)
    embed_batch_size=10,  # Optional: Batch size for embeddings (default: 10)
    keep_alive="5m",  # Optional: How long to keep model in memory (default: "5m")
    query_instruction=None,  # Optional: Instruction to prepend to queries
    text_instruction=None,  # Optional: Instruction to prepend to text
    ollama_additional_kwargs={},  # Optional: Additional kwargs for Ollama API
    client_kwargs={},  # Optional: Additional kwargs for Ollama client
)

Parameter Details

model_name (required): The name of the Ollama embedding model to use (e.g., "nomic-embed-text", "embeddinggemma")
base_url (optional): The base URL of your Ollama server. Defaults to "http://localhost:11434"
embed_batch_size (optional): Number of texts to process in each batch. Must be between 1 and 2048. Defaults to 10
keep_alive (optional): Controls how long the model stays loaded in memory after a request. Can be a duration string (e.g., "5m", "10s") or a number of seconds. Defaults to "5m"
query_instruction (optional): Instruction text to prepend to query strings before embedding
text_instruction (optional): Instruction text to prepend to document text before embedding
ollama_additional_kwargs (optional): Additional keyword arguments to pass to the Ollama API
client_kwargs (optional): Additional keyword arguments for the Ollama client (e.g., authentication headers)

Using Instructions for Better Retrieval

Some embedding models benefit from prepending instructions to queries and documents. This can improve retrieval quality:

embed_model = OllamaEmbedding(
    model_name="nomic-embed-text",
    query_instruction="Represent the question for retrieving supporting documents:",
    text_instruction="Represent the document for retrieval:",
)

# The instructions will be automatically prepended
query_embedding = embed_model.get_query_embedding("What is machine learning?")
# Internally processes: "Represent the question for retrieving supporting documents: What is machine learning?"

text_embedding = embed_model.get_text_embedding(
    "Machine learning is a method of data analysis."
)
# Internally processes: "Represent the document for retrieval: Machine learning is a method of data analysis."

Async Usage

The integration supports asynchronous operations for better performance:

import asyncio
from llama_index.embeddings.ollama import OllamaEmbedding

embed_model = OllamaEmbedding(model_name="nomic-embed-text")


async def main():
    # Async single embedding
    embedding = await embed_model.aget_text_embedding("Hello, world!")

    # Async batch embeddings
    embeddings = await embed_model.aget_text_embeddings(
        [
            "Text 1",
            "Text 2",
            "Text 3",
        ]
    )

    # Async query embedding
    query_embedding = await embed_model.aget_query_embedding("What is AI?")


asyncio.run(main())

Remote Ollama Server

If you're running Ollama on a remote server, specify the base_url:

embed_model = OllamaEmbedding(
    model_name="nomic-embed-text",
    base_url="http://your-remote-server:11434",
)

Available Models

Popular embedding models available in Ollama include:

nomic-embed-text: General-purpose embedding model
embeddinggemma: Google's Gemma-based embedding model
mxbai-embed-large: Large embedding model for better quality

Pull a model using:

ollama pull nomic-embed-text

Examples

For more detailed examples, see the Ollama Embeddings notebook in the LlamaIndex documentation.

Troubleshooting

Connection Errors

If you encounter connection errors, ensure:

Ollama is running: ollama serve or check the service status
The base_url matches your Ollama server address
The model is pulled: ollama pull <model-name>

Model Not Found

If you get a "model not found" error:

List available models: ollama list
Pull the required model: ollama pull <model-name>
Verify the model name matches exactly in your code

License

This package is licensed under the MIT License. See the LICENSE file for details.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.9.0

Mar 12, 2026

0.8.6

Jan 7, 2026

0.8.5

Dec 19, 2025

0.8.4

Nov 10, 2025

0.8.3

Sep 8, 2025

0.8.2

Aug 22, 2025

0.8.1

Aug 10, 2025

0.8.0

Aug 8, 2025

0.7.0

Jul 30, 2025

0.6.0

Mar 5, 2025

0.5.0

Dec 8, 2024

0.4.0

Nov 17, 2024

0.3.1

Sep 13, 2024

0.3.0

Aug 22, 2024

0.2.0

Aug 19, 2024

0.1.3

Aug 1, 2024

0.1.2

Feb 21, 2024

0.1.1

Feb 12, 2024

0.1.0

Feb 10, 2024

0.0.1

Feb 3, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_embeddings_ollama-0.9.0.tar.gz (6.6 kB view details)

Uploaded Mar 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llama_index_embeddings_ollama-0.9.0-py3-none-any.whl (6.2 kB view details)

Uploaded Mar 12, 2026 Python 3

File details

Details for the file llama_index_embeddings_ollama-0.9.0.tar.gz.

File metadata

Download URL: llama_index_embeddings_ollama-0.9.0.tar.gz
Upload date: Mar 12, 2026
Size: 6.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_embeddings_ollama-0.9.0.tar.gz
Algorithm	Hash digest
SHA256	`19d2d2a0e3f0934480eae31243ac5f1ce171319578b9c0adad25cf1b6c35659e`
MD5	`ea4f3b31a69386fe6131c06ab4093da1`
BLAKE2b-256	`8bcd2cff1feac66368a4c60ea7afbdbb3f3fdd49ee8c279fc105457e726a3ad2`

See more details on using hashes here.

File details

Details for the file llama_index_embeddings_ollama-0.9.0-py3-none-any.whl.

File metadata

Download URL: llama_index_embeddings_ollama-0.9.0-py3-none-any.whl
Upload date: Mar 12, 2026
Size: 6.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_embeddings_ollama-0.9.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`92e0ce481e60a9bcbddbe2c369d2f72c6fdd7158d03a34ab9b35d80869b673c3`
MD5	`b8c9d0c995b6be59776802d54de30ae8`
BLAKE2b-256	`9a3653674403380483510a7f657c5d6f0bdac5b7f9ec5a1a8d06cdfdd6dc47f2`

See more details on using hashes here.

llama-index-embeddings-ollama 0.9.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

LlamaIndex Embeddings Integration: Ollama

Installation

Prerequisites

Basic Usage

Simple Embedding Generation

Batch Embedding Generation

Integration with LlamaIndex

Using with VectorStoreIndex

Using with Custom LLM

Configuration Options

Parameter Details

Using Instructions for Better Retrieval

Async Usage

Remote Ollama Server

Available Models

Examples

Troubleshooting

Connection Errors

Model Not Found

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes