Skip to main content

LLM plugin providing access to local Ollama models

Project description

llm-ollama

PyPI Changelog Tests License

LLM plugin providing access to models running on an Ollama server.

Installation

Install this plugin in the same environment as LLM.

llm install llm-ollama

Usage

First, ensure that the Ollama server is running and that you have pulled some models. You can use ollama list to check what is locally available.

The plugin will query the Ollama server for the list of models. You can use llm ollama models to see the list; it should be the same as output by ollama list. All these models will be automatically registered with LLM and made available for prompting, chatting, and embedding.

Assuming you have llama3.2:latest available, you can run a prompt using:

llm -m llama3.2:latest 'How much is 2+2?'

The plugin automatically creates a shorter alias for models that have :latest in the name, so the previous command is equivalent to running:

llm -m llama3.2 'How much is 2+2?'

To start an interactive chat session:

llm chat -m llama3.2
Chatting with llama3.2:latest
Type 'exit' or 'quit' to exit
Type '!multi' to enter multiple lines, then '!end' to finish
Type '!edit' to open your default editor and modify the prompt
Type '!fragment <my_fragment> [<another_fragment> ...]' to insert one or more fragments
>

Image attachments

Multi-modal Ollama models can accept image attachments using the LLM attachments option:

llm -m llava "Describe this image" -a https://static.simonwillison.net/static/2024/pelicans.jpg

Tools

Ollama models with tools support can make use of LLM tools passed to them:

llm -m llama3.2 -T llm_time 'What is the time?' --td

Embeddings

The plugin supports LLM embeddings. Both regular and specialized embedding models (such as mxbai-embed-large) can be used:

llm embed -m mxbai-embed-large -i README.md

By default, the input will be truncated from the end to fit within the context length. This behavior can be changed by setting OLLAMA_EMBED_TRUNCATE=no environment variable. In such cases, embedding operation will fail if the context length is exceeded.

JSON schemas

Ollama's built-in support for structured outputs can be accessed through LLM schemas, for example:

llm -m llama3.2 --schema "name, age int, one_sentence_bio" "invent a cool dog"

Async models

The plugin registers async LLM models suitable for use with Python asyncio.

To utilize an async model, retrieve it using llm.get_async_model() function instead of llm.get_model() and then await the response:

import asyncio, llm

async def run():
    model = llm.get_async_model("llama3.2:latest")
    response = model.prompt("A short poem about tea")
    print(await response.text())

asyncio.run(run())

Model aliases

The same Ollama model may be referred by several names with different tags. For example, in the following list, there is a single unique model with three different names:

ollama list
NAME                    ID              SIZE    MODIFIED
stable-code:3b          aa5ab8afb862    1.6 GB  9 hours ago
stable-code:code        aa5ab8afb862    1.6 GB  9 seconds ago
stable-code:latest      aa5ab8afb862    1.6 GB  14 seconds ago

In such cases, the plugin will register a single model and create additional aliases. Continuing the previous example, this is what LLM will have:

llm models
...

Ollama: stable-code:3b (aliases: stable-code:code, stable-code:latest, stable-code)

Model options

All models accept Ollama modelfile parameters as options. Use the -o name value syntax to specify them, for example:

  • -o temperature 0.8: set the temperature of the model
  • -o num_ctx 256000: set the size of the context window used to generate the next token

See the referenced page for the complete list with descriptions and default values.

Additionally, the -o json_object 1 option can be used to force the model to reply with a valid JSON object. Note that your prompt must mention JSON for this to work.

Ollama server address

llm-ollama will try to connect to a server at the default localhost:11434 address. If your Ollama server is remote or runs on a non-default port, you can use OLLAMA_HOST environment variable to point the plugin to it, e.g.:

export OLLAMA_HOST=https://192.168.1.13:11434

Authentication

If your Ollama server is protected with Basic Authentication, you can include the credentials directly in the OLLAMA_HOST environment variable:

export OLLAMA_HOST=https://username:password@192.168.1.13:11434

The plugin will parse the credentials and use them for authentication. Special characters in usernames or passwords should be URL-encoded:

# For username "user@domain" and password "p@ssw0rd"
export OLLAMA_HOST=https://user%40domain:p%40ssw0rd@192.168.1.13:11434

Development

Setup

To set up this plugin locally, first checkout the code. Then create a new virtual environment and install the dependencies. If you are using uv:

cd llm-ollama
uv venv
uv pip install -e '.[test,lint]'

Otherwise, if you prefer using standard tools:

cd llm-ollama
python3 -m venv .venv
pip install -e '.[test,lint]'

Testing and linting

To test or lint the code, first activate the environment:

source .venv/bin/activate

To run unit and integration tests:

python -m pytest

Integration tests require a running Ollama server and will be:

  • Enabled automatically if an Ollama server is available;
  • Skipped if Ollama server is unavailable;
  • Force-enabled with --integration (but fail if Ollama server is unavailable);
  • Force-disabled with --no-integration.

To format the code:

python -m black .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_ollama-0.12.0.tar.gz (21.0 kB view details)

Uploaded Source

Built Distribution

llm_ollama-0.12.0-py3-none-any.whl (14.8 kB view details)

Uploaded Python 3

File details

Details for the file llm_ollama-0.12.0.tar.gz.

File metadata

  • Download URL: llm_ollama-0.12.0.tar.gz
  • Upload date:
  • Size: 21.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for llm_ollama-0.12.0.tar.gz
Algorithm Hash digest
SHA256 9430d04cb9830e9e3e190bd94f3940df4e310015b29a18e0736c49bb767697a5
MD5 0cb8b85a71ca186a7c9b3b64391bf4d3
BLAKE2b-256 19dd6d9175bbc180c4d3eabaa47f414e05aca6dd3ec23c9a490d81bd69ddc779

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_ollama-0.12.0.tar.gz:

Publisher: publish.yml on taketwo/llm-ollama

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file llm_ollama-0.12.0-py3-none-any.whl.

File metadata

  • Download URL: llm_ollama-0.12.0-py3-none-any.whl
  • Upload date:
  • Size: 14.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for llm_ollama-0.12.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f53538dbebc3a156445bf0e16bb0e36e6e3c7b1be810f5d4f2a4280304282132
MD5 991777671fe02434d4927d0310d2e8a9
BLAKE2b-256 cc871ca40dfadd2bcb3d64eb38627f60ab57d46ad5f2a277ce64ea7325f8594e

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_ollama-0.12.0-py3-none-any.whl:

Publisher: publish.yml on taketwo/llm-ollama

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page