llama-index llms ollama integration
Project description
LlamaIndex Llms Integration: Ollama
Installation
To install the required package, run:
pip install llama-index-llms-ollama
Setup
- Follow the Ollama README to set up and run a local Ollama instance.
- When the Ollama app is running on your local machine, it will serve all of your local models on
localhost:11434
. - Select your model when creating the
Ollama
instance by specifyingmodel=":"
. - You can increase the default timeout (30 seconds) by setting
Ollama(..., request_timeout=300.0)
. - If you set
llm = Ollama(..., model="<model family>")
without a version, it will automatically look for the latest version.
Usage
Initialize Ollama
from llama_index.llms.ollama import Ollama
llm = Ollama(model="llama3.1:latest", request_timeout=120.0)
Generate Completions
To generate a text completion for a prompt, use the complete
method:
resp = llm.complete("Who is Paul Graham?")
print(resp)
Chat Responses
To send a chat message and receive a response, create a list of ChatMessage
instances and use the chat
method:
from llama_index.core.llms import ChatMessage
messages = [
ChatMessage(
role="system", content="You are a pirate with a colorful personality."
),
ChatMessage(role="user", content="What is your name?"),
]
resp = llm.chat(messages)
print(resp)
Streaming Responses
Stream Complete
To stream responses for a prompt, use the stream_complete
method:
response = llm.stream_complete("Who is Paul Graham?")
for r in response:
print(r.delta, end="")
Stream Chat
To stream chat responses, use the stream_chat
method:
messages = [
ChatMessage(
role="system", content="You are a pirate with a colorful personality."
),
ChatMessage(role="user", content="What is your name?"),
]
resp = llm.stream_chat(messages)
for r in resp:
print(r.delta, end="")
JSON Mode
Ollama supports a JSON mode to ensure all responses are valid JSON, which is useful for tools that need to parse structured outputs:
llm = Ollama(model="llama3.1:latest", request_timeout=120.0, json_mode=True)
response = llm.complete(
"Who is Paul Graham? Output as a structured JSON object."
)
print(str(response))
Structured Outputs
You can attach a Pydantic class to the LLM to ensure structured outputs:
from llama_index.core.bridge.pydantic import BaseModel
from llama_index.core.tools import FunctionTool
class Song(BaseModel):
"""A song with name and artist."""
name: str
artist: str
llm = Ollama(model="llama3.1:latest", request_timeout=120.0)
sllm = llm.as_structured_llm(Song)
response = sllm.chat([ChatMessage(role="user", content="Name a random song!")])
print(
response.message.content
) # e.g., {"name": "Yesterday", "artist": "The Beatles"}
Asynchronous Chat
You can also use asynchronous chat:
response = await sllm.achat(
[ChatMessage(role="user", content="Name a random song!")]
)
print(response.message.content)
LLM Implementation example
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file llama_index_llms_ollama-0.4.0.tar.gz
.
File metadata
- Download URL: llama_index_llms_ollama-0.4.0.tar.gz
- Upload date:
- Size: 5.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.10 Darwin/22.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3885142b810a4508c973e981e3afedb90e80b35089f5fe9245c88cf3d02c56c3 |
|
MD5 | 00bc0794c4a4fa423f8eef3d4fab2e6e |
|
BLAKE2b-256 | a3f0607ab6b587e040914007e4b664b4907b6fd9c0785cd912804d41af3be52f |
File details
Details for the file llama_index_llms_ollama-0.4.0-py3-none-any.whl
.
File metadata
- Download URL: llama_index_llms_ollama-0.4.0-py3-none-any.whl
- Upload date:
- Size: 5.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.10 Darwin/22.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3d89b4f8b333d41e5b5eda9531d1b98ed2dc51a9be39ac63a8a916b183a92181 |
|
MD5 | 98461360e97f6b3b17f49381311e79fb |
|
BLAKE2b-256 | c6bd706dc924ff9969faf9e93e813a72e7addcefd14492ebd42bed27d182eff8 |