Skip to main content

The official Python client for Ollama.

Project description

Ollama Python Library

The Ollama Python library provides the easiest way to integrate Python 3.8+ projects with Ollama.

Prerequisites

  • Ollama should be installed and running
  • Pull a model to use with the library: ollama pull <model> e.g. ollama pull gemma3
    • See Ollama.com for more information on the models available.

Install

pip install ollama

Usage

from ollama import chat
from ollama import ChatResponse

response: ChatResponse = chat(model='gemma3', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])
print(response['message']['content'])
# or access fields directly from the response object
print(response.message.content)

See _types.py for more information on the response types.

Streaming responses

Response streaming can be enabled by setting stream=True.

from ollama import chat

stream = chat(
    model='gemma3',
    messages=[{'role': 'user', 'content': 'Why is the sky blue?'}],
    stream=True,
)

for chunk in stream:
  print(chunk['message']['content'], end='', flush=True)

Cloud Models

Run larger models by offloading to Ollama’s cloud while keeping your local workflow.

  • Supported models: deepseek-v3.1:671b-cloud, gpt-oss:20b-cloud, gpt-oss:120b-cloud, kimi-k2:1t-cloud, qwen3-coder:480b-cloud, kimi-k2-thinking See Ollama Models - Cloud for more information

Run via local Ollama

  1. Sign in (one-time):
ollama signin
  1. Pull a cloud model:
ollama pull gpt-oss:120b-cloud
  1. Make a request:
from ollama import Client

client = Client()

messages = [
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
]

for part in client.chat('gpt-oss:120b-cloud', messages=messages, stream=True):
  print(part.message.content, end='', flush=True)

Cloud API (ollama.com)

Access cloud models directly by pointing the client at https://ollama.com.

  1. Create an API key from ollama.com , then set:
export OLLAMA_API_KEY=your_api_key
  1. (Optional) List models available via the API:
curl https://ollama.com/api/tags
  1. Generate a response via the cloud API:
import os
from ollama import Client

client = Client(
    host='https://ollama.com',
    headers={'Authorization': 'Bearer ' + os.environ.get('OLLAMA_API_KEY')}
)

messages = [
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
]

for part in client.chat('gpt-oss:120b', messages=messages, stream=True):
  print(part.message.content, end='', flush=True)

Custom client

A custom client can be created by instantiating Client or AsyncClient from ollama.

All extra keyword arguments are passed into the httpx.Client.

from ollama import Client
client = Client(
  host='http://localhost:11434',
  headers={'x-some-header': 'some-value'}
)
response = client.chat(model='gemma3', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])

Async client

The AsyncClient class is used to make asynchronous requests. It can be configured with the same fields as the Client class.

import asyncio
from ollama import AsyncClient

async def chat():
  message = {'role': 'user', 'content': 'Why is the sky blue?'}
  response = await AsyncClient().chat(model='gemma3', messages=[message])

asyncio.run(chat())

Setting stream=True modifies functions to return a Python asynchronous generator:

import asyncio
from ollama import AsyncClient

async def chat():
  message = {'role': 'user', 'content': 'Why is the sky blue?'}
  async for part in await AsyncClient().chat(model='gemma3', messages=[message], stream=True):
    print(part['message']['content'], end='', flush=True)

asyncio.run(chat())

API

The Ollama Python library's API is designed around the Ollama REST API

Chat

ollama.chat(model='gemma3', messages=[{'role': 'user', 'content': 'Why is the sky blue?'}])

Generate

ollama.generate(model='gemma3', prompt='Why is the sky blue?')

List

ollama.list()

Show

ollama.show('gemma3')

Create

ollama.create(model='example', from_='gemma3', system="You are Mario from Super Mario Bros.")

Copy

ollama.copy('gemma3', 'user/gemma3')

Delete

ollama.delete('gemma3')

Pull

ollama.pull('gemma3')

Push

ollama.push('user/gemma3')

Embed

ollama.embed(model='gemma3', input='The sky is blue because of rayleigh scattering')

Embed (batch)

ollama.embed(model='gemma3', input=['The sky is blue because of rayleigh scattering', 'Grass is green because of chlorophyll'])

Ps

ollama.ps()

Errors

Errors are raised if requests return an error status or if an error is detected while streaming.

model = 'does-not-yet-exist'

try:
  ollama.chat(model)
except ollama.ResponseError as e:
  print('Error:', e.error)
  if e.status_code == 404:
    ollama.pull(model)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ollama-0.6.2.tar.gz (53.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ollama-0.6.2-py3-none-any.whl (15.1 kB view details)

Uploaded Python 3

File details

Details for the file ollama-0.6.2.tar.gz.

File metadata

  • Download URL: ollama-0.6.2.tar.gz
  • Upload date:
  • Size: 53.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ollama-0.6.2.tar.gz
Algorithm Hash digest
SHA256 936d55daa684f474364c098611c933626f8d6c7d67065c5b7ae0c477b508b07f
MD5 d3cace01e4c45f57560b4ac483d8888b
BLAKE2b-256 fc725f12423b6b39ca8430fbe56f77fcf4ef60f63067c7c4a2e30e200ed9ec16

See more details on using hashes here.

Provenance

The following attestation bundles were made for ollama-0.6.2.tar.gz:

Publisher: publish.yaml on ollama/ollama-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ollama-0.6.2-py3-none-any.whl.

File metadata

  • Download URL: ollama-0.6.2-py3-none-any.whl
  • Upload date:
  • Size: 15.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ollama-0.6.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3ad7daab28e5a973445c36a73882a3ef698c2ebb00e21e308652741577509f7d
MD5 4c8f57c24b0d67f671d84a80e01fbcc7
BLAKE2b-256 c4abd6722beeb2d10f7a3b9ff49375708904fde18f82b5609a0bc4aeb5996a4d

See more details on using hashes here.

Provenance

The following attestation bundles were made for ollama-0.6.2-py3-none-any.whl:

Publisher: publish.yaml on ollama/ollama-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page