OpenTelemetry instrumentation for Ollama - local LLM runner

These details have not been verified by PyPI

Project description

TraceAI Ollama Instrumentation

OpenTelemetry instrumentation for Ollama - the leading local LLM runner.

Installation

pip install traceai-ollama

Features

Automatic tracing of Ollama API calls
Support for chat, generate, and embed endpoints
Streaming response support
Token usage tracking
Performance metrics (total_duration, eval_duration, etc.)
Full OpenTelemetry semantic conventions compliance

Usage

Basic Setup

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor, ConsoleSpanExporter
from traceai_ollama import OllamaInstrumentor
import ollama

# Set up tracing
provider = TracerProvider()
provider.add_span_processor(SimpleSpanProcessor(ConsoleSpanExporter()))
trace.set_tracer_provider(provider)

# Instrument Ollama
OllamaInstrumentor().instrument(tracer_provider=provider)

# Use Ollama - calls are automatically traced
response = ollama.chat(
    model="llama3.2",
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)
print(response["message"]["content"])

Chat Completions

import ollama

# Simple chat
response = ollama.chat(
    model="llama3.2",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

# Multi-turn conversation
messages = [
    {"role": "user", "content": "My name is Alice."},
    {"role": "assistant", "content": "Hello Alice! Nice to meet you."},
    {"role": "user", "content": "What's my name?"}
]
response = ollama.chat(model="llama3.2", messages=messages)

Streaming Responses

import ollama

# Streaming chat
stream = ollama.chat(
    model="llama3.2",
    messages=[{"role": "user", "content": "Tell me a story."}],
    stream=True
)

for chunk in stream:
    print(chunk["message"]["content"], end="", flush=True)

Text Generation

import ollama

# Simple text generation
response = ollama.generate(
    model="llama3.2",
    prompt="The quick brown fox"
)
print(response["response"])

# With system prompt
response = ollama.generate(
    model="llama3.2",
    prompt="Write a haiku",
    system="You are a poet."
)

Embeddings

import ollama

# Single embedding
response = ollama.embed(
    model="nomic-embed-text",
    input="Hello, world!"
)
print(f"Embedding dimensions: {len(response['embedding'])}")

# Multiple embeddings (batch)
response = ollama.embeddings(
    model="nomic-embed-text",
    prompt=["Hello", "World"]
)

Using Client Class

import ollama

# Create a client
client = ollama.Client(host="http://localhost:11434")

# Use the client
response = client.chat(
    model="llama3.2",
    messages=[{"role": "user", "content": "Hello!"}]
)

Async Support

import asyncio
import ollama

async def main():
    client = ollama.AsyncClient()

    # Async chat
    response = await client.chat(
        model="llama3.2",
        messages=[{"role": "user", "content": "Hello!"}]
    )
    print(response["message"]["content"])

    # Async streaming
    async for chunk in await client.chat(
        model="llama3.2",
        messages=[{"role": "user", "content": "Tell me a joke."}],
        stream=True
    ):
        print(chunk["message"]["content"], end="", flush=True)

asyncio.run(main())

Multimodal (Vision)

import ollama

# With image (base64 encoded)
response = ollama.chat(
    model="llava",
    messages=[
        {
            "role": "user",
            "content": "What's in this image?",
            "images": ["base64_encoded_image_data"]
        }
    ]
)

Configuration Options

TraceConfig

from fi_instrumentation import TraceConfig
from traceai_ollama import OllamaInstrumentor

config = TraceConfig(
    hide_inputs=False,      # Set True to hide input content
    hide_outputs=False,     # Set True to hide output content
    base64_image_max_length=100,  # Max length for base64 images in traces
)

OllamaInstrumentor().instrument(
    tracer_provider=provider,
    config=config
)

Captured Attributes

Request Attributes

Attribute	Description
`fi.span.kind`	"LLM" for chat/generate, "EMBEDDING" for embed
`llm.system`	"ollama"
`llm.provider`	"ollama"
`llm.model`	Model name (llama3.2, etc.)
`llm.input_messages.{n}.role`	Message role
`llm.input_messages.{n}.content`	Message content

Response Attributes

Attribute	Description
`llm.token_count.prompt`	Input token count (prompt_eval_count)
`llm.token_count.completion`	Output token count (eval_count)
`llm.token_count.total`	Total token count
`llm.output_messages.{n}.role`	Response role
`llm.output_messages.{n}.content`	Response content
`ollama.total_duration_ns`	Total request duration (nanoseconds)
`ollama.load_duration_ns`	Model load duration
`ollama.prompt_eval_duration_ns`	Prompt evaluation duration
`ollama.eval_duration_ns`	Response generation duration

Supported Models

Any model available in Ollama can be traced, including:

Model	Description
`llama3.2`	Meta's Llama 3.2
`mistral`	Mistral 7B
`mixtral`	Mixtral 8x7B
`codellama`	Code Llama
`llava`	LLaVA (vision)
`nomic-embed-text`	Nomic embeddings

Real-World Use Cases

RAG Pipeline

import ollama

# Generate embedding for query
query = "What is machine learning?"
query_embedding = ollama.embed(
    model="nomic-embed-text",
    input=query
)

# Search vector database (not shown)
# context = search_vector_db(query_embedding["embedding"])

# Generate response with context
response = ollama.chat(
    model="llama3.2",
    messages=[
        {"role": "system", "content": f"Use this context: {context}"},
        {"role": "user", "content": query}
    ]
)

Code Assistant

import ollama

response = ollama.chat(
    model="codellama",
    messages=[
        {"role": "system", "content": "You are a Python expert."},
        {"role": "user", "content": "Write a function to calculate fibonacci numbers"}
    ]
)

License

Apache-2.0

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Mar 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

traceai_ollama-0.1.0.tar.gz (10.0 kB view details)

Uploaded Mar 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

traceai_ollama-0.1.0-py3-none-any.whl (11.4 kB view details)

Uploaded Mar 10, 2026 Python 3

File details

Details for the file traceai_ollama-0.1.0.tar.gz.

File metadata

Download URL: traceai_ollama-0.1.0.tar.gz
Upload date: Mar 10, 2026
Size: 10.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for traceai_ollama-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`fa40f2b773d7ada3ab59d1af7ee56218280f70cf716bd4b578f407951676fa6d`
MD5	`19e77bc830fee51945db1f84955270db`
BLAKE2b-256	`75a74d38ba1fe677234902e7b97e55115a86237b881e8ba078d987fda65d13c3`

See more details on using hashes here.

File details

Details for the file traceai_ollama-0.1.0-py3-none-any.whl.

File metadata

Download URL: traceai_ollama-0.1.0-py3-none-any.whl
Upload date: Mar 10, 2026
Size: 11.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for traceai_ollama-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`446b836b2c396883e5a5ed14cdc579593147322a6b00b7476a7fe064270454b8`
MD5	`b4100c36a0d294f45b586f14e7da058a`
BLAKE2b-256	`6e59589b86ec408d818bfaceb448addc256b7630d705171293f467964960c857`

See more details on using hashes here.

traceAI-ollama 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

TraceAI Ollama Instrumentation

Installation

Features

Usage

Basic Setup

Chat Completions

Streaming Responses

Text Generation

Embeddings

Using Client Class

Async Support

Multimodal (Vision)

Configuration Options

TraceConfig

Captured Attributes

Request Attributes

Response Attributes

Supported Models

Real-World Use Cases

RAG Pipeline

Code Assistant

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes