A Python library for running Ollama agents with automated server management
Project description
ollama2a
A Python library for running Ollama agents with automated server management.
Features
- 🚀 Automated Ollama Server Management: Automatically starts and manages Ollama servers
- 🤖 Pydantic AI Integration: Seamless integration with pydantic-ai for agent creation
- 🔧 Configurable: Easy configuration of host, port, models, and tools
- 🧪 Well Tested: Comprehensive test suite with high coverage
- 📦 Production Ready: Robust error handling and resource management
- 🌊 Streaming Support: A2A-compliant SSE streaming for real-time responses
Installation
pip install ollama2a
Development Installation
pip install ollama2a[dev]
Quick Start
Basic Usage
from ollama2a.agent_executor import OllamaAgentExecutor
# Create an agent executor with default settings
executor = OllamaAgentExecutor(
ollama_model="qwen3:0.6b",
system_prompt="You are a helpful assistant."
)
# The server starts automatically and the agent is ready to use
result = executor.agent.run_sync(user_prompt="What is the capital of France?")
print(result)
With Custom Tools
from pydantic_ai import Tool, RunContext
from ollama2a.agent_executor import OllamaAgentExecutor
# Define a custom tool
async def my_tool(ctx: RunContext[int], x: int, y: int) -> str:
return f"Result: {x + y}"
# Create executor with custom tools
executor = OllamaAgentExecutor(
ollama_model="qwen3:0.6b",
system_prompt="You are a math assistant.",
tools=[Tool(my_tool)]
)
FastAPI Integration
from pydantic_ai import Tool
from ollama2a.agent_executor import OllamaAgentExecutor
def my_tool(ctx: RunContext[int], x: int, y: int) -> str:
return f"Result: {x + y}"
# Create the agent executor
executor = OllamaAgentExecutor(
ollama_host="localhost",
ollama_port=11434,
ollama_model="qwen3:0.6b",
system_prompt="You are a helpful assistant.",
tools=[Tool(my_tool)]
)
# Get the FastAPI app
app = executor.app
# Run with: uvicorn main:app --host 0.0.0.0 --port 8000
Streaming API
The library now includes A2A-compliant SSE streaming support for real-time responses:
from ollama2a.agent_executor import OllamaAgentExecutor
# Create executor with streaming support
executor = OllamaAgentExecutor(
ollama_model="qwen3:0.6b",
system_prompt="You are a helpful assistant."
)
# The streaming endpoint is automatically available at /stream
app = executor.app
# Example client usage with httpx
import httpx
import json
async def stream_chat():
async with httpx.AsyncClient() as client:
response = await client.post(
"http://localhost:8000/stream",
json={
"prompt": "Tell me a story",
"temperature": 0.7,
"max_tokens": 500
},
timeout=60.0
)
async for line in response.aiter_lines():
if line.startswith("data: "):
data = json.loads(line[6:])
if "text" in data:
print(data["text"], end="", flush=True)
Streaming Request Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
prompt |
str |
Required | The prompt to stream responses for |
temperature |
float |
0.7 |
Sampling temperature (0.0-2.0) |
max_tokens |
int |
None |
Maximum tokens to generate |
top_p |
float |
None |
Top-p sampling parameter (0.0-1.0) |
frequency_penalty |
float |
None |
Frequency penalty (-2.0 to 2.0) |
presence_penalty |
float |
None |
Presence penalty (-2.0 to 2.0) |
timeout |
float |
60.0 |
Stream timeout in seconds |
SSE Event Types
The streaming endpoint emits the following A2A-compliant SSE events:
start: Initial event with model info and parameterscontent: Streaming text chunks with incremental contentcomplete: Final event with full response and statisticstimeout: Emitted if the stream exceeds the timeouterror: Emitted on errors with error detailsend: Final event marking stream completion
Configuration
OllamaAgentExecutor Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
ollama_host |
str |
"localhost" |
Ollama server host |
ollama_port |
int |
11434 |
Ollama server port |
ollama_model |
str |
"qwen3:0.6b" |
Model to use |
system_prompt |
str |
"You are a helpful assistant." |
System prompt for the agent |
description |
str |
"An agent that uses the Ollama API to execute tasks." |
Agent description |
tools |
List[Tool] |
[] |
Custom tools for the agent |
a2a_port |
int |
8000 |
Port for the A2A server |
Server Management
The HybridOllamaManager automatically handles:
- ✅ Server startup: Starts Ollama server if not running
- ✅ Model downloading: Downloads models if not available locally
- ✅ Health checks: Monitors server health
- ✅ Graceful shutdown: Properly terminates processes
- ✅ Error handling: Robust error handling and retries
Manual Server Management
from ollama2a.ollama_manager import HybridOllamaManager
manager = HybridOllamaManager(host="localhost", port=11434)
manager.ensure_server_running()
# Use the manager
response = manager.run_model("qwen3:0.6b", "Hello world!")
print(response)
# Cleanup when done
manager.cleanup()
Requirements
- Python 3.9+
- Ollama installed on your system
Installation of Ollama
Follow the official Ollama installation guide for your operating system.
Development
Setup Development Environment
# Clone the repository
git clone https://github.com/yourusername/ollama2a.git
cd ollama2a
# Install in development mode
pip install -e .[dev]
Running Tests
# Run all tests
pytest
# Run with coverage
pytest --cov=ollama2a --cov-report=html
# Run specific test file
pytest tests/test_ollama_manager.py -v
Code Quality
# Format code
black .
# Sort imports
isort .
# Lint code
flake8 .
# Type checking
mypy ollama2a/
Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Ollama for the amazing local LLM runtime
- Pydantic AI for the agent framework
- FastA2A for the API framework
Changelog
1.1.0 (2025-10-07)
- Added A2A-compliant SSE streaming support
- New
/streamendpoint for real-time responses - Configurable streaming parameters (temperature, max_tokens, etc.)
- Timeout handling for streaming requests
- Comprehensive test coverage for streaming features
1.0.0 (2025-09-12)
- Initial release
- Basic Ollama server management
- Pydantic AI integration
- Starlette app generation
- Comprehensive test suite
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ollama2a-1.0.0.tar.gz.
File metadata
- Download URL: ollama2a-1.0.0.tar.gz
- Upload date:
- Size: 9.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
77ede532e795cfb9f907ae8ff4466135f2bff4ca76e1fab1a162d7f6973ca58e
|
|
| MD5 |
55e0926077918daa1ac227c32ac4767c
|
|
| BLAKE2b-256 |
9678e349a3ca29da3b6da6f428609c8b30e9b6e94892d96b50da544e061ffcdb
|
Provenance
The following attestation bundles were made for ollama2a-1.0.0.tar.gz:
Publisher:
python-publish.yml on thijshakkenbergecolab/ollama2a
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ollama2a-1.0.0.tar.gz -
Subject digest:
77ede532e795cfb9f907ae8ff4466135f2bff4ca76e1fab1a162d7f6973ca58e - Sigstore transparency entry: 590455787
- Sigstore integration time:
-
Permalink:
thijshakkenbergecolab/ollama2a@613a7c300724749181439340e35ee322ed5cc3dc -
Branch / Tag:
refs/tags/v1.1.0 - Owner: https://github.com/thijshakkenbergecolab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@613a7c300724749181439340e35ee322ed5cc3dc -
Trigger Event:
release
-
Statement type:
File details
Details for the file ollama2a-1.0.0-py3-none-any.whl.
File metadata
- Download URL: ollama2a-1.0.0-py3-none-any.whl
- Upload date:
- Size: 9.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
12f7fb6707163791b236815c8785f8dde0e31dfca25de669cf3de550bceb29ac
|
|
| MD5 |
8a7622c4f55c6c3cf8b3e8a787ccbd76
|
|
| BLAKE2b-256 |
740ef14ce32d463e1264ef85ac9b6fd067442d8490719e417c0b88aa36ad7c1b
|
Provenance
The following attestation bundles were made for ollama2a-1.0.0-py3-none-any.whl:
Publisher:
python-publish.yml on thijshakkenbergecolab/ollama2a
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ollama2a-1.0.0-py3-none-any.whl -
Subject digest:
12f7fb6707163791b236815c8785f8dde0e31dfca25de669cf3de550bceb29ac - Sigstore transparency entry: 590455875
- Sigstore integration time:
-
Permalink:
thijshakkenbergecolab/ollama2a@613a7c300724749181439340e35ee322ed5cc3dc -
Branch / Tag:
refs/tags/v1.1.0 - Owner: https://github.com/thijshakkenbergecolab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@613a7c300724749181439340e35ee322ed5cc3dc -
Trigger Event:
release
-
Statement type: