Object-oriented library for interacting with Ollama Cloud
Project description
ollama-client-lib
Object-oriented Python library for interacting with Ollama Cloud. Allows creating multiple client instances with different configurations, automatic resource management with context managers, and full support for RAG (Retrieval Augmented Generation).
Features
- ✅ Multiple client instances with independent configurations
- ✅ Context manager for automatic resource management
- ✅ RAG support with context
- ✅ Streaming responses
- ✅ Robust error handling and automatic reconnection
- ✅ Reusable HTTP connection pool
- ✅ Model and prompt agnostic (no hardcoded defaults)
Installation
From PyPI (when published)
pip install ollama-client-lib
From GitHub
pip install git+https://github.com/ctangarife/ollama-client-lib.git
For local development
git clone https://github.com/ctangarife/ollama-client-lib.git
cd ollama-client-lib
pip install -e .
Quick Start
import asyncio
from ollama_client_lib import OllamaClient
async def main():
async with OllamaClient(
api_key="your_api_key",
default_model="kimi-k2:1t" # Model available in Ollama Cloud
) as client:
response = await client.generate_response(
prompt="What is Python?",
context=["Python is a programming language..."]
)
print(response)
asyncio.run(main())
Note: This library is specifically designed for Ollama Cloud (requires API key), not for local Ollama installations. The -cloud suffix is automatically added to model names if not present.
Configuration
The library supports configuration via environment variables or constructor parameters:
OLLAMA_API_KEY: API key (required) - Get one at https://ollama.com/settings/keysOLLAMA_URL: Base URL (default: https://ollama.com)OLLAMA_MODEL: Default model (optional, must be specified per request if not set)
Important:
- This library is for Ollama Cloud only (not local Ollama)
- Model names automatically get the
-cloudsuffix added if not present (e.g.,kimi-k2:1t→kimi-k2:1t-cloud) - Use
list_available_models()to see which models are available in your account
Setting up environment variables
-
Copy
env.exampleto.env:cp env.example .env
-
Edit
.envand add your Ollama Cloud API key:OLLAMA_API_KEY=your_actual_api_key_here -
Get your API key at: https://ollama.com/settings/keys
Note: The .env file is already in .gitignore and won't be committed to version control.
Usage Examples
Basic Usage
import asyncio
from ollama_client_lib import OllamaClient
async def main():
async with OllamaClient(
api_key="your_api_key",
default_model="kimi-k2:1t" # Use a model available in Ollama Cloud
) as client:
response = await client.generate_response(
prompt="Explain quantum computing"
)
print(response)
asyncio.run(main())
Multiple Clients
# Fast client
client_fast = OllamaClient(
api_key="your_api_key",
default_model="kimi-k2:1t",
timeout=60.0
)
# Powerful client
client_powerful = OllamaClient(
api_key="your_api_key",
default_model="gpt-oss:120b",
timeout=300.0
)
RAG with Context
import asyncio
from ollama_client_lib import OllamaClient
async def main():
async with OllamaClient(
api_key="your_api_key",
default_model="kimi-k2:1t"
) as client:
context = [
"Python is a high-level programming language...",
"It was created by Guido van Rossum...",
]
response = await client.generate_response(
prompt="Who created Python?",
context=context
)
print(response)
asyncio.run(main())
Streaming
import asyncio
from ollama_client_lib import OllamaClient
async def main():
async with OllamaClient(
api_key="your_api_key",
default_model="kimi-k2:1t"
) as client:
async for chunk in client.generate_response_streaming(
prompt="Tell me a story"
):
print(chunk, end="", flush=True)
asyncio.run(main())
Custom System Prompt
import asyncio
from ollama_client_lib import OllamaClient
async def main():
async with OllamaClient(
api_key="your_api_key",
default_model="kimi-k2:1t"
) as client:
response = await client.generate_response(
prompt="Explain this code",
system_prompt="You are a code reviewer. Explain code clearly and concisely.",
context=["def hello(): print('world')"]
)
print(response)
asyncio.run(main())
Check Model Availability
import asyncio
from ollama_client_lib import OllamaClient
async def main():
async with OllamaClient(api_key="your_api_key") as client:
# List all available models
models = await client.list_available_models()
print(f"Available models: {models}")
# Check if a specific model is available
available = await client.check_model_available("kimi-k2:1t")
if available:
print("Model is available!")
asyncio.run(main())
Note: Model names in the list don't include the -cloud suffix, but it's automatically added when making API calls.
API Reference
For detailed API documentation, see Ollama Cloud Documentation.
OllamaClient
Constructor
OllamaClient(
api_key: Optional[str] = None, # API key (required, or set OLLAMA_API_KEY env var)
base_url: Optional[str] = None, # Base URL (default: https://ollama.com)
default_model: Optional[str] = None, # Default model name (without -cloud suffix)
timeout: float = 120.0, # Request timeout in seconds
use_http2: bool = False, # Enable HTTP/2 (may cause DNS issues)
max_keepalive_connections: int = 10, # Max keep-alive connections in pool
max_connections: int = 20 # Max total connections in pool
)
Methods
generate_response()
Generate a complete response from the model.
async def generate_response(
prompt: str, # User prompt/question
model: Optional[str] = None, # Model name (uses default_model if not set)
context: Optional[List[str]] = None, # RAG context chunks
system_prompt: Optional[str] = None, # Custom system prompt
temperature: float = 0.7, # Creativity (0.0-1.0, lower = more deterministic)
max_tokens: Optional[int] = None, # Maximum tokens to generate
num_ctx: Optional[int] = None, # Context window size (default: 4096)
options: Optional[Dict[str, Any]] = None # Additional model options
) -> str
generate_response_streaming()
Generate a streaming response (yields chunks incrementally).
async def generate_response_streaming(
prompt: str,
model: Optional[str] = None,
context: Optional[List[str]] = None,
system_prompt: Optional[str] = None,
temperature: float = 0.7,
max_tokens: Optional[int] = None,
num_ctx: Optional[int] = None,
options: Optional[Dict[str, Any]] = None
) -> AsyncIterator[str]
check_model_available()
Check if a model is available in Ollama Cloud.
async def check_model_available(model: Optional[str] = None) -> bool
list_available_models()
List all models available in your Ollama Cloud account.
async def list_available_models() -> List[str]
close()
Close the HTTP client and release resources. Automatically called when using context manager.
async def close() -> None
Examples
See the examples/ directory for more complete examples:
basic_usage.py- Basic client usagemultiple_clients.py- Using multiple client instancesstreaming.py- Streaming responsesrag_example.py- RAG with contextcontext_manager.py- Using context managercustom_prompt.py- Custom system promptscheck_models.py- Checking model availability
Requirements
- Python 3.8+
- httpx >= 0.24.0
- python-dotenv (optional, for loading .env files)
License
MIT License - see LICENSE file for details.
Testing
Run the test suite:
# Unit tests only
pytest tests/test_client.py -v
# All tests (including integration tests - requires API key in .env)
pytest tests/ -v
Integration tests require a valid OLLAMA_API_KEY in your .env file.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a Pull Request
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ollama_client_lib-1.0.0.tar.gz.
File metadata
- Download URL: ollama_client_lib-1.0.0.tar.gz
- Upload date:
- Size: 14.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
567bd6f13713f0e0519a75e981d13670709b0e93fd2cbb576222b7e10b8bf97c
|
|
| MD5 |
558112f94c1cb5399f4793ddaaf33c1b
|
|
| BLAKE2b-256 |
4a17a78cdbf14e02e5752cc515790683de4f3727dc0df489aeded8376b8be4a0
|
Provenance
The following attestation bundles were made for ollama_client_lib-1.0.0.tar.gz:
Publisher:
python-publish.yml on ctangarife/ollama-client-lib
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ollama_client_lib-1.0.0.tar.gz -
Subject digest:
567bd6f13713f0e0519a75e981d13670709b0e93fd2cbb576222b7e10b8bf97c - Sigstore transparency entry: 735699597
- Sigstore integration time:
-
Permalink:
ctangarife/ollama-client-lib@4d933efd11d32d2da8e7090fdac1756a23c0cd6f -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/ctangarife
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@4d933efd11d32d2da8e7090fdac1756a23c0cd6f -
Trigger Event:
release
-
Statement type:
File details
Details for the file ollama_client_lib-1.0.0-py3-none-any.whl.
File metadata
- Download URL: ollama_client_lib-1.0.0-py3-none-any.whl
- Upload date:
- Size: 10.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3355f945eb2c2a6c97e4e809673562d406157f320f18fa55eb8f91a7753004d7
|
|
| MD5 |
133a73f719dc045c865956729d218d1d
|
|
| BLAKE2b-256 |
b82e9ff3cc2763f8be5a4fdbec01ebc2d8d98cee578b320c62df184e8b4d0351
|
Provenance
The following attestation bundles were made for ollama_client_lib-1.0.0-py3-none-any.whl:
Publisher:
python-publish.yml on ctangarife/ollama-client-lib
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ollama_client_lib-1.0.0-py3-none-any.whl -
Subject digest:
3355f945eb2c2a6c97e4e809673562d406157f320f18fa55eb8f91a7753004d7 - Sigstore transparency entry: 735699599
- Sigstore integration time:
-
Permalink:
ctangarife/ollama-client-lib@4d933efd11d32d2da8e7090fdac1756a23c0cd6f -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/ctangarife
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@4d933efd11d32d2da8e7090fdac1756a23c0cd6f -
Trigger Event:
release
-
Statement type: