The Oshepherd guiding the Ollama(s) inference orchestration.

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

oshepherd

The Oshepherd guiding the Ollama(s) inference orchestration.

oshepherd logo

A centralized FastAPI service, using Celery and Redis to orchestrate multiple Ollama servers as workers.

Install

pip install oshepherd

Usage

Setup Redis:

Celery uses Redis as message broker and backend. You'll need a Redis instance, which you can provision for free in redislabs.com.

Setup FastAPI Server:

# define configuration env file
# use credentials for redis as broker and backend
cp .api.env.template .api.env

# start api
oshepherd start-api --env-file .api.env

Setup Celery/Ollama Worker(s):

# install ollama https://ollama.com/download
# optionally pull the model
ollama pull mistral

# define configuration env file
# use credentials for redis as broker and backend
cp .worker.env.template .worker.env

# start worker
oshepherd start-worker --env-file .worker.env

Logging

Oshepherd uses Python's standard logging library for both the API server and Celery worker.

Set LOGLEVEL in either .api.env or .worker.env to control verbosity:

LOGLEVEL="info"

Valid values include debug, info, warning, error, and critical. API access logs from Uvicorn are disabled by default; enable them with:

UVICORN_ACCESS_LOG=true

Request and response payload bodies are only logged at debug level because they may contain prompts, model output, or other sensitive data.

Now you're ready to execute Ollama completions remotely. You can point your Ollama client to your oshepherd api server by setting the host, and it will return your requested completions from any of the workers:

ollama-python client:

import ollama

client = ollama.Client(host="http://127.0.0.1:5001")

# Standard request
response = client.generate(model="mistral", prompt="Why is the sky blue?")

# Streaming request
for chunk in client.generate(model="mistral", prompt="Why is the sky blue?", stream=True):
    print(chunk['response'], end='', flush=True)

For a complete Python example with streaming support, see examples/pretty_streaming.py.

ollama-js client:

import { Ollama } from "ollama/browser";

const ollama = new Ollama({ host: "http://127.0.0.1:5001" });

// Standard request
const response = await ollama.generate({
    model: "mistral",
    prompt: "Why is the sky blue?",
});

// Streaming request
const streamResponse = await ollama.generate({
    model: "mistral",
    prompt: "Why is the sky blue?",
    stream: true
});

for await (const chunk of streamResponse) {
    process.stdout.write(chunk.response);
}

For a complete TypeScript/JavaScript example with streaming support, see examples/ts-scripts/README.md.

Raw http request:

curl -X POST -H "Content-Type: application/json" -L http://127.0.0.1:5001/api/generate/ \
-d '{"model":"mistral","prompt":"Why is the sky blue?","stream":true}' \
--no-buffer

Example: PyCon Austria 2025

For a practical example of how oshepherd can be used to orchestrate on-premise open-source LLMs, see the companion repository from the PyCon Austria 2025 talk "Beyond the Cloud: On-Premise Orchestration for Open-Source LLMs":

Disclaimers 🚨

This package is in alpha, its architecture and api might change in the near future. Currently this is getting tested in a controlled environment by real users, but haven't been audited, nor tested thorugly. Use it at your own risk.

As this is an alpha version, support and responses might be limited. We'll do our best to address questions and issues as quickly as possible.

API server parity

Generate a completion: POST /api/generate
Generate a chat completion: POST /api/chat
Generate Embeddings: POST /api/embeddings
List Local Models: GET /api/tags
Version: GET /api/version
Show Model Information: POST /api/show
List Running Models: GET /api/ps

Oshepherd API server currently supports the endpoints listed above, enabling full compatibility with official Ollama clients (i.e.: ollama-python, ollama-js). These endpoints provide comprehensive functionality for the most common use cases. Additional endpoints from the official Ollama API are not planned for the near future. For more details on the full Ollama API specifications, refer to the Ollama API documentation.

Contribution guidelines

We welcome contributions! If you find a bug or have suggestions for improvements, please open an issue or submit a pull request pointing to development branch. Before creating a new issue/pull request, take a moment to search through the existing issues/pull requests to avoid duplicates.

Conda Support

To run and build locally you can use conda:

conda create -n oshepherd python=3.12
conda activate oshepherd
pip install -r requirements.txt

# install oshepherd
pip install -e .

Tests

The e2e tests require the following models to be available on your local Ollama instance:

ollama pull mistral        # used by generate, chat, and show tests
ollama pull embeddinggemma # used by embeddings tests

Then follow the usage instructions to start the API server and Celery worker, and run:

pytest -s tests/

Author

This is a project developed and maintained by mnemonica.ai.

License

MIT

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.0.23

Jun 26, 2026

0.0.22

Jun 24, 2026

0.0.21

Dec 3, 2025

0.0.20

Dec 1, 2025

0.0.19

Nov 29, 2025

0.0.18

Sep 13, 2025

0.0.17

Sep 7, 2025

0.0.16

Apr 6, 2025

0.0.14

Feb 23, 2025

0.0.13

Feb 20, 2025

0.0.12

Nov 10, 2024

0.0.11

Nov 3, 2024

0.0.9

Jun 17, 2024

0.0.6

May 3, 2024

0.0.5

Apr 29, 2024

0.0.4

Apr 29, 2024

0.0.3

Apr 9, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oshepherd-0.0.23.tar.gz (25.0 kB view details)

Uploaded Jun 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

oshepherd-0.0.23-py3-none-any.whl (31.1 kB view details)

Uploaded Jun 26, 2026 Python 3

File details

Details for the file oshepherd-0.0.23.tar.gz.

File metadata

Download URL: oshepherd-0.0.23.tar.gz
Upload date: Jun 26, 2026
Size: 25.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for oshepherd-0.0.23.tar.gz
Algorithm	Hash digest
SHA256	`52d5da4056082736d20f55bd696f5342f0a56cb51b9877aa6a01e829608da7d3`
MD5	`fa8869854c0878b5655811f390cd8a17`
BLAKE2b-256	`25247b029c6cc4b82d92e3324e179e5710152a0db69d735c31af6b8c52627af0`

See more details on using hashes here.

File details

Details for the file oshepherd-0.0.23-py3-none-any.whl.

File metadata

Download URL: oshepherd-0.0.23-py3-none-any.whl
Upload date: Jun 26, 2026
Size: 31.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for oshepherd-0.0.23-py3-none-any.whl
Algorithm	Hash digest
SHA256	`34f7b7c6eea25c84e73ae24f5da11436f09eda4193a620a997a64e3e62a29441`
MD5	`638d74d01e3b4dd973ad9ca2db711ae6`
BLAKE2b-256	`32cead78cc63933cadc287b760223d35da1d0cdba887c84b2c1185b0d7f08821`

See more details on using hashes here.

oshepherd 0.0.23

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

oshepherd

Install

Usage

Logging

Example: PyCon Austria 2025

Disclaimers 🚨

API server parity

Contribution guidelines

Conda Support

Tests

Author

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes