Skip to main content

The Oshepherd guiding the Ollama(s) inference orchestration.

Project description

oshepherd

The Oshepherd guiding the Ollama(s) inference orchestration.

A centralized FastAPI service, using Celery and Redis to orchestrate multiple Ollama servers as workers.

Install

pip install oshepherd

Usage

  1. Setup Redis:

    Celery uses Redis as message broker and backend. You'll need a Redis instance, which you can provision for free in redislabs.com.

  2. Setup FastAPI Server:

    # define configuration env file
    # use credentials for redis as broker and backend
    cp .api.env.template .api.env
    
    # start api
    oshepherd start-api --env-file .api.env
    
  3. Setup Celery/Ollama Worker(s):

    # install ollama https://ollama.com/download
    # optionally pull the model
    ollama pull mistral
    
    # define configuration env file
    # use credentials for redis as broker and backend
    cp .worker.env.template .worker.env
    
    # start worker
    oshepherd start-worker --env-file .worker.env
    
  4. Now you're ready to execute Ollama completions remotely. You can point your Ollama client to your oshepherd api server by setting the host, and it will return your requested completions from any of the workers:

    import ollama
    
    client = ollama.Client(host="http://127.0.0.1:5001")
    ollama_response = client.generate({"model": "mistral", "prompt": "Why is the sky blue?"})
    
    import { Ollama } from "ollama/browser";
    
    const ollama = new Ollama({ host: "http://127.0.0.1:5001" });
    const ollamaResponse = await ollama.generate({
        model: "mistral",
        prompt: "Why is the sky blue?",
    });
    
    • Raw http request:
    curl -X POST -H "Content-Type: application/json" -L http://127.0.0.1:5001/api/generate/ -d '{
        "model": "mistral",
        "prompt":"Why is the sky blue?"
    }'
    

Disclaimers 🚨

This package is in alpha, its architecture and api might change in the near future. Currently this is getting tested in a controlled environment by real users, but haven't been audited, nor tested thorugly. Use it at your own risk.

As this is an alpha version, support and responses might be limited. We'll do our best to address questions and issues as quickly as possible.

API server parity

  • Generate a completion: POST /api/generate
  • Generate a chat completion: POST /api/chat
  • Generate Embeddings: POST /api/embeddings
  • List Local Models: GET /api/tags (pending)
  • Show Model Information: POST /api/show (pending)
  • List Running Models: GET /api/ps (pending)

Oshepherd API server has been designed to maintain compatibility with the endpoints defined by Ollama, ensuring that any official client (i.e.: ollama-python, ollama-js) can use this server as host and receive expected responses. For more details on the full API specifications, refer to the official Ollama API documentation.

Contribution guidelines

We welcome contributions! If you find a bug or have suggestions for improvements, please open an issue or submit a pull request pointing to development branch. Before creating a new issue/pull request, take a moment to search through the existing issues/pull requests to avoid duplicates.

Conda Support

To run and build locally you can use conda:

conda create -n oshepherd python=3.8
conda activate oshepherd
pip install -r requirements.txt

# install oshepherd
pip install -e .
Tests

Follow usage instructions to start api server and celery worker using a local ollama, and then run the tests:

pytest -s tests/

Author

This is a project developed and maintained by mnemonica.ai.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oshepherd-0.0.12.tar.gz (11.9 kB view details)

Uploaded Source

Built Distribution

oshepherd-0.0.12-py3-none-any.whl (14.7 kB view details)

Uploaded Python 3

File details

Details for the file oshepherd-0.0.12.tar.gz.

File metadata

  • Download URL: oshepherd-0.0.12.tar.gz
  • Upload date:
  • Size: 11.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.8.20

File hashes

Hashes for oshepherd-0.0.12.tar.gz
Algorithm Hash digest
SHA256 c79860fbb4c880ea6460b3e5c355916224026dfdce5008df200635c19239be50
MD5 703ca5fa11ac6735e3539ac6bd36bc52
BLAKE2b-256 0d1f16efcb23ad08e9472af59282044c535751608e5045fdb4802e5d092cf31e

See more details on using hashes here.

File details

Details for the file oshepherd-0.0.12-py3-none-any.whl.

File metadata

  • Download URL: oshepherd-0.0.12-py3-none-any.whl
  • Upload date:
  • Size: 14.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.8.20

File hashes

Hashes for oshepherd-0.0.12-py3-none-any.whl
Algorithm Hash digest
SHA256 b49a778d61aee585e111a389a30bb5fa59231d129a4c97ed36457e17b3bf2f61
MD5 4b8506fcdef1e264dac59bdb6c965a80
BLAKE2b-256 30e76cde75a87b60c739b36b1dfd508cf453ab267954f7ea3f5641a814db2d3a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page