Skip to main content

Instruct and validate structured outputs from LLMs with Ollama.

Project description

ollama-instructor

ollama-instructor is a lightweight Python library that provides a convenient wrapper around the Client of the renowned Ollama repository, extending it with validation features for obtaining valid JSON responses from a Large Language Model (LLM). Utilizing Pydantic, ollama-instructor allows users to specify models for JSON schemas and data validation, ensuring that responses from LLMs adhere to the defined schema.

Downloads

Note 1: This library has a native support for the Ollamas Python client. If you want to have more flexibility with other providers like Groq, OpenAI, Perplexity and more, have a look into the great library of instrutor of Jason Lui.

Note 2: This library depends on having Ollama installed and running. For more information, please refer to the official website of Ollama.


Documentation and guides

Examples

Blog

Features

  • Easy integration with the Ollama repository for running open-source LLMs locally. See:
  • Data validation using Pydantic BaseModel to ensure the JSON response from a LLM meets the specified schema. See:
  • Retries with error guidance if the LLM returns invalid responses. You can set the maxium number of retries.
  • Allow partial responses to be returned by setting the allow_partial flag to True. This will try to clean set invalid data within the response and set it to None. Unsetted data (not part of the Pydantic model) will be deleted from the response.
  • Reasoning for the LLM to enhance the response quality of an LLM. This could be useful for complex tasks and JSON schemas to adhere and help smaller LLMs to perform better. By setting format to '' instead to 'json' (default) the LLM can return a string with a step by step reasoning. The LLM is instructed to return the JSON response within a code block (json ... ) which can be extracted from ollama-instructor (see example).

ollama-instructor can help you to get structured and reliable JSON from local LLMs like:

  • llama3 & llama3.1
  • phi3
  • mistral
  • gemma
  • ...

ollama-instructor can be your starting point to build agents by your self. Have full control over agent flows without relying on complex agent framework.

Concept

Concept.png

Find more here: The concept of ollama-instructor

Quick guide

Installation

To install ollama-instructor, run the following command in your terminal:

pip install ollama-instructor

Quick Start

Here are quick examples to get you started with ollama-instructor:

chat completion:

from ollama_instructor.ollama_instructor_client import OllamaInstructorClient
from pydantic import BaseModel

class Person(BaseModel):
    name: str
    age: int

client = OllamaInstructorClient(...)
response = client.chat_completion(
    model='phi3',
    pydantic_model=Person,
    messages=[
        {
            'role': 'user',
            'content': 'Jason is 30 years old.'
        }
    ]
)

print(response['message']['content'])

Output:

{"name": "Jason", "age": 30}

asynchronous chat completion:

from pydantic import BaseModel, ConfigDict
from enum import Enum
from typing import List
import rich
import asyncio

from ollama_instructor.ollama_instructor_client import OllamaInstructorAsyncClient

class Gender(Enum):
    MALE = 'male'
    FEMALE = 'female'

class Person(BaseModel):
    '''
    This model defines a person.
    '''
    name: str
    age: int
    gender: Gender
    friends: List[str] = []

    model_config = ConfigDict(
        extra='forbid'
    )

async def main():
    client = OllamaInstructorAsyncClient(...)
    await client.async_init()  # Important: must call this before using the client

    response = await client.chat_completion(
        model='phi3:instruct',
        pydantic_model=Person,
        messages=[
            {
                'role': 'user',
                'content': 'Jason is 25 years old. Jason loves to play soccer with his friends Nick and Gabriel. His favorite food is pizza.'
            }
        ],
    )
    rich.print(response['message']['content'])

if __name__ == "__main__":
    asyncio.run(main())

chat completion with streaming:

from ollama_instructor.ollama_instructor_client import OllamaInstructorClient
from pydantic import BaseModel

class Person(BaseModel):
    name: str
    age: int

client = OllamaInstructorClient(...)
response = client.chat_completion_with_stream(
    model='phi3',
    pydantic_model=Person,
    messages=[
        {
            'role': 'user',
            'content': 'Jason is 30 years old.'
        }
    ]
)

for chunk in response:
    print(chunk['message']['content'])

OllamaInstructorClient and OllamaInstructorAsyncClient

The classes OllamaInstructorClient and OllamaInstructorAsyncClient are the main class of the ollama-instructor library. They are the wrapper around the Ollama (async) client and contain the following arguments:

  • host: the URL of the Ollama server (default: http://localhost:11434). See documentation of Ollama
  • debug: a bool indicating whether to print debug messages (default: False).

Note: Until versions (v0.4.2) I was working with icecream for debugging. I switched to the logging module.

chat_completion & chat_completion_with_stream

The chat_completion and chat_completion_with_stream methods are the main methods of the library. They are used to generate text completions from a given prompt.

ollama-instructor uses chat_completion and chat_completion_with_stream to expand the chat method of Ollama. For all available arguments of chat see the Ollama documentation.

The following arguments are added to the chat method within chat_completion and chat_completion_with_stream:

  • pydantic_model: a class of Pydantic's BaseModel class that is used to firstly instruct the LLM with the JSON schema of the BaseModel and secondly to validate the response of the LLM with the built-in validation of Pydantic.
  • retries: the number of retries if the LLM fails to generate a valid response (default: 3). If a LLM fails the retry will provide the last response of the LLM with the given ValidationError and insructs it to generate a valid response.
  • allow_partial: If set to True ollama-instructor will modify the BaseModel to allow partial responses. In this case it makes sure to provide the correct instance of the JSON schema but with default or None values. Therefore, it is useful to provide default values within the BaseModel. With the improvement of this library you will find examples and best practice guides on that topic in the docs folder.
  • format: In fact this is an argument of Ollama already. But since version 0.4.0 of ollama-instructor this can be set to 'json' or ''. By default ollama-instructor uses the 'json' format. Before verion 0.4.0 only 'json' was possible. But within chat_completion (NOT for chat_completion_with_stream) you can set format = '' to enable the reasoning capabilities. The default system prompt of ollama-instructor instructs the LLM properly to response in a json ... code block, to extract the JSON for validation. When coming with a own system prompt an setting format= '', this has to be considered. See an example here.

Documentation and examples

  • It is my goal to have a well documented library. Therefore, have a look into the repositorys code to get an idea how to use it.
  • There will be a bunch of guides and examples in the docs folder (work in progress).
  • If you need more information about the library, please feel free to open a discussion or write an email to lennartpollvogt@protonmail.com.

License

ollama-instructor is released under the MIT License. See the LICENSE file for more details.

Support and Community

If you need help or want to discuss ollama-instructor, feel free to open an issue, a discussion on GitHub or just drop me an email (lennartpollvogt@protonmail.com). I always welcome new ideas of use cases for LLMs and vision models, and would love to cover them in the examples folder. Feel free to discuss them with me via email, issue or discussion section of this repository. 😊

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ollama_instructor-0.5.0.tar.gz (17.6 kB view details)

Uploaded Source

Built Distribution

ollama_instructor-0.5.0-py3-none-any.whl (18.1 kB view details)

Uploaded Python 3

File details

Details for the file ollama_instructor-0.5.0.tar.gz.

File metadata

  • Download URL: ollama_instructor-0.5.0.tar.gz
  • Upload date:
  • Size: 17.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/24.0.0

File hashes

Hashes for ollama_instructor-0.5.0.tar.gz
Algorithm Hash digest
SHA256 21effd10e78b1f644625ee38ff4d1d29f857350d5cf72452c349789a34fc0562
MD5 933f628a18482bae8e47037ce4c39951
BLAKE2b-256 d8a641ed9f0938ba938b17d4d0f1006c2cb2700b1bb8a759bfac091d0a968f5f

See more details on using hashes here.

File details

Details for the file ollama_instructor-0.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ollama_instructor-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7faaa17819190097972db66e8c5c775017737168c6c09c8a33a469fc30f62b97
MD5 74fa8da79f3aaf2b71f9174cc3bb0e9a
BLAKE2b-256 e24cadea8846303fc85b76244f53e88c9b8bcf203ce8835a1881dacfa62692ba

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page