Skip to main content

Zrb LLM plugin

Project description

Zrb Ollama

Zrb Ollama is a Pypi package that acts as LiteLLM's wrapper, allowing you to incorporate LLM into your workflow.

Zrb Ollama is a part of the Zrb ecosystem, but you can install it independently from Zrb.

Installation

You can install Zrb Ollama by invoking any of the following commands:

# From pypi
pip install zrb-ollama[rag,aws]

# From github
pip install git+https://github.com/state-alchemists/zrb-ollama.git@main

# From directory
pip install --use-feature=in-tree-build path/to/this/directory

By default, Zrb Ollama uses Ollama-based LLM. You can install Ollama by visiting the official website: https://ollama.ai/.

The default LLM is ollama/mistral:7b-instruct, while the default embedding LLM is ollama/nomic-embed-text.

You can change this by setting the model parameter on LLMTask or the create_rag function. See LiteLLM provider to use custom LLM.

Setup LLM

Using Ollama

You can intall and use Ollama to run models locally. To use Ollama in zrb-ollama, you need to set two variables:

  • ZRB_OLLAMA_LLM_MODEL (set this to ollama/gemma2, ollama/qwen2 or other ollama models)
  • ZRB_OLLAMA_EMBEDDING_MODEL (set this to ollama/nomic-embed-text or other ollama models)

Using OpenAI

To use OpenAI, you need to set three variables:

  • OPENAI_API_KEY
  • ZRB_OLLAMA_LLM_MODEL (set this to gpt-4o, gpt-4o-mini or other OpenAI models)
  • ZRB_OLLAMA_EMBEDDING_MODEL (set this to text-embedding-ada-001 or other OpenAI models)

Interactive Mode

Zrb Ollama provides a simple CLI command so you can interact with the LLM immediately. To interact with the LLM, you can invoke the following command.

zrb-ollama

To enhance zrb-ollama with tools, you can create a file named zrb_ollama_init.py and register the tools:

import os
from zrb_ollama import interactive_tools
from zrb_ollama.tools import create_rag, get_rag_documents

_CURRENT_DIR = os.path.dirname(__file__)

# Create RAG function
retrieve_john_titor_info = create_rag(
    tool_name='retrieve_john_titor_info',
    tool_description="Look for anything related to John Titor",
    documents=get_rag_documents(os.path.join(_CURRENT_DIR, "rag", "document")),
    vector_db_path=os.path.join(_CURRENT_DIR, "rag", "vector"),
    # reset_db=True,
)

# Register RAG function as zrb-ollama tool
interactive_tools.register(retrieve_john_titor_info)


# Create a simple function
def add(a: int, b: int) -> int:
    """Adding two numbers and return the result"""
    return a + b


# Register the function as zrb-ollama tool
interactive_tools.register(add)

zrb-ollama automatically load zrb_ollama_init.py and make any registered tools available in the interface.

Command

While in interactive mode, you can use the following commands:

/?                     Show help
/bye                   Quit
/clear                 Clear context
/multi                 Start multiline mode
/end                   Stop multiline mode
/model [model]         Get/set current model (e.g., ollama/mistral:7b-instruct, gpt-4o)

/tool                  Get list of tools
/tool add <tool-name>  Add tool
/tool rm <tool-name>   Remove tool

All commands are started with a /.

Using LLMTask (For Integration with Zrb)

Zrb Ollama provides a task named LLMTask, allowing you to create a Zrb Task with a custom model or tools.

import os

from zrb import CmdTask, StrInput, runner
from zrb_ollama import LLMTask, ToolFactory
from zrb_ollama.tools import (
    create_rag, get_rag_documents, query_internet
)

_CURRENT_DIR = os.path.dirname(__file__)
_RAG_DIR = os.path.join(_CURRENT_DIR, "rag")

rag = LLMTask(
    name="rag",
    inputs=[
        StrInput(name="user-prompt", default="How John Titor introduce himself?"),
    ],
    # model="gpt-4o",
    user_message="{{input.user_prompt}}",
    tools=[query_internet],
    tool_factories=[
        ToolFactory(
            create_rag,
            tool_name="retrieve_john_titor_info",
            tool_description="Look for anything related to John Titor",
            documents=get_rag_documents(os.path.join(_RAG_DIR, "document")),
            # model="text-embedding-ada-002",
            vector_db_path=os.path.join(_RAG_DIR, "vector"),
            # reset_db=True,
        )
    ],
)
runner.register(rag)

Assuming there is a file named john-titor.md inside rag/documents folder, you can invoke the Task by invoking the following command.

zrb rag

The LLM can browse the article or look for anything on the internet.

Using Agent (For Integration with Anything Else)

Under the hood, LLMTask makes use of Agent. You can create and interact with the agent programmatically as follows.

import asyncio
import os

from zrb import CmdTask, StrInput, runner
from zrb_ollama import agent
from zrb_ollama.tools import (
    create_rag, get_rag_documents, query_internet
)

_CURRENT_DIR = os.path.dirname(__file__)
_RAG_DIR = os.path.join(_CURRENT_DIR, "rag")


from zrb_ollama.tools import create_rag, query_internet


agent = Agent(
    model="gpt-4o",
    tools=[
        create_rag(
            tool_name="retrieve",
            tool_description="Look for anything related to John Titor"
            documents=get_rag_documents(os.path.join(_RAG_DIR, "document")),
            # model="text-embedding-ada-002",
            vector_db_path=os.path.join(_RAG_DIR, "vector"),
            # reset_db=True,
        ),
        query_internet,
    ]
)
result = asyncio.run(agent.add_user_message("How John Titor introduce himself?"))
print(result)

Configurations

You can set Zrb Ollama configurations using environment variables.

  • LLM_MODEL
    • Default: ollama/mistral:7b-instruct
    • Description: Default LLM model for LLMTask and interactive mode. See Lite LLM for valid values.
  • INTERACTIVE_ENABLED_TOOL_NAMES
    • Default: query_internet,open_web_page,run_shell_command
    • Description: Default tools enabled for interactive mode.
  • RAG_EMBEDDING_MODEL
    • Default: ollama/nomic-embed-text
    • Description: Default RAG embedding model for LLMTask and interactive mode. See Lite LLM for valid values.
  • RAG_CHUNK_SIZE
    • Default: 1024
    • Description: Default chunk size for RAG.
  • RAG_OVERLAP
    • Default: 128
    • Description: Default chunk overlap size for RAG.
  • RAG_MAX_RESULT_COUNT
    • Default: 5
    • Description: Default result count for RAG.
  • DEFAULT_SYSTEM_PROMPT
    • Default: You are a helpful assistant. You provide accurate and comprehensive answers.
    • Description: Default system prompt for LLM Agent.
  • DEFAULT_SYSTEM_MESSAGE_TEMPLATE
    • Default: See config.py
    • Description: Default template for LLM AGENT's system message. May contains the following:
      • {system_prompt}
      • {response_format}
      • {function_signatures}
  • DEFAULT_JSON_FIXER_SYSTEM_PROMPT
    • Default: You are a message fixer. You turn any message into JSON format. Your user is a LLM assistant that need to provide the correctly formatted message to serve the end user (human). If you think the LLM should end the conversation, make sure all necessary information for the human is included in the final_answer.
    • Description: System prompt to fix main LLM response in case it produces invalid JSON
  • DEFAULT_JSON_FIXER_SYSTEM_MESSAGE_TEMPLATE
    • Default: See config.py
    • Description: Default system message template to fix main LLM response. May contains the following:
      • {system_prompt}
      • {response_format}
      • {function_signatures}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zrb_ollama-0.2.13.tar.gz (20.6 kB view details)

Uploaded Source

Built Distribution

zrb_ollama-0.2.13-py3-none-any.whl (24.3 kB view details)

Uploaded Python 3

File details

Details for the file zrb_ollama-0.2.13.tar.gz.

File metadata

  • Download URL: zrb_ollama-0.2.13.tar.gz
  • Upload date:
  • Size: 20.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.0 Linux/5.15.153.1-microsoft-standard-WSL2

File hashes

Hashes for zrb_ollama-0.2.13.tar.gz
Algorithm Hash digest
SHA256 99f6f382c2999035f6e224b672757507e4a71de850dbc7f0d774c1f3ba0f9922
MD5 55c8f8cdb82730702b4e6b1a2056d430
BLAKE2b-256 c363e481771d550141e0b148862cf69213238586bab32867482f483595e26e2d

See more details on using hashes here.

File details

Details for the file zrb_ollama-0.2.13-py3-none-any.whl.

File metadata

  • Download URL: zrb_ollama-0.2.13-py3-none-any.whl
  • Upload date:
  • Size: 24.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.0 Linux/5.15.153.1-microsoft-standard-WSL2

File hashes

Hashes for zrb_ollama-0.2.13-py3-none-any.whl
Algorithm Hash digest
SHA256 94373ba0dc9e06aac1e0d6274ac445670a8141e8a9b9316d553a56896c15dbc1
MD5 2948de339b3ca63c1898c164ee6eead2
BLAKE2b-256 7ac08905d6f4c0c89b5e94bfe65d741b138496de2eff0450a1d34c287e3d08c2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page