A framework for building LLM based AI agents with llama-cpp-python.

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

llama-cpp-agent Framework

Introduction

The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). It provides a simple yet robust interface using llama-cpp-python, allowing users to chat with LLM models, execute structured function calls and get structured output. It does this by generating a formal GGML-BNF grammar of the user defined structures and functions, which is then used by llama.cpp to generate text valid to that grammar. In contrast to most GBNF grammar generators it also supports nested objects, dictionaries, enums and lists of them.

Key Features

Simple Chat Interface: Engage in seamless conversations with LLMs.
Structured Output: Get structured output from LLMs.
Function Calling: Execute structured outputs from LLMs, enhancing the interaction capabilities.
Flexibility: Suited for various applications from casual chatting to specific function executions.

Installation

The llama-cpp-agent framework can be installed using pip:

pip install llama-cpp-agent

Usage Examples

The following examples demonstrate the usage of the llama-cpp-agent framework. You can find a lot more examples in the examples folder.

Simple Chat Example

This example demonstrates how to initiate a chat with an LLM model.

from llama_cpp import Llama
from llama_cpp_agent.llm_agent import LlamaCppAgent
from llama_cpp_agent.messages_formatter import MessagesFormatterType
main_model = Llama(
    "../gguf-models/dolphin-2.7-mixtral-8x7b.Q4_K_M.gguf",
    n_gpu_layers=13,
    f16_kv=True,
    use_mlock=False,
    embedding=False,
    n_threads=8,
    n_batch=1024,
    n_ctx=8192,
    last_n_tokens_size=1024,
    verbose=False,
    seed=42,
)
llama_cpp_agent = LlamaCppAgent(main_model, debug_output=False,
                              system_prompt="You are Dolphin, an uncensored and unbiased AI assistant.", predefined_messages_formatter_type=MessagesFormatterType.CHATML)


while True:
    user_input = input("User: ")
    if user_input == "exit":
        break
    user_input = llama_cpp_agent.get_chat_response(user_input, temperature=0.7)
    print("AI: " + user_input)

Function Calling Agent Example

This example shows how to use the FunctionCallingAgent for function calling with normal python functions and functions defined as pydantic models.

# Example that uses the FunctionCallingAgent class to create a function calling agent.

from enum import Enum
from typing import Union

from llama_cpp import Llama
from pydantic import BaseModel, Field

from llama_cpp_agent.llm_settings import LlamaLLMSettings, LlamaLLMGenerationSettings

from llama_cpp_agent.function_calling_agent import FunctionCallingAgent


# Write to file function that can be used by the agent. Docstring will be used in system prompt.
def write_to_file(chain_of_thought: str, file_path: str, file_content: str):
    """
    Write file to the user filesystem.
    :param chain_of_thought: Your chain of thought while writing the file.
    :param file_path: The file path includes the filename and file ending.
    :param file_content: The actual content to write.
    """
    print(chain_of_thought)
    with open(file_path, mode="w", encoding="utf-8") as file:
        file.write(file_content)
    return f"File {file_path} successfully written."


# Read file function that can be used by the agent. Docstring will be used in system prompt.
def read_file(file_path: str):
    """
    Read file from the user filesystem.
    :param file_path: The file path includes the filename and file ending.
    :return: File content.
    """
    output = ""
    with open(file_path, mode="r", encoding="utf-8") as file:
        output = file.read()
    return f"Content of file '{file_path}':\n\n{output}"


# Enum for the calculator tool.
class MathOperation(Enum):
    ADD = "add"
    SUBTRACT = "subtract"
    MULTIPLY = "multiply"
    DIVIDE = "divide"


# Simple pydantic calculator tool for the agent that can add, subtract, multiply, and divide. Docstring and description of fields will be used in system prompt.
class Calculator(BaseModel):
    """
    Perform a math operation on two numbers.
    """
    number_one: Union[int, float] = Field(..., description="First number.")
    operation: MathOperation = Field(..., description="Math operation to perform.")
    number_two: Union[int, float] = Field(..., description="Second number.")

    def run(self):
        if self.operation == MathOperation.ADD:
            return self.number_one + self.number_two
        elif self.operation == MathOperation.SUBTRACT:
            return self.number_one - self.number_two
        elif self.operation == MathOperation.MULTIPLY:
            return self.number_one * self.number_two
        elif self.operation == MathOperation.DIVIDE:
            return self.number_one / self.number_two
        else:
            raise ValueError("Unknown operation.")


# Callback for receiving messages for the user.
def send_message_to_user_callback(message: str):
    print(message)

generation_settings = LlamaLLMGenerationSettings(temperature=0.65, top_p=0.5, tfs_z=0.975)

# Can be saved and loaded like that:
# generation_settings.save("generation_settings.json")
# generation_settings = LlamaLLMGenerationSettings.load_from_file("generation_settings.json")

function_call_agent = FunctionCallingAgent(LlamaLLMSettings.load_from_file("openhermes-2.5-mistral-7b.Q8_0.json"),  # Can lama-cpp-python Llama class or LlamaLLMSettings class.
                                           llama_generation_settings=generation_settings,
                                           python_functions=[write_to_file, read_file],
                                           pydantic_functions=[Calculator],
                                           send_message_to_user_callback=send_message_to_user_callback)

while True:
    user_input = input(">")
    function_call_agent.generate_response(user_input)
    function_call_agent.save("function_calling_agent.json")

Example output

{ "function": "calculator","function_parameters": { "number_one": 42.00000 ,  "operation": "multiply" ,  "number_two": 42.00000 }}
1764.0

Structured Output

This example shows how to get structured output objects using the StructureOutputAgent class.

# Example agent that uses the StructuredOutputAgent class to create a dataset entry of a book out of unstructured data.

from enum import Enum

from llama_cpp import Llama
from pydantic import BaseModel, Field

from llama_cpp_agent.structured_output_agent import StructuredOutputAgent


# Example enum for our output model
class Category(Enum):
    Fiction = "Fiction"
    NonFiction = "Non-Fiction"


# Example output model
class Book(BaseModel):
    """
    Represents an entry about a book.
    """
    title: str = Field(..., description="Title of the book.")
    author: str = Field(..., description="Author of the book.")
    published_year: int = Field(..., description="Publishing year of the book.")
    keywords: list[str] = Field(..., description="A list of keywords.")
    category: Category = Field(..., description="Category of the book.")
    summary: str = Field(..., description="Summary of the book.")


main_model = Llama(
    "../gguf-models/nous-hermes-2-solar-10.7b.Q6_K.gguf",
    n_gpu_layers=49,
    offload_kqv=True,
    f16_kv=True,
    use_mlock=False,
    embedding=False,
    n_threads=8,
    n_batch=1024,
    n_ctx=4096,
    last_n_tokens_size=1024,
    verbose=False,
    seed=42,
)

structured_output_agent = StructuredOutputAgent(main_model, debug_output=True)

text = """The Feynman Lectures on Physics is a physics textbook based on some lectures by Richard Feynman, a Nobel laureate who has sometimes been called "The Great Explainer". The lectures were presented before undergraduate students at the California Institute of Technology (Caltech), during 1961–1963. The book's co-authors are Feynman, Robert B. Leighton, and Matthew Sands."""
print(structured_output_agent.create_object(Book, text))

Example output

 { "title": "The Feynman Lectures on Physics"  ,  "author": "Richard Feynman, Robert B. Leighton, Matthew Sands"  ,  "published_year": 1963 ,  "keywords": [ "physics" , "textbook" , "Nobel laureate" , "The Great Explainer" , "California Institute of Technology" , "undergraduate" , "lectures"  ] ,  "category": "Non-Fiction" ,  "summary": "The Feynman Lectures on Physics is a physics textbook based on lectures by Nobel laureate Richard Feynman, known as 'The Great Explainer'. The lectures were presented to undergraduate students at Caltech between 1961 and 1963. Co-authors of the book are Feynman, Robert B. Leighton, and Matthew Sands."  }


title='The Feynman Lectures on Physics' author='Richard Feynman, Robert B. Leighton, Matthew Sands' published_year=1963 keywords=['physics', 'textbook', 'Nobel laureate', 'The Great Explainer', 'California Institute of Technology', 'undergraduate', 'lectures'] category=<Category.NonFiction: 'Non-Fiction'> summary="The Feynman Lectures on Physics is a physics textbook based on lectures by Nobel laureate Richard Feynman, known as 'The Great Explainer'. The lectures were presented to undergraduate students at Caltech between 1961 and 1963. Co-authors of the book are Feynman, Robert B. Leighton, and Matthew Sands."

Manual Function Calling Example

This example shows how to do function calling with pydantic models. You can also convert Python functions with type hints, automatically to pydantic models using the function: create_dynamic_model_from_function under: llama_cpp_agent.gbnf_grammar_generator.gbnf_grammar_from_pydantic_models

from enum import Enum

from llama_cpp import Llama
from pydantic import BaseModel, Field

from llama_cpp_agent.llm_agent import LlamaCppAgent

from llama_cpp_agent.messages_formatter import MessagesFormatterType
from llama_cpp_agent.function_calling import LlamaCppFunctionTool


# Simple calculator tool for the agent that can add, subtract, multiply, and divide.
class MathOperation(Enum):
    ADD = "add"
    SUBTRACT = "subtract"
    MULTIPLY = "multiply"
    DIVIDE = "divide"


class Calculator(BaseModel):
    """
    Perform a math operation on two numbers.
    """
    number_one: float = Field(..., description="First number.", max_precision=5, min_precision=2)
    operation: MathOperation = Field(..., description="Math operation to perform.")
    number_two: float = Field(..., description="Second number.", max_precision=5, min_precision=2)

    def run(self):
        if self.operation == MathOperation.ADD:
            return self.number_one + self.number_two
        elif self.operation == MathOperation.SUBTRACT:
            return self.number_one - self.number_two
        elif self.operation == MathOperation.MULTIPLY:
            return self.number_one * self.number_two
        elif self.operation == MathOperation.DIVIDE:
            return self.number_one / self.number_two
        else:
            raise ValueError("Unknown operation.")


function_tools = [LlamaCppFunctionTool(Calculator)]

function_tool_registry = LlamaCppAgent.get_function_tool_registry(function_tools)

main_model = Llama(
    "../gguf-models/dolphin-2.6-mistral-7b-Q8_0.gguf",
    n_gpu_layers=35,
    f16_kv=True,
    use_mlock=False,
    embedding=False,
    n_threads=8,
    n_batch=1024,
    n_ctx=8192,
    last_n_tokens_size=1024,
    verbose=False,
    seed=42,
)
llama_cpp_agent = LlamaCppAgent(main_model, debug_output=False,
                                system_prompt="You are an advanced AI, tasked to assist the user by calling functions in JSON format.\n\n\n" + function_tool_registry.get_documentation(),
                                predefined_messages_formatter_type=MessagesFormatterType.CHATML)
user_input = 'What is 42 * 42?'
print(llama_cpp_agent.get_chat_response(user_input, temperature=0.45, function_tool_registry=function_tool_registry))

Example output

{ "function": "calculator","function_parameters": { "number_one": 42.00000 ,  "operation": "multiply" ,  "number_two": 42.00000 }}
1764.0

Manual Function Calling with Python Function Example

This example shows how to do function calling using actual Python functions.

from llama_cpp import Llama
from typing import Union
import math

from llama_cpp_agent.llm_agent import LlamaCppAgent

from llama_cpp_agent.messages_formatter import MessagesFormatterType
from llama_cpp_agent.function_calling import LlamaCppFunctionTool
from llama_cpp_agent.gbnf_grammar_generator.gbnf_grammar_from_pydantic_models import create_dynamic_model_from_function


def calculate_a_to_the_power_b(a: Union[int, float], b: Union[int, float]):
    """
    Calculates 'a' to the power 'b' and returns the result
    """
    return f"Result: {math.pow(a, b)}"


DynamicSampleModel = create_dynamic_model_from_function(calculate_a_to_the_power_b)

function_tools = [LlamaCppFunctionTool(DynamicSampleModel)]

function_tool_registry = LlamaCppAgent.get_function_tool_registry(function_tools)

main_model = Llama(
    "../../gguf-models/openhermes-2.5-mistral-7b-16k.Q8_0.gguf",
    n_gpu_layers=49,
    offload_kqv=True,
    f16_kv=True,
    use_mlock=False,
    embedding=False,
    n_threads=8,
    n_batch=1024,
    n_ctx=8192,
    last_n_tokens_size=1024,
    verbose=True,
    seed=42,
)

llama_cpp_agent = LlamaCppAgent(main_model, debug_output=True,
                                system_prompt="You are an advanced AI, tasked to assist the user by calling functions in JSON format. The following are the available functions and their parameters and types:\n\n" + function_tool_registry.get_documentation(),
                                predefined_messages_formatter_type=MessagesFormatterType.CHATML)
user_input = "Calculate 5 to power 42"

print(llama_cpp_agent.get_chat_response(user_input, temperature=0.45, function_tool_registry=function_tool_registry))

Example output

{ "function": "calculate-a-to-the-power-b","function_parameters": { "a": 5 ,  "b": 42  }}
Result: 2.2737367544323207e+29

Knowledge Graph Creation Example

This example, based on an example of the Instructor library for OpenAI, demonstrates how to create a knowledge graph using the llama-cpp-agent framework.

import json
from typing import List

from enum import Enum

from llama_cpp import Llama, LlamaGrammar
from pydantic import BaseModel, Field

from llama_cpp_agent.llm_agent import LlamaCppAgent
from llama_cpp_agent.gbnf_grammar_generator.gbnf_grammar_from_pydantic_models import generate_gbnf_grammar_and_documentation

main_model = Llama(
    "../gguf-models/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf",
    n_gpu_layers=13,
    f16_kv=True,
    use_mlock=False,
    embedding=False,
    n_threads=8,
    n_batch=1024,
    n_ctx=8192,
    last_n_tokens_size=1024,
    verbose=True,
    seed=42,
)

class Node(BaseModel):
    id: int
    label: str
    color: str


class Edge(BaseModel):
    source: int
    target: int
    label: str
    color: str = "black"


class KnowledgeGraph(BaseModel):
    nodes: List[Node] = Field(..., default_factory=list)
    edges: List[Edge] = Field(..., default_factory=list)




gbnf_grammar, documentation = generate_gbnf_grammar_and_documentation([KnowledgeGraph],False)

print(gbnf_grammar)
grammar = LlamaGrammar.from_string(gbnf_grammar, verbose=True)


llama_cpp_agent = LlamaCppAgent(main_model, debug_output=True,
                              system_prompt="You are an advanced AI assistant responding in JSON format.\n\nAvailable JSON response models:\n\n" + documentation)


from graphviz import Digraph


def visualize_knowledge_graph(kg: KnowledgeGraph):
    dot = Digraph(comment="Knowledge Graph")

    # Add nodes
    for node in kg.nodes:
        dot.node(str(node.id), node.label, color=node.color)

    # Add edges
    for edge in kg.edges:
        dot.edge(str(edge.source), str(edge.target), label=edge.label, color=edge.color)

    # Render the graph
    dot.render("knowledge_graph.gv", view=True)


def generate_graph(user_input: str) -> KnowledgeGraph:
    prompt = f'''Help me understand the following by describing it as a detailed knowledge graph: {user_input}'''.strip()
    response = llama_cpp_agent.get_chat_response(message=prompt, temperature=0.65, mirostat_mode=0, mirostat_tau=3.0,
                                               mirostat_eta=0.1, grammar=grammar)
    knowledge_graph = json.loads(response)
    cls = KnowledgeGraph
    knowledge_graph = cls(**knowledge_graph)
    return knowledge_graph


graph = generate_graph("Teach me about quantum mechanics")
visualize_knowledge_graph(graph)

Example Output:

Additional Information

Dependencies: pydantic for grammars based generation and of course llama-cpp-python.

Predefined Messages Formatter

The llama-cpp-agent framework uses custom messages formatters to format messages for the LLM model. The MessagesFormatterType enum defines the available predefined formatters. The following predefined formatters are available:

MessagesFormatterType.CHATML: Formats messages using the CHATML format.
MessagesFormatterType.MIXTRAL: Formats messages using the MIXTRAL format.
MessagesFormatterType.VICUNA: Formats messages using the VICUNA format.
MessagesFormatterType.LLAMA_2: Formats messages using the LLAMA 2 format.
MessagesFormatterType.SYNTHIA: Formats messages using the SYNTHIA format.
MessagesFormatterType.NEURAL_CHAT: Formats messages using the NEURAL CHAT format.
MessagesFormatterType.SOLAR: Formats messages using the SOLAR format.
MessagesFormatterType.OPEN_CHAT: Formats messages using the OPEN CHAT format.

You can also define your own custom messages formatter by creating an instance of the MessagesFormatter class. The MessagesFormatter class takes the following parameters:

PRE_PROMPT: The pre-prompt to use for the messages.
SYS_PROMPT_START: The system prompt start to use for the messages.
SYS_PROMPT_END: The system prompt end to use for the messages.
USER_PROMPT_START: The user prompt start to use for the messages.
USER_PROMPT_END: The user prompt end to use for the messages.
ASSISTANT_PROMPT_START: The assistant prompt start to use for the messages.
ASSISTANT_PROMPT_END: The assistant prompt end to use for the messages.
INCLUDE_SYS_PROMPT_IN_FIRST_USER_MESSAGE: Whether to include the system prompt in the first user message.
DEFAULT_STOP_SEQUENCES: The default stop sequences to use for the messages.

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.2.35

Jun 29, 2024

0.2.34

Jun 23, 2024

0.2.33

Jun 19, 2024

0.2.32

Jun 12, 2024

0.2.31

Jun 12, 2024

0.2.30

Jun 11, 2024

0.2.29

Jun 11, 2024

0.2.28

Jun 11, 2024

0.2.27

Jun 11, 2024

0.2.26

Jun 11, 2024

0.2.25

Jun 6, 2024

0.2.24

Jun 5, 2024

0.2.23

May 31, 2024

0.2.22

May 31, 2024

0.2.21

May 30, 2024

0.2.20

May 27, 2024

0.2.19

May 27, 2024

0.2.18

May 26, 2024

0.2.17

May 26, 2024

0.2.16

May 26, 2024

0.2.15

May 26, 2024

0.2.14

May 26, 2024

0.2.13

May 26, 2024

0.2.12

May 26, 2024

0.2.11

May 26, 2024

0.2.10

May 20, 2024

0.2.9

May 20, 2024

0.2.8

May 20, 2024

0.2.7

May 14, 2024

0.2.6

May 14, 2024

0.2.5

May 13, 2024

0.2.4

May 13, 2024

0.2.3

May 13, 2024

0.2.2

May 12, 2024

0.2.1

May 12, 2024

0.2.0

May 12, 2024

0.1.4

May 7, 2024

0.1.3

May 5, 2024

0.1.2

May 5, 2024

0.1.1

May 4, 2024

0.1.0

May 1, 2024

0.0.32

May 1, 2024

0.0.31

May 1, 2024

0.0.30

May 1, 2024

0.0.29

May 1, 2024

0.0.28

May 1, 2024

0.0.27

May 1, 2024

0.0.26

Apr 18, 2024

0.0.25

Apr 16, 2024

0.0.24

Apr 10, 2024

0.0.23

Apr 10, 2024

0.0.22

Apr 3, 2024

0.0.21

Apr 3, 2024

0.0.20

Apr 3, 2024

0.0.19

Apr 2, 2024

0.0.18

Apr 2, 2024

0.0.17

Jan 16, 2024

0.0.16

Jan 16, 2024

0.0.15

Jan 16, 2024

0.0.14

Jan 13, 2024

0.0.13

Jan 12, 2024

0.0.12

Jan 12, 2024

0.0.11

Jan 12, 2024

0.0.10

Jan 9, 2024

0.0.9

Jan 9, 2024

0.0.8

Jan 9, 2024

This version

0.0.7

Jan 7, 2024

0.0.6

Jan 7, 2024

0.0.5

Jan 7, 2024

0.0.4

Jan 5, 2024

0.0.3

Jan 4, 2024

0.0.2

Jan 4, 2024

0.0.1

Jan 4, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama-cpp-agent-0.0.7.tar.gz (32.3 kB view details)

Uploaded Jan 7, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llama_cpp_agent-0.0.7-py3-none-any.whl (31.9 kB view details)

Uploaded Jan 7, 2024 Python 3

File details

Details for the file llama-cpp-agent-0.0.7.tar.gz.

File metadata

Download URL: llama-cpp-agent-0.0.7.tar.gz
Upload date: Jan 7, 2024
Size: 32.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for llama-cpp-agent-0.0.7.tar.gz
Algorithm	Hash digest
SHA256	`cee516fd7a556ffa94ccd326a49655a121c23db27b3d87420d44e78e53e3ef8a`
MD5	`02e09f0c3c001070b335e80657930b65`
BLAKE2b-256	`e486836800fbaad6959dff2eb633372c5093fc6d5d5595a8dc97b4fff41f7596`

See more details on using hashes here.

File details

Details for the file llama_cpp_agent-0.0.7-py3-none-any.whl.

File metadata

Download URL: llama_cpp_agent-0.0.7-py3-none-any.whl
Upload date: Jan 7, 2024
Size: 31.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for llama_cpp_agent-0.0.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`da266564f1bd9576d43fee4146149c8373b0306db36bd2601154d1ad0e7cee40`
MD5	`1c64be4218a062441aa0f709e4e834ec`
BLAKE2b-256	`7a0a0ffae2e069c518f2cd0d4a4e7da52181e7aa207671acdb61fabd1820bedc`

See more details on using hashes here.

llama-cpp-agent 0.0.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

llama-cpp-agent Framework

Introduction

Key Features

Installation

Usage Examples

Simple Chat Example

Function Calling Agent Example

Structured Output

Manual Function Calling Example

Manual Function Calling with Python Function Example

Knowledge Graph Creation Example

Additional Information

Predefined Messages Formatter

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes