Generates GBNF grammars from pydantic models
Project description
Pydantic GBNF Grammar Generator
Pydantic GBNF Grammar Generator facilitates the conversion of Pydantic data models into GBNF grammars. This library was created to use it in combination with llama.cpp and is at this point in fact just a repackaged version of some Python scripts contained in the llama.cpp repository to make integration into other projects easier.
Installation
The easiest way to install is from PYPI
pip install pydantic-gbnf-grammar-generator
Alternatively, you can install from source
git clone https://github.com/rhohndorf/pydantic-gbnf-gammar-generator.git
cd pydantic-gbnf-gammar-generator
pip install -e .
Usage
The following example demonstrates the technical usage of the library. All examples can be found in the examples folder. To run the examples a llama.cpp server listing on port 8080 is required.
Structured Output
from enum import Enum
import json
from typing import Optional, List
from pydantic import BaseModel, Field
import requests
from pydantic_gbnf_grammar_generator import generate_gbnf_grammar_and_documentation
# Function to get completion on the llama.cpp server with grammar.
def create_completion(prompt, grammar):
headers = {"Content-Type": "application/json"}
data = {"prompt": prompt, "grammar": grammar, "stop": ["<|im_end|>"]}
response = requests.post("http://127.0.0.1:8080/completion", headers=headers, json=data)
data = response.json()
print(data["content"])
return data["content"]
# A example structured output based on pydantic models. The LLM will create an entry for a Book database out of an unstructured text.
class Category(Enum):
"""
The category of the book.
"""
Fiction = "Fiction"
NonFiction = "Non-Fiction"
class Book(BaseModel):
"""
Represents an entry about a book.
"""
title: str = Field(..., description="Title of the book.")
author: str = Field(..., description="Author of the book.")
published_year: Optional[int] = Field(..., description="Publishing year of the book.")
keywords: List[str] = Field(..., description="A list of keywords.")
category: Category = Field(..., description="Category of the book.")
summary: str = Field(..., description="Summary of the book.")
# We need no additional parameters other than our list of pydantic models.
gbnf_grammar, documentation = generate_gbnf_grammar_and_documentation([Book])
print(gbnf_grammar)
system_message = "You are an advanced AI, tasked to create a dataset entry in JSON for a Book. The following is the expected output model:\n\n" + documentation
text = """The Feynman Lectures on Physics is a physics textbook based on some lectures by Richard Feynman, a Nobel laureate who has sometimes been called "The Great Explainer". The lectures were presented before undergraduate students at the California Institute of Technology (Caltech), during 1961–1963. The book's co-authors are Feynman, Robert B. Leighton, and Matthew Sands."""
prompt = f"<|im_start|>system\n{system_message}<|im_end|>\n<|im_start|>user\n{text}<|im_end|>\n<|im_start|>assistant"
text = create_completion(prompt=prompt, grammar=gbnf_grammar)
json_data = json.loads(text)
print(Book(**json_data))
Function Calling
from enum import Enum
import json
from typing import Union
from pydantic import BaseModel, Field
import requests
from pydantic_gbnf_grammar_generator import generate_gbnf_grammar_and_documentation
# Function to get completion on the llama.cpp server with grammar.
def create_completion(prompt, grammar):
headers = {"Content-Type": "application/json"}
data = {"prompt": prompt, "grammar": grammar, "stop": ["<|im_end|>"]}
response = requests.post("http://127.0.0.1:8080/completion", headers=headers, json=data)
data = response.json()
print(data["content"])
return data["content"]
# A function for the agent to send a message to the user.
class SendMessageToUser(BaseModel):
"""
Send a message to the User.
"""
chain_of_thought: str = Field(..., description="Your chain of thought while sending the message.")
message: str = Field(..., description="Message you want to send to the user.")
def run(self):
print(self.message)
# Enum for the calculator tool.
class MathOperation(Enum):
ADD = "add"
SUBTRACT = "subtract"
MULTIPLY = "multiply"
DIVIDE = "divide"
# Simple pydantic calculator tool for the agent that can add, subtract, multiply, and divide. Docstring and description of fields will be used in system prompt.
class Calculator(BaseModel):
"""
Perform a math operation on two numbers.
"""
number_one: Union[int, float] = Field(..., description="First number.")
operation: MathOperation = Field(..., description="Math operation to perform.")
number_two: Union[int, float] = Field(..., description="Second number.")
def run(self):
if self.operation == MathOperation.ADD:
return self.number_one + self.number_two
elif self.operation == MathOperation.SUBTRACT:
return self.number_one - self.number_two
elif self.operation == MathOperation.MULTIPLY:
return self.number_one * self.number_two
elif self.operation == MathOperation.DIVIDE:
return self.number_one / self.number_two
else:
raise ValueError("Unknown operation.")
# Here the grammar gets generated by passing the available function models to generate_gbnf_grammar_and_documentation function. This also generates a documentation usable by the LLM.
# pydantic_model_list is the list of pydanitc models
# outer_object_name is an optional name for an outer object around the actual model object. Like a "function" object with "function_parameters" which contains the actual model object. If None, no outer object will be generated
# outer_object_content is the name of outer object content.
# model_prefix is the optional prefix for models in the documentation. (Default="Output Model")
# fields_prefix is the prefix for the model fields in the documentation. (Default="Output Fields")
gbnf_grammar, documentation = generate_gbnf_grammar_and_documentation(
pydantic_model_list=[SendMessageToUser, Calculator],
outer_object_name="function",
outer_object_content="function_parameters",
model_prefix="Function",
fields_prefix="Parameters",
)
print(gbnf_grammar)
print(documentation)
system_message = (
"You are an advanced AI, tasked to assist the user by calling functions in JSON format. The following are the available functions and their parameters and types:\n\n"
+ documentation
)
user_message = "What is 42 * 42?"
prompt = (
f"<|im_start|>system\n{system_message}<|im_end|>\n<|im_start|>user\n{user_message}<|im_end|>\n<|im_start|>assistant"
)
text = create_completion(prompt=prompt, grammar=gbnf_grammar)
function_dictionary = json.loads(text)
if function_dictionary["function"] == "Calculator":
function_parameters = {**function_dictionary["function_parameters"]}
print(Calculator(**function_parameters).run())
# This should output: 1764
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pydantic_gbnf_grammar_generator-0.1.1.tar.gz.
File metadata
- Download URL: pydantic_gbnf_grammar_generator-0.1.1.tar.gz
- Upload date:
- Size: 16.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0c1c5ffc7c6853fab87ead81a3b7e2db7d9cd4910c145cccd34b89081ff0dd0c
|
|
| MD5 |
e3a66d952120bc26ca2aae5a29456425
|
|
| BLAKE2b-256 |
1f59b3381ec58a1a8435e5c466e4c5a879e767872f1ae8b3e859c0f4358c9f33
|
File details
Details for the file pydantic_gbnf_grammar_generator-0.1.1-py3-none-any.whl.
File metadata
- Download URL: pydantic_gbnf_grammar_generator-0.1.1-py3-none-any.whl
- Upload date:
- Size: 15.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
62b6c9fbd2f7b51fd83e4d04164466b6a89f2a34c25b056bbf54186cbcb35152
|
|
| MD5 |
afd76a10ff09ec99f38eab0e79e28601
|
|
| BLAKE2b-256 |
c309d6622d0ad9215d02c19555b42fcdcf73cde3c79b4a4a0595cc5bedf696a5
|