Skip to main content

A composable Python DSL for building GBNF grammars compatible with llama.cpp

Project description

pygbnf

A composable Python DSL for building GBNF grammars compatible with llama.cpp.

  1. Define context-free grammars using expressive Python functions,

  2. Compile them into valid GBNF strings for constrained LLM generation.

  3. Real-time rule matching during inference.

Installation

pip install pygbnf          # core DSL only
pip install pygbnf[llm]     # + openai (for GrammarLLM)
pip install pygbnf[all]     # everything

For grammar visualization (DOT / SVG export), install Graphviz:

brew install graphviz   # macOS
apt install graphviz    # Debian / Ubuntu

Quick Start

Start llama-server with your favorite GGUF model.

$ llama-server -m LFM2-8B-A1B-Q4_K_M.gguf

Build grammar and constraint the model.

from pygbnf import Grammar, GrammarLLM, select

g = Grammar()

@g.rule
def answer():
    return select(["yes", "no", "maybe"])

g.start("answer")

llm = GrammarLLM("http://localhost:8080/v1")

text, _ = llm.complete(
    messages=[{"role": "user", "content": "Is the sky blue?"}],
    grammar=g
)

print(text)

The grammar constrains the LLM output — it can only produce yes, no, or maybe.

Guidance-Style GBNF

import pygbnf as cfg
from pygbnf import select, one_or_more, zero_or_more

g = cfg.Grammar()

@g.rule
def number():
    n = one_or_more(select("0123456789"))
    return select(['-' + n, n])

@g.rule
def operator():
    return select(['+', '*', '**', '/', '-'])

@g.rule
def expression():
    return select([
        number(),
        expression() + zero_or_more(" ") + operator()
            + zero_or_more(" ") + expression(),
        "(" + expression() + ")"
    ])

g.start("expression")
print(g.to_gbnf())

Output:

root ::= expression

number ::= "-" [0123456789]+ | [0123456789]+
operator ::= "+" | "*" | "**" | "/" | "-"
expression ::=
    number
  | expression " "* operator " "* expression
  | "(" expression ")"

LLM Usage

pygbnf includes GrammarLLM, a thin wrapper around any OpenAI-compatible endpoint (llama.cpp, vLLM, Ollama…) that injects the GBNF grammar automatically.

Streaming with rule matching

Enable match=True (or pass only/exclude) to get real-time RuleEvents as the LLM generates tokens:

from pygbnf import Grammar, GrammarLLM, select, one_or_more

g = Grammar()

@g.rule
def name():
    """A person's name."""
    return one_or_more(select("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ "))

@g.rule
def greeting():
    """A greeting message."""
    return select(["hello", "hi", "hey"]) + " " + name()

g.start("greeting")

llm = GrammarLLM("http://localhost:8080/v1")

for token, events in llm.stream(
    messages=[{"role": "user", "content": "Greet Alice."}],
    grammar=g,
    match=True,
):
    print(token, end="", flush=True)
    if events:
        for ev in events:
            print(f"\n  ← [{ev.rule}] {ev.text!r} (doc: {ev.doc})")
print()

Each RuleEvent carries:

  • rule — the matched rule name
  • text — the matched text
  • fn — the original Python function
  • doc — the function's docstring

Non-streaming completion

text, events = llm.complete(
    messages=[{"role": "user", "content": "Is the sky blue?"}],
    grammar=g,
    match=True,
)
print(text)
for ev in events:
    print(f"  [{ev.rule}] {ev.text!r}")

Schema-based grammar with LLM

Combine grammar_from_type with GrammarLLM to constrain output to a JSON schema:

from dataclasses import dataclass
from pygbnf import grammar_from_type, GrammarLLM

@dataclass
class City:
    name: str
    country: str
    population: int

g = grammar_from_type(City)
llm = GrammarLLM("http://localhost:8080/v1")

text, _ = llm.complete(
    messages=[{"role": "user", "content": "Describe Tokyo in JSON."}],
    grammar=g,
)
print(text)
# → {"name": "Tokyo", "country": "Japan", "population": 13960000}

Tool calling with Toolkit

Toolkit is a decorator-based tool registry. Register functions with @toolkit.tool, then pass the toolkit to llm.stream() or llm.complete() — the grammar and system prompt are injected automatically.

import enum
from pygbnf import GrammarLLM, Toolkit

toolkit = Toolkit()

class Units(enum.Enum):
    CELSIUS = "celsius"
    FAHRENHEIT = "fahrenheit"

@toolkit.tool
def get_weather(city: str, units: Units = Units.CELSIUS) -> str:
    """Get current weather for a city."""
    return f"22° {units.value} in {city}"

@toolkit.tool
def search_web(query: str, max_results: int = 5) -> str:
    """Search the web."""
    return f"Found {max_results} results for {query!r}"

llm = GrammarLLM("http://localhost:8080/v1")

# Stream with toolkit — grammar + system prompt auto-injected
result = ""
for token, _ in llm.stream(
    messages=[{"role": "user", "content": "Weather in Tokyo?"}],
    toolkit=toolkit,
):
    print(token, end="", flush=True)
    result += token

# Dispatch the JSON result to the matching function
output = toolkit.dispatch(result)
print(output)  # → "22° celsius in Tokyo"

The toolkit:

  • Builds a GBNF grammar constraining the LLM to produce {"function": "...", "arguments": {...}} with only registered tool names and typed arguments
  • Generates a system prompt listing available tools with signatures and docstrings
  • Dispatches the parsed JSON to the right function, converting enum strings back to Python Enum instances automatically

You can also use llm.tool_call() as a one-liner that streams + dispatches:

output = llm.tool_call(toolkit, "Weather in Tokyo?")
print(output)  # → "22° celsius in Tokyo"

Note: GrammarLLM requires the openai package: pip install openai. The LLM server must support the grammar field in its API (llama.cpp does natively).

Architecture

AST Nodes

Every grammar construct is a frozen dataclass node. Nodes compose via + (sequence) and | (alternative):

Node Description GBNF
Literal Double-quoted string "hello"
CharacterClass Character class [0-9]
Sequence Ordered concatenation a b c
Alternative Choice between options a | b | c
Repeat Quantified repetition x+, x*, x?, x{2,5}
RuleReference Reference to named rule expression
TokenReference Token-level constraint <think>, <[1000]>
Group Parenthesised group (a b)
Optional_ Optional element x?

DSL Combinators

from pygbnf import select, one_or_more, zero_or_more, optional, repeat, group

# Character class from string
select("0123456789")          # → [0123456789]

# Alternative from list
select(["+", "-", "*"])       # → "+" | "-" | "*"

# Repetition
one_or_more(x)                # → x+
zero_or_more(x)               # → x*
optional(x)                   # → x?
repeat(x, 2, 5)              # → x{2,5}

# Grouping
group(a + b)                  # → (a b)

# Operators
a + b                         # → a b   (sequence)
a | b                         # → a | b (alternative)

Rule Definition

Rules are defined with the @g.rule decorator. Calling a rule function inside another rule creates a rule reference (not an inline expansion):

g = cfg.Grammar()

@g.rule
def digit():
    return select("0123456789")

@g.rule
def number():
    return one_or_more(digit())  # → digit+  (reference, not inlined)

Forward references work naturally — rules can reference rules defined later.

Token Constraints

llama.cpp supports token-level matching:

from pygbnf import token, token_id, not_token, not_token_id

token("think")        # → <think>
token_id(1000)        # → <[1000]>
not_token("think")    # → !<think>
not_token_id(1001)    # → !<[1001]>

Grammar Helpers

Common patterns prebuilt:

from pygbnf import (
    WS, ws, ws_required,           # whitespace
    keyword, identifier, number,    # basic tokens
    float_number, string_literal,   # complex tokens
    comma_list, between,           # structural patterns
    separated_by, spaced_comma_list,
)

comma_list(identifier())   # → ident ("," " "* ident)*
between("(", expr, ")")    # → "(" expr ")"

Recursion Analysis

Detect left recursion in your grammar:

cycles = g.detect_left_recursion()
# Warns: "Left recursion detected: expression -> expression"
# Suggests: rewrite as base (op base)*

Examples

See the examples/ directory:

File Description
quickstart.py The quick-start example from this README
arithmetic.py Arithmetic expressions with operator precedence
csv_grammar.py CSV file format
json_grammar.py Full JSON grammar
simple_lang.py A small programming language
token_demo.py Token-level constraints
demo_schema.py Schema → grammar examples
demo_enum_select.py Enum-based selection
demo_simple_lang.py Mini-language generation with LLM
demo_vision.py Vision + grammar: solve math from an image
demo_visualization.py Export grammar NFA as DOT / SVG

Run any example:

python examples/arithmetic.py

Schema Generation

Auto-generate grammars from Python types and dataclasses:

from dataclasses import dataclass
from pygbnf import grammar_from_type

@dataclass
class Movie:
    title: str
    year: int
    rating: float

g = grammar_from_type(Movie)
print(g.to_gbnf())

Also supports function signatures:

from pygbnf import grammar_from_args

def search(query: str, limit: int = 10):
    ...

g = grammar_from_args(search)
print(g.to_gbnf())

Visualization

Export any grammar as an NFA diagram in DOT or SVG format:

import pygbnf as cfg
from pygbnf import select, one_or_more, optional
from pygbnf.visualization import write_grammar_svg

g = cfg.Grammar()

@g.rule
def number():
    return optional("-") + one_or_more(select("0123456789"))

@g.rule
def operator():
    return select(["+", "-", "*", "/"])

@g.rule
def expression():
    atom = select([number(), "(" + expression() + ")"])
    return atom + cfg.zero_or_more(cfg.group(" " + operator() + " " + expression()))

g.start("expression")

# Generates .dot + .svg (requires Graphviz)
write_grammar_svg(g, "arithmetic.svg")

When rule_names is omitted, only user-defined rules are included (auto-generated infrastructure rules like ws, json-string, etc. are filtered out).

Requirements

  • Python 3.8+
  • Optional: openai>=1.0 for GrammarLLM (pip install pygbnf[llm])
  • Optional: Graphviz CLI for SVG rendering

Acknowledgements

  • guidance-ai — pygbnf's composable API is inspired by their approach to constrained generation
  • llama.cpp — for the GBNF format and the underlying inference engine

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pygbnf-0.5.0.tar.gz (127.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pygbnf-0.5.0-py3-none-any.whl (50.1 kB view details)

Uploaded Python 3

File details

Details for the file pygbnf-0.5.0.tar.gz.

File metadata

  • Download URL: pygbnf-0.5.0.tar.gz
  • Upload date:
  • Size: 127.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for pygbnf-0.5.0.tar.gz
Algorithm Hash digest
SHA256 e1bf541f51aa88c414b3a42bf97645026b41b38f1e6eb3cfe023c68f9aa3a0f5
MD5 7418dd175f929412f6bed75440e7db67
BLAKE2b-256 f79ba5d3ade7c10b423cb3bb55fe52c93fd8647594451f367e3346de5f1d998d

See more details on using hashes here.

File details

Details for the file pygbnf-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: pygbnf-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 50.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for pygbnf-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 321b920d253b191c3644dbf2c5df0c5de864c46aaf8bf50193bb15db1facd729
MD5 c3bd7e0d9a1468770325049be5c40e7a
BLAKE2b-256 09fb57e4d9c7779318f8998d34636f8875b8dc2f7bf59d4f530186de5b3c11ce

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page