Skip to main content

A composable Python DSL for building GBNF grammars compatible with llama.cpp

Project description

pygbnf

A composable Python DSL for building GBNF grammars compatible with llama.cpp.

  1. Define context-free grammars using expressive Python functions,

  2. Compile them into valid GBNF strings for constrained LLM generation.

  3. Real-time rule matching during inference.

Installation

pip install pygbnf          # core DSL only
pip install pygbnf[llm]     # + openai (for GrammarLLM)
pip install pygbnf[all]     # everything

For grammar visualization (DOT / SVG export), install Graphviz:

brew install graphviz   # macOS
apt install graphviz    # Debian / Ubuntu

Quick Start

Start llama-server with your favorite GGUF model.

$ llama-server -m LFM2-8B-A1B-Q4_K_M.gguf

Build grammar and constraint the model.

from pygbnf import Grammar, GrammarLLM, select

g = Grammar()

@g.rule
def answer():
    return select(["yes", "no", "maybe"])

g.start("answer")

llm = GrammarLLM("http://localhost:8080/v1")

text, _ = llm.complete(
    messages=[{"role": "user", "content": "Is the sky blue?"}],
    grammar=g
)

print(text)

The grammar constrains the LLM output — it can only produce yes, no, or maybe.

Guidance-Style GBNF

import pygbnf as cfg
from pygbnf import select, one_or_more, zero_or_more

g = cfg.Grammar()

@g.rule
def number():
    n = one_or_more(select("0123456789"))
    return select(['-' + n, n])

@g.rule
def operator():
    return select(['+', '*', '**', '/', '-'])

@g.rule
def expression():
    return select([
        number(),
        expression() + zero_or_more(" ") + operator()
            + zero_or_more(" ") + expression(),
        "(" + expression() + ")"
    ])

g.start("expression")
print(g.to_gbnf())

Output:

root ::= expression

number ::= "-" [0123456789]+ | [0123456789]+
operator ::= "+" | "*" | "**" | "/" | "-"
expression ::=
    number
  | expression " "* operator " "* expression
  | "(" expression ")"

LLM Usage

pygbnf includes GrammarLLM, a thin wrapper around any OpenAI-compatible endpoint (llama.cpp, vLLM, Ollama…) that injects the GBNF grammar automatically.

Streaming with rule matching

Enable match=True (or pass only/exclude) to get real-time RuleEvents as the LLM generates tokens:

from pygbnf import Grammar, GrammarLLM, select, one_or_more

g = Grammar()

@g.rule
def name():
    """A person's name."""
    return one_or_more(select("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ "))

@g.rule
def greeting():
    """A greeting message."""
    return select(["hello", "hi", "hey"]) + " " + name()

g.start("greeting")

llm = GrammarLLM("http://localhost:8080/v1")

for token, events in llm.stream(
    messages=[{"role": "user", "content": "Greet Alice."}],
    grammar=g,
    match=True,
):
    print(token, end="", flush=True)
    if events:
        for ev in events:
            print(f"\n  ← [{ev.rule}] {ev.text!r} (doc: {ev.doc})")
print()

Each RuleEvent carries:

  • rule — the matched rule name
  • text — the matched text
  • fn — the original Python function
  • doc — the function's docstring

Non-streaming completion

text, events = llm.complete(
    messages=[{"role": "user", "content": "Is the sky blue?"}],
    grammar=g,
    match=True,
)
print(text)
for ev in events:
    print(f"  [{ev.rule}] {ev.text!r}")

Schema-based grammar with LLM

Combine grammar_from_type with GrammarLLM to constrain output to a JSON schema:

from dataclasses import dataclass
from pygbnf import grammar_from_type, GrammarLLM

@dataclass
class City:
    name: str
    country: str
    population: int

g = grammar_from_type(City)
llm = GrammarLLM("http://localhost:8080/v1")

text, _ = llm.complete(
    messages=[{"role": "user", "content": "Describe Tokyo in JSON."}],
    grammar=g,
)
print(text)
# → {"name": "Tokyo", "country": "Japan", "population": 13960000}

Tool calling with Toolkit

Toolkit is a decorator-based tool registry. Register functions with @toolkit.tool, then pass the toolkit to llm.stream() or llm.complete() — the grammar and system prompt are injected automatically.

import enum
from pygbnf import GrammarLLM, Toolkit

toolkit = Toolkit()

class Units(enum.Enum):
    CELSIUS = "celsius"
    FAHRENHEIT = "fahrenheit"

@toolkit.tool
def get_weather(city: str, units: Units = Units.CELSIUS) -> str:
    """Get current weather for a city."""
    return f"22° {units.value} in {city}"

@toolkit.tool
def search_web(query: str, max_results: int = 5) -> str:
    """Search the web."""
    return f"Found {max_results} results for {query!r}"

llm = GrammarLLM("http://localhost:8080/v1")

# Stream with toolkit — grammar + system prompt auto-injected
result = ""
for token, _ in llm.stream(
    messages=[{"role": "user", "content": "Weather in Tokyo?"}],
    toolkit=toolkit,
):
    print(token, end="", flush=True)
    result += token

# Dispatch the JSON result to the matching function
output = toolkit.dispatch(result)
print(output)  # → "22° celsius in Tokyo"

The toolkit:

  • Builds a GBNF grammar constraining the LLM to produce {"function": "...", "arguments": {...}} with only registered tool names and typed arguments
  • Generates a system prompt listing available tools with signatures and docstrings
  • Dispatches the parsed JSON to the right function, converting enum strings back to Python Enum instances automatically

You can also use llm.tool_call() as a one-liner that streams + dispatches:

output = llm.tool_call(toolkit, "Weather in Tokyo?")
print(output)  # → "22° celsius in Tokyo"

Note: GrammarLLM requires the openai package: pip install openai. The LLM server must support the grammar field in its API (llama.cpp does natively).

Architecture

AST Nodes

Every grammar construct is a frozen dataclass node. Nodes compose via + (sequence) and | (alternative):

Node Description GBNF
Literal Double-quoted string "hello"
CharacterClass Character class [0-9]
Sequence Ordered concatenation a b c
Alternative Choice between options a | b | c
Repeat Quantified repetition x+, x*, x?, x{2,5}
RuleReference Reference to named rule expression
TokenReference Token-level constraint <think>, <[1000]>
Group Parenthesised group (a b)
Optional_ Optional element x?

DSL Combinators

from pygbnf import select, one_or_more, zero_or_more, optional, repeat, group

# Character class from string
select("0123456789")          # → [0123456789]

# Alternative from list
select(["+", "-", "*"])       # → "+" | "-" | "*"

# Repetition
one_or_more(x)                # → x+
zero_or_more(x)               # → x*
optional(x)                   # → x?
repeat(x, 2, 5)              # → x{2,5}

# Grouping
group(a + b)                  # → (a b)

# Operators
a + b                         # → a b   (sequence)
a | b                         # → a | b (alternative)

Rule Definition

Rules are defined with the @g.rule decorator. Calling a rule function inside another rule creates a rule reference (not an inline expansion):

g = cfg.Grammar()

@g.rule
def digit():
    return select("0123456789")

@g.rule
def number():
    return one_or_more(digit())  # → digit+  (reference, not inlined)

Forward references work naturally — rules can reference rules defined later.

Token Constraints

llama.cpp supports token-level matching:

from pygbnf import token, token_id, not_token, not_token_id

token("think")        # → <think>
token_id(1000)        # → <[1000]>
not_token("think")    # → !<think>
not_token_id(1001)    # → !<[1001]>

Grammar Helpers

Common patterns prebuilt:

from pygbnf import (
    WS, ws, ws_required,           # whitespace
    keyword, identifier, number,    # basic tokens
    float_number, string_literal,   # complex tokens
    comma_list, between,           # structural patterns
    separated_by, spaced_comma_list,
)

comma_list(identifier())   # → ident ("," " "* ident)*
between("(", expr, ")")    # → "(" expr ")"

Recursion Analysis

Detect left recursion in your grammar:

cycles = g.detect_left_recursion()
# Warns: "Left recursion detected: expression -> expression"
# Suggests: rewrite as base (op base)*

Examples

See the examples/ directory:

File Description
quickstart.py The quick-start example from this README
arithmetic.py Arithmetic expressions with operator precedence
csv_grammar.py CSV file format
json_grammar.py Full JSON grammar
simple_lang.py A small programming language
token_demo.py Token-level constraints
demo_schema.py Schema → grammar examples
demo_enum_select.py Enum-based selection
demo_simple_lang.py Mini-language generation with LLM
demo_vision.py Vision + grammar: solve math from an image
demo_visualization.py Export grammar NFA as DOT / SVG

Run any example:

python examples/arithmetic.py

Schema Generation

Auto-generate grammars from Python types and dataclasses:

from dataclasses import dataclass
from pygbnf import grammar_from_type

@dataclass
class Movie:
    title: str
    year: int
    rating: float

g = grammar_from_type(Movie)
print(g.to_gbnf())

Also supports function signatures:

from pygbnf import grammar_from_args

def search(query: str, limit: int = 10):
    ...

g = grammar_from_args(search)
print(g.to_gbnf())

Visualization

Export any grammar as an NFA diagram in DOT or SVG format:

import pygbnf as cfg
from pygbnf import select, one_or_more, optional
from pygbnf.visualization import write_grammar_svg

g = cfg.Grammar()

@g.rule
def number():
    return optional("-") + one_or_more(select("0123456789"))

@g.rule
def operator():
    return select(["+", "-", "*", "/"])

@g.rule
def expression():
    atom = select([number(), "(" + expression() + ")"])
    return atom + cfg.zero_or_more(cfg.group(" " + operator() + " " + expression()))

g.start("expression")

# Generates .dot + .svg (requires Graphviz)
write_grammar_svg(g, "arithmetic.svg")

When rule_names is omitted, only user-defined rules are included (auto-generated infrastructure rules like ws, json-string, etc. are filtered out).

Requirements

  • Python 3.8+
  • Optional: openai>=1.0 for GrammarLLM (pip install pygbnf[llm])
  • Optional: Graphviz CLI for SVG rendering

Acknowledgements

  • guidance-ai — pygbnf's composable API is inspired by their approach to constrained generation
  • llama.cpp — for the GBNF format and the underlying inference engine

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pygbnf-0.4.0.tar.gz (127.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pygbnf-0.4.0-py3-none-any.whl (50.1 kB view details)

Uploaded Python 3

File details

Details for the file pygbnf-0.4.0.tar.gz.

File metadata

  • Download URL: pygbnf-0.4.0.tar.gz
  • Upload date:
  • Size: 127.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for pygbnf-0.4.0.tar.gz
Algorithm Hash digest
SHA256 9378f3fba2007eba425aadaba12194458794a6e3a89ffa57012d4a76b0ae73b8
MD5 f2c355c610ff2a73ca8c4ed904be9dba
BLAKE2b-256 1aaab99998bc00a1217f01e6fdd87e961fd55e24bb2d8eaa8a9b97edc6f62ccd

See more details on using hashes here.

File details

Details for the file pygbnf-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: pygbnf-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 50.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for pygbnf-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d030297b12561622168fb870d412cfe7671094cdacd192419e9897fa930effa5
MD5 ca3ac92f5707b9b21227df8a807fa855
BLAKE2b-256 9827d3143117eabaec0e1af34421e9445596f14a0bbee7a3fcbc681fce4b5d9b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page