A Python library for constraining language model outputs to follow CFG, REGEX and JSON (experimental).
Project description
A Python library for constraining language model outputs to follow CFG, REGEX and JSON (experimental).
Features
- Zero dependencies
- Parses all context-free grammars, including ambiguous grammars
- Returns tokens constrained to a specified vocabulary if needed
- Type annotations with
mypy - Includes a Rust implementation
Quick Start
Installation
pip install lextrail
Usage Modes
The library supports two ways to generate constrained text, depending on your use case:
Trail
Use a Trail object when you want to generate the complete next element without vocabulary constraints.
CFG
from lextrail.guide import trail_cfg
example = r"""
start: expression
expression: term (("+" | "-") term)
term: factor (("*" | "/") factor)
factor: NUMBER
NUMBER: /-?[0-9]+/
"""
trail = trail_cfg(example)
Regex
from lextrail.guide import trail_rex
example = r"[a-z]+@[a-z]+\.(com|org|net)"
trail = trail_rex(example)
You can also combine both TERMINAL and REGEX expressions using trail_exp.
from lextrail.guide import trail_exp
example = r"/[0-9]\.[0-9]/ "+" /[0-9]\.[0-9]/"
trail = trail_exp(example)
JSON
This is an experimental version. Not intended for production use.
- Currently supported keywords:
type,enum,const,properties,required,items,prefixItems,oneOf - Constraint intersection (e.g., combining
prefixItemswithitems, orconstwithenum) is not yet implemented
from lextrail.json import trail_json
example = r"""
{
"type": "object",
"properties": {
"user": {
"type": "object",
"properties": {
"name": {"type": "string"},
"email": {"type": "string"}
},
"required": ["email"]
}
}
}
"""
trail = trail_json(example)
Then, run a random simulation.
import random
from lextrail.guide import get_next_values
response, value = [], ""
while values := get_next_values(trail, value):
value = random.choice(values)
response.append(value)
print("".join(response))
ASM
Use an ASM object when you need to constrain the next token to a predefined vocabulary.
Example
from lextrail.assemble import asm_cfg
example = r"""
start: L0
L0: ("A" | "B")+ L1
L1: ("C" | "D") L2
L2: "E" L3*
L3: /FGH/
"""
asm = asm_cfg(example, ["AD", "EF", "GH"])
If you launch a simulation, then the proposals will be elements of the provided vocabulary.
import random
from lextrail.assemble import get_next_tokens
response, value = [], ""
while values := get_next_tokens(asm, value):
value = random.choice(values)
response.append(value)
print("".join(response))
assert response == ["AD", "EF", "GH", ""]
You can do it with any of the formats.
# CFG
from lextrail.assemble import asm_cfg
asm_cfg(.., [..])
# REGEX
from lextrail.assemble import asm_rex
asm_rex(.., [..])
# MIXED
from lextrail.assemble import asm_exp
asm_exp(.., [..])
# JSON
from lextrail.json import asm_json
asm_json(.., [..])
Playground
I've built a playground to showcase the different simulations, you can use either a Trail object or an ASM one.
from lextrail.guide import trail_cfg
from lextrail.playground import run_playground
example = r"""
start: expression
expression: term* (( "+" | "-") term)+
term: factor* (("*" | "/") factor)+
factor: NUMBER?
NUMBER: /[0-1]+/
"""
trail = trail_cfg(example)
run_playground(trail)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lextrail-0.1.0.tar.gz.
File metadata
- Download URL: lextrail-0.1.0.tar.gz
- Upload date:
- Size: 156.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d4d018877c172e8456cd4a95e654007b8df5c0447d936b2ebb3f51a5c6b40bca
|
|
| MD5 |
c13cf8f2e38293535b2128d7cf251f5d
|
|
| BLAKE2b-256 |
92262eb994cf8b8e2b9ac6e659ffe832a34b6cb64117172abc7c900cc52da50d
|
File details
Details for the file lextrail-0.1.0-py3-none-any.whl.
File metadata
- Download URL: lextrail-0.1.0-py3-none-any.whl
- Upload date:
- Size: 165.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8321017cd5f39405035aa688df1c50046d59abb0070d907b70e1d5ba188b3c9a
|
|
| MD5 |
4e493677579fd953a918b656c364a071
|
|
| BLAKE2b-256 |
b8ef02450e9d4b00b9ad83b178aacbacb44d0dffe460a2ab1757f8311b138577
|