Skip to main content

Multi-language library for parsing labeled/structured LLM output

Project description

Python usage

The Python bindings call into the core Go implementation compiled to WebAssembly using a WASI runtime.

Requirements

  • Python 3.8+
  • wasmtime CLI installed and available on your PATH

Install:

pip install structured-parse

Basic import:

from structured_parse import StructuredParser, Label

Types

The exact implementation may vary, but conceptually:

from dataclasses import dataclass
from typing import List, Optional

@dataclass
class Label:
    name: str
    required: bool = False
    required_with: Optional[List[str]] = None
    is_json: bool = False
    is_block_start: bool = False

@dataclass
class ParserOptions:
    separators: str = ":~-="  # default separators

@dataclass
class ParseResult:
    result: dict[str, object]
    errors: list[str]

@dataclass
class ParseBlocksResult:
    blocks: list[dict[str, object]]
    errors: list[str]

StructuredParser uses the WASM module under the hood; the interface is synchronous, but initialization may perform some one-time setup on first use.


Creating a parser

from structured_parse import StructuredParser, Label

labels = [
    Label(name="Reason", required=True),
    Label(name="Sentiment", required=True),
    Label(name="Confidence", is_json=True),
]

parser = StructuredParser()  # default options (":~-=" separators)

If you expose options:

from structured_parse import StructuredParser, Label, ParserOptions

options = ParserOptions(separators=":")  # only allow colon
parser = StructuredParser(options=options)

Parsing a single record

from structured_parse import StructuredParser, Label

labels = [
    Label(name="Reason", required=True),
    Label(name="Sentiment", required=True),
    Label(name="Confidence", is_json=True),
]

llm_output = """
Reason: I see mostly positive language.
Sentiment: Positive
Confidence: {"score": 0.94, "threshold": 0.8}
"""

parser = StructuredParser()
result = parser.parse(labels, llm_output)

if result.errors:
    print("Warnings:")
    for err in result.errors:
        print("  -", err)

reason = result.result.get("Reason")
sentiment = result.result.get("Sentiment")
confidence = result.result.get("Confidence")  # likely a dict

print("Reason:", reason)
print("Sentiment:", sentiment)
print("Confidence:", confidence)

Notes:

  • Matching is case-insensitive, but result.result keys use your original label names ("Reason", "Sentiment", etc.).

  • JSON fields (is_json=True) are parsed; on failure:

    • The raw string is kept.
    • An error is added to result.errors.

Parsing multiple blocks

from structured_parse import StructuredParser, Label

labels = [
    Label(name="Step", is_block_start=True, required=True),
    Label(name="Analysis"),
    Label(name="Data", is_json=True),
    Label(name="Conclusion"),
]

llm_output = """
Step: Data Collection
Analysis: Gathered user feedback from 1,000 responses
Data: {"positive": 650, "neutral": 250, "negative": 100}
Conclusion: Majority positive sentiment

Step: Trend Analysis
Analysis: Comparing with previous quarter
Data: {"growth": 15.5, "retention": 92.3}
Conclusion: Strong upward trend
"""

parser = StructuredParser()
blocks_result = parser.parse_blocks(labels, llm_output)

for i, block in enumerate(blocks_result.blocks, start=1):
    print(f"\n=== Step {i}: {block.get('Step')} ===")
    print("Analysis:", block.get("Analysis"))
    print("Data:", block.get("Data"))
    print("Conclusion:", block.get("Conclusion"))

if blocks_result.errors:
    print("\nWarnings:")
    for err in blocks_result.errors:
        print("  -", err)

Each block is a dict[str, object].


Custom separators

Default separators: :, ~, -, =.

If ParserOptions is exposed, you can restrict or change them:

from structured_parse import StructuredParser, Label, ParserOptions

labels = [
    Label(name="Key"),
    Label(name="Value"),
]

options = ParserOptions(separators=":")  # only colon
parser = StructuredParser(options=options)

output = """
Key: foo
Value: bar
"""

result = parser.parse(labels, output)
print(result.result)

Lines without a configured separator aren’t recognized as label lines and are treated as part of the current field’s value.


Multiline values

Values span multiple lines until the next recognized label:

labels = [
    Label(name="Description"),
    Label(name="Next Field"),
]

llm_output = """
Description: This is a long description
that spans multiple lines and will be
captured as a single value.
Next Field: Done
"""

parser = StructuredParser()
result = parser.parse(labels, llm_output)

print(result.result["Description"])
# "This is a long description\nthat spans multiple lines and will be\ncaptured as a single value."

Error handling

Both parse and parse_blocks return result objects with an errors list:

result = parser.parse(labels, text)

if result.errors:
    print("Warnings:")
    for err in result.errors:
        print("  -", err)

# result.result is available even if there are warnings

Common errors:

  • Missing required fields: "Sentiment" is required
  • Dependency failures: "Action" requires "Action Input"
  • JSON parse errors: JSON error in 'Config': ...

WASM and wasmtime

Under the hood, StructuredParser:

  1. Loads the WebAssembly module built from the Go implementation.
  2. Executes it via the wasmtime CLI (WASI).

This means:

  • You get the same behavior and bugfixes as the Go implementation.
  • You must have wasmtime installed and accessible on your system.
  • If initialization fails (e.g., WASM module not found, wasmtime missing), the library will raise a Python exception.

If you encounter WASM-related issues, verify:

  • wasmtime --version works in your shell.
  • Your environment can find the WASM module packaged with structured-parse.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

structured_parse-1.0.1.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

structured_parse-1.0.1-py3-none-any.whl (1.1 MB view details)

Uploaded Python 3

File details

Details for the file structured_parse-1.0.1.tar.gz.

File metadata

  • Download URL: structured_parse-1.0.1.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.10

File hashes

Hashes for structured_parse-1.0.1.tar.gz
Algorithm Hash digest
SHA256 1b7d980e3a1d6e499124e5d464353c3d2069a1a5754a4768b15422a111290dce
MD5 43c32ba29ac364f9de8e9c7608997c52
BLAKE2b-256 6f736066c7eea8c4625a6bb3a48c1e5caa36ded8f91e8efd3e8a261c4899e52e

See more details on using hashes here.

File details

Details for the file structured_parse-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for structured_parse-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 634a2e4b139636250d654691735225b4966bacf6bfb30335138f51ab2cc2f6d7
MD5 f06a49a18e21c0acad17826dc3fbcd16
BLAKE2b-256 85549fd6fabb18f22e9d99cb13000b116ded731b2b8c32bd35a5eb146fdfb8f1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page