Skip to main content

GASP (Gee Another Schema Parser) - A validator and type safe deserializer for LLM output.

Project description

GASP - Type-Safe LLM Output Parser

⚠️ MAJOR BREAKING CHANGES IN VERSION 1.0.0 ⚠️
Version 1.0.0 is a complete rewrite that removes WAIL entirely and introduces a new tag-based parsing approach.
If you're using an older version of GASP, you'll need to significantly change your code to upgrade.
See the Migration Guide below.

GASP is a Rust-based parser for turning LLM outputs into properly typed Python objects. It handles streaming JSON fragments, recovers from common LLM quirks, and makes structured data extraction actually pleasant.

The Problem

LLMs are great at generating structured data when asked, but not perfect:

<Person>
{
  "name": "Alice Smith",
  "age": 30,
  hobbies: ["coding", "hiking"]
}
</Person>

That output has unquoted keys, inconsistent formatting, and is embedded in natural language. Most JSON parsers just give up.

How GASP Works

GASP uses a tag-based approach to extract and type-cast structured data:

  1. Tags like <Person>...</Person> mark where the structured data lives (and what type it is)
  2. The parser ignores everything outside those tags
  3. Inside the tags, it handles messy JSON with broken quotes, trailing commas, etc.
  4. The data gets converted into proper Python objects based on type annotations

Features

  • Tag-Based Extraction: Extract structured data even when surrounded by explanatory text
  • Streaming Support: Process data incrementally as it arrives from the LLM
  • Type Inference: Automatically match JSON objects to Python classes
  • Error Recovery: Handle common JSON mistakes that LLMs make
  • Pydantic Integration: Works with Pydantic for validation and schema definition

Installation

pip install gasp-py

Quick Example

from gasp import Parser
from typing import List, Optional

# Regular classes work now - no need for Deserializable
class Address:
    def __init__(self, street="", city="", zip_code=""):
        self.street = street
        self.city = city
        self.zip_code = zip_code

class Person:
    def __init__(self, name="", age=0, address=None, hobbies=None):
        self.name = name
        self.age = age
        self.address = address or Address()
        self.hobbies = hobbies or []

# Create a parser for the Person type
parser = Parser(Person)

# Process LLM output chunks as they arrive
chunks = [
    '<Person>{"name": "Alice", "age": 30',
    ', "address": {"street": "123 Main St", "city": "Springfield"',
    ', "zip_code": "12345"}, "hobbies": ["reading", "coding"]}</Person>'
]

for chunk in chunks:
    result = parser.feed(chunk)
    print(result)  # Will show partial objects as they're built

# Get the final validated result
person = parser.validate()
print(f"Hello {person.name}!")  # Hello Alice!

Container Types

Lists and tuples get their own tags:

# List[T] uses <list> tag
parser = Parser(List[int])
result = parser.feed('<list>[1, 2, 3]</list>')  # returns [1, 2, 3]

# Tuple[T, ...] uses <tuple> tag  
parser = Parser(Tuple[str, int, bool])
result = parser.feed('<tuple>["hello", 42, true]</tuple>')  # returns ("hello", 42, True)

Using Deserializable for Advanced Streaming

Regular classes work for most use cases. Use Deserializable when you need:

  • Streaming control: React to data as it arrives
  • Custom validation: Validate/transform data during parsing
  • State management: Maintain computed fields or derived state
from gasp import Deserializable

class LiveDashboard(Deserializable):
    def __init__(self):
        self.events = []
        self.summary_stats = {}
    
    def __gasp_update__(self, partial_data):
        # React to streaming updates
        if 'new_event' in partial_data:
            self.events.append(partial_data['new_event'])
            self._recalculate_stats()
    
    @classmethod
    def __gasp_from_partial__(cls, partial_data):
        # Custom instantiation logic
        instance = cls()
        # Apply defaults, validate, etc.
        return instance

TL;DR: Regular classes for simple parsing, Deserializable for complex streaming behavior.

Working with Pydantic

GASP integrates seamlessly with Pydantic:

from pydantic import BaseModel
from gasp import Parser

class UserProfile(BaseModel):
    username: str
    email: str
    is_active: bool = True

# Create parser from Pydantic model
parser = Parser.from_pydantic(UserProfile)

# Feed LLM output with tags
llm_output = '<UserProfile>{"username": "alice42", "email": "alice@example.com"}</UserProfile>'
result = parser.feed(llm_output)

# Access as a proper Pydantic object
profile = UserProfile.model_validate(parser.validate())
print(profile.model_dump_json(indent=2))

How Tags Work

The tag name directly indicates what Python type to instantiate:

<Person>{ ... JSON data ... }</Person>  # Creates a Person instance
<List>[ ... array data ... ]</List>     # Creates a List
<Address>{ ... address data ... }</Address>  # Creates an Address

The parser ignores everything outside of the tags, so the LLM can provide explanations, context, or other text alongside the structured data.

Advanced Templating with Jinja2

GASP provides built-in Jinja2 integration for more advanced prompt templating:

from gasp import Deserializable, render_template
from typing import List, Optional

class Person(Deserializable):
    """Information about a person"""
    name: str
    age: int
    hobbies: Optional[List[str]] = None

# Create a template with Jinja2 syntax
template = """
# {{ title }}

Generate a {{ type_name|type_description }}.

{% if include_format_instructions %}
Your response must be formatted as:
{{ person_type|format_type }}
{% endif %}
"""

# Provide template context
context = {
    'title': 'Person Generator',
    'type_name': Person,
    'include_format_instructions': True,
    'person_type': Person
}

# Render the template
prompt = render_template(template, context)

Available Jinja2 Filters

  • format_type: Generates format instructions for a type (e.g., {{ my_type|format_type }})
  • type_description: Provides a human-readable description of a type, including docstring info

Template Files & Inheritance

You can also use template files with inheritance:

from gasp import render_file_template

# Renders a template file with GASP filters included
prompt = render_file_template("templates/person_prompt.j2", context)

Direct Jinja2 Access

For advanced cases, you can use the Jinja2 environment directly:

from gasp.jinja_helpers import create_type_environment

# Create a Jinja2 environment with GASP filters
env = create_type_environment()

# Add your own filters
env.filters["my_filter"] = my_filter_function

# Load templates from a directory
env.loader = jinja2.FileSystemLoader("templates/")

# Use directly with Jinja2 API
template = env.get_template("my_template.j2")
prompt = template.render(**context)

Customizing Behavior

Need more control? You can customize type conversion, validation, and parsing behavior:

# Custom type conversions and validation
class CustomPerson(Deserializable):
    name: str
    age: int

    @classmethod
    def __gasp_from_partial__(cls, partial_data):
        """Add custom validation or pre-processing"""
        # Normalize name to title case
        if "name" in partial_data:
            partial_data["name"] = partial_data["name"].title()
        return super().__gasp_from_partial__(partial_data)

Migrating from pre-1.0 Versions

Version 1.0.0 represents a complete architectural shift:

What's Been Removed

  • WAIL Parser: The entire WAIL language and validation system has been removed
  • Schema Validation: The schema-based approach has been replaced with typed parsing
  • WAILGenerator: This class and its API are no longer available
  • All WAIL-related files and examples

What's New

  • Tag-Based Parsing: Uses XML-like tags in LLM output to identify data types
  • Type Annotations: Direct use of Python type annotations to define structures
  • Template Helpers: Functions to generate format instructions from types
  • Streaming Support: Improved support for processing data as it arrives

Migration Steps

  1. Replace WAIL schema definitions with Python classes using type annotations
  2. Replace WAILGenerator with the new Parser class
  3. Update your prompts to use the new tag-based format
  4. Use template_helpers.interpolate_prompt() to generate type-aware prompts

Example of old WAIL approach:

schema = r'''
object Response { name: String, age: Number }
template GenerateResponse() -> Response { ... }
'''
generator = WAILGenerator()
generator.load_wail(schema)
(prompt, _, _) = generator.get_prompt()
llm_response = your_llm_client.generate(prompt)
parsed_data = generator.parse_llm_output(llm_response)

New approach:

from gasp import Parser
from gasp.template_helpers import interpolate_prompt

# Regular class
class Person:
    def __init__(self, name="", age=0):
        self.name = name
        self.age = age

# Create a template with a {{return_type}} placeholder
template = """
Generate a profile for a person who loves coding.

{{return_type}}
"""

# Generate a complete prompt with type information
prompt = interpolate_prompt(template, Person)
print(prompt)
# Output will include:
# Your response should be formatted as:
# <Person>{ "name": string, "age": number }</Person>

# Send to your LLM
llm_response = your_llm_client.generate(prompt)

# Parse the tagged response
parser = Parser(Person)
parser.feed(llm_response)
person = parser.validate()

print(f"Created person: {person.name}, {person.age} years old")

Contributing

Contributions welcome! Check out the examples directory to see how things work.

License

Apache License, Version 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

gasp_py-1.2.0-cp312-cp312-musllinux_1_2_x86_64.whl (394.6 kB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ x86-64

gasp_py-1.2.0-cp312-cp312-musllinux_1_2_aarch64.whl (364.2 kB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ ARM64

gasp_py-1.2.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (332.5 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

gasp_py-1.2.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (315.4 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

gasp_py-1.2.0-cp312-cp312-macosx_11_0_arm64.whl (294.5 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

gasp_py-1.2.0-cp311-cp311-musllinux_1_2_x86_64.whl (393.1 kB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ x86-64

gasp_py-1.2.0-cp311-cp311-musllinux_1_2_aarch64.whl (362.9 kB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ ARM64

gasp_py-1.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (331.0 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

gasp_py-1.2.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (314.2 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

gasp_py-1.2.0-cp311-cp311-macosx_11_0_arm64.whl (293.2 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

gasp_py-1.2.0-cp310-cp310-musllinux_1_2_x86_64.whl (393.1 kB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ x86-64

gasp_py-1.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (331.0 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

gasp_py-1.2.0-cp310-cp310-macosx_10_12_x86_64.whl (303.9 kB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

gasp_py-1.2.0-cp39-cp39-musllinux_1_2_x86_64.whl (392.6 kB view details)

Uploaded CPython 3.9musllinux: musl 1.2+ x86-64

gasp_py-1.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (330.6 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

gasp_py-1.2.0-cp39-cp39-macosx_10_12_x86_64.whl (303.5 kB view details)

Uploaded CPython 3.9macOS 10.12+ x86-64

gasp_py-1.2.0-cp38-cp38-musllinux_1_2_x86_64.whl (393.9 kB view details)

Uploaded CPython 3.8musllinux: musl 1.2+ x86-64

gasp_py-1.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (331.7 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

gasp_py-1.2.0-cp38-cp38-macosx_10_12_x86_64.whl (304.3 kB view details)

Uploaded CPython 3.8macOS 10.12+ x86-64

File details

Details for the file gasp_py-1.2.0-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 b6fc39c127d4d0d54ea24aa59719d7bb08d97c6915c1e02715513ca68ba3af5b
MD5 fe9b8f90c1246f3057b5b84efca3caec
BLAKE2b-256 588e0562ca8564b69d3da63aa1120aca87210073b6a03272017d66f0456f80c1

See more details on using hashes here.

File details

Details for the file gasp_py-1.2.0-cp312-cp312-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp312-cp312-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 d28f323d497ab214d036421fbc060f58289cdacde6adcfd8f008066afe59300b
MD5 23a7b2a5c6379ff6d5e28f248290c70d
BLAKE2b-256 d533162370cdffadf95ef769a881cd13e72df21505d606076ba662e97aef17cf

See more details on using hashes here.

File details

Details for the file gasp_py-1.2.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7f97e139371abd9d4d2d845819cc63e2ab195626240131abe010d8bcf1acffc8
MD5 3536eddf5c93b2ea0b7457c283ac0071
BLAKE2b-256 81ed0eea8697addb8baf08bb94ff75051b375d2ea4f1066333d4c13da5bcd720

See more details on using hashes here.

File details

Details for the file gasp_py-1.2.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 96f71444dc18c3c103d95598e1a8b70f39bb248325a0fe7debe973c27639e98c
MD5 7a545741a82e3152c53ed93a50fabc96
BLAKE2b-256 b3c8040a2b7f08939a47d854c31b0e4650e90cf5bd22d77ef4851a86fe1e2f7b

See more details on using hashes here.

File details

Details for the file gasp_py-1.2.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 089aff9d404f61dd232f9d43e655f8107ff06286eb50a805c65d2253bf54e524
MD5 f874f3a2cf0e1c77272e13ac25e47c16
BLAKE2b-256 33d6d37f93c3fd1513a5c6ea3a63534ba7ac8ddfc7b733b94692b51f479f8f0d

See more details on using hashes here.

File details

Details for the file gasp_py-1.2.0-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 db830a0a9c8f765202d86caacf98e1143c3d80b03fe9431eca2431b2dbed2cc5
MD5 809d0ab478f552e0b336daa627b50306
BLAKE2b-256 4d1563a8c21f8ab2a762a8231d6ac74547815f06060173d58b838c5f4ce51f4b

See more details on using hashes here.

File details

Details for the file gasp_py-1.2.0-cp311-cp311-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp311-cp311-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 a49320cf4f9a15c3bd7b01555d385ab34bd533d3118942f59f8fb4322dcc7db7
MD5 2ca83083b42df7a2173173004dc5da34
BLAKE2b-256 275549485211a4e83b96082052747cc7e6aeb1040e2c09143bb0f4a83c1f33fa

See more details on using hashes here.

File details

Details for the file gasp_py-1.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f884119eb5618cc8fc35da7c3cdae9d9065cd2f1de2ce4327fba1460cb73c374
MD5 faf50db988a279e3519e85e5d264e1c7
BLAKE2b-256 39a138d16ac4d1ec4c4b77e555fb2380d487e7d724418ccbfaf27b9d0197e1df

See more details on using hashes here.

File details

Details for the file gasp_py-1.2.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 e5347f4be4f52bf054c4dbb8d0358377ea712ad2cac7a43c9e481ee93d556716
MD5 9fd080659f8054573456979c6e372de0
BLAKE2b-256 186e79d2247ec935e6ce39adfba1033c968057851943219e44af22abaa6b2e37

See more details on using hashes here.

File details

Details for the file gasp_py-1.2.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 fbaa08e5c6ef78e1faf64b158655d70adf811627940272009717438bab7e70e7
MD5 372db2e480bdf2f15890f0f501fb647d
BLAKE2b-256 59960559d4c98034aa72cdfd656b01bc2c6dd417197b1ca25ad457c160fa1cb3

See more details on using hashes here.

File details

Details for the file gasp_py-1.2.0-cp310-cp310-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp310-cp310-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 73a76cb657296a04a743a774b5288a988bc108b4b9e944e726d764703bc3798d
MD5 118c6badf452dda7ec698195f14c3cc9
BLAKE2b-256 91a14c97d94c75ca17048e4a06d5dbf45d2ae5fe9af465efd24b68602f701637

See more details on using hashes here.

File details

Details for the file gasp_py-1.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0802ec63b37d91807bdf1849cf27609ee5a3d1f10c49a940830cec3ac43cf6a5
MD5 d69b733e46a648edfdb7475fc6c2e7e3
BLAKE2b-256 6ce632c1d7fa62e0cf4375a8416af9c34f37195732c6cd3bfdfac6ed1a91c203

See more details on using hashes here.

File details

Details for the file gasp_py-1.2.0-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 756798a8e3dc53eb6e9549bbb2e27dccf6c88086bed77709a3ec57f322835ba8
MD5 049fc69a8bec17ecf6ee9c93b7102859
BLAKE2b-256 bcd1b50d92233c6497fdc0f58e159e3a7d0ba525ed3aed0def93ba303c4912f5

See more details on using hashes here.

File details

Details for the file gasp_py-1.2.0-cp39-cp39-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp39-cp39-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 202cc48117bf47906ea1206d06fe2be1268ae3a41e3af92b2658c0869a884ff8
MD5 80fbcc50492d7b0900d0c974e9fb18f9
BLAKE2b-256 64261675ea6d678a3aab1f65cc73cc8e9e82fad26eec3c7fd0701a77a4846d99

See more details on using hashes here.

File details

Details for the file gasp_py-1.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b92907dd7412c28e1e2b67fc76ec5d96bd6c2e8c6248ebb755d6675d4b904a09
MD5 63a398d8c583b43e093d3d7f88963b64
BLAKE2b-256 f8047b40aac83fe995cded68c4928ebe0376f513bb9c3e19e74a32be3e8930da

See more details on using hashes here.

File details

Details for the file gasp_py-1.2.0-cp39-cp39-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp39-cp39-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 f68dd64fe9f37716d512041ebf6bdad6f4b7c6018481f08c1c4c5df355da660a
MD5 329faf03700c6085fc06afcc530b0a36
BLAKE2b-256 7e798e8e214938a0635ba0655e40709a41bfd600c88165a0dbf6b0e9b4552307

See more details on using hashes here.

File details

Details for the file gasp_py-1.2.0-cp38-cp38-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp38-cp38-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 374d548551894092db2d28cb9b7dfd2a81e000c4fd7987b2657685a0e26b51f4
MD5 bec462bcff9e7c8a6b0e07cd7d231f54
BLAKE2b-256 25d936a09f80a52d5921875b16972c27141231c9eb876bfe1668ce3d6f509969

See more details on using hashes here.

File details

Details for the file gasp_py-1.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 803490095e35bcd3b23df2e5ea3307400be80f806465c944cffc778307a611f1
MD5 3b7a545618dcaa2086793f815d1019c8
BLAKE2b-256 b97a072315bc015308da99b293bacba12bca279d064b30486ef94b55a85b762e

See more details on using hashes here.

File details

Details for the file gasp_py-1.2.0-cp38-cp38-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.2.0-cp38-cp38-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 e9857e484b54a6cdf05ef8d85ddbc2b9acbfcc86f904da602ed1cb6731606d15
MD5 6bf5c20c3264c31c36dd60290aeca528
BLAKE2b-256 bf0f4c7236ac05cb019bc1313971cfc7bb7b33adbd75374c2bf5b28c2ce861b9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page