Skip to main content

GASP (Gee Another Schema Parser) - A validator and type safe deserializer for LLM output.

Project description

GASP - Type-Safe LLM Output Parser

⚠️ MAJOR BREAKING CHANGES IN VERSION 1.0.0 ⚠️
Version 1.0.0 is a complete rewrite that removes WAIL entirely and introduces a new tag-based parsing approach.
If you're using an older version of GASP, you'll need to significantly change your code to upgrade.
See the Migration Guide below.

GASP is a Rust-based parser for turning LLM outputs into properly typed Python objects. It handles streaming JSON fragments, recovers from common LLM quirks, and makes structured data extraction actually pleasant.

The Problem

LLMs are great at generating structured data when asked, but not perfect:

<Person>
{
  "name": "Alice Smith",
  "age": 30,
  hobbies: ["coding", "hiking"]
}
</Person>

That output has unquoted keys, inconsistent formatting, and is embedded in natural language. Most JSON parsers just give up.

How GASP Works

GASP uses a tag-based approach to extract and type-cast structured data:

  1. Tags like <Person>...</Person> mark where the structured data lives (and what type it is)
  2. The parser ignores everything outside those tags
  3. Inside the tags, it handles messy JSON with broken quotes, trailing commas, etc.
  4. The data gets converted into proper Python objects based on type annotations

Features

  • Tag-Based Extraction: Extract structured data even when surrounded by explanatory text
  • Streaming Support: Process data incrementally as it arrives from the LLM
  • Type Inference: Automatically match JSON objects to Python classes
  • Error Recovery: Handle common JSON mistakes that LLMs make
  • Pydantic Integration: Works with Pydantic for validation and schema definition

Installation

pip install gasp-py

Quick Example

from gasp import Parser
from typing import List, Optional

# Regular classes work now - no need for Deserializable
class Address:
    def __init__(self, street="", city="", zip_code=""):
        self.street = street
        self.city = city
        self.zip_code = zip_code

class Person:
    def __init__(self, name="", age=0, address=None, hobbies=None):
        self.name = name
        self.age = age
        self.address = address or Address()
        self.hobbies = hobbies or []

# Create a parser for the Person type
parser = Parser(Person)

# Process LLM output chunks as they arrive
chunks = [
    '<Person>{"name": "Alice", "age": 30',
    ', "address": {"street": "123 Main St", "city": "Springfield"',
    ', "zip_code": "12345"}, "hobbies": ["reading", "coding"]}</Person>'
]

for chunk in chunks:
    result = parser.feed(chunk)
    print(result)  # Will show partial objects as they're built

# Get the final validated result
person = parser.validate()
print(f"Hello {person.name}!")  # Hello Alice!

Container Types

Lists and tuples get their own tags:

# List[T] uses <list> tag
parser = Parser(List[int])
result = parser.feed('<list>[1, 2, 3]</list>')  # returns [1, 2, 3]

# Tuple[T, ...] uses <tuple> tag  
parser = Parser(Tuple[str, int, bool])
result = parser.feed('<tuple>["hello", 42, true]</tuple>')  # returns ("hello", 42, True)

Using Deserializable for Advanced Streaming

Regular classes work for most use cases. Use Deserializable when you need:

  • Streaming control: React to data as it arrives
  • Custom validation: Validate/transform data during parsing
  • State management: Maintain computed fields or derived state
from gasp import Deserializable

class LiveDashboard(Deserializable):
    def __init__(self):
        self.events = []
        self.summary_stats = {}
    
    def __gasp_update__(self, partial_data):
        # React to streaming updates
        if 'new_event' in partial_data:
            self.events.append(partial_data['new_event'])
            self._recalculate_stats()
    
    @classmethod
    def __gasp_from_partial__(cls, partial_data):
        # Custom instantiation logic
        instance = cls()
        # Apply defaults, validate, etc.
        return instance

TL;DR: Regular classes for simple parsing, Deserializable for complex streaming behavior.

Working with Pydantic

GASP integrates seamlessly with Pydantic:

from pydantic import BaseModel
from gasp import Parser

class UserProfile(BaseModel):
    username: str
    email: str
    is_active: bool = True

# Create parser from Pydantic model
parser = Parser.from_pydantic(UserProfile)

# Feed LLM output with tags
llm_output = '<UserProfile>{"username": "alice42", "email": "alice@example.com"}</UserProfile>'
result = parser.feed(llm_output)

# Access as a proper Pydantic object
profile = UserProfile.model_validate(parser.validate())
print(profile.model_dump_json(indent=2))

How Tags Work

The tag name directly indicates what Python type to instantiate:

<Person>{ ... JSON data ... }</Person>  # Creates a Person instance
<List>[ ... array data ... ]</List>     # Creates a List
<Address>{ ... address data ... }</Address>  # Creates an Address

The parser ignores everything outside of the tags, so the LLM can provide explanations, context, or other text alongside the structured data.

Advanced Templating with Jinja2

GASP provides built-in Jinja2 integration for more advanced prompt templating:

from gasp import Deserializable, render_template
from typing import List, Optional

class Person(Deserializable):
    """Information about a person"""
    name: str
    age: int
    hobbies: Optional[List[str]] = None

# Create a template with Jinja2 syntax
template = """
# {{ title }}

Generate a {{ type_name|type_description }}.

{% if include_format_instructions %}
Your response must be formatted as:
{{ person_type|format_type }}
{% endif %}
"""

# Provide template context
context = {
    'title': 'Person Generator',
    'type_name': Person,
    'include_format_instructions': True,
    'person_type': Person
}

# Render the template
prompt = render_template(template, context)

Available Jinja2 Filters

  • format_type: Generates format instructions for a type (e.g., {{ my_type|format_type }})
  • type_description: Provides a human-readable description of a type, including docstring info

Template Files & Inheritance

You can also use template files with inheritance:

from gasp import render_file_template

# Renders a template file with GASP filters included
prompt = render_file_template("templates/person_prompt.j2", context)

Direct Jinja2 Access

For advanced cases, you can use the Jinja2 environment directly:

from gasp.jinja_helpers import create_type_environment

# Create a Jinja2 environment with GASP filters
env = create_type_environment()

# Add your own filters
env.filters["my_filter"] = my_filter_function

# Load templates from a directory
env.loader = jinja2.FileSystemLoader("templates/")

# Use directly with Jinja2 API
template = env.get_template("my_template.j2")
prompt = template.render(**context)

Customizing Behavior

Need more control? You can customize type conversion, validation, and parsing behavior:

# Custom type conversions and validation
class CustomPerson(Deserializable):
    name: str
    age: int

    @classmethod
    def __gasp_from_partial__(cls, partial_data):
        """Add custom validation or pre-processing"""
        # Normalize name to title case
        if "name" in partial_data:
            partial_data["name"] = partial_data["name"].title()
        return super().__gasp_from_partial__(partial_data)

Migrating from pre-1.0 Versions

Version 1.0.0 represents a complete architectural shift:

What's Been Removed

  • WAIL Parser: The entire WAIL language and validation system has been removed
  • Schema Validation: The schema-based approach has been replaced with typed parsing
  • WAILGenerator: This class and its API are no longer available
  • All WAIL-related files and examples

What's New

  • Tag-Based Parsing: Uses XML-like tags in LLM output to identify data types
  • Type Annotations: Direct use of Python type annotations to define structures
  • Template Helpers: Functions to generate format instructions from types
  • Streaming Support: Improved support for processing data as it arrives

Migration Steps

  1. Replace WAIL schema definitions with Python classes using type annotations
  2. Replace WAILGenerator with the new Parser class
  3. Update your prompts to use the new tag-based format
  4. Use template_helpers.interpolate_prompt() to generate type-aware prompts

Example of old WAIL approach:

schema = r'''
object Response { name: String, age: Number }
template GenerateResponse() -> Response { ... }
'''
generator = WAILGenerator()
generator.load_wail(schema)
(prompt, _, _) = generator.get_prompt()
llm_response = your_llm_client.generate(prompt)
parsed_data = generator.parse_llm_output(llm_response)

New approach:

from gasp import Parser
from gasp.template_helpers import interpolate_prompt

# Regular class
class Person:
    def __init__(self, name="", age=0):
        self.name = name
        self.age = age

# Create a template with a {{return_type}} placeholder
template = """
Generate a profile for a person who loves coding.

{{return_type}}
"""

# Generate a complete prompt with type information
prompt = interpolate_prompt(template, Person)
print(prompt)
# Output will include:
# Your response should be formatted as:
# <Person>{ "name": string, "age": number }</Person>

# Send to your LLM
llm_response = your_llm_client.generate(prompt)

# Parse the tagged response
parser = Parser(Person)
parser.feed(llm_response)
person = parser.validate()

print(f"Created person: {person.name}, {person.age} years old")

Contributing

Contributions welcome! Check out the examples directory to see how things work.

License

Apache License, Version 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

gasp_py-1.3.0-cp312-cp312-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ x86-64

gasp_py-1.3.0-cp312-cp312-musllinux_1_2_aarch64.whl (964.8 kB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ ARM64

gasp_py-1.3.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (999.6 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

gasp_py-1.3.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (918.2 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

gasp_py-1.3.0-cp312-cp312-macosx_11_0_arm64.whl (859.1 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

gasp_py-1.3.0-cp311-cp311-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ x86-64

gasp_py-1.3.0-cp311-cp311-musllinux_1_2_aarch64.whl (963.2 kB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ ARM64

gasp_py-1.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (998.3 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

gasp_py-1.3.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (915.9 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

gasp_py-1.3.0-cp311-cp311-macosx_11_0_arm64.whl (857.6 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

gasp_py-1.3.0-cp310-cp310-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ x86-64

gasp_py-1.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (998.3 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

gasp_py-1.3.0-cp310-cp310-macosx_10_12_x86_64.whl (915.1 kB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

gasp_py-1.3.0-cp39-cp39-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.9musllinux: musl 1.2+ x86-64

gasp_py-1.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (998.0 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

gasp_py-1.3.0-cp39-cp39-macosx_10_12_x86_64.whl (914.8 kB view details)

Uploaded CPython 3.9macOS 10.12+ x86-64

gasp_py-1.3.0-cp38-cp38-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.8musllinux: musl 1.2+ x86-64

gasp_py-1.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (998.8 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

gasp_py-1.3.0-cp38-cp38-macosx_10_12_x86_64.whl (915.4 kB view details)

Uploaded CPython 3.8macOS 10.12+ x86-64

File details

Details for the file gasp_py-1.3.0-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 0c84de037465838033bcc30dfb82648fd1b62bfc1040a6eee64ca0c3d60340be
MD5 c75241218c6f8dd1a6f4eb7c19891b5d
BLAKE2b-256 5f8875f22c4b11c93f97d7077a087129ac89011a872d060af1d07195c3f97a36

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.0-cp312-cp312-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp312-cp312-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 28ac583802861388c663dcc5d9bf9196cdaceb761e5f4c988a1f0357e21c9733
MD5 2ed0f0eb295cac0180b50884245de875
BLAKE2b-256 7263e00628d393c8624653aca6f68a03fe4433d2457edc14bf52c4779d7f8817

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0f22a670003d99aef7521136695478b7a41b8eea0a59b2e8be4ef0b59e557bf4
MD5 0d637ff69f204b9b06076216cdf26a69
BLAKE2b-256 eb7ce7b8fcf95827bf89cc2dedf97f48d287ad5885c91b9f4fd5417e46243ea3

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 ee5857bc00b59bec7c2c3dbc3cbcace73fd67be1a0598c0db3d30819730b9fc6
MD5 177fcd83528e0565753493eb5e5b5103
BLAKE2b-256 959e9c2c9f40af1bb41dd1eb4195abd9ead89fb2a3b00c58f5ccd82fca9492b2

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7745d0e809ae365582a450291aa2763e98e025efee2169f25b41c1f815c8b32a
MD5 d4b766a8ce481337e87d5a3b74592baa
BLAKE2b-256 bbfa81888ab5861e11d63fb000b70ebcd8fc660259ff186fc972273b414fc724

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.0-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 de0f0fa0553c96301cd93dddc9b90bf9e68a46d687ecb2fc23b17762678be12c
MD5 49dc2965f4980f26afdde1a09374b7e2
BLAKE2b-256 31efd899a568b3d9751345b7d2948164f3db99333f81a190156ebb3e355e3d74

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.0-cp311-cp311-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp311-cp311-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 45f72ffb3509632d36baab3227bc56e65767890cc73cc6f1c3fafe27deb74a0f
MD5 deced1fb4aa76e1fbcad1e96ad56b2a5
BLAKE2b-256 88be878ef92055db30c5b9799d557384adb0da629e140ae8071c8f775b8919f9

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a67094b74c6a836bf64c6d56da7ef107e62ad3abf858549531bbde6c646f01d5
MD5 24ccce7525236dfc93b93f70d754a330
BLAKE2b-256 082f7f379703e3242135ccaaaa2b092ec75fe04a7b6273d0262238d7d9a04536

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 29d1e7a8137f92b8b6c75e537ec4be53be11b90ff4e191037cabe4bb3786e8e1
MD5 e6e62418926ea33a370c4bf456eed7e2
BLAKE2b-256 9ab356f18f46c4137f34f82c7e1492bb4f83b0d335d93fc4fa2d09309265f788

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 12f2fbc252010fd7e3da68481761b11b30e44e8d1fc5f9af0677dea295b80c85
MD5 c96f67b63967b9a2ac65c20e693d6735
BLAKE2b-256 b0718498ee206531b9174e776b71ea02aa2f088e71301db40fc06c1ca6f33023

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.0-cp310-cp310-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp310-cp310-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 28cd564604d4837c12cbbc349ad8d21abfcca9fe594ec62cc62ecddc68e096a3
MD5 2d525aab7331e1f1988156d1be7d990e
BLAKE2b-256 be06ee467904f9cd4230d03e85a431eb5439c790e86e9fb8b40db6244eb851af

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ad4bdd5611890e341d5dd4958a829e8531bde84795a45074c15a472f1aa07f7a
MD5 5befcc0c6771e45501a7b90bbeb0c9ca
BLAKE2b-256 dec55af3cbb335e177ba442e11bd24cd3010481a99d5284ab8abb5759100363e

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.0-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 57e3d88b9d515c727a0c3fd99e4e024118cb5805e021437298e696baa3ac8d48
MD5 af79abf45e929e31b7b51cc1c6b532a7
BLAKE2b-256 d1f012c9066ebd452f603c35b30209a316f0fe8905543a1a58de8eb163f28e9d

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.0-cp39-cp39-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp39-cp39-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 8ab07783d084809ebaa7dc327e5968e1c947ae9d185ccb546b01fa6a8fe99ac2
MD5 35f0888641b80939fe2720d735cbb5d6
BLAKE2b-256 a26e4f72fef2625286fad2ead21cb9e7faa0b189ac0938577f86ef81de732571

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 af71bc23a8334828ef9def30bf7139a29e3b73e55a0a95cda00b609ff11dc84e
MD5 0bc36d7a1835b52a984a9023e5fa273a
BLAKE2b-256 b0834e9cef76fc50906f493c6df8a81b5d4652c6b91092edfdff731e9e906bee

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.0-cp39-cp39-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp39-cp39-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 dac6a6faf945cb8ae109b75e82a6a958cf13664110f71e3532675604f41247d4
MD5 9aac9f2ef38cc016a08961c0f8cd2cf8
BLAKE2b-256 513b09dafb67695f78f46000f303d87a87cdba6d1287e48f1b9330faa8cf361d

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.0-cp38-cp38-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp38-cp38-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 6cdfd42d5ff6b2af02d12a13121f267c9cbf62e899778a26d081e2b9dd5794db
MD5 4478408a53eaf89c974caa2142f6cc58
BLAKE2b-256 c51222303f5baff89f4095ea62c470a111c5e10d175575a95eb484f3bedc2ec1

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8188d0a3ed97353bbd8cedb5c7728f13905393946e31755ca46149a60a255e95
MD5 5e388dcec545afc6e29cebdb60e1e32d
BLAKE2b-256 ef1f0f8663c01f5c46842262c76335df062d0ffb854e1e28baa557881fb8b562

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.0-cp38-cp38-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.0-cp38-cp38-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 36b9ef6590168687bb4c36f7756cb85edc653c3cf4bbb1b5d0b94a7e298022a5
MD5 6bed8ec6c85bcccec253ac708504a0a3
BLAKE2b-256 2162b2fe6090f5aa27dbfbb648000eb9adeb10670a03a7514ef16579b6bd5c37

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page