Skip to main content

GASP (Gee Another Schema Parser) - A validator and type safe deserializer for LLM output.

Project description

GASP - Type-Safe LLM Output Parser

⚠️ MAJOR BREAKING CHANGES IN VERSION 1.0.0 ⚠️
Version 1.0.0 is a complete rewrite that removes WAIL entirely and introduces a new tag-based parsing approach.
If you're using an older version of GASP, you'll need to significantly change your code to upgrade.
See the Migration Guide below.

GASP is a Rust-based parser for turning LLM outputs into properly typed Python objects. It handles streaming JSON fragments, recovers from common LLM quirks, and makes structured data extraction actually pleasant.

The Problem

LLMs are great at generating structured data when asked, but not perfect:

<Person>
{
  "name": "Alice Smith",
  "age": 30,
  hobbies: ["coding", "hiking"]
}
</Person>

That output has unquoted keys, inconsistent formatting, and is embedded in natural language. Most JSON parsers just give up.

How GASP Works

GASP uses a tag-based approach to extract and type-cast structured data:

  1. Tags like <Person>...</Person> mark where the structured data lives (and what type it is)
  2. The parser ignores everything outside those tags
  3. Inside the tags, it handles messy JSON with broken quotes, trailing commas, etc.
  4. The data gets converted into proper Python objects based on type annotations

Features

  • Tag-Based Extraction: Extract structured data even when surrounded by explanatory text
  • Streaming Support: Process data incrementally as it arrives from the LLM
  • Type Inference: Automatically match JSON objects to Python classes
  • Error Recovery: Handle common JSON mistakes that LLMs make
  • Pydantic Integration: Works with Pydantic for validation and schema definition

Installation

pip install gasp-py

Quick Example

from gasp import Parser
from typing import List, Optional

# Regular classes work now - no need for Deserializable
class Address:
    def __init__(self, street="", city="", zip_code=""):
        self.street = street
        self.city = city
        self.zip_code = zip_code

class Person:
    def __init__(self, name="", age=0, address=None, hobbies=None):
        self.name = name
        self.age = age
        self.address = address or Address()
        self.hobbies = hobbies or []

# Create a parser for the Person type
parser = Parser(Person)

# Process LLM output chunks as they arrive
chunks = [
    '<Person>{"name": "Alice", "age": 30',
    ', "address": {"street": "123 Main St", "city": "Springfield"',
    ', "zip_code": "12345"}, "hobbies": ["reading", "coding"]}</Person>'
]

for chunk in chunks:
    result = parser.feed(chunk)
    print(result)  # Will show partial objects as they're built

# Get the final validated result
person = parser.validate()
print(f"Hello {person.name}!")  # Hello Alice!

Container Types

Lists and tuples get their own tags:

# List[T] uses <list> tag
parser = Parser(List[int])
result = parser.feed('<list>[1, 2, 3]</list>')  # returns [1, 2, 3]

# Tuple[T, ...] uses <tuple> tag  
parser = Parser(Tuple[str, int, bool])
result = parser.feed('<tuple>["hello", 42, true]</tuple>')  # returns ("hello", 42, True)

Using Deserializable for Advanced Streaming

Regular classes work for most use cases. Use Deserializable when you need:

  • Streaming control: React to data as it arrives
  • Custom validation: Validate/transform data during parsing
  • State management: Maintain computed fields or derived state
from gasp import Deserializable

class LiveDashboard(Deserializable):
    def __init__(self):
        self.events = []
        self.summary_stats = {}
    
    def __gasp_update__(self, partial_data):
        # React to streaming updates
        if 'new_event' in partial_data:
            self.events.append(partial_data['new_event'])
            self._recalculate_stats()
    
    @classmethod
    def __gasp_from_partial__(cls, partial_data):
        # Custom instantiation logic
        instance = cls()
        # Apply defaults, validate, etc.
        return instance

TL;DR: Regular classes for simple parsing, Deserializable for complex streaming behavior.

Working with Pydantic

GASP integrates seamlessly with Pydantic:

from pydantic import BaseModel
from gasp import Parser

class UserProfile(BaseModel):
    username: str
    email: str
    is_active: bool = True

# Create parser from Pydantic model
parser = Parser.from_pydantic(UserProfile)

# Feed LLM output with tags
llm_output = '<UserProfile>{"username": "alice42", "email": "alice@example.com"}</UserProfile>'
result = parser.feed(llm_output)

# Access as a proper Pydantic object
profile = UserProfile.model_validate(parser.validate())
print(profile.model_dump_json(indent=2))

How Tags Work

The tag name directly indicates what Python type to instantiate:

<Person>{ ... JSON data ... }</Person>  # Creates a Person instance
<List>[ ... array data ... ]</List>     # Creates a List
<Address>{ ... address data ... }</Address>  # Creates an Address

The parser ignores everything outside of the tags, so the LLM can provide explanations, context, or other text alongside the structured data.

Advanced Templating with Jinja2

GASP provides built-in Jinja2 integration for more advanced prompt templating:

from gasp import Deserializable, render_template
from typing import List, Optional

class Person(Deserializable):
    """Information about a person"""
    name: str
    age: int
    hobbies: Optional[List[str]] = None

# Create a template with Jinja2 syntax
template = """
# {{ title }}

Generate a {{ type_name|type_description }}.

{% if include_format_instructions %}
Your response must be formatted as:
{{ person_type|format_type }}
{% endif %}
"""

# Provide template context
context = {
    'title': 'Person Generator',
    'type_name': Person,
    'include_format_instructions': True,
    'person_type': Person
}

# Render the template
prompt = render_template(template, context)

Available Jinja2 Filters

  • format_type: Generates format instructions for a type (e.g., {{ my_type|format_type }})
  • type_description: Provides a human-readable description of a type, including docstring info

Template Files & Inheritance

You can also use template files with inheritance:

from gasp import render_file_template

# Renders a template file with GASP filters included
prompt = render_file_template("templates/person_prompt.j2", context)

Direct Jinja2 Access

For advanced cases, you can use the Jinja2 environment directly:

from gasp.jinja_helpers import create_type_environment

# Create a Jinja2 environment with GASP filters
env = create_type_environment()

# Add your own filters
env.filters["my_filter"] = my_filter_function

# Load templates from a directory
env.loader = jinja2.FileSystemLoader("templates/")

# Use directly with Jinja2 API
template = env.get_template("my_template.j2")
prompt = template.render(**context)

Customizing Behavior

Need more control? You can customize type conversion, validation, and parsing behavior:

# Custom type conversions and validation
class CustomPerson(Deserializable):
    name: str
    age: int

    @classmethod
    def __gasp_from_partial__(cls, partial_data):
        """Add custom validation or pre-processing"""
        # Normalize name to title case
        if "name" in partial_data:
            partial_data["name"] = partial_data["name"].title()
        return super().__gasp_from_partial__(partial_data)

Migrating from pre-1.0 Versions

Version 1.0.0 represents a complete architectural shift:

What's Been Removed

  • WAIL Parser: The entire WAIL language and validation system has been removed
  • Schema Validation: The schema-based approach has been replaced with typed parsing
  • WAILGenerator: This class and its API are no longer available
  • All WAIL-related files and examples

What's New

  • Tag-Based Parsing: Uses XML-like tags in LLM output to identify data types
  • Type Annotations: Direct use of Python type annotations to define structures
  • Template Helpers: Functions to generate format instructions from types
  • Streaming Support: Improved support for processing data as it arrives

Migration Steps

  1. Replace WAIL schema definitions with Python classes using type annotations
  2. Replace WAILGenerator with the new Parser class
  3. Update your prompts to use the new tag-based format
  4. Use template_helpers.interpolate_prompt() to generate type-aware prompts

Example of old WAIL approach:

schema = r'''
object Response { name: String, age: Number }
template GenerateResponse() -> Response { ... }
'''
generator = WAILGenerator()
generator.load_wail(schema)
(prompt, _, _) = generator.get_prompt()
llm_response = your_llm_client.generate(prompt)
parsed_data = generator.parse_llm_output(llm_response)

New approach:

from gasp import Parser
from gasp.template_helpers import interpolate_prompt

# Regular class
class Person:
    def __init__(self, name="", age=0):
        self.name = name
        self.age = age

# Create a template with a {{return_type}} placeholder
template = """
Generate a profile for a person who loves coding.

{{return_type}}
"""

# Generate a complete prompt with type information
prompt = interpolate_prompt(template, Person)
print(prompt)
# Output will include:
# Your response should be formatted as:
# <Person>{ "name": string, "age": number }</Person>

# Send to your LLM
llm_response = your_llm_client.generate(prompt)

# Parse the tagged response
parser = Parser(Person)
parser.feed(llm_response)
person = parser.validate()

print(f"Created person: {person.name}, {person.age} years old")

Contributing

Contributions welcome! Check out the examples directory to see how things work.

License

Apache License, Version 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

gasp_py-1.3.6-cp312-cp312-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ x86-64

gasp_py-1.3.6-cp312-cp312-musllinux_1_2_aarch64.whl (963.7 kB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ ARM64

gasp_py-1.3.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (998.8 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

gasp_py-1.3.6-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (917.2 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

gasp_py-1.3.6-cp312-cp312-macosx_11_0_arm64.whl (858.5 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

gasp_py-1.3.6-cp311-cp311-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ x86-64

gasp_py-1.3.6-cp311-cp311-musllinux_1_2_aarch64.whl (963.0 kB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ ARM64

gasp_py-1.3.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (997.9 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

gasp_py-1.3.6-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (916.1 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

gasp_py-1.3.6-cp311-cp311-macosx_11_0_arm64.whl (858.1 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

gasp_py-1.3.6-cp310-cp310-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ x86-64

gasp_py-1.3.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (997.9 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

gasp_py-1.3.6-cp310-cp310-macosx_10_12_x86_64.whl (914.3 kB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

gasp_py-1.3.6-cp39-cp39-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.9musllinux: musl 1.2+ x86-64

gasp_py-1.3.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (997.6 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

gasp_py-1.3.6-cp39-cp39-macosx_10_12_x86_64.whl (913.8 kB view details)

Uploaded CPython 3.9macOS 10.12+ x86-64

gasp_py-1.3.6-cp38-cp38-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.8musllinux: musl 1.2+ x86-64

gasp_py-1.3.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (998.6 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

gasp_py-1.3.6-cp38-cp38-macosx_10_12_x86_64.whl (914.7 kB view details)

Uploaded CPython 3.8macOS 10.12+ x86-64

File details

Details for the file gasp_py-1.3.6-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 488bf15be39369188be72172d4a3aa3b9d3cecf9ebeb4ac0475b8d9114ea8a1f
MD5 767fb7110b3b766cbb2a6054e64613bf
BLAKE2b-256 15dd4d0f24c27f998cc7da04618462a0fd8b6d29104941b784d77e8cbbfadbd0

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.6-cp312-cp312-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp312-cp312-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 54f5bc67858304bee2f0542737242bc3f70c5810b46918a03f9661515b4dcdaa
MD5 b9d763a439234f3a2d23e0a8594caec4
BLAKE2b-256 27a3b407265133e627947df34d0bf34a7e38395bf87772175a7f471898bea042

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1cfe5ea869527571af5f26fd0abea9d70ed58268d7e0be6f665ed37f37875d14
MD5 03549461253f50585aebdd2da69871ba
BLAKE2b-256 964a23a71e275ad7d9e3f721cb14fb6b34ee325f8bda04cb51b7a71756a2e050

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.6-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 e755e396e3b0873a9654ac16f3dcc2b9d92c7d16c0a0d984fb8653c7d0ba2f29
MD5 8f4150800e7ccd5ecfb0e06bcd2a8d55
BLAKE2b-256 02899323fe236314c2ce065b9fb8d701aa2eb2847927097bdae15df4d933d2d3

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.6-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 309a8e3d6e3788dde419fdc0e4c9c851afc686f08bd805337595fddb5c3b75d6
MD5 db12bdfcbcf519368e888e8f1e93c36c
BLAKE2b-256 28b99f5732d099ceb368fb566483fbd2f3b6d88fadbef6365cb2c67a3310b81c

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.6-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 b878f008c061ecf8265afe6b940ea61d6f560f452e35021a29a85f3bbaa5ee0b
MD5 b1769f42790fe2f03658af34a39a8970
BLAKE2b-256 0e3e41d37ce461ff396dba2027c3a2d53620ac2edba40a02df7aa8607780bb98

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.6-cp311-cp311-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp311-cp311-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 a9bb6e8368c721ddbf4c5000e419888cdff2413ed3d9fb3da0e9a49f0ac24c9b
MD5 0cd16b2f2dc63bd4c7b8bca2f339db52
BLAKE2b-256 733a642af975b77c6a36487335d582ed273852a86fe916266420d39ac1fcd294

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 37955f173e21a512f73c521178fcbfb19c8c542a39826b4af1711456f1e81ced
MD5 34f6f331a810ec781d9b79bc1380c9ff
BLAKE2b-256 c1f0c5f09a09591c9e7c5cdc0db19e060bca7a222e6d8d41bbf5746af9d2c3b7

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.6-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 c15037c4da180b4540b84f2fbab698569a8321290c0d2f42814852113fcb3462
MD5 facad760c9537ea76d3f4ff10e0022bc
BLAKE2b-256 4c4a8816613c5c9ed5bc7a4623d3350c2d72f22412c9b21144e778b4b98cab5b

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.6-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bd95298e14a6b4da87a3ec4b317a64dbce825282ac9868c27ddd25c9008d45c5
MD5 2f0938401a608097502040f649d6ae3c
BLAKE2b-256 5a1b708d137f536cbaa836776c14912e6740d7fcb2f20009e45062140beb5e1d

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.6-cp310-cp310-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp310-cp310-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 93d577b1ad1fc14557e3ba4229ce19f1b933d183ce677b6028c0c6da39223b45
MD5 f013277401393d269340fd6734632ea6
BLAKE2b-256 7d3edb3ba8da75c07502cf22437b7827b685c2ec72037e239f1acd666ced7c4e

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 84b3d94e85cfc53da8cdac7b35fa402a955b1f803b731fd64baad763924f7cdb
MD5 a0aa983b5925f0546e94b3a8f15b63ca
BLAKE2b-256 69b05217eb9c414721d637a398d4668477fbeeb2eeb6fd97e16b533e03f10431

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.6-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 6ad04cdcd08ae44500dfaf088afd468fd28b5447cd8c94204e780208b2466f99
MD5 9098c11663d8f1191677c6811fd7c4dd
BLAKE2b-256 5fc428d26f2bc031c590d38804322854fb4e79939eaf81d2ecdf736b307f5e2f

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.6-cp39-cp39-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp39-cp39-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 ffdc7fe9d66ca054516b1c5e9df1cd38430457e991b57590d9b8b25c4d9166fd
MD5 2f62813e57463c43150fb7957d83de6c
BLAKE2b-256 732a65da4deeeea3d48a32e61e1ac492c10ee3a73834bbaba960f1794f829ba4

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6c2a480ae18f2a66deaef58f71e4e9a222369c7751e06d43d80b3f823939d3a3
MD5 bd237ea0e06b42c4bb0ebb59704d5ade
BLAKE2b-256 a05ef64dbaa564b57913e37556ca32d6a154a8de8265a89dbeacd35e0c30de55

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.6-cp39-cp39-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp39-cp39-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 7688c90419ad4b4fe982e2d94f1de34fb8fe2f0b8893e343aafd39a72e3a8dcb
MD5 5b65fb9348e0410a9e4b363011620e04
BLAKE2b-256 34f2fff7b228b0cdae6d61061ac0d9bde6c7a3040c0b3c236d66dc4934a8d8a3

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.6-cp38-cp38-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp38-cp38-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 94156aaa364493c4df6fd7d47737914d3ce80754080a3b7f3b2e32ccab2ad7d6
MD5 46070b42ed36a9b222e023004c2f67f6
BLAKE2b-256 df3975e6771a7b571ea0b39a17cd7f3bbc928c49a677e0f2278d5808d2ecd19c

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 de193cdae1d20a4ed23edec020cce9f3800347be7003be2a173761831c707e71
MD5 ed068370abf6629d1ef688b3d07f7125
BLAKE2b-256 6011f72582e71efee464ca1c46173402613f2e2befd884620bd54b798bb2fcf6

See more details on using hashes here.

File details

Details for the file gasp_py-1.3.6-cp38-cp38-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.3.6-cp38-cp38-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 b1bf969c7b6cf27c05c787be1237423a9eda96b60ac4e995a3c7adc16b978033
MD5 6ff001a32aac5431e2679f8eb3752cc8
BLAKE2b-256 919ae363b247ed392392dddb15b8ba62ad6dda84b5a104381217b0806313f380

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page