Skip to main content

GASP (Gee Another Schema Parser) - A validator and type safe deserializer for LLM output.

Project description

GASP - Type-Safe LLM Output Parser

⚠️ MAJOR BREAKING CHANGES IN VERSION 1.0.0 ⚠️
Version 1.0.0 is a complete rewrite that removes WAIL entirely and introduces a new tag-based parsing approach.
If you're using an older version of GASP, you'll need to significantly change your code to upgrade.
See the Migration Guide below.

GASP is a Rust-based parser for turning LLM outputs into properly typed Python objects. It handles streaming JSON fragments, recovers from common LLM quirks, and makes structured data extraction actually pleasant.

The Problem

LLMs are great at generating structured data when asked, but not perfect:

<Person>
{
  "name": "Alice Smith",
  "age": 30,
  hobbies: ["coding", "hiking"]
}
</Person>

That output has unquoted keys, inconsistent formatting, and is embedded in natural language. Most JSON parsers just give up.

How GASP Works

GASP uses a tag-based approach to extract and type-cast structured data:

  1. Tags like <Person>...</Person> mark where the structured data lives (and what type it is)
  2. The parser ignores everything outside those tags
  3. Inside the tags, it handles messy JSON with broken quotes, trailing commas, etc.
  4. The data gets converted into proper Python objects based on type annotations

Features

  • Tag-Based Extraction: Extract structured data even when surrounded by explanatory text
  • Streaming Support: Process data incrementally as it arrives from the LLM
  • Type Inference: Automatically match JSON objects to Python classes
  • Error Recovery: Handle common JSON mistakes that LLMs make
  • Pydantic Integration: Works with Pydantic for validation and schema definition

Installation

pip install gasp-py

Quick Example

from gasp import Parser, Deserializable
from typing import List, Optional

class Address(Deserializable):
    street: str
    city: str
    zip_code: str

class Person(Deserializable):
    name: str
    age: int
    address: Address
    hobbies: Optional[List[str]] = None

# Create a parser for the Person type
parser = Parser(Person)

# Process LLM output chunks as they arrive
chunks = [
    '<Person>{"name": "Alice", "age": 30',
    ', "address": {"street": "123 Main St", "city": "Springfield"',
    ', "zip_code": "12345"}, "hobbies": ["reading", "coding"]}</Person>'
]

for chunk in chunks:
    result = parser.feed(chunk)
    print(result)  # Will show partial objects as they're built

# Get the final validated result
person = parser.validate()
print(f"Hello {person.name}!")  # Hello Alice!

Working with Pydantic

GASP integrates seamlessly with Pydantic:

from pydantic import BaseModel
from gasp import Parser

class UserProfile(BaseModel):
    username: str
    email: str
    is_active: bool = True

# Create parser from Pydantic model
parser = Parser.from_pydantic(UserProfile)

# Feed LLM output with tags
llm_output = '<UserProfile>{"username": "alice42", "email": "alice@example.com"}</UserProfile>'
result = parser.feed(llm_output)

# Access as a proper Pydantic object
profile = UserProfile.model_validate(parser.validate())
print(profile.model_dump_json(indent=2))

How Tags Work

The tag name directly indicates what Python type to instantiate:

<Person>{ ... JSON data ... }</Person>  # Creates a Person instance
<List>[ ... array data ... ]</List>     # Creates a List
<Address>{ ... address data ... }</Address>  # Creates an Address

The parser ignores everything outside of the tags, so the LLM can provide explanations, context, or other text alongside the structured data.

Advanced Templating with Jinja2

GASP provides built-in Jinja2 integration for more advanced prompt templating:

from gasp import Deserializable, render_template
from typing import List, Optional

class Person(Deserializable):
    """Information about a person"""
    name: str
    age: int
    hobbies: Optional[List[str]] = None

# Create a template with Jinja2 syntax
template = """
# {{ title }}

Generate a {{ type_name|type_description }}.

{% if include_format_instructions %}
Your response must be formatted as:
{{ person_type|format_type }}
{% endif %}
"""

# Provide template context
context = {
    'title': 'Person Generator',
    'type_name': Person,
    'include_format_instructions': True,
    'person_type': Person
}

# Render the template
prompt = render_template(template, context)

Available Jinja2 Filters

  • format_type: Generates format instructions for a type (e.g., {{ my_type|format_type }})
  • type_description: Provides a human-readable description of a type, including docstring info

Template Files & Inheritance

You can also use template files with inheritance:

from gasp import render_file_template

# Renders a template file with GASP filters included
prompt = render_file_template("templates/person_prompt.j2", context)

Direct Jinja2 Access

For advanced cases, you can use the Jinja2 environment directly:

from gasp.jinja_helpers import create_type_environment

# Create a Jinja2 environment with GASP filters
env = create_type_environment()

# Add your own filters
env.filters["my_filter"] = my_filter_function

# Load templates from a directory
env.loader = jinja2.FileSystemLoader("templates/")

# Use directly with Jinja2 API
template = env.get_template("my_template.j2")
prompt = template.render(**context)

Customizing Behavior

Need more control? You can customize type conversion, validation, and parsing behavior:

# Custom type conversions and validation
class CustomPerson(Deserializable):
    name: str
    age: int

    @classmethod
    def __gasp_from_partial__(cls, partial_data):
        """Add custom validation or pre-processing"""
        # Normalize name to title case
        if "name" in partial_data:
            partial_data["name"] = partial_data["name"].title()
        return super().__gasp_from_partial__(partial_data)

Migrating from pre-1.0 Versions

Version 1.0.0 represents a complete architectural shift:

What's Been Removed

  • WAIL Parser: The entire WAIL language and validation system has been removed
  • Schema Validation: The schema-based approach has been replaced with typed parsing
  • WAILGenerator: This class and its API are no longer available
  • All WAIL-related files and examples

What's New

  • Tag-Based Parsing: Uses XML-like tags in LLM output to identify data types
  • Type Annotations: Direct use of Python type annotations to define structures
  • Template Helpers: Functions to generate format instructions from types
  • Streaming Support: Improved support for processing data as it arrives

Migration Steps

  1. Replace WAIL schema definitions with Python classes using type annotations
  2. Replace WAILGenerator with the new Parser class
  3. Update your prompts to use the new tag-based format
  4. Use template_helpers.interpolate_prompt() to generate type-aware prompts

Example of old WAIL approach:

schema = r'''
object Response { name: String, age: Number }
template GenerateResponse() -> Response { ... }
'''
generator = WAILGenerator()
generator.load_wail(schema)
(prompt, _, _) = generator.get_prompt()
llm_response = your_llm_client.generate(prompt)
parsed_data = generator.parse_llm_output(llm_response)

New approach:

from gasp import Deserializable, Parser
from gasp.template_helpers import interpolate_prompt

class Person(Deserializable):
    name: str
    age: int

# Create a template with a {{return_type}} placeholder
template = """
Generate a profile for a person who loves coding.

{{return_type}}
"""

# Generate a complete prompt with type information
prompt = interpolate_prompt(template, Person)
print(prompt)
# Output will include:
# Your response should be formatted as:
# <Person>{ "name": string, "age": number }</Person>

# Send to your LLM
llm_response = your_llm_client.generate(prompt)

# Parse the tagged response
parser = Parser(Person)
parser.feed(llm_response)
person = parser.validate()

print(f"Created person: {person.name}, {person.age} years old")

Contributing

Contributions welcome! Check out the examples directory to see how things work.

License

Apache License, Version 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

gasp_py-1.0.0-cp312-cp312-musllinux_1_2_x86_64.whl (376.7 kB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ x86-64

gasp_py-1.0.0-cp312-cp312-musllinux_1_2_aarch64.whl (348.6 kB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ ARM64

gasp_py-1.0.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (314.6 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

gasp_py-1.0.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (300.2 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

gasp_py-1.0.0-cp312-cp312-macosx_11_0_arm64.whl (278.5 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

gasp_py-1.0.0-cp311-cp311-musllinux_1_2_x86_64.whl (375.1 kB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ x86-64

gasp_py-1.0.0-cp311-cp311-musllinux_1_2_aarch64.whl (348.6 kB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ ARM64

gasp_py-1.0.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (313.1 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

gasp_py-1.0.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (300.2 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

gasp_py-1.0.0-cp311-cp311-macosx_11_0_arm64.whl (277.9 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

gasp_py-1.0.0-cp310-cp310-musllinux_1_2_x86_64.whl (375.1 kB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ x86-64

gasp_py-1.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (313.1 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

gasp_py-1.0.0-cp310-cp310-macosx_10_12_x86_64.whl (286.9 kB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

gasp_py-1.0.0-cp39-cp39-musllinux_1_2_x86_64.whl (374.9 kB view details)

Uploaded CPython 3.9musllinux: musl 1.2+ x86-64

gasp_py-1.0.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (313.3 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

gasp_py-1.0.0-cp39-cp39-macosx_10_12_x86_64.whl (285.8 kB view details)

Uploaded CPython 3.9macOS 10.12+ x86-64

gasp_py-1.0.0-cp38-cp38-musllinux_1_2_x86_64.whl (374.8 kB view details)

Uploaded CPython 3.8musllinux: musl 1.2+ x86-64

gasp_py-1.0.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (312.7 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

gasp_py-1.0.0-cp38-cp38-macosx_10_12_x86_64.whl (285.9 kB view details)

Uploaded CPython 3.8macOS 10.12+ x86-64

File details

Details for the file gasp_py-1.0.0-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 38bfa81c5a354301a4ec878958e5de9b7577f7d6dbba6c20ca7c7ca470a2177d
MD5 779d8844bc2c1c2e1ca5d308bcfc6650
BLAKE2b-256 1473015949e9647cb37ebeebb29013d8f48362821042ea693b2a7c0b1a83b9b9

See more details on using hashes here.

File details

Details for the file gasp_py-1.0.0-cp312-cp312-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp312-cp312-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 f75e97425c00a540db0b6832a69eb4da64c8f2d0be4410a7229e05588afe508a
MD5 825739f4f99cfdf42215e938092483e5
BLAKE2b-256 3e902a6cae5953e3179ef370e815e1082bfe0407c75a02017a5c547efa518a99

See more details on using hashes here.

File details

Details for the file gasp_py-1.0.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3bb66c80cfa85d290eaabbf9918d5537882de5ab2edaadaac776f69e6ca94662
MD5 b1ffe9edfb42a6c7066ce42480f937de
BLAKE2b-256 37e4f434faf5fdd70870a3bd42d31fdcfba653399db38b196cfde5f9d0f4a002

See more details on using hashes here.

File details

Details for the file gasp_py-1.0.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 d7e9373035f52c334a352fd93e822d1510ce03d209cc44e609c66b76f0c5a06c
MD5 a9cc40ba0200fbc2e6187cced120b2d8
BLAKE2b-256 2faee63a51a93efd9cb9bfad01ce49772e3ece0808427b882b6d857b8aeab048

See more details on using hashes here.

File details

Details for the file gasp_py-1.0.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7caeecb32e7e8293b32c26e12df45bedbc044047e4c80643540d7358adbd3d39
MD5 cfa3e76eb9087a8db21eb72441eb1bf0
BLAKE2b-256 c45e5fd2089b82a368e980d10bff2ebbc67b1dfe86e8a315c6578ae0bbe94766

See more details on using hashes here.

File details

Details for the file gasp_py-1.0.0-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 1c1237bbe20fe1a8b5c01cea118b4654daf1b8920e462c4c988f9e31417d01b7
MD5 152688ce1eaa99d5b17e51747b38ac8b
BLAKE2b-256 966e41a719ca7e11d4cbdb4cedd87bddc1b28068a7e553dcac75b04b2df2fe7d

See more details on using hashes here.

File details

Details for the file gasp_py-1.0.0-cp311-cp311-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp311-cp311-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 a43e0463438b3aa8c87815e5cfaa44b803218852a842d3d638da7ee8709c43e4
MD5 dbd7cb2234bbe1cef9d9d1957a4b9ae5
BLAKE2b-256 eb0b83e24961924be5e8a9db55b48bc8b42a97d228c1eefc8ac8198b1b784fd2

See more details on using hashes here.

File details

Details for the file gasp_py-1.0.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 aba8d87a08084873675ebeb44d6fc016ccd153c5cd82112efa6a3f39ce71b2c1
MD5 b442395e35081b81ceed23f58c502362
BLAKE2b-256 6c643ba8f0132545290f4caeeb33065512a109f2ab0b7e2116ec073c2a61d2f5

See more details on using hashes here.

File details

Details for the file gasp_py-1.0.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 20b27bbb30dd0e083e2c542af2ef83452ae092f873aa09834614768ce2195cf3
MD5 041fbbf137ebd5e86a0c07203d21cda7
BLAKE2b-256 36248eafc253232bf07743624cc29fa83d1e26dbc205f6f5d6e3547f6946f796

See more details on using hashes here.

File details

Details for the file gasp_py-1.0.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5fa5d27d58d484bf12851b81909fbab564aca92b6770ce41d9d489ba8f4fe5c8
MD5 2936468d4ebc3065c146b283c423082a
BLAKE2b-256 81dc7d4c5a1d6527e319a80531a72c0f9ed7c1567002b7078da81bf91a92e15f

See more details on using hashes here.

File details

Details for the file gasp_py-1.0.0-cp310-cp310-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp310-cp310-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 8c92737aacc6fa557d4ea01756c6abe549b40bd71c7ded29df1732ee73a46541
MD5 0aa03341eabab3a7f950b793ada4d2be
BLAKE2b-256 9dfdc9dbb6d3dfa244b1e927a758cc336680c7be92a95d66543d5ea9cae0b595

See more details on using hashes here.

File details

Details for the file gasp_py-1.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 74f0a73b2d73ab60146e06d1888dca5c2bbb05a5d3ae7d4484df347b003eb7ec
MD5 73dddb9de8cb2f9c4b2a7a53a2e1e2dc
BLAKE2b-256 d2bd3246aeb0174109904dcb81a40ac681b75ed886f55a5c07afcb6ab7efafa9

See more details on using hashes here.

File details

Details for the file gasp_py-1.0.0-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 1e06fad67500d4434b205327f2bccc3dabb564fa5ce71703df083eafff28fb3c
MD5 0a575c4eb5408200ff82628d68860d78
BLAKE2b-256 ab8456436d55f9e0723c2d7bdc36094a0ad6e9cdb93c10187a81cbf28a38b9df

See more details on using hashes here.

File details

Details for the file gasp_py-1.0.0-cp39-cp39-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp39-cp39-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 64dc03f912ca158229922eb04ed6a45040bd8e218b8e5eb8361859ee54247fef
MD5 da414b30cd3677443b38d67d0f3104d5
BLAKE2b-256 2d1d165f565f54b22fa53db08882cbc7c8bbf636b1eb6cc7fc5a79489d30e5fc

See more details on using hashes here.

File details

Details for the file gasp_py-1.0.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1abf8d7db3e4028b6b63372bafa62b4af95053e83c02f8a98de78da299a7324a
MD5 628b8ccd35d26e46c5a206a842b42661
BLAKE2b-256 39b5132150dd2643109b1e1c380f538ce3ae21500ccccc89d95bea12c347f5fa

See more details on using hashes here.

File details

Details for the file gasp_py-1.0.0-cp39-cp39-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp39-cp39-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 c45371db9e29b5636aa2d79452b097597dd6054af863d3d9951fc1f07628b5c5
MD5 3eac2ab96bd82f71d656668443c402ee
BLAKE2b-256 6c80f6af1f07b2b029e1cf343f1a12353156fd2aa7a53cff24100f189e5f2836

See more details on using hashes here.

File details

Details for the file gasp_py-1.0.0-cp38-cp38-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp38-cp38-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 8a2c0400a68eb31f9afcff816282d31ca393b21379acf2f78dd39f2c894e1492
MD5 618b7de37c0562f6b6c9ad4a852374d7
BLAKE2b-256 83da2d081a6baa0c24ec35231ae4f2533c006214960b877458e769089360f60b

See more details on using hashes here.

File details

Details for the file gasp_py-1.0.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f6519abedd71c92994426a503ac328260744bf755e47efd7305b47ed3961d837
MD5 7826b372891578194fd6069fa8831eb8
BLAKE2b-256 00ef9d4613d5b654bc039c4f1e8b54ebbf296fccc637f9d2082f25098dc15643

See more details on using hashes here.

File details

Details for the file gasp_py-1.0.0-cp38-cp38-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.0.0-cp38-cp38-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 2876e7e6ed9c37e0b0af5b09f3a0309aa1ecabd1df82195ea9f6c4eb01b89d2c
MD5 18024af97c0d53081a16615d8f05b490
BLAKE2b-256 cd865ea04206f9707a1a0cad3d9aecf264b7d184b44139aa95df304bfcdc29f0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page