Skip to main content

GASP (Gee Another Schema Parser) - A validator and type safe deserializer for LLM output.

Project description

GASP - Type-Safe LLM Output Parser

⚠️ MAJOR BREAKING CHANGES IN VERSION 1.0.0 ⚠️
Version 1.0.0 is a complete rewrite that removes WAIL entirely and introduces a new tag-based parsing approach.
If you're using an older version of GASP, you'll need to significantly change your code to upgrade.
See the Migration Guide below.

GASP is a Rust-based parser for turning LLM outputs into properly typed Python objects. It handles streaming JSON fragments, recovers from common LLM quirks, and makes structured data extraction actually pleasant.

The Problem

LLMs are great at generating structured data when asked, but not perfect:

<Person>
{
  "name": "Alice Smith",
  "age": 30,
  hobbies: ["coding", "hiking"]
}
</Person>

That output has unquoted keys, inconsistent formatting, and is embedded in natural language. Most JSON parsers just give up.

How GASP Works

GASP uses a tag-based approach to extract and type-cast structured data:

  1. Tags like <Person>...</Person> mark where the structured data lives (and what type it is)
  2. The parser ignores everything outside those tags
  3. Inside the tags, it handles messy JSON with broken quotes, trailing commas, etc.
  4. The data gets converted into proper Python objects based on type annotations

Features

  • Tag-Based Extraction: Extract structured data even when surrounded by explanatory text
  • Streaming Support: Process data incrementally as it arrives from the LLM
  • Type Inference: Automatically match JSON objects to Python classes
  • Error Recovery: Handle common JSON mistakes that LLMs make
  • Pydantic Integration: Works with Pydantic for validation and schema definition

Installation

pip install gasp-py

Quick Example

from gasp import Parser, Deserializable
from typing import List, Optional

class Address(Deserializable):
    street: str
    city: str
    zip_code: str

class Person(Deserializable):
    name: str
    age: int
    address: Address
    hobbies: Optional[List[str]] = None

# Create a parser for the Person type
parser = Parser(Person)

# Process LLM output chunks as they arrive
chunks = [
    '<Person>{"name": "Alice", "age": 30',
    ', "address": {"street": "123 Main St", "city": "Springfield"',
    ', "zip_code": "12345"}, "hobbies": ["reading", "coding"]}</Person>'
]

for chunk in chunks:
    result = parser.feed(chunk)
    print(result)  # Will show partial objects as they're built

# Get the final validated result
person = parser.validate()
print(f"Hello {person.name}!")  # Hello Alice!

Working with Pydantic

GASP integrates seamlessly with Pydantic:

from pydantic import BaseModel
from gasp import Parser

class UserProfile(BaseModel):
    username: str
    email: str
    is_active: bool = True

# Create parser from Pydantic model
parser = Parser.from_pydantic(UserProfile)

# Feed LLM output with tags
llm_output = '<UserProfile>{"username": "alice42", "email": "alice@example.com"}</UserProfile>'
result = parser.feed(llm_output)

# Access as a proper Pydantic object
profile = UserProfile.model_validate(parser.validate())
print(profile.model_dump_json(indent=2))

How Tags Work

The tag name directly indicates what Python type to instantiate:

<Person>{ ... JSON data ... }</Person>  # Creates a Person instance
<List>[ ... array data ... ]</List>     # Creates a List
<Address>{ ... address data ... }</Address>  # Creates an Address

The parser ignores everything outside of the tags, so the LLM can provide explanations, context, or other text alongside the structured data.

Advanced Templating with Jinja2

GASP provides built-in Jinja2 integration for more advanced prompt templating:

from gasp import Deserializable, render_template
from typing import List, Optional

class Person(Deserializable):
    """Information about a person"""
    name: str
    age: int
    hobbies: Optional[List[str]] = None

# Create a template with Jinja2 syntax
template = """
# {{ title }}

Generate a {{ type_name|type_description }}.

{% if include_format_instructions %}
Your response must be formatted as:
{{ person_type|format_type }}
{% endif %}
"""

# Provide template context
context = {
    'title': 'Person Generator',
    'type_name': Person,
    'include_format_instructions': True,
    'person_type': Person
}

# Render the template
prompt = render_template(template, context)

Available Jinja2 Filters

  • format_type: Generates format instructions for a type (e.g., {{ my_type|format_type }})
  • type_description: Provides a human-readable description of a type, including docstring info

Template Files & Inheritance

You can also use template files with inheritance:

from gasp import render_file_template

# Renders a template file with GASP filters included
prompt = render_file_template("templates/person_prompt.j2", context)

Direct Jinja2 Access

For advanced cases, you can use the Jinja2 environment directly:

from gasp.jinja_helpers import create_type_environment

# Create a Jinja2 environment with GASP filters
env = create_type_environment()

# Add your own filters
env.filters["my_filter"] = my_filter_function

# Load templates from a directory
env.loader = jinja2.FileSystemLoader("templates/")

# Use directly with Jinja2 API
template = env.get_template("my_template.j2")
prompt = template.render(**context)

Customizing Behavior

Need more control? You can customize type conversion, validation, and parsing behavior:

# Custom type conversions and validation
class CustomPerson(Deserializable):
    name: str
    age: int

    @classmethod
    def __gasp_from_partial__(cls, partial_data):
        """Add custom validation or pre-processing"""
        # Normalize name to title case
        if "name" in partial_data:
            partial_data["name"] = partial_data["name"].title()
        return super().__gasp_from_partial__(partial_data)

Migrating from pre-1.0 Versions

Version 1.0.0 represents a complete architectural shift:

What's Been Removed

  • WAIL Parser: The entire WAIL language and validation system has been removed
  • Schema Validation: The schema-based approach has been replaced with typed parsing
  • WAILGenerator: This class and its API are no longer available
  • All WAIL-related files and examples

What's New

  • Tag-Based Parsing: Uses XML-like tags in LLM output to identify data types
  • Type Annotations: Direct use of Python type annotations to define structures
  • Template Helpers: Functions to generate format instructions from types
  • Streaming Support: Improved support for processing data as it arrives

Migration Steps

  1. Replace WAIL schema definitions with Python classes using type annotations
  2. Replace WAILGenerator with the new Parser class
  3. Update your prompts to use the new tag-based format
  4. Use template_helpers.interpolate_prompt() to generate type-aware prompts

Example of old WAIL approach:

schema = r'''
object Response { name: String, age: Number }
template GenerateResponse() -> Response { ... }
'''
generator = WAILGenerator()
generator.load_wail(schema)
(prompt, _, _) = generator.get_prompt()
llm_response = your_llm_client.generate(prompt)
parsed_data = generator.parse_llm_output(llm_response)

New approach:

from gasp import Deserializable, Parser
from gasp.template_helpers import interpolate_prompt

class Person(Deserializable):
    name: str
    age: int

# Create a template with a {{return_type}} placeholder
template = """
Generate a profile for a person who loves coding.

{{return_type}}
"""

# Generate a complete prompt with type information
prompt = interpolate_prompt(template, Person)
print(prompt)
# Output will include:
# Your response should be formatted as:
# <Person>{ "name": string, "age": number }</Person>

# Send to your LLM
llm_response = your_llm_client.generate(prompt)

# Parse the tagged response
parser = Parser(Person)
parser.feed(llm_response)
person = parser.validate()

print(f"Created person: {person.name}, {person.age} years old")

Contributing

Contributions welcome! Check out the examples directory to see how things work.

License

Apache License, Version 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

gasp_py-1.1.0-cp312-cp312-musllinux_1_2_x86_64.whl (388.4 kB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ x86-64

gasp_py-1.1.0-cp312-cp312-musllinux_1_2_aarch64.whl (358.9 kB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ ARM64

gasp_py-1.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (326.3 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

gasp_py-1.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (310.5 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

gasp_py-1.1.0-cp312-cp312-macosx_11_0_arm64.whl (289.9 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

gasp_py-1.1.0-cp311-cp311-musllinux_1_2_x86_64.whl (387.4 kB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ x86-64

gasp_py-1.1.0-cp311-cp311-musllinux_1_2_aarch64.whl (357.2 kB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ ARM64

gasp_py-1.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (325.1 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

gasp_py-1.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (308.8 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

gasp_py-1.1.0-cp311-cp311-macosx_11_0_arm64.whl (288.1 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

gasp_py-1.1.0-cp310-cp310-musllinux_1_2_x86_64.whl (387.4 kB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ x86-64

gasp_py-1.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (325.1 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

gasp_py-1.1.0-cp310-cp310-macosx_10_12_x86_64.whl (298.6 kB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

gasp_py-1.1.0-cp39-cp39-musllinux_1_2_x86_64.whl (386.9 kB view details)

Uploaded CPython 3.9musllinux: musl 1.2+ x86-64

gasp_py-1.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (324.6 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

gasp_py-1.1.0-cp39-cp39-macosx_10_12_x86_64.whl (298.0 kB view details)

Uploaded CPython 3.9macOS 10.12+ x86-64

gasp_py-1.1.0-cp38-cp38-musllinux_1_2_x86_64.whl (387.8 kB view details)

Uploaded CPython 3.8musllinux: musl 1.2+ x86-64

gasp_py-1.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (325.6 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

gasp_py-1.1.0-cp38-cp38-macosx_10_12_x86_64.whl (299.0 kB view details)

Uploaded CPython 3.8macOS 10.12+ x86-64

File details

Details for the file gasp_py-1.1.0-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 a359fb579298ea08731b9091ba448b1a5fa1dd4aba4ffae7092b7f8ae7dcc5b8
MD5 d65e5b8a8a86f315afdd0cc802dfb25e
BLAKE2b-256 a1b65f2ddb8cc9e68aa38475184293db0403664f52a35b47e888e4dbc3ee15e1

See more details on using hashes here.

File details

Details for the file gasp_py-1.1.0-cp312-cp312-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp312-cp312-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 ebeb46a68a8fbe11fb4e4d920c54fc1e354877a5ed53083d03b6a70757376f53
MD5 25ee65ea0f7fa31d3c5d75b1ba0aa6cf
BLAKE2b-256 8c7ee59695165c4fc0530e80b7ddd5c0067cd9889b21e24262257b6aecba41ff

See more details on using hashes here.

File details

Details for the file gasp_py-1.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b01d8513c14fc2f43f2a4f4f09b75a12b5a72d3ee5373ba08146a9fee191e0b2
MD5 13bef8b66dcf3d4c8964eb85cd36a4be
BLAKE2b-256 2d3234fe860a1d08710dc7b952b5c8137a621eaead45e674d288f02680a1d596

See more details on using hashes here.

File details

Details for the file gasp_py-1.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 3dd4117a22d4ccf2f44a7ac796bf3d10e164304529004d28cb0b35aab0f931db
MD5 d81e79dc149dca25dee927c350137ec8
BLAKE2b-256 683052cb7a73c67179a6b8f81d6b4db72479b9650ea09c80a8ba5874bd6878fd

See more details on using hashes here.

File details

Details for the file gasp_py-1.1.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8951c3ac59898ffd46d309eafdd91279c4f0db9eee8b248ebddfe6528f9fa493
MD5 64a85d7e43b494cf025de5808f5ca320
BLAKE2b-256 4ea6d376bac6ffe1813811cdfbf6faf9f1bac43f04ab43b5fffeb58975d039c3

See more details on using hashes here.

File details

Details for the file gasp_py-1.1.0-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 56bb0f8b57415533b9b182b2da8f3a8ffbb327dbb321157b6ea9919c3c2a43cd
MD5 143232836d18c2ca4dcf2aea251d4a44
BLAKE2b-256 5ccb1920d16a928e38784ec3b01a25f2f94a682cc5db287bfd228b375d670b0e

See more details on using hashes here.

File details

Details for the file gasp_py-1.1.0-cp311-cp311-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp311-cp311-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 412e5bbedaa1aad8c94caa0826fab31a88f423605962c1bef7912b1497af3bc3
MD5 ec1a502d896db85ee3dc20bae330fffc
BLAKE2b-256 b245028e316e5aa5904b1b092d9e3ebc75ba2b55f6568644a89eff49cfebf700

See more details on using hashes here.

File details

Details for the file gasp_py-1.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6d47fa3f773a3334463fba6740e25a13bef28f1fa9d87ba25f5b2bb7c877ce68
MD5 fc7a70904045c5e527ae19b66e776149
BLAKE2b-256 ceaa96ec300c109c2d2a42a44102abfe75fa32ba4101fe51d5f8ffe4772d25ef

See more details on using hashes here.

File details

Details for the file gasp_py-1.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 91de283cb181a15486253b5e44e6ffce6a5ac7e67a8a497b8b6a5904dff11c0e
MD5 0478f891ff4c9568d02de083a053adf7
BLAKE2b-256 9445b1e50026b3b1c999f03e7baca06dbf48e39a1ce8c0a44ffe45d6221f166c

See more details on using hashes here.

File details

Details for the file gasp_py-1.1.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9ac17d5e8a3e9253011ae1ed077450de6a24dec44c820d97b89b8d754e29a1fa
MD5 6445415e69b8c5fab72ada263183745c
BLAKE2b-256 970db7ae8b53090790e106d29daefb4ed5591b79f8c25c2546361f2a60cd144b

See more details on using hashes here.

File details

Details for the file gasp_py-1.1.0-cp310-cp310-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp310-cp310-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 79323e1a3791b60e60941e254618a53199d90d1690ddb0f7959344481c3bacb1
MD5 5c18bb7a5c69c63eb8ddd4a2e3951cbc
BLAKE2b-256 fa33b86efa147237e4ccf6484d694a0beb17183dc56ad75b7cc51495652c3b0e

See more details on using hashes here.

File details

Details for the file gasp_py-1.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0ce47290b2ecefe25c9da87b497b0634530c2f6d3073af3390b5d1571be9e6db
MD5 a9ebc42502a3e61fc49ace198ff2db30
BLAKE2b-256 a1b7cbdf2e30d334cf44d42c4aa67fb7877a8bf4cfa27335875b06485e0e7e9d

See more details on using hashes here.

File details

Details for the file gasp_py-1.1.0-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 28bf122b57ed3707d8e4eaf2aea5376331dde7672a8937520bd24322d87f0ccf
MD5 20a6328ee30643298535af263c65192c
BLAKE2b-256 bb83a7f257495bf2c15f09eee45fd93ed7923f274149771269da2d2f67b3787a

See more details on using hashes here.

File details

Details for the file gasp_py-1.1.0-cp39-cp39-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp39-cp39-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 9281bcab1042133d32a1a6a6e1d7f4dbc442635545af123d86c06c6d58ac81e5
MD5 3599dc8725c82bb8d5bc1d0e9b0cf3ee
BLAKE2b-256 b8e021033d148a5068e860ca2d428063aa6ad6ebee18bac09bee4281f4bad443

See more details on using hashes here.

File details

Details for the file gasp_py-1.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 60dfdd45e00969ef8e1d73dfaa05792d0949a692ab4fd999f0937c618ce2d3e2
MD5 3dbd2f2c6ee74ec14295d6f3c19b100a
BLAKE2b-256 a2ed4bbd70859fb5ff6ae55513b9d55933dc6785c5d2b58404ec840d4d9ecb89

See more details on using hashes here.

File details

Details for the file gasp_py-1.1.0-cp39-cp39-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp39-cp39-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 8248b5fd44b60a93c42743350e7795a6be48097ee63af732a5847e20718b1c39
MD5 c2626bf0ce7e4497fa8a486e0fa04a6f
BLAKE2b-256 4ba7bc9e788e0e7c6e5d0acb58211aab40acd36a6f2dc4bc4fc3c110d843a554

See more details on using hashes here.

File details

Details for the file gasp_py-1.1.0-cp38-cp38-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp38-cp38-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 55a9281b6fdd64404063e89380587ca669dc6fc0def6c963f2e66bf1b1047974
MD5 8a1fee9eea83b8331f32c8209c43c3b9
BLAKE2b-256 98447534d338d53a698060c2eb7702b39d84f0c16455f80ee0920b52e197a33c

See more details on using hashes here.

File details

Details for the file gasp_py-1.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 70e658576a96bede4bc366ab72f749b6b6e7c81bf648d7d3a34391d789ae74b6
MD5 0828c7bb3ccb27ca8f53c221f103c2cf
BLAKE2b-256 22a4637f89c523670f5ce3c90850906315f2f7a0247e5628a57052f65aeedd09

See more details on using hashes here.

File details

Details for the file gasp_py-1.1.0-cp38-cp38-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gasp_py-1.1.0-cp38-cp38-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 11e5b7479ef8019e5c8a39d72389150607e6d3eb44f8f6aeed2b30faf8a1f5a1
MD5 18fa6b939517c0819ca09f51c47a7d4b
BLAKE2b-256 e16f8f4303bac5b5837cb7921f7e56bf4b9ec189bab921c3050bdc3e7cba8b89

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page