Skip to main content

Token-efficient schema language for LLMs with validation and conversion

Project description

SlimSchema

Compact schema language and data extraction utilities for LLMs.

from slimschema import to_data

schema = """
name: str
age: 18..120
email: email
"""

response = '<json>{"name": "Ace", "age": 30, "email": "a@g.com"}</json>'
data, error = to_data(response, schema)

assert error is None
assert data == {'name': 'Ace', 'age': 30, 'email': 'a@g.com'}

Install

uv add slimschema
# or: pip install slimschema

API

to_schema(obj) -> Schema                     # YAML/Pydantic → SlimSchema
to_data(response, obj) -> (data, error)      # Validate
str(schema)                                  # SlimSchema → YAML

With msgspec (Dynamic Validation)

No class definition needed. SlimSchema creates msgspec Struct dynamically:

from slimschema import to_data

schema = """
name: str
age: 18..120
email: email
tags: [str]
"""

# SlimSchema creates this msgspec Struct internally:
# class DynamicStruct(msgspec.Struct):
#     name: str
#     age: Annotated[int, msgspec.Meta(ge=18, le=120)]
#     email: Annotated[str, msgspec.Meta(pattern=r"^[^@]+@...")]
#     tags: list[str]

data, error = to_data(llm_response, schema)
# Returns dict validated by msgspec (117x faster)

With Pydantic

Convert models to SlimSchema, get instances back:

from pydantic import BaseModel, Field
from slimschema import to_schema, to_data

class User(BaseModel):
    username: str = Field(min_length=3, max_length=20)
    age: int = Field(ge=18, le=120)
    tags: set[str]

# Pydantic → SlimSchema YAML
schema = to_schema(User)

assert str(schema) == """# User
username: str{3..20}
age: 18..120
tags: {str}"""

# Validate → Pydantic instance
user, error = to_data(llm_response, User)
assert isinstance(user, User)

With msgspec Struct

Pass msgspec Structs directly:

import msgspec
from slimschema import to_data, to_schema

# Define your msgspec Struct
class User(msgspec.Struct):
    name: str
    age: int
    groups: set[str] = set()
    email: str | None = None

# Pass Struct directly - returns Struct instance
response = '<json>{"name": "alice", "age": 30, "groups": ["admin"]}</json>'
user, error = to_data(response, User)

assert error is None
assert isinstance(user, User)
assert user.name == "alice"
assert user.groups == {"admin"}

# Also works: convert to SlimSchema for prompts
schema = to_schema(User)
assert str(schema) == """# User
name: str
age: int
groups: {str}
email?: str"""

Types

# Primitives
name: str
age: int
price: float
active: bool
data: obj

# Constraints
username: str{3..20}
age: 18..120
ratio: 0.0..1.0

# Formats
email: email
url: url
birthday: date
timestamp: datetime
id: uuid

# Regex
slug: /^[a-z0-9-]+$/

# Enums
status: active | pending | done

# Collections
tags: [str]
ids: {int}

# Optional
bio?: str

Complete Example

from slimschema import to_data

schema = r"""
# Product
name: str{3..100}
sku: /^[A-Z]{3}-\d{4}$/
price: 0.01..99999.99
status: draft | active | archived
tags: [str]
ids: {int}
in_stock: bool
metadata: obj
supplier: obj
created: datetime
updated?: datetime
"""

response = '''
<json>
{
  "name": "Laptop Pro",
  "sku": "ELC-1234",
  "price": 1299.99,
  "status": "active",
  "tags": ["laptop", "computer"],
  "ids": [101, 202, 303],
  "in_stock": true,
  "metadata": {"warranty": "2 years"},
  "supplier": {
    "name": "TechCorp",
    "country": "USA"
  },
  "created": "2025-01-15T10:30:00"
}
</json>
'''

product, error = to_data(response, schema)

assert error is None
assert product["name"] == "Laptop Pro"
assert product["supplier"]["name"] == "TechCorp"

Error Handling

Concise, LLM-friendly error messages:

from slimschema import to_data

# Missing required field
_, err = to_data('<json>{"name": "Al"}</json>', """
name: str
age: int
""")
assert err == "Object missing required field `age`"

# Out of range
_, err = to_data('<json>{"age": 10}</json>', "age: 18..120")
assert "Expected `int` >= 18" in err

# Invalid email
_, err = to_data('<json>{"email": "bad"}</json>', "email: email")
assert "matching regex" in err

# Nested object error
_, err = to_data('<json>{"user": "not-an-object"}</json>', "user: obj")
assert "Expected `object`" in err

# Multiple errors (reports first)
_, err = to_data('<json>{}</json>', """
name: str
age: int
email: email
""")
assert "name" in err  # First missing field reported

msgspec reports one error at a time - perfect for LLM retry loops.

JSON Extraction

Extracts from: <json>, <output>, ```json, or raw JSON automatically.

Comments (Round-Trip)

Comments flow through: YAML ↔ SlimSchema ↔ Pydantic

from slimschema import to_schema

# YAML comments
yaml_in = """
# User
name: str       # Full name
email: email    # Contact
age?: int
"""

schema = to_schema(yaml_in)
assert schema.name == "User"
assert schema.fields[0].description == "Full name"

# Convert back
yaml_out = str(schema)
assert yaml_out == """# User
name: str       # Full name
email: email    # Contact
age?: int"""

# Pydantic descriptions → comments
from pydantic import BaseModel, Field

class Person(BaseModel):
    name: str = Field(description="Full name")
    email: str = Field(description="Contact email")

schema2 = to_schema(Person)
assert "# Full name" in str(schema2)
assert "# Contact email" in str(schema2)

Why SlimSchema?

Token-efficient schemas - 5-10x smaller than JSON Schema Fast validation - msgspec is 117x faster than alternatives LLM-friendly - Unambiguous notation, clear constraints Seamless integration - Works with Pydantic & msgspec

For developers: See CLAUDE.md for comprehensive testing results, token economics analysis, and development guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

slimschema-0.0.0.dev2.tar.gz (19.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

slimschema-0.0.0.dev2-py3-none-any.whl (15.2 kB view details)

Uploaded Python 3

File details

Details for the file slimschema-0.0.0.dev2.tar.gz.

File metadata

  • Download URL: slimschema-0.0.0.dev2.tar.gz
  • Upload date:
  • Size: 19.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for slimschema-0.0.0.dev2.tar.gz
Algorithm Hash digest
SHA256 537a75dfb288f40e4eb59f0f70ed02a3502fe1d5755cbae18e1d4d9385d411bf
MD5 3fae13d7e926a16bc920fbe727816740
BLAKE2b-256 0baeb3a1d062f0bff4c6a11034e0feb7a84d58f4ce572761d80a9ada3a9e672b

See more details on using hashes here.

Provenance

The following attestation bundles were made for slimschema-0.0.0.dev2.tar.gz:

Publisher: on-release-main.yml on botassembly/slimschema

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file slimschema-0.0.0.dev2-py3-none-any.whl.

File metadata

File hashes

Hashes for slimschema-0.0.0.dev2-py3-none-any.whl
Algorithm Hash digest
SHA256 d25cecbef17258c1f0d4b1beb437ebd6bf9e5f79c538d128d031387dfcdb51c0
MD5 a4f6e81ff4b6201168adf00101d685d4
BLAKE2b-256 a5bb6437f0ba5183da77d91f6d94748d48cf8f9d1826c471b2276d8cd2ca6706

See more details on using hashes here.

Provenance

The following attestation bundles were made for slimschema-0.0.0.dev2-py3-none-any.whl:

Publisher: on-release-main.yml on botassembly/slimschema

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page