Practical, robust structured generation with retries using Pydantic schemas
Project description
schema_agent
Practical, robust structured generation for LLMs using Pydantic schemas. Provide a schema, a prompt, and a model; get back a validated BaseModel instance with automatic retries when validation fails.
Note: This is minimalist experimental package, and it does nearly the same as Instructor, with some slight differences in implementation design. However, if you need this for production, I recommend using Instructor.
Features
- Schema-first: define your output as a Pydantic model
- Automatic retries: validates via a tool call and re-prompts on failure
- Provider-agnostic: accepts LangChain-compatible models or a provider string (e.g.
"openai:gpt-4o-mini") - Strong typing: returns a Pydantic instance alongside raw agent traces
- Simple API: one function
generate_with_schema(...)
Usage
Install from PyPI:
pip install schema_agent
# Optional OpenAI support (needed to run scripts/demo.py as-is)
pip install "schema_agent[openai]"
Basic example:
from pydantic import BaseModel, Field
from schema_agent import generate_with_schema
class Person(BaseModel):
name: str = Field(description="Full name")
age: int = Field(description="Age in years")
resp = generate_with_schema(
user_prompt="Hos name was John Doe and he was 42 years old",
llm="openai:gpt-4o-mini", # or pass a LangChain model instance
schema=Person,
max_retries=2,
)
# Validated Pydantic instance
print(resp["output"]) # -> Person(name='John Doe', age=42)
print(resp["success"]) # -> True/False
print(resp["retries"]) # -> number of retries performed
With a LangChain model object:
from langchain_openai import ChatOpenAI
from schema_agent import generate_with_schema
llm = ChatOpenAI(model="gpt-4o-mini")
resp = generate_with_schema(
user_prompt="Hos name was John Doe and he was 42 years old",
llm=llm,
schema=Person,
max_retries=2,
)
With a validation callback (example that extracts a phone number from a large text):
def validate_output(x: str | dict) -> None:
if x["name"] != "John Doe":
raise ValueError("Name is not John Doe")
class PhoneNumber(BaseModel):
phone_number: str = Field(description="Phone number")
def check_phone_number_in_data(x: str | dict) -> None:
if x["phone_number"] not in large_text:
raise ValueError("Extracted attribute 'phone_number' not found in data")
resp = generate_with_schema(
user_prompt=large_text, # large text that contains a phone number
llm=llm,
schema=Person,
max_retries=2,
validation_callback=check_phone_number_in_data,
)
Run the demo script:
pixi run demo
Notes:
- Set
OPENAI_API_KEYin your environment if using OpenAI (e.g., via a.envfile when installing theopenaiextra). - On unexpected tool errors the call raises an exception; expected validation failures are retried up to
max_retries.
Project Structure
schema_agent/: Package logicllm.py:generate_with_schemaagent orchestration and validation toolstr.py: schema-to-example string utilitiesutils.py,errors.py,consts.py,types.py: helpers, exceptions, prompts, typings
tests/: Unit tests for all modulesscripts/:demo.pyscript
Development
This package has been created with pymc-labs/project-starter. It features:
- 📦
pixifor dependency and environment management. - 🧹
pre-commitfor formatting, spellcheck, etc. If everyone uses the same standard formatting, then PRs won't have flaky formatting updates that distract from the actual contribution. Reviewing code will be much easier. - 🧪
pytestfor testing. - 🔄 Github Actions for running the pre-commit checks on each PR, automated testing and dependency management (dependabot). Merges to
mainpublish to PyPI via trusted publishing.
Prerequisites
- Python 3.11 or higher
- Pixi package manager
Get started
- Run
pixi installto install the dependencies. - Run
pixi r testto run the tests. - Run
pre-commit installto set up pre-commit hooks.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file schema_agent-0.1.4.tar.gz.
File metadata
- Download URL: schema_agent-0.1.4.tar.gz
- Upload date:
- Size: 55.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
19cb8fad144b7b2bd36dc5fdf7739d6bf375ed229e33cc9a3ce4c33f213614d4
|
|
| MD5 |
49ccafd94aec7351dde6e6d47c1f9ceb
|
|
| BLAKE2b-256 |
5ec4b49f62b4fdc967c1ab688985bc4dfbedf75d94a188a52e0d63e81b56ab6f
|
Provenance
The following attestation bundles were made for schema_agent-0.1.4.tar.gz:
Publisher:
publish.yml on ulfaslak/schema-agent
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
schema_agent-0.1.4.tar.gz -
Subject digest:
19cb8fad144b7b2bd36dc5fdf7739d6bf375ed229e33cc9a3ce4c33f213614d4 - Sigstore transparency entry: 552063239
- Sigstore integration time:
-
Permalink:
ulfaslak/schema-agent@59afa8c5b2af17b18b6d7e296f16e8a7e2be00e9 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/ulfaslak
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@59afa8c5b2af17b18b6d7e296f16e8a7e2be00e9 -
Trigger Event:
push
-
Statement type:
File details
Details for the file schema_agent-0.1.4-py3-none-any.whl.
File metadata
- Download URL: schema_agent-0.1.4-py3-none-any.whl
- Upload date:
- Size: 16.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0c88ef2eb75efb6a58c7f96a83dea26278c03cf71859535a153591e9e16ad9cb
|
|
| MD5 |
772a4bb5e047812484b9fa97defbe074
|
|
| BLAKE2b-256 |
025174aef56914891632206702dd225393ed97c769ed2fac3909339bdd016596
|
Provenance
The following attestation bundles were made for schema_agent-0.1.4-py3-none-any.whl:
Publisher:
publish.yml on ulfaslak/schema-agent
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
schema_agent-0.1.4-py3-none-any.whl -
Subject digest:
0c88ef2eb75efb6a58c7f96a83dea26278c03cf71859535a153591e9e16ad9cb - Sigstore transparency entry: 552063257
- Sigstore integration time:
-
Permalink:
ulfaslak/schema-agent@59afa8c5b2af17b18b6d7e296f16e8a7e2be00e9 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/ulfaslak
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@59afa8c5b2af17b18b6d7e296f16e8a7e2be00e9 -
Trigger Event:
push
-
Statement type: