structured outputs for llm
Project description
Instructor: Structured Outputs for LLMs
Get reliable JSON from any LLM. Built on Pydantic for validation, type safety, and IDE support.
import instructor
from pydantic import BaseModel
# Define what you want
class User(BaseModel):
name: str
age: int
# Extract it from natural language
client = instructor.from_provider("openai/gpt-4o-mini")
user = client.chat.completions.create(
response_model=User,
messages=[{"role": "user", "content": "John is 25 years old"}],
)
print(user) # User(name='John', age=25)
That's it. No JSON parsing, no error handling, no retries. Just define a model and get structured data.
Why Instructor?
Getting structured data from LLMs is hard. You need to:
- Write complex JSON schemas
- Handle validation errors
- Retry failed extractions
- Parse unstructured responses
- Deal with different provider APIs
Instructor handles all of this with one simple interface:
Without Instructor | With Instructor |
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "..."}],
tools=[
{
"type": "function",
"function": {
"name": "extract_user",
"parameters": {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
},
},
},
}
],
)
# Parse response
tool_call = response.choices[0].message.tool_calls[0]
user_data = json.loads(tool_call.function.arguments)
# Validate manually
if "name" not in user_data:
# Handle error...
pass
|
client = instructor.from_provider("openai/gpt-4")
user = client.chat.completions.create(
response_model=User,
messages=[{"role": "user", "content": "..."}],
)
# That's it! user is validated and typed
|
Install in seconds
pip install instructor
Or with your package manager:
uv add instructor
poetry add instructor
Works with every major provider
Use the same code with any LLM provider:
# OpenAI
client = instructor.from_provider("openai/gpt-4o")
# Anthropic
client = instructor.from_provider("anthropic/claude-3-5-sonnet")
# Google
client = instructor.from_provider("google/gemini-pro")
# Ollama (local)
client = instructor.from_provider("ollama/llama3.2")
# All use the same API!
user = client.chat.completions.create(
response_model=User,
messages=[{"role": "user", "content": "..."}],
)
Production-ready features
Automatic retries
Failed validations are automatically retried with the error message:
from pydantic import BaseModel, field_validator
class User(BaseModel):
name: str
age: int
@field_validator('age')
def validate_age(cls, v):
if v < 0:
raise ValueError('Age must be positive')
return v
# Instructor automatically retries when validation fails
user = client.chat.completions.create(
response_model=User,
messages=[{"role": "user", "content": "..."}],
max_retries=3,
)
Streaming support
Stream partial objects as they're generated:
from instructor import Partial
for partial_user in client.chat.completions.create(
response_model=Partial[User],
messages=[{"role": "user", "content": "..."}],
stream=True,
):
print(partial_user)
# User(name=None, age=None)
# User(name="John", age=None)
# User(name="John", age=25)
Nested objects
Extract complex, nested data structures:
from typing import List
class Address(BaseModel):
street: str
city: str
country: str
class User(BaseModel):
name: str
age: int
addresses: List[Address]
# Instructor handles nested objects automatically
user = client.chat.completions.create(
response_model=User,
messages=[{"role": "user", "content": "..."}],
)
Used in production by
Trusted by over 100,000 developers and companies building AI applications:
- 3M+ monthly downloads
- 10K+ GitHub stars
- 1000+ community contributors
Companies using Instructor include teams at OpenAI, Google, Microsoft, AWS, and many YC startups.
Get started
Basic extraction
Extract structured data from any text:
from pydantic import BaseModel
import instructor
client = instructor.from_provider("openai/gpt-4o-mini")
class Product(BaseModel):
name: str
price: float
in_stock: bool
product = client.chat.completions.create(
response_model=Product,
messages=[{"role": "user", "content": "iPhone 15 Pro, $999, available now"}],
)
print(product)
# Product(name='iPhone 15 Pro', price=999.0, in_stock=True)
Multiple languages
Instructor's simple API is available in many languages:
- Python - The original
- TypeScript - Full TypeScript support
- Ruby - Ruby implementation
- Go - Go implementation
- Elixir - Elixir implementation
- Rust - Rust implementation
Learn more
- Documentation - Comprehensive guides
- Examples - Copy-paste recipes
- Blog - Tutorials and best practices
- Discord - Get help from the community
Why use Instructor over alternatives?
vs Raw JSON mode: Instructor provides automatic validation, retries, streaming, and nested object support. No manual schema writing.
vs LangChain/LlamaIndex: Instructor is focused on one thing - structured extraction. It's lighter, faster, and easier to debug.
vs Custom solutions: Battle-tested by thousands of developers. Handles edge cases you haven't thought of yet.
Contributing
We welcome contributions! Check out our good first issues to get started.
License
MIT License - see LICENSE for details.
Built by the Instructor community. Special thanks to Jason Liu and all contributors.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file instructor-1.9.0.tar.gz
.
File metadata
- Download URL: instructor-1.9.0.tar.gz
- Upload date:
- Size: 69.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
34fb37084d40130bdde9d8dd27f592cef81b409ded8a8c11ccd81945e52fd9e6
|
|
MD5 |
838408e1b474a253bd16ca31147e7011
|
|
BLAKE2b-256 |
b9bcfcd5e5ba55cfb263277fb31d0ca65ed65e9e0f7a38eb9bc3d405aacb2ba8
|
File details
Details for the file instructor-1.9.0-py3-none-any.whl
.
File metadata
- Download URL: instructor-1.9.0-py3-none-any.whl
- Upload date:
- Size: 94.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
ab0eda1d20a2135cd15e1ed0928ea6428bee65cdc1ebfd09aba980c8da6389ee
|
|
MD5 |
d7a58cec899d16d0906c31c1f7529673
|
|
BLAKE2b-256 |
7011c29d11567c39b360b61284e7bd6932a6aeb2772ff5088cf9d26b3f160d05
|