Skip to main content

structured outputs for llm

Project description

Instructor: Structured Outputs for LLMs

Get reliable JSON from any LLM. Built on Pydantic for validation, type safety, and IDE support.

import instructor
from pydantic import BaseModel


# Define what you want
class User(BaseModel):
    name: str
    age: int


# Extract it from natural language
client = instructor.from_provider("openai/gpt-4o-mini")
user = client.chat.completions.create(
    response_model=User,
    messages=[{"role": "user", "content": "John is 25 years old"}],
)

print(user)  # User(name='John', age=25)

That's it. No JSON parsing, no error handling, no retries. Just define a model and get structured data.

PyPI Downloads GitHub Stars Discord Twitter

Use Instructor for fast extraction, reach for PydanticAI when you need agents. Instructor keeps schema-first flows simple and cheap. If your app needs richer agent runs, built-in observability, or shareable traces, try PydanticAI. PydanticAI is the official agent runtime from the Pydantic team, adding typed tools, replayable datasets, evals, and production dashboards while using the same Pydantic models. Dive into the PydanticAI docs to see how it extends Instructor-style workflows.

Why Instructor?

Getting structured data from LLMs is hard. You need to:

  1. Write complex JSON schemas
  2. Handle validation errors
  3. Retry failed extractions
  4. Parse unstructured responses
  5. Deal with different provider APIs

Instructor handles all of this with one simple interface:

Without Instructor With Instructor
response = openai.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "..."}],
    tools=[
        {
            "type": "function",
            "function": {
                "name": "extract_user",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "name": {"type": "string"},
                        "age": {"type": "integer"},
                    },
                },
            },
        }
    ],
)

# Parse response
tool_call = response.choices[0].message.tool_calls[0]
user_data = json.loads(tool_call.function.arguments)

# Validate manually
if "name" not in user_data:
    # Handle error...
    pass
client = instructor.from_provider("openai/gpt-4")

user = client.chat.completions.create(
    response_model=User,
    messages=[{"role": "user", "content": "..."}],
)

# That's it! user is validated and typed

Install in seconds

pip install instructor

Or with your package manager:

uv add instructor
poetry add instructor

Works with every major provider

Use the same code with any LLM provider:

# OpenAI
client = instructor.from_provider("openai/gpt-4o")

# Anthropic
client = instructor.from_provider("anthropic/claude-3-5-sonnet")

# Google
client = instructor.from_provider("google/gemini-pro")

# Ollama (local)
client = instructor.from_provider("ollama/llama3.2")

# With API keys directly (no environment variables needed)
client = instructor.from_provider("openai/gpt-4o", api_key="sk-...")
client = instructor.from_provider("anthropic/claude-3-5-sonnet", api_key="sk-ant-...")
client = instructor.from_provider("groq/llama-3.1-8b-instant", api_key="gsk_...")

# All use the same API!
user = client.chat.completions.create(
    response_model=User,
    messages=[{"role": "user", "content": "..."}],
)

Production-ready features

Automatic retries

Failed validations are automatically retried with the error message:

from pydantic import BaseModel, field_validator


class User(BaseModel):
    name: str
    age: int

    @field_validator('age')
    def validate_age(cls, v):
        if v < 0:
            raise ValueError('Age must be positive')
        return v


# Instructor automatically retries when validation fails
user = client.chat.completions.create(
    response_model=User,
    messages=[{"role": "user", "content": "..."}],
    max_retries=3,
)

Streaming support

Stream partial objects as they're generated:

from instructor import Partial

for partial_user in client.chat.completions.create(
    response_model=Partial[User],
    messages=[{"role": "user", "content": "..."}],
    stream=True,
):
    print(partial_user)
    # User(name=None, age=None)
    # User(name="John", age=None)
    # User(name="John", age=25)

Nested objects

Extract complex, nested data structures:

from typing import List


class Address(BaseModel):
    street: str
    city: str
    country: str


class User(BaseModel):
    name: str
    age: int
    addresses: List[Address]


# Instructor handles nested objects automatically
user = client.chat.completions.create(
    response_model=User,
    messages=[{"role": "user", "content": "..."}],
)

Used in production by

Trusted by over 100,000 developers and companies building AI applications:

  • 3M+ monthly downloads
  • 10K+ GitHub stars
  • 1000+ community contributors

Companies using Instructor include teams at OpenAI, Google, Microsoft, AWS, and many YC startups.

Get started

Basic extraction

Extract structured data from any text:

from pydantic import BaseModel
import instructor

client = instructor.from_provider("openai/gpt-4o-mini")


class Product(BaseModel):
    name: str
    price: float
    in_stock: bool


product = client.chat.completions.create(
    response_model=Product,
    messages=[{"role": "user", "content": "iPhone 15 Pro, $999, available now"}],
)

print(product)
# Product(name='iPhone 15 Pro', price=999.0, in_stock=True)

Multiple languages

Instructor's simple API is available in many languages:

  • Python - The original
  • TypeScript - Full TypeScript support
  • Ruby - Ruby implementation
  • Go - Go implementation
  • Elixir - Elixir implementation
  • Rust - Rust implementation

Learn more

Why use Instructor over alternatives?

vs Raw JSON mode: Instructor provides automatic validation, retries, streaming, and nested object support. No manual schema writing.

vs LangChain/LlamaIndex: Instructor is focused on one thing - structured extraction. It's lighter, faster, and easier to debug.

vs Custom solutions: Battle-tested by thousands of developers. Handles edge cases you haven't thought of yet.

Contributing

We welcome contributions! Check out our good first issues to get started.

License

MIT License - see LICENSE for details.


Built by the Instructor community. Special thanks to Jason Liu and all contributors.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

instructor-1.14.5.tar.gz (70.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

instructor-1.14.5-py3-none-any.whl (177.4 kB view details)

Uploaded Python 3

File details

Details for the file instructor-1.14.5.tar.gz.

File metadata

  • Download URL: instructor-1.14.5.tar.gz
  • Upload date:
  • Size: 70.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.27 {"installer":{"name":"uv","version":"0.9.27","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for instructor-1.14.5.tar.gz
Algorithm Hash digest
SHA256 fcb6432867f2fe4a5986e8bf389dcc64ed2ad4039a12a2dff85464e51c2f171a
MD5 d91bc15abd258ef6c395dbe93933038b
BLAKE2b-256 0bef986d059424db204ed57b29d8c07fda35de2a2c72dee8ea7994bc90a6f767

See more details on using hashes here.

File details

Details for the file instructor-1.14.5-py3-none-any.whl.

File metadata

  • Download URL: instructor-1.14.5-py3-none-any.whl
  • Upload date:
  • Size: 177.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.27 {"installer":{"name":"uv","version":"0.9.27","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for instructor-1.14.5-py3-none-any.whl
Algorithm Hash digest
SHA256 2a5a31222b008c0989be1cc001e33a237f49506e80ac5833f6d36d7690bae7b1
MD5 5098d71a8e4c9a3001ac4c942b26c678
BLAKE2b-256 4504e442e1356c97b03a6d30d2b462f7c0bdfbf207e75f6833815fd1225a75b4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page