Skip to main content

Structured outputs for LLMs with 30-70% token savings using TSON optimization

Project description

OpenInstruct

Structured outputs for LLMs with 30-70% token savings

Extract structured data from any LLM. TSON optimization reduces token costs while maintaining type safety.

PyPI version Python 3.9+ License: MIT


Why OpenInstruct?

Getting structured data from LLMs is expensive and complex:

# ❌ Without OpenInstruct: Manual JSON, verbose prompts, wasted tokens
response = openai.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "..."}],
    tools=[{
        "type": "function",
        "function": {
            "name": "extract_user",
            "parameters": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "age": {"type": "integer"},
                },
            },
        },
    }],
)
# Parse response manually
tool_call = response.choices[0].message.tool_calls[0]
user_data = json.loads(tool_call.function.arguments)
# Validate manually...
# ✅ With OpenInstruct: Simple, validated, 30-70% fewer tokens with large payload
from openinstruct import OpenInstruct
from pydantic import BaseModel

class User(BaseModel):
    name: str
    age: int

client = OpenInstruct.from_provider("openai/gpt-4o")
user = client.extract(
    response_model=User,
    messages=[{"role": "user", "content": "Extract: John, 25 years old"}],
)
# user.name = "John", user.age = 25 ✅ Validated & typed

Install

pip install openinstruct

Token Savings

OpenInstruct uses TSON (Token-efficient Structured Object Notation) to reduce token consumption:

Format Tokens Savings
JSON {"name": "Alice", "age": 30} -
TSON {@name,age|Alice,30} ~50%

For arrays of objects, savings can reach 70%+.

When NOT to Use TSON

Disable TSON optimization (optimize=False and optimize_context=False) in these cases:

Scenario Why
Small payloads Overhead outweighs savings for simple objects
Debugging JSON is more readable for troubleshooting
Smaller/fine-tuned models May not understand TSON syntax well
Native JSON mode If using provider's built-in structured output
High-stakes extraction JSON has better LLM reliability
# Disable TSON for simple extractions
user = client.extract(
    response_model=User,
    messages=[...],
    optimize=False,  # Use JSON instead
)

Rule of thumb: Use TSON for large context data and arrays. Use JSON for simple single-object extractions.


Works with Every Major Provider

# OpenAI
client = OpenInstruct.from_provider("openai/gpt-4o")

# Anthropic
client = OpenInstruct.from_provider("anthropic/claude-3-5-sonnet")

# Google Gemini
client = OpenInstruct.from_provider("google/gemini-2.0-flash")

# Groq (fast inference)
client = OpenInstruct.from_provider("groq/llama-3.1-8b-instant")

# Ollama (local)
client = OpenInstruct.from_provider("ollama/llama3.2")

# OpenRouter (multiple providers)
client = OpenInstruct.from_provider("openrouter/openai/gpt-4o-mini")

# With explicit API key
client = OpenInstruct.from_provider("openai/gpt-4o", api_key="sk-...")
Provider Environment Variable
openai OPENAI_API_KEY
anthropic ANTHROPIC_API_KEY
google GOOGLE_API_KEY
groq GROQ_API_KEY
together TOGETHER_API_KEY
mistral MISTRAL_API_KEY
ollama None (local)
openrouter OPENROUTER_API_KEY

Features

Automatic Retries with Backoff

Failed validations are automatically retried:

from openinstruct import OpenInstruct, RetryConfig

config = RetryConfig(
    max_retries=3,
    retry_delay=0.5,      # 0.5s, 1s, 2s delays
    backoff_factor=2.0,
    on_retry=lambda attempt, error, response: print(f"Retry {attempt}"),
)

user = client.extract(
    response_model=User,
    messages=[...],
    retry_config=config,
)

Token Usage Tracking

Track costs across requests:

result = client.extract(
    response_model=User,
    messages=[...],
    return_usage=True,
)

print(result.data.name)              # "Alice"
print(result.usage.total_tokens)     # 175
print(result.attempts)               # 1

Nested Objects

Extract complex, nested data:

class Address(BaseModel):
    city: str
    country: str

class UserWithAddress(BaseModel):
    name: str
    email: str
    address: Address

user = client.extract(
    response_model=UserWithAddress,
    messages=[{"role": "user", "content": "John, john@example.com, NYC, USA"}],
)
# user.address.city = "NYC"

List Extraction

Extract arrays of objects:

users = client.extract(
    response_model=list[User],
    messages=[{"role": "user", "content": "List 5 random users"}],
)
# Returns list of validated User objects

Input Optimization

Large context data is automatically converted to TSON:

sales_data = [
    {"month": "Jan", "revenue": 50000},
    {"month": "Feb", "revenue": 62000},
    # ... 100 more rows
]

class Analysis(BaseModel):
    total_revenue: float
    best_month: str

result = client.extract(
    response_model=Analysis,
    messages=[{"role": "user", "content": "Analyze: {data}"}],
    context={"data": sales_data},  # 60% smaller in tokens
)

Async Support

from openinstruct import AsyncOpenInstruct

async def main():
    client = AsyncOpenInstruct.from_provider("openai/gpt-4o")
    
    user = await client.extract(
        response_model=User,
        messages=[...],
    )
    
    await client.close()

API Reference

OpenInstruct.from_provider()

client = OpenInstruct.from_provider(
    provider_model: str,    # "provider/model" format
    api_key: str = None,    # Optional API key
    base_url: str = None,   # Custom endpoint
    timeout: float = 60.0,
)

client.extract()

result = client.extract(
    response_model: Type[T],       # Pydantic model or list[Model]
    messages: list[dict],          # Chat messages
    context: dict = None,          # Data to inject
    optimize: bool = True,         # Use TSON for LLM output
    optimize_context: bool = True, # Use TSON for context data
    retry_config: RetryConfig = None,
    return_usage: bool = False,
    temperature: float = 0.0,
    max_tokens: int = None,
)

Comparison with Instructor

Feature OpenInstruct Instructor
Token Savings ✅ 30-70% (TSON) ❌ JSON only
Input Optimization ✅ Context as TSON
Multi-Provider ✅ 8+ providers
Token Tracking ✅ Built-in
Retry with Backoff ✅ Configurable ✅ Basic
Streaming 🚧 Coming soon

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.


License

MIT License - see LICENSE


Version: 1.1.0

Built for efficiency. Optimized for LLMs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openinstruct-1.1.0.tar.gz (18.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openinstruct-1.1.0-py3-none-any.whl (23.6 kB view details)

Uploaded Python 3

File details

Details for the file openinstruct-1.1.0.tar.gz.

File metadata

  • Download URL: openinstruct-1.1.0.tar.gz
  • Upload date:
  • Size: 18.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for openinstruct-1.1.0.tar.gz
Algorithm Hash digest
SHA256 e1d0d7e40e8bba369c067eb4377d23c9f1c8373a0a854b2f48d4ffbcc3f60cb6
MD5 0673f3e14ea3a47368d2d3dfba293b12
BLAKE2b-256 7be124fa54c64899c9ad015d3220045c5d3c3871e03cef27107bff0a6f187aee

See more details on using hashes here.

File details

Details for the file openinstruct-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: openinstruct-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 23.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for openinstruct-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e9df7d1dcaa965eb0297c98c31612f25edbfa58201c1f1366561c864b71cd1ad
MD5 50caff0f4a53d330eb6d4943ff32d6d7
BLAKE2b-256 6737050585013b964701b8b536763b62232997439910570e287bde54c379cc4e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page