Structured outputs for LLMs with 30-70% token savings using TSON optimization
Project description
OpenInstruct
Structured outputs for LLMs with 30-70% token savings
Extract structured data from any LLM. TSON optimization reduces token costs while maintaining type safety.
Why OpenInstruct?
Getting structured data from LLMs is expensive and complex:
# ❌ Without OpenInstruct: Manual JSON, verbose prompts, wasted tokens
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "..."}],
tools=[{
"type": "function",
"function": {
"name": "extract_user",
"parameters": {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
},
},
},
}],
)
# Parse response manually
tool_call = response.choices[0].message.tool_calls[0]
user_data = json.loads(tool_call.function.arguments)
# Validate manually...
# ✅ With OpenInstruct: Simple, validated, 30-70% fewer tokens with large payload
from openinstruct import OpenInstruct
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
client = OpenInstruct.from_provider("openai/gpt-4o")
user = client.extract(
response_model=User,
messages=[{"role": "user", "content": "Extract: John, 25 years old"}],
)
# user.name = "John", user.age = 25 ✅ Validated & typed
Install
pip install openinstruct
Token Savings
OpenInstruct uses TSON (Token-efficient Structured Object Notation) to reduce token consumption:
| Format | Tokens | Savings |
|---|---|---|
| JSON | {"name": "Alice", "age": 30} |
- |
| TSON | {@name,age|Alice,30} |
~50% |
For arrays of objects, savings can reach 70%+.
When NOT to Use TSON
Disable TSON optimization (optimize=False and optimize_context=False) in these cases:
| Scenario | Why |
|---|---|
| Small payloads | Overhead outweighs savings for simple objects |
| Debugging | JSON is more readable for troubleshooting |
| Smaller/fine-tuned models | May not understand TSON syntax well |
| Native JSON mode | If using provider's built-in structured output |
| High-stakes extraction | JSON has better LLM reliability |
# Disable TSON for simple extractions
user = client.extract(
response_model=User,
messages=[...],
optimize=False, # Use JSON instead
)
Rule of thumb: Use TSON for large context data and arrays. Use JSON for simple single-object extractions.
Works with Every Major Provider
# OpenAI
client = OpenInstruct.from_provider("openai/gpt-4o")
# Anthropic
client = OpenInstruct.from_provider("anthropic/claude-3-5-sonnet")
# Google Gemini
client = OpenInstruct.from_provider("google/gemini-2.0-flash")
# Groq (fast inference)
client = OpenInstruct.from_provider("groq/llama-3.1-8b-instant")
# Ollama (local)
client = OpenInstruct.from_provider("ollama/llama3.2")
# OpenRouter (multiple providers)
client = OpenInstruct.from_provider("openrouter/openai/gpt-4o-mini")
# With explicit API key
client = OpenInstruct.from_provider("openai/gpt-4o", api_key="sk-...")
| Provider | Environment Variable |
|---|---|
openai |
OPENAI_API_KEY |
anthropic |
ANTHROPIC_API_KEY |
google |
GOOGLE_API_KEY |
groq |
GROQ_API_KEY |
together |
TOGETHER_API_KEY |
mistral |
MISTRAL_API_KEY |
ollama |
None (local) |
openrouter |
OPENROUTER_API_KEY |
Features
Automatic Retries with Backoff
Failed validations are automatically retried:
from openinstruct import OpenInstruct, RetryConfig
config = RetryConfig(
max_retries=3,
retry_delay=0.5, # 0.5s, 1s, 2s delays
backoff_factor=2.0,
on_retry=lambda attempt, error, response: print(f"Retry {attempt}"),
)
user = client.extract(
response_model=User,
messages=[...],
retry_config=config,
)
Token Usage Tracking
Track costs across requests:
result = client.extract(
response_model=User,
messages=[...],
return_usage=True,
)
print(result.data.name) # "Alice"
print(result.usage.total_tokens) # 175
print(result.attempts) # 1
Nested Objects
Extract complex, nested data:
class Address(BaseModel):
city: str
country: str
class UserWithAddress(BaseModel):
name: str
email: str
address: Address
user = client.extract(
response_model=UserWithAddress,
messages=[{"role": "user", "content": "John, john@example.com, NYC, USA"}],
)
# user.address.city = "NYC"
List Extraction
Extract arrays of objects:
users = client.extract(
response_model=list[User],
messages=[{"role": "user", "content": "List 5 random users"}],
)
# Returns list of validated User objects
Input Optimization
Large context data is automatically converted to TSON:
sales_data = [
{"month": "Jan", "revenue": 50000},
{"month": "Feb", "revenue": 62000},
# ... 100 more rows
]
class Analysis(BaseModel):
total_revenue: float
best_month: str
result = client.extract(
response_model=Analysis,
messages=[{"role": "user", "content": "Analyze: {data}"}],
context={"data": sales_data}, # 60% smaller in tokens
)
Async Support
from openinstruct import AsyncOpenInstruct
async def main():
client = AsyncOpenInstruct.from_provider("openai/gpt-4o")
user = await client.extract(
response_model=User,
messages=[...],
)
await client.close()
API Reference
OpenInstruct.from_provider()
client = OpenInstruct.from_provider(
provider_model: str, # "provider/model" format
api_key: str = None, # Optional API key
base_url: str = None, # Custom endpoint
timeout: float = 60.0,
)
client.extract()
result = client.extract(
response_model: Type[T], # Pydantic model or list[Model]
messages: list[dict], # Chat messages
context: dict = None, # Data to inject
optimize: bool = True, # Use TSON for LLM output
optimize_context: bool = True, # Use TSON for context data
retry_config: RetryConfig = None,
return_usage: bool = False,
temperature: float = 0.0,
max_tokens: int = None,
)
Comparison with Instructor
| Feature | OpenInstruct | Instructor |
|---|---|---|
| Token Savings | ✅ 30-70% (TSON) | ❌ JSON only |
| Input Optimization | ✅ Context as TSON | ❌ |
| Multi-Provider | ✅ 8+ providers | ✅ |
| Token Tracking | ✅ Built-in | ❌ |
| Retry with Backoff | ✅ Configurable | ✅ Basic |
| Streaming | 🚧 Coming soon | ✅ |
Contributing
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
License
MIT License - see LICENSE
Version: 1.1.0
Built for efficiency. Optimized for LLMs.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openinstruct-1.1.0.tar.gz.
File metadata
- Download URL: openinstruct-1.1.0.tar.gz
- Upload date:
- Size: 18.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e1d0d7e40e8bba369c067eb4377d23c9f1c8373a0a854b2f48d4ffbcc3f60cb6
|
|
| MD5 |
0673f3e14ea3a47368d2d3dfba293b12
|
|
| BLAKE2b-256 |
7be124fa54c64899c9ad015d3220045c5d3c3871e03cef27107bff0a6f187aee
|
File details
Details for the file openinstruct-1.1.0-py3-none-any.whl.
File metadata
- Download URL: openinstruct-1.1.0-py3-none-any.whl
- Upload date:
- Size: 23.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9df7d1dcaa965eb0297c98c31612f25edbfa58201c1f1366561c864b71cd1ad
|
|
| MD5 |
50caff0f4a53d330eb6d4943ff32d6d7
|
|
| BLAKE2b-256 |
6737050585013b964701b8b536763b62232997439910570e287bde54c379cc4e
|