CLI for OpenAI Structured Output with Multi-Tool Integration

These details have not been verified by PyPI

Project description

02:07 A.M. CI Failure

"Build failed. Again. The third time this week our data extraction pipeline broke because someone changed the log format, and our regex-based parser couldn't handle the new structure. Sarah's on vacation, Mike's parsing code is unreadable, and the client wants their analytics dashboard working by morning.

There has to be a better way to turn messy data into structured JSON without writing custom parsers for every format change..."

ostruct

ostruct transforms unstructured inputs into structured, usable JSON output using OpenAI APIs with multi-tool integration

The better way you've been looking for.

ostruct-cli

ostruct processes unstructured data (text files, code, CSVs, etc.), input variables, and dynamic prompt templates to produce structured JSON output defined by a JSON schema. With enhanced multi-tool integration, ostruct now supports Code Interpreter for data analysis, File Search for document retrieval, Web Search for real-time information access, and MCP (Model Context Protocol) servers for extended capabilities.

How ostruct works

Why ostruct?

LLMs are powerful, but getting consistent, structured output from them can be challenging. ostruct solves this problem by providing a streamlined approach to transform unstructured data into reliable JSON structures. The motivation behind creating ostruct was to:

Bridge the gap between freeform LLM capabilities and structured data needs in production systems
Simplify integration of AI into existing workflows and applications that expect consistent data formats
Ensure reliability and validate output against a defined schema to avoid unexpected formats or missing data
Reduce development time by providing a standardized way to interact with OpenAI models for structured outputs
Enable non-developers to leverage AI capabilities through a simple CLI interface with templates

Real-World Use Cases

ostruct can be used for various scenarios, including:

Automated Code Review with Multi-Tool Analysis

# Traditional pattern matching
ostruct run prompts/task.j2 schemas/code_review.json -p source "examples/security/*.py"

# Enhanced with Code Interpreter for deeper analysis
ostruct run prompts/task.j2 schemas/code_review.json -fc examples/security/ -fs documentation/

Analyze code for security vulnerabilities, style issues, and performance problems. The enhanced version uses Code Interpreter for execution analysis and File Search for documentation context.

Security Vulnerability Scanning

# Budget-friendly static analysis (recommended for most projects)
ostruct run prompts/static_analysis.j2 schemas/scan_result.json \
  -d code examples -R --sys-file prompts/system.txt

# Professional security analysis with Code Interpreter (best balance)
ostruct run prompts/code_interpreter.j2 schemas/scan_result.json \
  -dc examples --sys-file prompts/system.txt

# Comprehensive hybrid analysis for critical applications
ostruct run prompts/hybrid_analysis.j2 schemas/scan_result.json \
  -d code examples -R -dc examples --sys-file prompts/system.txt

Three optimized approaches for automated security vulnerability scanning:

Static Analysis: $0.18 cost, fast processing, comprehensive vulnerability detection
Code Interpreter: $0.18 cost (same!), superior analysis quality with evidence-based findings
Hybrid Analysis: $0.20 cost (+13%), maximum depth with cross-validation

Each approach finds the same core vulnerabilities but with different levels of detail and analysis quality. Directory-based analysis provides comprehensive project coverage in a single scan.

Data Analysis with Code Interpreter

# Upload data for analysis and visualization
ostruct run analysis.j2 schemas/analysis_result.json \
  -fc sales_data.csv -fc customer_data.json \
  -fs reports/ -ft config.yaml

Perform sophisticated data analysis using Python execution, generate visualizations, and create comprehensive reports with document context.

Configuration Validation & Analysis

# Traditional file comparison
ostruct run prompts/task.j2 schemas/validation_result.json \
  -f dev examples/basic/dev.yaml -f prod examples/basic/prod.yaml

# Enhanced with environment context
ostruct run prompts/task.j2 schemas/validation_result.json \
  -ft dev.yaml -ft prod.yaml -fs infrastructure_docs/

Validate configuration files across environments with documentation context for better analysis and recommendations.

Oh, and also, among endless other use cases:

Etymology Analysis

ostruct run prompts/task.j2 schemas/etymology.json -ft examples/scientific.txt

Break down words into their components, showing their origins, meanings, and hierarchical relationships. Useful for linguistics, educational tools, and understanding terminology in specialized fields.

Features

Core Capabilities

Generate structured JSON output defined by dynamic prompts using OpenAI models and JSON schemas
Rich template system for defining prompts (Jinja2-based)
Automatic token counting and context window management
Streaming support for real-time output
Secure handling of sensitive data with comprehensive path validation
Automatic prompt optimization and token management

Multi-Tool Integration

Code Interpreter: Upload and analyze data files, execute Python code, generate visualizations
File Search: Vector-based document search and retrieval from uploaded files
Web Search: Real-time information retrieval and current data access via OpenAI's web search tool
MCP Servers: Connect to Model Context Protocol servers for extended functionality
Explicit File Routing: Route different files to specific tools for optimized processing

Advanced Features

Configuration System: YAML-based configuration with environment variable support
Unattended Operation: Designed for CI/CD and automation scenarios
Progress Reporting: Real-time progress updates with clear, user-friendly messaging
Model Registry: Dynamic model management with support for latest OpenAI models

Requirements

Python 3.10 or higher

Installation

For Users

To install the latest stable version from PyPI:

pip install ostruct-cli

For Developers

If you plan to contribute to the project, see the Development Setup section below for instructions on setting up the development environment with Poetry.

Environment Variables

ostruct-cli respects the following environment variables:

OPENAI_API_KEY: Your OpenAI API key (required unless provided via command line)
OPENAI_API_BASE: Custom API base URL (optional)
OPENAI_API_VERSION: API version to use (optional)
OPENAI_API_TYPE: API type (e.g., "azure") (optional)
OSTRUCT_DISABLE_REGISTRY_UPDATE_CHECKS: Set to "1", "true", or "yes" to disable automatic registry update checks
MCP_<NAME>_URL: Custom MCP server URLs (e.g., MCP_STRIPE_URL=https://mcp.stripe.com)

Shell Completion Setup (Click to expand)

ostruct-cli supports shell completion for Bash, Zsh, and Fish shells. To enable it:

Bash

Add this to your ~/.bashrc:

eval "$(_OSTRUCT_COMPLETE=bash_source ostruct)"

Zsh

Add this to your ~/.zshrc:

eval "$(_OSTRUCT_COMPLETE=zsh_source ostruct)"

Fish

Add this to your ~/.config/fish/completions/ostruct.fish:

eval (env _OSTRUCT_COMPLETE=fish_source ostruct)

After adding the appropriate line, restart your shell or source the configuration file. Shell completion will help you with:

Command options and their arguments
File paths for template and schema files
Directory paths for -d and --base-dir options
And more!

Enhanced CLI with Multi-Tool Integration

Migration Notice

ostruct now includes powerful multi-tool integration while maintaining full backward compatibility. All existing commands continue to work exactly as before, but you can now take advantage of:

Code Interpreter for data analysis and visualization
File Search for document retrieval
Web Search for real-time information access
MCP Servers for extended functionality
Explicit File Routing for optimized processing

New File Routing Options (Click to expand)

Basic File Routing (Explicit Tool Assignment)

# Template access only (config files, small data)
ostruct run template.j2 schema.json -ft config.yaml

# Code Interpreter (data analysis, code execution)
ostruct run analysis.j2 schema.json -fc data.csv

# File Search (document retrieval)
ostruct run search.j2 schema.json -fs documentation.pdf

# Web Search (real-time information)
ostruct run research.j2 schema.json --web-search -V topic="latest AI developments"

# Multiple tools with one file
ostruct run template.j2 schema.json --file-for code-interpreter shared.json --file-for file-search shared.json

Directory Routing

ostruct provides two directory routing patterns to match different use cases:

Auto-Naming Pattern (for known directory structures):

# Variables are auto-generated from directory contents
ostruct run template.j2 schema.json -dt ./config -dc ./datasets -ds ./docs
# Creates variables like: config_yaml, datasets_csv, docs_pdf (based on actual files)

Alias Pattern (for generic, reusable templates):

# Create stable variable names regardless of directory contents
ostruct run template.j2 schema.json --dta app_config ./config --dca data ./datasets --dsa knowledge ./docs
# Creates stable variables: app_config, data, knowledge (always these names)

When to Use Each Pattern:

Use auto-naming (-dt, -dc, -ds) when your template knows the specific directory structure
Use alias syntax (--dta, --dca, --dsa) when your template is generic and needs stable variable names

Template Example:

{# Works with alias pattern - variables are predictable #}
{% for file in app_config %}
Configuration: {{ file.name }} = {{ file.content }}
{% endfor %}

{# Analysis data from stable variable name #}
{% for file in data %}
Processing: {{ file.path }}
{% endfor %}

This design pattern makes templates reusable across different projects while maintaining full backward compatibility.

MCP Server Integration

# Connect to MCP servers for extended capabilities
ostruct run template.j2 schema.json --mcp-server deepwiki@https://mcp.deepwiki.com/sse

Configuration System

Create an ostruct.yaml file for persistent settings:

models:
  default: gpt-4o

tools:
  code_interpreter:
    auto_download: true
    output_directory: "./output"

mcp:
  custom_server: "https://my-mcp-server.com"

limits:
  max_cost_per_run: 10.00

Load custom configuration:

ostruct --config my-config.yaml run template.j2 schema.json

Get Started Quickly

🚀 New to ostruct? Follow our step-by-step quickstart guide featuring Juno the beagle for a hands-on introduction.

📖 Full Documentation: https://ostruct.readthedocs.io/

Quick Start

Set your OpenAI API key:

export OPENAI_API_KEY=your-api-key

Example 1: Basic Text Extraction (Simplest)

Create a template file extract_person.j2:

Extract information about the person from this text: {{ stdin }}

Create a schema file schema.json:

{
  "type": "object",
  "properties": {
    "person": {
      "type": "object",
      "properties": {
        "name": {"type": "string", "description": "The person's full name"},
        "age": {"type": "integer", "description": "The person's age"},
        "occupation": {"type": "string", "description": "The person's job"}
      },
      "required": ["name", "age", "occupation"],
      "additionalProperties": false
    }
  },
  "required": ["person"],
  "additionalProperties": false
}

Run the CLI:

# Basic usage
echo "John Smith is a 35-year-old software engineer" | ostruct run extract_person.j2 schema.json

# With enhanced options
echo "John Smith is a 35-year-old software engineer" | \
  ostruct run extract_person.j2 schema.json \
  --model gpt-4o \
  --temperature 0.7

Example 2: Multi-Tool Data Analysis

Create an analysis template analysis_template.j2:

Analyze the following data sources:

{% if sales_data_csv is defined %}
Sales Data: {{ sales_data_csv.name }} ({{ sales_data_csv.size }} bytes)
{% endif %}

{% if customer_data_json is defined %}
Customer Data: {{ customer_data_json.name }} ({{ customer_data_json.size }} bytes)
{% endif %}

{% if market_reports_pdf is defined %}
Market Reports: {{ market_reports_pdf.name }} ({{ market_reports_pdf.size }} bytes)
{% endif %}

{% if config_yaml is defined %}
Configuration: {{ config_yaml.content }}
{% endif %}

Provide comprehensive analysis and actionable insights.

Create an analysis schema analysis_schema.json:

{
  "type": "object",
  "properties": {
    "analysis": {
      "type": "object",
      "properties": {
        "insights": {"type": "string", "description": "Key insights from the data"},
        "recommendations": {
          "type": "array",
          "items": {"type": "string"},
          "description": "Actionable recommendations"
        },
        "data_quality": {"type": "string", "description": "Assessment of data quality"}
      },
      "required": ["insights", "recommendations"],
      "additionalProperties": false
    }
  },
  "required": ["analysis"],
  "additionalProperties": false
}

For more complex scenarios, use explicit file routing with flexible syntax options:

# Auto-naming (fastest for one-off analysis)
ostruct run analysis_template.j2 analysis_schema.json \
  -fc sales_data.csv \
  -fc customer_data.json \
  -fs market_reports.pdf \
  -ft config.yaml

# Mixed syntax with custom variable names
ostruct run analysis_template.j2 analysis_schema.json \
  -fc sales_data.csv \
  -fc customers customer_data.json \
  --fsa reports market_reports.pdf \
  --fta app_config config.yaml

# Alias syntax for reusable templates (best tab completion)
ostruct run reusable_analysis.j2 analysis_schema.json \
  --fca sales_data sales_data.csv \
  --fca customer_data customer_data.json \
  --fsa market_reports market_reports.pdf \
  --fta config config.yaml

# Code review with stable variable names
ostruct run code_review.j2 review_schema.json \
  --fca source_code source_code/ \
  --fsa documentation docs/ \
  --fta eslint_config .eslintrc.json

Example 3: Legacy Compatibility

All existing commands continue to work unchanged:

# Traditional usage (fully supported)
ostruct run extract_from_file.j2 schema.json -f text input.txt -d configs
ostruct run template.j2 schema.json -p "*.py" source -V env=prod

System Prompt Handling (Click to expand)

ostruct-cli provides three ways to specify a system prompt, with a clear precedence order:

Command-line option (--sys-prompt or --sys-file):

# Direct string
ostruct run template.j2 schema.json --sys-prompt "You are an expert analyst"

# From file
ostruct run template.j2 schema.json --sys-file system_prompt.txt

Template frontmatter:

---
system_prompt: You are an expert analyst
---
Extract information from: {{ text }}

Shared system prompts (with template frontmatter):

---
include_system: shared/base_analyst.txt
system_prompt: Focus on financial metrics
---
Extract information from: {{ text }}

Default system prompt (built into the CLI)

Precedence Rules

When multiple system prompts are provided, they are resolved in this order:

Command-line options take highest precedence:
- If both --sys-prompt and --sys-file are provided, --sys-prompt wins
- Use --ignore-task-sysprompt to ignore template frontmatter
Template frontmatter is used if:
- No command-line options are provided
- --ignore-task-sysprompt is not set
Default system prompt is used only if no other prompts are provided

Example combining multiple sources:

# Command-line prompt will override template frontmatter
ostruct run template.j2 schema.json --sys-prompt "Override prompt"

# Ignore template frontmatter and use default
ostruct run template.j2 schema.json --ignore-task-sysprompt

Model Registry Management

ostruct-cli maintains a registry of OpenAI models and their capabilities, which includes:

Context window sizes for each model
Maximum output token limits
Supported parameters and their constraints
Model version information

To ensure you're using the latest models and features, you can update the registry:

# Update from the official repository
ostruct update-registry

# Update from a custom URL
ostruct update-registry --url https://example.com/models.yml

# Force an update even if the registry is current
ostruct update-registry --force

This is especially useful when:

New OpenAI models are released
Model capabilities or parameters change
You need to work with custom model configurations

The registry file is stored at ~/.openai_structured/config/models.yml and is automatically referenced when validating model parameters and token limits.

The update command uses HTTP conditional requests (If-Modified-Since headers) to check if the remote registry has changed before downloading, ensuring efficient updates.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.6.1

Aug 2, 2025

1.6.0

Aug 2, 2025

1.5.0

Jul 30, 2025

1.4.1

Jul 21, 2025

1.4.0

Jul 20, 2025

1.3.0

Jul 14, 2025

1.2.0

Jul 6, 2025

1.1.0

Jun 30, 2025

1.0.3

Jun 29, 2025

1.0.2

Jun 28, 2025

1.0.1

Jun 28, 2025

1.0.0

Jun 28, 2025

0.8.29

Jun 15, 2025

0.8.8

Jun 5, 2025

0.8.7

Jun 5, 2025

0.8.4

Jun 5, 2025

0.8.3

Jun 5, 2025

0.8.2

Jun 1, 2025

This version

0.8.1

Jun 1, 2025

0.8.0

Jun 1, 2025

0.7.2

Mar 15, 2025

0.7.1

Mar 15, 2025

0.7.0

Mar 1, 2025

0.6.2

Feb 22, 2025

0.6.1

Feb 18, 2025

0.6.0

Feb 17, 2025

0.5.0

Feb 16, 2025

0.4.0

Feb 8, 2025

0.3.0

Feb 3, 2025

0.2.0

Jan 29, 2025

0.1.4

Jan 29, 2025

0.1.2

Jan 29, 2025

0.1.0

Jan 29, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ostruct_cli-0.8.1.tar.gz (158.7 kB view details)

Uploaded Jun 1, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ostruct_cli-0.8.1-py3-none-any.whl (184.2 kB view details)

Uploaded Jun 1, 2025 Python 3

File details

Details for the file ostruct_cli-0.8.1.tar.gz.

File metadata

Download URL: ostruct_cli-0.8.1.tar.gz
Upload date: Jun 1, 2025
Size: 158.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ostruct_cli-0.8.1.tar.gz
Algorithm	Hash digest
SHA256	`5389a419db8f6bfb28b0c90d4d1d2a235d04ab01f03348d847f3b3f88b4f3042`
MD5	`e96ec20d2448f540e8c0be9e125c3e0c`
BLAKE2b-256	`c5e29dc534318077366a3c8387a713dea87813345d34857b642ed62f01e960ae`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ostruct_cli-0.8.1.tar.gz:

Publisher: publish.yml on yaniv-golan/ostruct

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ostruct_cli-0.8.1.tar.gz
- Subject digest: 5389a419db8f6bfb28b0c90d4d1d2a235d04ab01f03348d847f3b3f88b4f3042
- Sigstore transparency entry: 226911625
- Sigstore integration time: Jun 1, 2025
Source repository:
- Permalink: yaniv-golan/ostruct@c2bc08a45601da6294ffe8d893265524c653e2f4
- Branch / Tag: refs/tags/v0.8.1
- Owner: https://github.com/yaniv-golan
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@c2bc08a45601da6294ffe8d893265524c653e2f4
- Trigger Event: release

File details

Details for the file ostruct_cli-0.8.1-py3-none-any.whl.

File metadata

Download URL: ostruct_cli-0.8.1-py3-none-any.whl
Upload date: Jun 1, 2025
Size: 184.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ostruct_cli-0.8.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f6713a36051c2f66bd920e96dc7b635ec7889f33a4bcfd59fe6f4c564430d7af`
MD5	`a3c31791d0764e591aa3b829573d0f9c`
BLAKE2b-256	`b61fb3bffd092b51adfd53ff17aec89ef9fc2d3dceecffe089ca6c535c6cebf7`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ostruct_cli-0.8.1-py3-none-any.whl:

Publisher: publish.yml on yaniv-golan/ostruct

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ostruct_cli-0.8.1-py3-none-any.whl
- Subject digest: f6713a36051c2f66bd920e96dc7b635ec7889f33a4bcfd59fe6f4c564430d7af
- Sigstore transparency entry: 226911627
- Sigstore integration time: Jun 1, 2025
Source repository:
- Permalink: yaniv-golan/ostruct@c2bc08a45601da6294ffe8d893265524c653e2f4
- Branch / Tag: refs/tags/v0.8.1
- Owner: https://github.com/yaniv-golan
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@c2bc08a45601da6294ffe8d893265524c653e2f4
- Trigger Event: release

ostruct-cli 0.8.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

02:07 A.M. CI Failure

ostruct-cli

Why ostruct?

Real-World Use Cases

Automated Code Review with Multi-Tool Analysis

Security Vulnerability Scanning

Data Analysis with Code Interpreter

Configuration Validation & Analysis

Etymology Analysis

Features

Core Capabilities

Multi-Tool Integration

Advanced Features

Requirements

Installation

For Users

For Developers

Environment Variables

Bash

Zsh

Fish

Enhanced CLI with Multi-Tool Integration

Migration Notice

Basic File Routing (Explicit Tool Assignment)

Directory Routing

MCP Server Integration

Configuration System

Get Started Quickly

Quick Start

Example 1: Basic Text Extraction (Simplest)

Example 2: Multi-Tool Data Analysis

Example 3: Legacy Compatibility

Precedence Rules

Model Registry Management

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance