Skip to main content

Structure Outputs from Language Models

Project description

Outformer Logo

Outformer: Structure Outputs from Language Models

PyPI - Package Version Python Versions License Docs - GitHub.io

Outformer is a powerful library that enables language models to generate structured outputs. It ensures always valid JSON outputs by generating only values while maintaining the structural integrity of your schema.

Features

  • 🔄 Structured Output Generation: Generate valid JSON outputs from language models
  • 🎯 Schema Validation: Ensure outputs conform to your JSON schema
  • 🛠️ Flexible Integration: Works with any Hugging Face transformer model
  • 🚀 Easy to Use: Simple API with minimal configuration
  • 🎨 Value Highlighting: Visualize generated values in your JSON structure

Installation

We recommend Python 3.10+, PyTorch 2.7.0+, transformers v4.51.3+.

Install via pip

pip install outformer

Install from source

git clone https://github.com/milistu/outformer.git
cd outformer
pip install -e .

Quick Start

Here's a simple example to get you started:

Quick Start Example

Click to expand code example
from outformer import Jsonformer, highlight_values
from transformers import AutoModelForCausalLM, AutoTokenizer

# Initialize model and tokenizer
model_name = "Qwen/Qwen3-1.7B"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Create Jsonformer instance
jsonformer = Jsonformer(model, tokenizer, max_tokens_string=30)

# Define your JSON schema
json_schema = {
    "type": "object",
    "properties": {
        "brand": {
            "type": "string",
            "description": "Brand of the product",
        },
        "model": {
            "type": "string",
            "description": "Model of the product",
        },
        "product_type": {
            "type": "string",
            "description": "Type of the product",
        },
        "gender": {
            "type": "string",
            "enum": ["Female", "Male", "Unisex"],
        },
        "color": {
            "type": "string",
            "description": "Color of the product if specified, otherwise return 'Unknown'",
        },
        "material": {
            "type": "string",
            "description": "Material of the product if specified, otherwise return 'Unknown'",
        },
        "features": {
            "type": "array",
            "minItems": 3,
            "items": {
                "type": "string",
                "description": "Features of the product that may be relevant for the customer. Extract as much as possible.",
            },
        },
    },
}

# Your input prompt
prompt = """
Extract key information from the product description:

adidas Men's Powerlift.3 Cross-Trainer Shoes
A powerful shoe with lockdown fit. Made with an extra-wide design that allows the foot to spread, these men's lifting/weight-training shoes pair a snug-fitting upper with a wide midfoot strap for extra support. A high-density die-cut wedge midsole keeps you close to the ground.
100% Synthetic leather
Imported
Rubber sole
Removable Insole
"""

# Generate structured output
generated_data = jsonformer.generate(schema=json_schema, prompt=prompt)

# Highlight generated values
highlight_values(generated_data)

The code above will generate a structured JSON output and display it with highlighted values. Here's what you'll get:

{
    "brand": "Adidas",
    "model": "Powerlift.3 Cross-Trainer Shoes",
    "product_type": "Cross-Trainer Shoes",
    "gender": "Male",
    "color": "Unknown",
    "material": "Synthetic leather",
    "features": [
        "Lockdown fit",
        "Extra-wide design",
        "High-density die-cut wedge midsole",
    ],
}

When using highlight_values(), the output will be displayed in your terminal with the generated values highlighted in color, making it easy to distinguish between the structure and the generated content.

Advanced Usage

Configuration Options

The Jsonformer class accepts several configuration parameters:

  • debug (bool): Enable debug mode for detailed generation process
  • max_array_length (int): Maximum number of elements in an array
  • max_tokens_number (int): Maximum number of tokens for number generation
  • max_tokens_string (int): Maximum number of tokens for string generation
  • temperature (float): Sampling temperature for generation
  • generation_marker (str): Marker for tracking generation position
  • max_attempts (int): Maximum attempts for value generation

Supported JSON Schema Features

  • Basic types: string, number, boolean
  • Arrays with min/max items
  • Objects with nested properties
  • Enums for constrained string values
  • Descriptions for better generation context

Contributing

We welcome contributions! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citing & Authors

The idea for this repository was inspired by jsonformer.

Maintainer: Milutin Studen

Support

If you encounter any issues or have questions, please open an issue on our GitHub repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

outformer-0.1.3.tar.gz (16.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

outformer-0.1.3-py3-none-any.whl (16.4 kB view details)

Uploaded Python 3

File details

Details for the file outformer-0.1.3.tar.gz.

File metadata

  • Download URL: outformer-0.1.3.tar.gz
  • Upload date:
  • Size: 16.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.10.13 Darwin/24.5.0

File hashes

Hashes for outformer-0.1.3.tar.gz
Algorithm Hash digest
SHA256 7b6b34a936193fbcafe29c09beb76653ee37141676e621744937e8f12bfc84a5
MD5 171eab263e336e479737a6fd1f664f85
BLAKE2b-256 f127f8758a29c3c564b7a2d04f07761224819eef52e1c9bb3f687ffdd1e17b5e

See more details on using hashes here.

File details

Details for the file outformer-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: outformer-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 16.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.10.13 Darwin/24.5.0

File hashes

Hashes for outformer-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d0cd47f361e6de1bb820689f4be0c22962c2afe630e15cc5c4c22f427cc8368b
MD5 031693c1b4944e4e103e7ecb560a4a2e
BLAKE2b-256 0e108e5b79515cc5943605f45628c6002b11b3b3ecd3bfb4e5f53ed9ee759a8f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page