Logprobs for OpenAI Structured Outputs

These details have not been verified by PyPI

Project links

Project description

structured-logprobs

This Python library is designed to enhance OpenAI chat completion responses by adding detailed information about token log probabilities. This library works with OpenAI Structured Outputs, which is a feature that ensures the model will always generate responses that adhere to your supplied JSON Schema, so you don't need to worry about the model omitting a required key, or hallucinating an invalid enum value. It provides utilities to analyze and incorporate token-level log probabilities into structured outputs, helping developers understand the reliability of structured data extracted from OpenAI models.

Purpose

The primary goal of structured-logprobs is to provide insights into the reliability of extracted data. By analyzing token-level log probabilities, the library enables:

Understand how likely each token is based on the model's predictions.
Detect low-confidence areas in responses for further review.

Prerequisites

Before using this library, one should be familiar with:

the OpenAI API and its client.
the concept of log probabilities, a measure of the likelihood assigned to each token by the model.

Key Features

The module contains a function for mapping characters to token indices (map_characters_to_token_indices) and two methods for incorporating log probabilities:

Adding log probabilities as a separate field in the response (add_logprobs).
Embedding log probabilities inline within the message content (add_logprobs_inline).

Example

To use this library, first create a chat completion response with the OpenAI Python SDK, then enhance the response with log probabilities. Here is an example of how to do that:

from openai import OpenAI
from openai.types import ResponseFormatJSONSchema
from structured_logprobs import add_logprobs, add_logprobs_inline

# Initialize the OpenAI client
client = OpenAI(api_key="your-api-key")

schema_path = "path-to-your-json-schema"
with open(schema_path) as f:
        schema_content = json.load(f)

# Validate the schema content
response_schema = ResponseFormatJSONSchema.model_validate(schema_content)

# Create a chat completion request
completion = client.chat.completions.create(
    model="gpt-4o-2024-08-06",
    messages = [
            {
                "role": "system",
                "content": (
                    "I have three questions. The first question is: What is the capital of France? "
                    "The second question is: Which are the two nicest colors? "
                    "The third question is: Can you roll a die and tell me which number comes up?"
                ),
            }
        ],
    logprobs=True,
    response_format=response_schema.model_dump(by_alias=True),
)

chat_completion = add_logprobs(completion)
chat_completion_inline = add_logprobs_inline(completion)
print(chat_completion.log_probs[0])
{'capital_of_France': -5.5122365e-07, 'the_two_nicest_colors': [-0.0033997903, -0.011364183612649998], 'die_shows': -0.48048785}
print(chat_completion_inline.choices[0].message.content)
{"capital_of_France": "Paris", "capital_of_France_logprob": -6.704273e-07, "the_two_nicest_colors": ["blue", "green"], "die_shows": 5.0, "die_shows_logprob": -2.3782086}

Example JSON Schema

The response_format in the request body is an object specifying the format that the model must output. Setting to { "type": "json_schema", "json_schema": {...} } ensures the model will match your supplied JSON schema.

Below is the example of the JSON file that defines the schema used for validating the responses.

{
    "type": "json_schema",
    "json_schema": {
        "name": "answears",
        "description": "Response to questions in JSON format",
        "schema": {
            "type": "object",
            "properties": {
                "capital_of_France": { "type": "string" },
                "the_two_nicest_colors": {
                    "type": "array",
                    "items": {
                        "type": "string",
                        "enum": ["red", "blue", "green", "yellow", "purple"]
                    }
                },
                "die_shows": { "type": "number" }
            },
            "required": ["capital_of_France", "the_two_nicest_colors", "die_shows"],
            "additionalProperties": false
        },
        "strict": true
    }
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.5

Jan 24, 2025

0.1.4

Jan 14, 2025

0.1.3

Jan 14, 2025

0.1.2

Jan 14, 2025

0.1.1

Jan 14, 2025

0.1.0

Jan 14, 2025

0.0.2

Jan 10, 2025

This version

0.0.1

Jan 10, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

structured_logprobs-0.0.1.tar.gz (8.9 kB view details)

Uploaded Jan 10, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

structured_logprobs-0.0.1-py3-none-any.whl (7.4 kB view details)

Uploaded Jan 10, 2025 Python 3

File details

Details for the file structured_logprobs-0.0.1.tar.gz.

File metadata

Download URL: structured_logprobs-0.0.1.tar.gz
Upload date: Jan 10, 2025
Size: 8.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for structured_logprobs-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`b341c1a5578f7d55c9db7204a20790b494d121eb5cdd4b787c77b0f8b0359a62`
MD5	`da03307ab61a456bd4519d5f84d85fb5`
BLAKE2b-256	`a5e033dc45147c105f4e49da952bac1ec80424987833e59e60202392462bd41c`

See more details on using hashes here.

File details

Details for the file structured_logprobs-0.0.1-py3-none-any.whl.

File metadata

Download URL: structured_logprobs-0.0.1-py3-none-any.whl
Upload date: Jan 10, 2025
Size: 7.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for structured_logprobs-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3ef7a5d6f5090f68f190bfae78a66f386e255e9537d4e4890cd414728c8d6457`
MD5	`227aaa242453cf112684cc10089cc4ac`
BLAKE2b-256	`df42179bfde8f2d41d5ffc4017b81f24fc400fef8faaff2cc3cc3277ac5a6c54`

See more details on using hashes here.

structured-logprobs 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

structured-logprobs

Purpose

Prerequisites

Key Features

Example

Example JSON Schema

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes