hiddenlayer-openai-guardrails

Guardrails for the OpenAI Agents SDK

These details have not been verified by PyPI

Project description

HiddenLayer Guardrails for OpenAI Agents (Beta)

Drop-in replacement for the Agents SDK Agent that wires HiddenLayer guardrails into agent and tool execution. Agent input/output and every tool call are sent through HiddenLayer's analyze endpoint so prompt-injection and policy violations are caught automatically.

Note: OpenAI's native guardrails only support blocking content—they do not support redacting sensitive information from input or output. This library provides redaction capabilities through HiddenLayer's REDACT action, allowing you to sanitize content while still allowing the request to proceed.

Installation

pip install hiddenlayer-openai-guardrails

Configuration

Environment Variables

The following environment variables must be set for authentication:

HIDDENLAYER_CLIENT_ID - HiddenLayer API client ID (required)
HIDDENLAYER_CLIENT_SECRET - HiddenLayer API client secret (required)

Optional environment variables:

HIDDENLAYER_PROJECT_ID - HiddenLayer project ID for policy routing
HIDDENLAYER_REQUESTER_ID - Identifier for tracking requests (default: "hiddenlayer-openai-integration")

# Required
export HIDDENLAYER_CLIENT_ID="your-client-id"
export HIDDENLAYER_CLIENT_SECRET="your-client-secret"

# Optional
export HIDDENLAYER_PROJECT_ID="your-project-id"
export HIDDENLAYER_REQUESTER_ID="your-app-name"

HiddenLayerParams

Configure HiddenLayer behavior using the HiddenLayerParams object:

from hiddenlayer_openai_guardrails import HiddenLayerParams

params = HiddenLayerParams(
    project_id="my-project",       # Optional: HiddenLayer project ID for policy routing
    model="gpt-4o-mini",            # Optional: Model name for tracking (auto-detected from agent if not set)
    requester_id="my-app-v1",      # Optional: Identifier for tracking requests (default: "hiddenlayer-openai-integration")
)

All fields are optional. If model is not provided, it will be automatically detected from the agent's model configuration.

Usage

Basic Agent with Guardrails

The Agent class mirrors agents.Agent but adds HiddenLayer guardrails to the agent and all tools. Guardrails automatically block malicious content:

from agents import Runner, function_tool
from agents.run import RunConfig
from hiddenlayer_openai_guardrails import Agent, HiddenLayerParams


@function_tool
def get_weather(city: str) -> str:
    """returns weather info for the specified city."""
    return f"The weather in {city} is sunny"


# Configure HiddenLayer parameters
params = HiddenLayerParams(
    project_id="my-project",  # optional: for policy routing
)

agent = Agent(
    name="Haiku agent",
    instructions="Always respond in haiku form",
    model="gpt-4o-mini",
    tools=[get_weather],  # tool input/output are screened by HiddenLayer
    hiddenlayer_params=params,  # optional: defaults will be used if not provided
)

result = Runner.run_sync(
    agent,
    "What's the weather in Toronto",
    run_config=RunConfig(tracing_disabled=True),
)
print(result.final_output)

Redacting Input and Output

Since OpenAI's guardrails can only block (not redact), this library provides helper functions for content redaction:

from agents import Runner
from hiddenlayer_openai_guardrails import (
    Agent,
    HiddenLayerParams,
    redact_input,
    redact_output,
    InputBlockedError,
    OutputBlockedError,
)

# Configure HiddenLayer parameters
params = HiddenLayerParams(project_id="my-project")

agent = Agent(
    name="Assistant",
    instructions="You are a helpful assistant.",
    hiddenlayer_params=params,
)

try:
    # Redact sensitive info from user input before processing
    safe_input = await redact_input(
        user_input,
        hiddenlayer_params=params,
    )

    # Run agent (guardrails will block malicious content)
    result = await Runner.run(agent, safe_input)

    # Redact sensitive info from output before showing to user
    safe_output = await redact_output(
        result.final_output,
        hiddenlayer_params=params,
    )
    print(safe_output)

except InputBlockedError:
    print("Input was blocked by HiddenLayer")
except OutputBlockedError:
    print("Output was blocked by HiddenLayer")

Safe Streaming Output

For streaming responses, use safe_stream to stream event objects while scanning the final output through HiddenLayer guardrails:

from agents import Runner
from hiddenlayer_openai_guardrails import Agent, HiddenLayerParams, safe_stream

# Configure HiddenLayer parameters
params = HiddenLayerParams(project_id="my-project")

agent = Agent(
    name="Assistant",
    instructions="Help users",
    hiddenlayer_params=params,
)
result = Runner.run_streamed(agent, user_input)

async for event in safe_stream(result, hiddenlayer_params=params):
    # event is an Agents SDK stream event (not plain text)
    print(event)

If HiddenLayer returns a BLOCK action for the final streamed output, safe_stream raises OutputBlockedError after streaming completes.

MCP Server Tools

When using MCP servers with the Agents SDK, HiddenLayer guardrails are automatically applied to dynamically discovered MCP tools:

from agents import Runner
from agents.mcp import MCPServerStreamableHttp
from agents.run import RunConfig
from hiddenlayer_openai_guardrails import Agent, HiddenLayerParams

servers = [
    MCPServerStreamableHttp(name="calculator", params={"url": "http://localhost:8000/mcp"}),
]

agent = Agent(
    name="Math agent",
    instructions="Use the calculator to answer math questions.",
    model="gpt-4o-mini",
    hiddenlayer_params=HiddenLayerParams(project_id="my-project"),
    mcp_servers=servers,
)

result = await Runner.run(
    agent,
    "What is 2 + 2?",
    run_config=RunConfig(tracing_disabled=True),
)
print(result.final_output)

MCP tool definitions are scanned through HiddenLayer at discovery time. Tools that violate policy are blocked and excluded from the agent (fail-closed). Scan results are cached per tool so repeated get_mcp_tools() calls don't re-scan the same definitions.

How it works

hiddenlayer_openai_guardrails.agents.Agent returns a regular agents.Agent configured with:
- Agent-level input/output guardrails that analyze user and assistant messages.
- Tool-level guardrails that inspect tool arguments before execution and tool output afterward.
Guardrails rely on AsyncHiddenLayer.interactions.analyze and will raise when HiddenLayer signals a blocking action.
Input guardrails scan one message per request, in order, and skip already-seen messages for the same conversation thread.
Tool guardrails currently enforce block-only behavior; REDACT actions are not applied in tool hooks.
Strict phase mapping is used: model-produced tool arguments scan as output, and tool results destined for the model scan as input.
MCP tools are scanned at discovery time — tool definitions that violate policy are excluded (fail-closed). Allowed tools receive the same input/output guardrails as regular tools. Per-tool scan results are cached so repeated discovery calls are efficient.

Development

# Install dependencies (uses uv)
uv sync

# Run unit tests (no network, mocked dependencies)
pytest tests -m unit

# Run live integration tests (requires credentials and OPENAI_API_KEY)
RUN_LIVE_INTEGRATION_TESTS=1 pytest tests -m integration

# Run all tests
pytest tests

Public API lives in src/hiddenlayer_openai_guardrails/agents.py (facade); implementation is split across:
- src/hiddenlayer_openai_guardrails/_hiddenlayer.py (HiddenLayer client + analyze calls)
- src/hiddenlayer_openai_guardrails/_guardrails.py (agent/tool/MCP guardrail wiring)
- src/hiddenlayer_openai_guardrails/_normalize.py (payload normalization)
- src/hiddenlayer_openai_guardrails/_analysis.py (response parsing)
- src/hiddenlayer_openai_guardrails/_redaction.py and src/hiddenlayer_openai_guardrails/_streaming.py
Tests are in tests/test_agents_unit.py and tests/test_agents_integration.py.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.4.0

Apr 8, 2026

0.3.0

Feb 27, 2026

0.2.0

Feb 10, 2026

0.1.0

Jan 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hiddenlayer_openai_guardrails-0.4.0.tar.gz (9.3 kB view details)

Uploaded Apr 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hiddenlayer_openai_guardrails-0.4.0-py3-none-any.whl (11.6 kB view details)

Uploaded Apr 8, 2026 Python 3

File details

Details for the file hiddenlayer_openai_guardrails-0.4.0.tar.gz.

File metadata

Download URL: hiddenlayer_openai_guardrails-0.4.0.tar.gz
Upload date: Apr 8, 2026
Size: 9.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hiddenlayer_openai_guardrails-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`8b403f9e36887a5c996946c8e1db04c2484daff30cdd2afa6be6f5bd752ad0ad`
MD5	`c9da28ba62bea610f00f338765cc8784`
BLAKE2b-256	`d362fb34b1b368f7b9a869c1d01c34c14c7e5cc8d60d233782f2b0796df07b56`

See more details on using hashes here.

Provenance

The following attestation bundles were made for hiddenlayer_openai_guardrails-0.4.0.tar.gz:

Publisher: publish.yml on hiddenlayerai/hiddenlayer-openai-guardrails

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: hiddenlayer_openai_guardrails-0.4.0.tar.gz
- Subject digest: 8b403f9e36887a5c996946c8e1db04c2484daff30cdd2afa6be6f5bd752ad0ad
- Sigstore transparency entry: 1254459346
- Sigstore integration time: Apr 8, 2026
Source repository:
- Permalink: hiddenlayerai/hiddenlayer-openai-guardrails@b23863603b7b24aa7cf9bfad42c4201962b4c8c4
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/hiddenlayerai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@b23863603b7b24aa7cf9bfad42c4201962b4c8c4
- Trigger Event: release

File details

Details for the file hiddenlayer_openai_guardrails-0.4.0-py3-none-any.whl.

File metadata

Download URL: hiddenlayer_openai_guardrails-0.4.0-py3-none-any.whl
Upload date: Apr 8, 2026
Size: 11.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hiddenlayer_openai_guardrails-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e587703ef30270f575e5a9c6f2132ed7818e41cb0d974d286de303e465447a8f`
MD5	`c51f0c7a9f160e296f644e59d33c48ce`
BLAKE2b-256	`9b7d14c88928558e3675886fe8cacc9323b32e65bef092c967489ae13976013a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for hiddenlayer_openai_guardrails-0.4.0-py3-none-any.whl:

Publisher: publish.yml on hiddenlayerai/hiddenlayer-openai-guardrails

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: hiddenlayer_openai_guardrails-0.4.0-py3-none-any.whl
- Subject digest: e587703ef30270f575e5a9c6f2132ed7818e41cb0d974d286de303e465447a8f
- Sigstore transparency entry: 1254459409
- Sigstore integration time: Apr 8, 2026
Source repository:
- Permalink: hiddenlayerai/hiddenlayer-openai-guardrails@b23863603b7b24aa7cf9bfad42c4201962b4c8c4
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/hiddenlayerai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@b23863603b7b24aa7cf9bfad42c4201962b4c8c4
- Trigger Event: release

hiddenlayer-openai-guardrails 0.4.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

HiddenLayer Guardrails for OpenAI Agents (Beta)

Installation

Configuration

Environment Variables

HiddenLayerParams

Usage

Basic Agent with Guardrails

Redacting Input and Output

Safe Streaming Output

MCP Server Tools

How it works

Development

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance