Skip to main content

OpenAI Guardrails: A framework for building safe and reliable AI systems.

Project description

OpenAI Guardrails: Python (Preview)

This is the Python version of OpenAI Guardrails, a package for adding configurable safety and compliance guardrails to LLM applications. It provides a drop-in wrapper for OpenAI's Python client, enabling automatic input/output validation and moderation using a wide range of guardrails.

Most users can simply follow the guided configuration and installation instructions at guardrails.openai.com.

OpenAI Guardrails configuration screenshot

Installation

You can download openai-guardrails package this way:

pip install openai-guardrails

Usage

Follow the configuration and installation instructions at guardrails.openai.com.

Local Development

Clone the repository and install locally:

# Clone the repository
git clone https://github.com/openai/openai-guardrails-python.git
cd openai-guardrails-python

# Install the package (editable), plus example extras if desired
pip install -e .
pip install -e ".[examples]"

Integration Details

Drop-in OpenAI Replacement

The easiest way to use Guardrails Python is as a drop-in replacement for the OpenAI client:

from pathlib import Path
from guardrails import GuardrailsOpenAI, GuardrailTripwireTriggered

# Use GuardrailsOpenAI instead of OpenAI
client = GuardrailsOpenAI(config=Path("guardrail_config.json"))

try:
    # Works with standard Chat Completions
    chat = client.chat.completions.create(
        model="gpt-5",
        messages=[{"role": "user", "content": "Hello world"}],
    )
    print(chat.choices[0].message.content)

    # Or with the Responses API
    resp = client.responses.create(
        model="gpt-5",
        input="What are the main features of your premium plan?",
    )
    print(resp.output_text)
except GuardrailTripwireTriggered as e:
    print(f"Guardrail triggered: {e}")

Agents SDK Integration

You can integrate guardrails with the OpenAI Agents SDK via GuardrailAgent:

import asyncio
from pathlib import Path
from agents import InputGuardrailTripwireTriggered, OutputGuardrailTripwireTriggered, Runner
from agents.run import RunConfig
from guardrails import GuardrailAgent

# Create agent with guardrails automatically configured
agent = GuardrailAgent(
    config=Path("guardrails_config.json"),
    name="Customer support agent",
    instructions="You are a customer support agent. You help customers with their questions.",
)

async def main():
    try:
        result = await Runner.run(agent, "Hello, can you help me?", run_config=RunConfig(tracing_disabled=True))
        print(result.final_output)
    except (InputGuardrailTripwireTriggered, OutputGuardrailTripwireTriggered):
        print("🛑 Guardrail triggered!")

if __name__ == "__main__":
    asyncio.run(main())

For more details, see docs/agents_sdk_integration.md.

Evaluation Framework

Evaluate guardrail performance on labeled datasets and run benchmarks.

Running Evaluations

# Basic evaluation
python -m guardrails.evals.guardrail_evals \
  --config-path guardrails_config.json \
  --dataset-path data.jsonl

# Benchmark mode (compare models, generate ROC curves, latency)
python -m guardrails.evals.guardrail_evals \
  --config-path guardrails_config.json \
  --dataset-path data.jsonl \
  --mode benchmark \
  --models gpt-5 gpt-5-mini gpt-4.1-mini

Dataset Format

Datasets must be in JSONL format, with each line containing a JSON object:

{
  "id": "sample_1",
  "data": "Text or conversation to evaluate",
  "expected_triggers": {
    "Moderation": true,
    "NSFW Text": false
  }
}

Programmatic Usage

from pathlib import Path
from guardrails.evals.guardrail_evals import GuardrailEval

eval = GuardrailEval(
    config_path=Path("guardrails_config.json"),
    dataset_path=Path("data.jsonl"),
    batch_size=32,
    output_dir=Path("results"),
)

import asyncio
asyncio.run(eval.run())

Project Structure

  • src/guardrails/ - Python source code
  • src/guardrails/checks/ - Built-in guardrail checks
  • src/guardrails/evals/ - Evaluation framework
  • examples/ - Example usage and sample configs

Examples

The package includes examples in the examples/ directory:

  • examples/basic/hello_world.py — Basic chatbot with guardrails using GuardrailsOpenAI
  • examples/basic/agents_sdk.py — Agents SDK integration with GuardrailAgent
  • examples/basic/local_model.py — Using local models with guardrails
  • examples/basic/structured_outputs_example.py — Structured outputs
  • examples/basic/pii_mask_example.py — PII masking
  • examples/basic/suppress_tripwire.py — Handling violations gracefully

Running Examples

Prerequisites

pip install -e .
pip install "openai-guardrails[examples]"

Run

python examples/basic/hello_world.py
python examples/basic/agents_sdk.py

Available Guardrails

The Python implementation includes the following built-in guardrails:

  • Moderation: Content moderation using OpenAI's moderation API
  • URL Filter: URL filtering and domain allowlist/blocklist
  • Contains PII: Personally Identifiable Information detection
  • Hallucination Detection: Detects hallucinated content using vector stores
  • Jailbreak: Detects jailbreak attempts
  • NSFW Text: Detects workplace-inappropriate content in model outputs
  • Off Topic Prompts: Ensures responses stay within business scope
  • Custom Prompt Check: Custom LLM-based guardrails

For full details, advanced usage, and API reference, see: OpenAI Guardrails Documentation.

License

MIT License - see LICENSE file for details.

Disclaimers

Please note that Guardrails may use Third-Party Services such as the Presidio open-source framework, which are subject to their own terms and conditions and are not developed or verified by OpenAI.

Developers are responsible for implementing appropriate safeguards to prevent storage or misuse of sensitive or prohibited content (including but not limited to personal data, child sexual abuse material, or other illegal content). OpenAI disclaims liability for any logging or retention of such content by developers. Developers must ensure their systems comply with all applicable data protection and content safety laws, and should avoid persisting any blocked content generated or intercepted by Guardrails. Guardrails calls paid OpenAI APIs, and developers are responsible for associated charges.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openai_guardrails-0.2.1.tar.gz (2.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openai_guardrails-0.2.1-py3-none-any.whl (160.4 kB view details)

Uploaded Python 3

File details

Details for the file openai_guardrails-0.2.1.tar.gz.

File metadata

  • Download URL: openai_guardrails-0.2.1.tar.gz
  • Upload date:
  • Size: 2.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for openai_guardrails-0.2.1.tar.gz
Algorithm Hash digest
SHA256 71e4d76744c60deb0de5640b9f65e8437e826d4dbc7bc34dd3aac4f57f0df168
MD5 b982f1474bcca6ec55a718dca0464334
BLAKE2b-256 bbe4163e05d8643c9abaaf90ea2ff2447bd0bf87ddb70da631f3fc4f048dae99

See more details on using hashes here.

Provenance

The following attestation bundles were made for openai_guardrails-0.2.1.tar.gz:

Publisher: publish.yml on openai/openai-guardrails-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file openai_guardrails-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for openai_guardrails-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d2d27735ec0d8a801c2398b600ecb484ae063f93502ead146be2ad36436eba36
MD5 e838c6db8a0ebb8fd98aec5a35e2bd9e
BLAKE2b-256 ffdeb02f27f8a022804e64b09dfa1f0ca477546dbb3b83257a5961fa2013d1ac

See more details on using hashes here.

Provenance

The following attestation bundles were made for openai_guardrails-0.2.1-py3-none-any.whl:

Publisher: publish.yml on openai/openai-guardrails-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page