Guardrails that can be used to check inputs and outputs of functions and works well with Openlayer tracing.
Project description
Openlayer Guardrails
Open source guardrail implementations that work with Openlayer tracing.
Installation
pip install openlayer-guardrails
Usage
Standalone Usage
from openlayer_guardrails import PIIGuardrail
# Create guardrail
pii_guard = PIIGuardrail(
block_entities={"CREDIT_CARD", "US_SSN"},
redact_entities={"EMAIL_ADDRESS", "PHONE_NUMBER"}
)
# Check inputs manually
data = {"message": "My email is john@example.com and SSN is 123-45-6789"}
result = pii_guard.check_input(data)
if result.action.value == "block":
print(f"Blocked: {result.reason}")
elif result.action.value == "modify":
print(f"Modified data: {result.modified_data}")
With Openlayer Tracing
from openlayer_guardrails import PIIGuardrail
from openlayer.lib.tracing import trace
# Create guardrail
pii_guard = PIIGuardrail()
# Apply to traced functions
@trace(guardrails=[pii_guard])
def process_user_data(user_input: str):
return f"Processed: {user_input}"
# PII is automatically handled
result = process_user_data("My email is john@example.com")
# Output: "Processed: My email is [EMAIL-REDACTED]"
Toxicity Guardrail (Brazilian Portuguese)
Detects toxic content in Brazilian Portuguese using the ToxiGuardrailPT model.
pip install 'openlayer-guardrails[toxicity]'
from openlayer_guardrails import ToxicityPTGuardrail
# Create guardrail (default threshold=0.0; positive scores = safe, negative = toxic)
toxicity_guard = ToxicityPTGuardrail()
# Check inputs
result = toxicity_guard.check_input({"message": "Você é um idiota!"})
print(result.action) # GuardrailAction.BLOCK
# Check outputs with contextual scoring (sentence-pair encoding)
result = toxicity_guard.check_output(
output="Claro, aqui está a informação solicitada.",
inputs={"prompt": "Me ajude com meu trabalho."},
)
print(result.action) # GuardrailAction.ALLOW
Toxicity Guardrail (English)
Detects toxic content in English across six categories using unitary/toxic-bert.
pip install 'openlayer-guardrails[toxicity]'
from openlayer_guardrails import ToxicityENGuardrail
# Create guardrail (default threshold=0.5)
toxicity_guard = ToxicityENGuardrail()
# Check inputs
result = toxicity_guard.check_input({"message": "You are terrible and should die"})
print(result.action) # GuardrailAction.BLOCK
print(result.metadata["triggered_categories"])
# e.g. {'toxic': 0.98, 'severe_toxic': 0.72, 'insult': 0.89, 'threat': 0.81}
# Monitor only specific categories
guard = ToxicityENGuardrail(categories={"threat", "severe_toxic"})
Handling long texts
By default, all guardrails truncate inputs to 512 tokens for fast inference.
To evaluate the full text, enable chunking mode by setting max_length=None:
guard = ToxicityPTGuardrail(max_length=None) # or ToxicityENGuardrail(max_length=None)
In chunking mode, long texts are split into overlapping 512-token windows and each window is scored independently. The most toxic score across all windows is used. Latency scales linearly with the number of chunks.
Model Limitations
Prompt Injection Guardrail
| Property | Value |
|---|---|
| Model | meta-llama/Prompt-Guard-86M |
| Max tokens | 512 |
| Language | English |
| Parameters | 86M |
| Scope | Input-only (outputs are not checked) |
Texts longer than 512 tokens are truncated. Only the first 512 tokens are evaluated.
Toxicity Guardrail (PT-BR)
| Property | Value |
|---|---|
| Model | nicholasKluge/ToxiGuardrailPT |
| Max tokens | 512 |
| Language | Brazilian Portuguese |
| Parameters | 109M |
| Architecture | BERTimbau (bert-base-portuguese-cased) |
| Output type | Single scalar reward score (positive = safe, negative = toxic) |
| Reported accuracy | 70.36% (hatecheck-portuguese), 74.04% (told-br) |
| Scope | Input and output (output uses sentence-pair encoding for context) |
By default, texts longer than 512 tokens are truncated. Set max_length=None to enable chunking for full-text coverage. The model was trained on Brazilian Portuguese data and may not generalize well to European Portuguese or other languages.
Toxicity Guardrail (EN)
| Property | Value |
|---|---|
| Model | unitary/toxic-bert |
| Max tokens | 512 (chunking available via max_length=None) |
| Language | English |
| Parameters | 110M |
| Architecture | BERT (bert-base-uncased) |
| Output type | Multi-label probabilities across 6 categories |
| Categories | toxic, severe_toxic, obscene, threat, insult, identity_hate |
| Reported AUC | 0.98636 (Jigsaw Toxic Comment Challenge) |
| Scope | Input and output |
By default, texts longer than 512 tokens are truncated. Set max_length=None to enable chunking for full-text coverage. The model was trained on English data.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file openlayer_guardrails-0.4.1.tar.gz.
File metadata
- Download URL: openlayer_guardrails-0.4.1.tar.gz
- Upload date:
- Size: 166.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a60096da8ae4f2aa4152f0dbcb3b8ecc9da1c048aed45bc78b288999713a1760
|
|
| MD5 |
1fb29c6e72fb57c6958749a389efc79b
|
|
| BLAKE2b-256 |
3f4abbcf83832eb680e917bbf2a77112fb1dbd39a5bc2eb9b91a7e5b46254135
|
File details
Details for the file openlayer_guardrails-0.4.1-py3-none-any.whl.
File metadata
- Download URL: openlayer_guardrails-0.4.1-py3-none-any.whl
- Upload date:
- Size: 19.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
82d215eb9bf463f2e8ef860c50ccb8b6c9fb214e179c3cfc7e984573b2c86c3a
|
|
| MD5 |
02f57965923e5a715fbd9eefe707d3ba
|
|
| BLAKE2b-256 |
7cb9aef720bb837ebcc2d83f5316b27e30182bb825838deb497833d02b74e1fd
|