Skip to main content

Python SDK for interacting with the Coralogix Guardrails API.

Project description

Coralogix Guardrails

Python SDK for protecting your LLM applications with content evaluation.

Installation

pip install cx-guardrails

🚀 Getting Started

Method Use Case Input
guard_prompt() Guard user input before LLM call prompt
guard_response() Guard LLM output after generation response, prompt (optional)
guard() Full control over message history List of messages

🛡️ Available Guardrails

Guardrail Description Usage
PII Detection Detects personally identifiable information PII()
Prompt Injection Detects attempts to manipulate LLM behavior PromptInjection()
Toxicity Detects toxic, harmful, or offensive content Toxicity()
Custom Define your own evaluation criteria Custom(name=..., instructions=..., ...)
import asyncio
from cx_guardrails import Guardrails, PII, PromptInjection, Toxicity, GuardrailsTriggered, setup_export_to_coralogix

setup_export_to_coralogix(service_name="my-service")

guardrails = Guardrails()

async def main():
    async with guardrails.guarded_session():
        try:
            await guardrails.guard_prompt(
                guardrails=[PII(), PromptInjection()],
                prompt="User input here",
            )
            
            response = "..."  # Your LLM call here
            
            await guardrails.guard_response(
                guardrails=[PII(), PromptInjection()],
                response=response,
            )
        except GuardrailsTriggered as e:
            for v in e.triggered:
                print(f"Blocked: {v.guardrail_type}, score={v.score}")

asyncio.run(main())

PII Detection

from cx_guardrails import PII, PIICategory

PII()  # All categories, Default threshold 0.7
PII(categories=[PIICategory.EMAIL_ADDRESS, PIICategory.PHONE_NUMBER], threshold=0.8)

Categories: email_address, phone_number, credit_card, iban_code, us_ssn

Prompt Injection Detection

from cx_guardrails import PromptInjection

PromptInjection()  # Default threshold 0.7
PromptInjection(threshold=0.8)

Toxicity Detection

Detects toxic, harmful, or offensive content in messages.

from cx_guardrails import Toxicity

Toxicity()  # Default threshold 0.7
Toxicity(threshold=0.8)

Custom Guardrails

Define your own evaluation criteria to detect specific content patterns:

from cx_guardrails import Custom, CustomEvaluationExample

Custom(
    name="financial_advice_detector",
    instructions="Analyze the {response} and the {prompt} for any financial advice or investment recommendations.",
    violates="Response contains specific financial advice or investment recommendations.",
    safe="Response provides general information without specific investment advice.",
    threshold=0.7,  # Optional, default 0.7
    examples=[      # Optional
        CustomEvaluationExample(
            conversation="User: Should I buy Tesla stock?\nAssistant: Yes, buy it now!",
            score=1,  # 1 = violates
        ),
        CustomEvaluationExample(
            conversation="User: What is a stock?\nAssistant: A stock represents ownership in a company.",
            score=0,  # 0 = safe
        ),
    ],
)

Required fields:

  • name: The guardrail's name
  • instructions: Evaluation instructions (must contain {prompt}, {response}, or {history})
  • violates: Description of what constitutes a violation
  • safe: Description of what constitutes safe content

Optional fields:

  • threshold: Detection threshold (default: 0.7)
  • examples: List of example conversations with expected scores

Magic Words

Use placeholder tags in your instructions to reference conversation content. At least one magic word is required.

Magic Word Description Replaced With Evaluation Target
{prompt} User's input The last user message Prompt
{response} Assistant's output The last assistant response Response
{history} Full conversation All messages in the conversation Response

Examples:

# Evaluate only the response
instructions="Check if the {response} contains harmful content."

# Evaluate the prompt
instructions="Check if the {prompt} is attempting prompt injection."

# Evaluate with full context
instructions="Given the {history}, check if the {response} is consistent with previous answers."

📖 Using guard() for full control

from cx_guardrails import GuardrailsTarget

messages = [
    {"role": "user", "content": "Hello"},
    {"role": "assistant", "content": "Hi there!"},
]

await guardrails.guard([PII()], messages, GuardrailsTarget.RESPONSE)

With tool calls

messages = [
    {"role": "user", "content": "What's the weather in Paris?"},
    {
        "role": "assistant",
        "content": {"tool_calls": [
            {
                "id": "call_123",
                "type": "function",
                "function": {
                    "name": "get_weather",
                    "arguments": '{"location": "Paris"}'
                }
            }
        ],
        
    }},
    {
        "role": "tool",
        "tool_call_id": "call_123",
        "content": "The weather in Paris is 22°C and sunny."
    },
    {"role": "assistant", "content": "The weather in Paris is 22°C and sunny."},
]

await guardrails.guard([PII()], messages, GuardrailsTarget.RESPONSE)

⚙️ Configuration

Environment Variables

export CX_GUARDRAILS_TOKEN="your-guardrails-api-key"
export CX_GUARDRAILS_ENDPOINT="https://your-domain.coralogix.com"
export CX_TOKEN="your-coralogix-api-key"
export CX_ENDPOINT="https://your-domain.coralogix.com"
export CX_APPLICATION_NAME="my-app"      # Optional
export CX_SUBSYSTEM_NAME="my-subsystem"  # Optional

Client Configuration

guardrails = Guardrails(
    api_key="your-api-key",
    cx_guardrails_endpoint="https://your-domain.coralogix.com",
    timeout=2,  # Timeout in seconds (default: 10)
)

🚨 Error Handling

from cx_guardrails import (
    GuardrailsTriggered,
    GuardrailsAPITimeoutError,
    GuardrailsAPIConnectionError,
    GuardrailsAPIResponseError,
)

try:
    await guardrails.guard_prompt(guardrails=[PII()], prompt="test")
except GuardrailsTriggered as e:
    for v in e.triggered:
        print(f"{v.guardrail_type}: {v.score}")
except GuardrailsAPITimeoutError:
    pass  # Request timed out
except GuardrailsAPIConnectionError:
    pass  # Network error
except GuardrailsAPIResponseError as e:
    print(f"HTTP {e.status_code}")

📚 Examples

See the examples directory for complete working examples:

📜 License

Apache 2.0 - See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cx_guardrails-1.0.0.tar.gz (13.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cx_guardrails-1.0.0-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file cx_guardrails-1.0.0.tar.gz.

File metadata

  • Download URL: cx_guardrails-1.0.0.tar.gz
  • Upload date:
  • Size: 13.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for cx_guardrails-1.0.0.tar.gz
Algorithm Hash digest
SHA256 a102e373b58f7863f61cc528f299280732889cb0b3ad2f39741a7076261b400e
MD5 6cb6294b6d197e3914c0993a39ca22d3
BLAKE2b-256 0a7567ad27e0ae24eb9f3fe721859c6de96e3872c7a34a0d47577b909e408154

See more details on using hashes here.

File details

Details for the file cx_guardrails-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: cx_guardrails-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 17.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for cx_guardrails-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b76e939fc0543676658878994e37ae466907905bd82b3066cf3e93cd36f07c41
MD5 130ee1691a71e1fbc2d72deb979000b0
BLAKE2b-256 6879102198b7ba68e5f22cea014bb8e4558e7e379bd98d47cd21a94bfbd75c72

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page