Skip to main content

A comprehensive collection of AI guardrails built with DSPy for content moderation and security.

Project description

DSPy Guardrails

A comprehensive collection of AI guardrails built with DSPy for content moderation and security.
Explore the Documentation »
Report Bug · Request Feature

Table of Contents
  1. About
  2. Quick Start
  3. Usage
  4. Development
  5. Contributing
  6. License

About

DSPy Guardrails is a comprehensive suite of AI guardrails built with DSPy. Each guardrail is implemented as a self-contained module that can be used to test and validate different types of content moderation and security checks.

  • Modular design — Each guardrail type is implemented as a separate, self-contained module
  • Two-stage detection — A fast regex prefilter handles known patterns before any LLM call, so most requests never reach the model
  • Programmatic testing — Run guardrails directly in Python for fast iteration
  • Comprehensive coverage — Covers major content moderation and security scenarios
  • DSPy integration — Leverages DSPy's programmatic prompting for consistent, reliable results
  • Type safety — Full type hints and dataclass definitions for robust implementations

(back to top)

Quick Start

Install

Install dspy-guardrails with uv (recommended)

uv add dspy-guardrails

Install with pip (alternative)

pip install dspy-guardrails

Continue with the usage examples below.

(back to top)

Usage

Basic Usage

import dspy
from dspy_guardrails import guardrail

# Configure DSPy (required)
lm = dspy.LM("openrouter/google/gemini-2.5-flash-preview-09-2025")
guardrail.configure(lm=lm)

# Create and run guardrails
topic_guardrail = guardrail.Topic(topic_scopes=["AI", "Machine Learning"])
result = guardrail.Run(topic_guardrail, "I want to learn about neural networks")
print(f"Allowed: {result.is_allowed}")  # True

Multiple Guardrails

Assumes guardrail.configure(lm=lm) has already been called.

all_guardrails = [
    guardrail.Topic(topic_scopes=["AI"]),
    guardrail.Nsfw(),
    guardrail.Pii(),
]
result = guardrail.Run(all_guardrails, "Safe AI content")
print(f"All passed: {result.is_allowed}")  # True

Early Return

Stop execution on the first failing guardrail.

guardrails = [guardrail.Topic(topic_scopes=["AI"]), guardrail.Nsfw()]
result = guardrail.Run(guardrails, "Risky content", early_return=True)
print(result.is_allowed)

For more examples and patterns, see the complete quickstart guide and guardrail types. For the regex prefilter catalog, opt-out flags, and custom-pattern reference, see docs/REGEX_PREFILTERS.md.

(back to top)

Two-Stage Detection: Regex Prefilter + LLM

Most guardrails (Pii, PromptInjection, SecretKeys, Jailbreak, Keywords, Gibberish, Toxicity, Language, Topic) run a fast regex prefilter before calling the DSPy LLM. If the prefilter matches, the guardrail short-circuits with is_allowed=False and method="regex_prefilter" in the result metadata — no model call is made, so the request is handled deterministically and at zero API cost. Only requests that pass the prefilter are sent to the LLM.

result = guardrail.Run(pii_guardrail, "Email me at user@example.com")
result.metadata["method"]    # "regex_prefilter" (fast) or "dspy" (LLM)
result.metadata["matches"]   # list of {slug, matched_text, ...} on a prefilter hit

The prefilter is opt-out per guardrail (e.g. enable_regex_prefilter=False, enable_script_prefilter=False, enable_blocked_topic_prefilter=False) and accepts user-supplied custom patterns where the API supports them (Pii.custom_patterns, SecretKeys.custom_patterns). Custom patterns are ReDoS-screened at construction time.

To run multiple guardrails concurrently in a single bulk check, pass parallel=True to guardrail.Run() (uses a ThreadPoolExecutor):

result = guardrail.Run(
    [pii_gr, secret_keys_gr, prompt_injection_gr],
    "Email me at user@example.com",
    parallel=True,
)

(back to top)

Available Guardrails

Guardrail Example variables Example instantiation
Topic topic_scopes=["AI", "Machine Learning"], blocked_topics=["spam"] guardrail.Topic(topic_scopes=["AI", "Machine Learning"], blocked_topics=["spam"])
NSFW sensitivity_level="high" guardrail.Nsfw(sensitivity_level="high")
PII allowed_pii_types=["email"] guardrail.Pii(allowed_pii_types=["email"])
Toxicity toxicity_threshold=0.8 guardrail.Toxicity(toxicity_threshold=0.8)
Tone desired_tone="helpful", unwanted_tones=["sarcastic"] guardrail.Tone(desired_tone="helpful", unwanted_tones=["sarcastic"])
Grounding grounding_threshold=0.8 guardrail.Grounding(grounding_threshold=0.8)
Language allowed_languages=["en", "es"] guardrail.Language(allowed_languages=["en", "es"])
Keywords blocked_keywords=["password", "secret"], case_sensitive=False guardrail.Keywords(blocked_keywords=["password", "secret"], case_sensitive=False)
Secret Keys key_patterns=["sk-", "ghp_"], entropy_threshold=3.5 guardrail.SecretKeys(key_patterns=["sk-", "ghp_"], entropy_threshold=3.5)
Gibberish prob_threshold=0.7 guardrail.Gibberish(prob_threshold=0.7)
Prompt Injection injection_patterns=["ignore previous", "system override"] guardrail.PromptInjection(injection_patterns=["ignore previous", "system override"])
Jailbreak detection_threshold=0.9 guardrail.Jailbreak(detection_threshold=0.9)

Development

Code Quality

This project uses several tools to maintain code quality:

  • Ruff: Linting and formatting
  • isort: Import sorting
  • pytest: Testing framework

Available commands:

# Run all quality checks
uv run poe clean

# Individual checks
uv run poe lint          # Ruff linting
uv run poe format        # Ruff formatting
uv run poe sort          # Import sorting

Testing

Run tests using pytest:

# Run all tests
uv run pytest

# Run specific test
uv run pytest path/to/test.py::test_name

(back to top)

Contributing

Quick workflow:

  1. Fork and branch: git checkout -b feature/name
  2. Make changes
  3. Run checks: uv run poe clean-full
  4. Commit and push
  5. Open a Pull Request

(back to top)

License

MIT (as declared in pyproject.toml).

(back to top)


Built by thememium

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dspy_guardrails-0.1.5.tar.gz (41.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dspy_guardrails-0.1.5-py3-none-any.whl (61.2 kB view details)

Uploaded Python 3

File details

Details for the file dspy_guardrails-0.1.5.tar.gz.

File metadata

  • Download URL: dspy_guardrails-0.1.5.tar.gz
  • Upload date:
  • Size: 41.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.18 {"installer":{"name":"uv","version":"0.11.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for dspy_guardrails-0.1.5.tar.gz
Algorithm Hash digest
SHA256 0c294cf4cf7f47facdcbb5569098aa766e166bd0db865ee773892af9406e0c12
MD5 b4712f1040d32fb5c3b0db9311d541fc
BLAKE2b-256 98a5acea6c25cbf67b403d2838df47fb54eaf1ea6160b3f1654c1de936e45847

See more details on using hashes here.

File details

Details for the file dspy_guardrails-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: dspy_guardrails-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 61.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.18 {"installer":{"name":"uv","version":"0.11.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for dspy_guardrails-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 1121e2b3ec516f3f8069d01d356fad5fad43adc02bead9331e4450fc29c1751b
MD5 65fc39a85b807132fa8bd236adb59ce8
BLAKE2b-256 2ec4989f0d513f3dbd7fa063784d40f4395ae6517e71619e1de9fb6d1a557496

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page