Skip to main content

A Python package for enforcing behavioural contracts in AI agents

Project description

Behavioural Contracts

A Python package for enforcing behavioural contracts in AI agents. This package provides a framework for defining, validating, and enforcing behavioural contracts that ensure AI agents operate within specified constraints and patterns.

Declarative OA fragments (validate_task_output)

Recommended Open Agent style: each YAML fragment uses type + config. Pass the parsed dict per call to validate_task_output(output, fragment); legacy shapes (nested response_contract, or plain applies_to + rules) still work unchanged.

Response / schema

- name: response_schema_v1
  type: response_contract
  description: "Expected task JSON shape"
  config:
    output_format:
      type: object
      required_fields: [summary, checklist]

Content rules (e.g. checklist)

- name: no_generic_checklist_items
  type: content_contract
  description: "Tighten checklist items"
  applies_to: checklist
  config:
    rules:
      - type: must_contain_action_verb
      - type: min_word_count
        value: 5
      - type: must_not_start_with
        values: [consider, think about, understand]
from behavioural_contracts import validate_task_output, normalize_task_contract_fragment

# validate_task_output normalizes declarative fragments automatically
validate_task_output(task_output_dict, fragment_dict)

# Optional: normalize a whole list before looping
for frag in map(normalize_task_contract_fragment, fragments):
    ...

Severity and structured results (validate_task_output_detailed)

Optional per-fragment severity: low | medium | high (case-insensitive). Omitted or invalid values default to medium. Set on the YAML/dict root next to name (declarative or legacy fragment).

Use validate_task_output_detailed(output, spec) when you need one JSON-serializable row per validation unit (schema row + each applies_to: checklist fragment). validate_task_output runs the same checks, evaluates all units, then raises BehaviouralContractViolationError with a short summary if any row failed—use it for strict pipelines; use the detailed API for bce.validated / bce.violation style logs.

Example return value (abridged):

[
  {
    "contract": "__response_schema__",
    "status": "ok",
    "severity": "medium",
    "violations": []
  },
  {
    "contract": "no_generic_checklist_items",
    "status": "fail",
    "severity": "high",
    "violations": [
      {
        "rule_type": "min_word_count",
        "rule": "min_word_count",
        "message": "Checklist item [0] rule min_word_count: need at least 5 words, got 2",
        "item_index": 0,
        "path": "checklist",
        "detail": {"payload": 5},
        "severity": "high"
      }
    ]
  }
]

Example log line for integrators:

bce.violation contract=no_generic_checklist_items severity=high rule=min_word_count idx=0 msg="need at least 5 words, got 2"

✨ Phase 1 Security Enhancements (January 2025)

NEW SECURITY FEATURES:

  • 🛡️ Prompt Injection Detection - 16+ patterns, Base64/hex detection
  • 🧠 Context-Aware Validation - Prevents hallucination and contradictions
  • 🔒 Enhanced BCE Rules - Advanced security validation
  • 📋 Compliance Templates - GDPR, HIPAA, SOC2, ISO27001, PCI DSS
  • 🎯 GitHub Log Agent Fix - Prevents false confidence in empty data

📊 Phase 1.5 Data Persistence & Monitoring (January 2025)

NEW PERSISTENCE FEATURES:

  • 🗄️ SQLite/PostgreSQL Support - Local development → Azure production
  • 📈 Metrics Storage - Response times, token usage, cache performance
  • ⚠️ Violation Tracking - Security events with severity classification
  • 🎯 Performance Aggregation - Real-time compliance scoring
  • 🏥 System Health Monitoring - Active agents, violation trends
  • 🔍 Time-based Queries - Historical analysis and reporting

Enhanced Security Agent Example

from behavioural_contracts import behavioural_contract

@behavioural_contract({
    "version": "0.2.0",
    "description": "Enhanced Security Agent",
    "behavioural_flags": {
        "conservatism": "high",
        "temperature_control": {"mode": "strict", "range": [0.1, 0.5]}
    },
    "response_contract": {
        "output_format": {
            "required_fields": ["risk_assessment", "recommendations", "confidence_level"]
        },
        "safety_checks": {
            "harmful_content": True,
            "pii_protection": True
        }
    }
})
def security_agent(threat_description: str, severity: str) -> dict:
    return {
        "risk_assessment": f"Analyzing {severity} threat: {threat_description}",
        "recommendations": ["Immediate containment", "Escalate to security team"],
        "confidence_level": 0.85
    }

Compliance-Ready Agents

from behavioural_contracts.compliance_templates import create_compliant_agent

@create_compliant_agent("gdpr")
def gdpr_data_processor(data: str) -> dict:
    return {
        "compliance_status": "compliant",
        "data_processing_basis": "legitimate_interest", 
        "privacy_impact": "low",
        "recommendations": ["Data processed according to GDPR"]
    }

Data Persistence & Monitoring

from behavioural_contracts.persistence import SessionLocal, init_db, MetricsStore

# Initialize database (SQLite locally, PostgreSQL on Azure)
init_db()
session = SessionLocal()
store = MetricsStore(session)

# Record validation metrics
store.record_validation(
    agent_id="security-agent-v1",
    contract_id="compliance-contract",
    validation_time_ms=125.0,
    token_count=200,
    cache_hit=False,
    confidence_score=0.95
)

# Record security violations
store.record_violation(
    agent_id="security-agent-v1", 
    contract_id="compliance-contract",
    violation_type="prompt_injection",
    severity="high",
    confidence=0.85
)

# Query agent performance
metrics = store.get_agent_metrics("security-agent-v1")
print(f"Compliance score: {metrics['compliance_score']:.1%}")
print(f"Average response time: {metrics['avg_validation_time_ms']:.1f}ms")

# System health check
health = store.get_system_health()
print(f"System status: {health['status']}")

Interactive Testing

# Interactive demo
python demo/interactive_demo.py

# Test persistence layer
python demo/test_persistence_demo.py

# Run all tests with linting and summary report
python run_tests.py

# Live agent testing with real LLMs
python demo/live_agent_demo.py

# Modern linting and formatting
ruff check .        # Lint code
ruff format .       # Format code

Proven Results:

  • Fixes GitHub log agent hallucination (confidence 0.9 → 0.3)
  • 83% prompt injection detection accuracy
  • 100% compliance template validation
  • <50ms latency overhead

Installation

pip install behavioural-contracts

Quick Start

from behavioural_contracts import behavioural_contract, generate_contract

# Define your contract
contract_data = {
    "version": "1.1",
    "description": "Financial Analyst Agent",
    "policy": {
        "pii": False,
        "compliance_tags": ["EU-AI-ACT"],
        "allowed_tools": ["search", "summary"]
    },
    "behavioural_flags": {
        "conservatism": "moderate",
        "verbosity": "compact",
        "temperature_control": {
            "mode": "adaptive",
            "range": [0.2, 0.6]
        }
    },
    "response_contract": {
        "output_format": {
            "type": "object",
            "required_fields": [
                "decision", "confidence", "summary", "reasoning",
                "compliance_tags", "temperature_used"
            ],
            "on_failure": {
                "action": "fallback",
                "max_retries": 1,
                "fallback": {
                    "decision": "unknown",
                    "confidence": "low",
                    "summary": "Recommendation rejected due to validation failure.",
                    "reasoning": "The model's response failed validation checks."
                }
            }
        },
        "max_response_time_ms": 4000,
        "behaviour_signature": {
            "key": "decision",
            "expected_type": "string"
        }
    }
}

# Generate a formatted contract
contract = generate_contract(contract_data)

# Use the contract with your agent
@behavioural_contract(contract)
def analyst_agent(signal: dict, **kwargs):
    return {
        "decision": "BUY",
        "confidence": "high",
        "summary": "Strong buy signal based on technical indicators",
        "reasoning": "Multiple indicators show bullish momentum",
        "compliance_tags": ["EU-AI-ACT"],
        "temperature_used": 0.3  # Required field for temperature validation
    }

Key Features

1. Contract Generation

Generate properly formatted contracts from specification data:

from behavioural_contracts import generate_contract

# Basic contract
basic_contract = generate_contract({
    "version": "1.1",
    "description": "Simple Agent",
    "response_contract": {
        "output_format": {
            "required_fields": ["decision", "confidence", "temperature_used"]
        }
    }
})

# Contract with policy and response validation
policy_contract = generate_contract({
    "version": "1.1",
    "description": "Compliant Agent",
    "policy": {
        "pii": False,
        "compliance_tags": ["GDPR", "HIPAA"],
        "allowed_tools": ["search", "analyze"]
    },
    "response_contract": {
        "output_format": {
            "required_fields": [
                "decision", "confidence", "compliance_tags", "temperature_used"
            ]
        },
        "max_response_time_ms": 2000
    }
})

2. Contract Formatting

Format existing contracts to ensure proper value types:

from behavioural_contracts import format_contract

# Format a contract with mixed types
formatted = format_contract({
    "version": 1.1,  # Will be converted to string
    "description": "My Agent",
    "response_contract": {
        "output_format": {
            "required_fields": ["decision", "temperature_used"]
        },
        "max_response_time_ms": 1000
    }
})

3. Behavioural Contract Decorator

Use the decorator to enforce contracts on your agent functions:

from behavioural_contracts import behavioural_contract

# Using a dictionary
@behavioural_contract({
    "version": "1.1",
    "description": "Trading Agent",
    "policy": {
        "pii": False,
        "compliance_tags": ["FINRA"]
    },
    "response_contract": {
        "output_format": {
            "required_fields": [
                "decision", "confidence", "compliance_tags", "temperature_used"
            ]
        }
    }
})
def trading_agent(signal: dict, **kwargs):
    return {
        "decision": "BUY",
        "confidence": "high",
        "compliance_tags": ["FINRA"],
        "temperature_used": 0.3
    }

4. Response Validation

The contract system enforces response validation including:

  • Required fields
  • Temperature range validation
  • Response time limits
  • Compliance tag verification
  • PII detection
  • Tool usage validation
@behavioural_contract({
    "version": "1.1",
    "description": "Validated Agent",
    "behavioural_flags": {
        "temperature_control": {
            "range": [0.2, 0.6]
        }
    },
    "response_contract": {
        "output_format": {
            "required_fields": [
                "decision", "confidence", "temperature_used"
            ]
        },
        "max_response_time_ms": 1000
    }
})
def validated_agent(signal: dict, **kwargs):
    # Response will be validated for:
    # - All required fields present
    # - Temperature within range
    # - Response time under 1000ms
    return {
        "decision": "APPROVE",
        "confidence": "high",
        "temperature_used": 0.3
    }

Contract Structure

A behavioural contract consists of several key sections:

  1. Basic Information

    • version: Contract version
    • description: Agent description
  2. Policy Settings

    • pii: PII handling flag
    • compliance_tags: Required compliance tags
    • allowed_tools: List of allowed tools
  3. Behavioural Flags

    • conservatism: Agent conservatism level
    • verbosity: Output verbosity
    • temperature_control: Temperature settings
      • mode: Control mode (fixed/adaptive)
      • range: Allowed temperature range [min, max]
  4. Response Contract

    • output_format: Response structure requirements
      • type: Output type (usually "object")
      • required_fields: List of required fields
      • on_failure: Fallback configuration
    • max_response_time_ms: Maximum allowed response time
    • behaviour_signature: Key field to track for suspicious behavior

Python Installation

PyPI version Python versions License

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Overview

https://www.openagentstack.ai

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

behavioural_contracts-0.5.0.tar.gz (69.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

behavioural_contracts-0.5.0-py3-none-any.whl (40.4 kB view details)

Uploaded Python 3

File details

Details for the file behavioural_contracts-0.5.0.tar.gz.

File metadata

  • Download URL: behavioural_contracts-0.5.0.tar.gz
  • Upload date:
  • Size: 69.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.5

File hashes

Hashes for behavioural_contracts-0.5.0.tar.gz
Algorithm Hash digest
SHA256 b547171fe0ffda23386701907f2cb5a2105c54304033ffc3de0a7816e72f4fa1
MD5 94126843477d48874d20bf75ddfb2499
BLAKE2b-256 3a0db4bdd86fe4d3b06043640cb1e4b18235965b9b9ad78d2396e655ea5e0d0c

See more details on using hashes here.

File details

Details for the file behavioural_contracts-0.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for behavioural_contracts-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 163466366fb0620e4a4faa6354f85ddcdb1628512f9491a0367be27d8465e5db
MD5 a8bcbf7425d5853fa8e86d7dbd21de16
BLAKE2b-256 d5d33c82ba242f558a2f7bde5805ba9942bbd4c4c1b1a72c1caaee628bc88a38

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page