Skip to main content

Community Edition — Agent-Lightning RL integration for the Agent Governance Toolkit: governed training with policy enforcement

Project description

Agent Lightning — RL Training Governance

[!IMPORTANT] Community Preview — The agentmesh-lightning package on PyPI is a community preview release for testing and evaluation only. It is not an official Microsoft-signed release. Official signed packages will be available in a future release.

Train AI agents with RL while maintaining 0% policy violations.

Part of the Agent Governance Toolkit

CI License Python PyPI

🎯 Overview

This package provides governed RL training integration:

  • Agent-Lightning = Training/Optimization (the "brains")
  • Agent-OS = Governance/Safety (the "guardrails")

Result: Agents learn to be smart AND safe from the start.

Note: This package was extracted from agent_os.integrations.agent_lightning. The old import path still works via a backward-compatibility shim but new code should import from agent_lightning_gov directly.

🚀 Quick Start

pip install agent-lightning
# Optional: pip install agent-os-kernel  # for kernel integration
from agent_lightning_gov import GovernedRunner, PolicyReward
from agent_os import KernelSpace
from agent_os.policies import SQLPolicy, CostControlPolicy

# 1. Create governed kernel
kernel = KernelSpace(policy=[
    SQLPolicy(deny=["DROP", "DELETE"]),
    CostControlPolicy(max_cost_usd=100)
])

# 2. Create governed runner
runner = GovernedRunner(kernel)

# 3. Create policy-aware reward function
def base_accuracy(rollout):
    return rollout.task_output.accuracy if rollout.success else 0.0

reward_fn = PolicyReward(kernel, base_reward_fn=base_accuracy)

# 4. Train with Agent-Lightning
from agentlightning import Trainer
trainer = Trainer(
    runner=runner,
    reward_fn=reward_fn,
    algorithm="GRPO"
)

trainer.train(num_epochs=100)

📊 Key Benefits

Metric Without Agent-OS With Agent-OS
Policy Violations 12.3% 0.0%
Task Accuracy 76.4% 79.2%
Training Stability Variable Consistent

🔧 Components

GovernedRunner

Agent-Lightning runner that enforces policies during execution:

from agent_lightning_gov import GovernedRunner

runner = GovernedRunner(
    kernel,
    fail_on_violation=False,   # Continue but penalize
    log_violations=True,        # Log all violations
)

# Execute a task
rollout = await runner.step(task_input)
print(f"Violations: {len(rollout.violations)}")
print(f"Total penalty: {rollout.total_penalty}")

PolicyReward

Converts policy violations to RL penalties:

from agent_lightning_gov import PolicyReward, RewardConfig

config = RewardConfig(
    critical_penalty=-100.0,  # Harsh penalty for critical violations
    high_penalty=-50.0,
    medium_penalty=-10.0,
    low_penalty=-1.0,
    clean_bonus=5.0,          # Bonus for no violations
)

reward_fn = PolicyReward(kernel, config=config)

# Calculate reward
reward = reward_fn(rollout)  # Base reward + policy penalties

GovernedEnvironment

Gym-compatible training environment:

from agent_lightning_gov import GovernedEnvironment

env = GovernedEnvironment(
    kernel,
    config=EnvironmentConfig(
        max_steps=100,
        terminate_on_critical=True,
    )
)

# Standard Gym interface
state, info = env.reset()
while not env.terminated:
    action = agent.get_action(state)
    state, reward, terminated, truncated, info = env.step(action)

FlightRecorderEmitter

Export audit logs to LightningStore:

from agent_os import FlightRecorder
from agent_lightning_gov import FlightRecorderEmitter

recorder = FlightRecorder()
emitter = FlightRecorderEmitter(recorder)

# Export to LightningStore
emitter.emit_to_store(lightning_store)

# Or export to file for analysis
emitter.export_to_file("training_audit.json")

# Get violation summary
summary = emitter.get_violation_summary()
print(f"Violation rate: {summary['violation_rate']:.1%}")

Ecosystem

Agent Lightning is one of 7 packages in the Agent Governance Toolkit:

Package Role
Agent OS Policy engine — deterministic action evaluation
AgentMesh Trust infrastructure — identity, credentials, protocol bridges
Agent Runtime Execution supervisor — rings, sessions, sagas
Agent SRE Reliability — SLOs, circuit breakers, chaos testing
Agent Compliance Regulatory compliance — GDPR, HIPAA, SOX frameworks
Agent Marketplace Plugin lifecycle — discover, install, verify, sign
Agent Lightning RL training governance — governed runners, policy rewards (this package)

📋 License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentmesh_lightning-2.3.0.tar.gz (17.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentmesh_lightning-2.3.0-py3-none-any.whl (17.6 kB view details)

Uploaded Python 3

File details

Details for the file agentmesh_lightning-2.3.0.tar.gz.

File metadata

  • Download URL: agentmesh_lightning-2.3.0.tar.gz
  • Upload date:
  • Size: 17.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for agentmesh_lightning-2.3.0.tar.gz
Algorithm Hash digest
SHA256 ef2e1482047d1a117b9c34c7fc9183958b232e098b81511e92ce13eac83a821b
MD5 0ca7796a2007a7eb1e0c293f4e7ae331
BLAKE2b-256 642b35c61196410134c760b0a7b46d1450a980ffda7b273493accc515da9d3e6

See more details on using hashes here.

File details

Details for the file agentmesh_lightning-2.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for agentmesh_lightning-2.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2ec60d439809e041d19ad0fe4100c02e3c05980cbd66076a58f44d2af48f475d
MD5 3126141ea628563954582352cb011dea
BLAKE2b-256 f1cb496604fdd83559381410a4e4ed5cc342b5e7eba7cd23ab72fbdb565ef105

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page