Public Preview — Agent-Lightning RL integration for the Agent Governance Toolkit: governed training with policy enforcement
Project description
Agent Lightning — RL Training Governance
[!IMPORTANT] Public Preview — The
agentmesh-lightningpackage on PyPI is a Microsoft-signed public preview release. APIs may change before GA.
Train AI agents with RL while maintaining 0% policy violations.
Part of the Agent Governance Toolkit
🎯 Overview
This package provides governed RL training integration:
- Agent-Lightning = Training/Optimization (the "brains")
- Agent-OS = Governance/Safety (the "guardrails")
Result: Agents learn to be smart AND safe from the start.
Note: This package was extracted from
agent_os.integrations.agent_lightning. The old import path still works via a backward-compatibility shim but new code should import fromagent_lightning_govdirectly.
🚀 Quick Start
pip install agentmesh-lightning
# Optional: pip install agent-os-kernel # for kernel integration
from agent_lightning_gov import GovernedRunner, PolicyReward
from agent_os import KernelSpace
from agent_os.policies import SQLPolicy, CostControlPolicy
# 1. Create governed kernel
kernel = KernelSpace(policy=[
SQLPolicy(deny=["DROP", "DELETE"]),
CostControlPolicy(max_cost_usd=100)
])
# 2. Create governed runner
runner = GovernedRunner(kernel)
# 3. Create policy-aware reward function
def base_accuracy(rollout):
return rollout.task_output.accuracy if rollout.success else 0.0
reward_fn = PolicyReward(kernel, base_reward_fn=base_accuracy)
# 4. Train with Agent-Lightning
from agentlightning import Trainer
trainer = Trainer(
runner=runner,
reward_fn=reward_fn,
algorithm="GRPO"
)
trainer.train(num_epochs=100)
📊 Key Benefits
| Metric | Without Agent-OS | With Agent-OS |
|---|---|---|
| Policy Violations | 12.3% | 0.0% |
| Task Accuracy | 76.4% | 79.2% |
| Training Stability | Variable | Consistent |
🔧 Components
GovernedRunner
Agent-Lightning runner that enforces policies during execution:
from agent_lightning_gov import GovernedRunner
runner = GovernedRunner(
kernel,
fail_on_violation=False, # Continue but penalize
log_violations=True, # Log all violations
)
# Execute a task
rollout = await runner.step(task_input)
print(f"Violations: {len(rollout.violations)}")
print(f"Total penalty: {rollout.total_penalty}")
PolicyReward
Converts policy violations to RL penalties:
from agent_lightning_gov import PolicyReward, RewardConfig
config = RewardConfig(
critical_penalty=-100.0, # Harsh penalty for critical violations
high_penalty=-50.0,
medium_penalty=-10.0,
low_penalty=-1.0,
clean_bonus=5.0, # Bonus for no violations
)
reward_fn = PolicyReward(kernel, config=config)
# Calculate reward
reward = reward_fn(rollout) # Base reward + policy penalties
GovernedEnvironment
Gym-compatible training environment:
from agent_lightning_gov import GovernedEnvironment
env = GovernedEnvironment(
kernel,
config=EnvironmentConfig(
max_steps=100,
terminate_on_critical=True,
)
)
# Standard Gym interface
state, info = env.reset()
while not env.terminated:
action = agent.get_action(state)
state, reward, terminated, truncated, info = env.step(action)
FlightRecorderEmitter
Export audit logs to LightningStore:
from agent_os import FlightRecorder
from agent_lightning_gov import FlightRecorderEmitter
recorder = FlightRecorder()
emitter = FlightRecorderEmitter(recorder)
# Export to LightningStore
emitter.emit_to_store(lightning_store)
# Or export to file for analysis
emitter.export_to_file("training_audit.json")
# Get violation summary
summary = emitter.get_violation_summary()
print(f"Violation rate: {summary['violation_rate']:.1%}")
Ecosystem
Agent Lightning is one of 7 packages in the Agent Governance Toolkit:
| Package | Role |
|---|---|
| Agent OS | Policy engine — deterministic action evaluation |
| AgentMesh | Trust infrastructure — identity, credentials, protocol bridges |
| Agent Runtime | Execution supervisor — rings, sessions, sagas |
| Agent SRE | Reliability — SLOs, circuit breakers, chaos testing |
| Agent Compliance | Regulatory compliance — GDPR, HIPAA, SOX frameworks |
| Agent Marketplace | Plugin lifecycle — discover, install, verify, sign |
| Agent Lightning | RL training governance — governed runners, policy rewards (this package) |
📋 License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentmesh_lightning-3.4.0.tar.gz.
File metadata
- Download URL: agentmesh_lightning-3.4.0.tar.gz
- Upload date:
- Size: 21.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: RestSharp/106.13.0.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5505626c5fec85826b9212e61fdb765eb31391ba2296667e153c78dbe43e7644
|
|
| MD5 |
6e902fd05b5fbe3d787bd678c2704aca
|
|
| BLAKE2b-256 |
87f6ab2c7d72b11294fdfe80c0449a41b33e0e55052a97224219273021fd033a
|
File details
Details for the file agentmesh_lightning-3.4.0-py3-none-any.whl.
File metadata
- Download URL: agentmesh_lightning-3.4.0-py3-none-any.whl
- Upload date:
- Size: 17.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: RestSharp/106.13.0.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
18b32e35f2ec2141e3fcf4206e491d816f01b04e466ef8244a1fb54705ac7d24
|
|
| MD5 |
62555e13a484a3a4245d7e253073fc0d
|
|
| BLAKE2b-256 |
3fd00b451fc73fc5b20d96f3dcd03a9fa48c937169d67c4ab4a858eb878c5e69
|