Skip to main content

A kernel architecture for governing autonomous AI agents with Nexus Trust Exchange

Project description

Agent OS

A kernel architecture for governing autonomous AI agents

License Python CI VS Code Extension Documentation

Quick Start โ€ข Documentation โ€ข VS Code Extension โ€ข Examples


Open in Gitpod

Try Agent OS instantly in your browser - no installation required


Agent OS Terminal Demo


๐ŸŽฏ What You'll Build in 5 Minutes

from agent_os import KernelSpace, Policy

# 1. Define safety policies (not prompts - actual enforcement)
kernel = KernelSpace(policies=[
    Policy.no_destructive_sql(),      # Block DROP, DELETE without WHERE
    Policy.file_access("/workspace"), # Restrict file access
    Policy.rate_limit(100, "1m"),     # Max 100 calls/minute
])

# 2. Your agent code runs in user space
@kernel.register
async def data_analyst(query: str):
    result = await llm.generate(f"Analyze: {query}")
    return result

# 3. Kernel intercepts and validates EVERY action
result = await kernel.execute(data_analyst, "revenue by region")
# โœ… Safe queries execute
# โŒ "DROP TABLE users" โ†’ BLOCKED (not by prompt, by kernel)

Result: Defined policies are deterministically enforced by the kernelโ€”not by hoping the LLM follows instructions.


What is Agent OS?

Agent OS applies operating system concepts to AI agent governance. Instead of relying on prompts to enforce safety ("please don't do dangerous things"), it provides application-level middleware that intercepts and validates agent actions before execution.

Note: This is application-level enforcement (Python middleware), not OS kernel-level isolation. Agents run in the same process. For true isolation, run agents in containers.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚              USER SPACE (Agent Code)                    โ”‚
โ”‚   Your agent code runs here. The kernel intercepts      โ”‚
โ”‚   actions before they execute.                          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚              KERNEL SPACE                               โ”‚
โ”‚   Policy Engine โ”‚ Flight Recorder โ”‚ Signal Dispatch     โ”‚
โ”‚   Actions are checked against policies before execution โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

The Idea

Prompt-based safety asks the LLM to follow rules. The LLM decides whether to comply.

Kernel-based safety intercepts actions before execution. The policy engine decides, not the LLM.

This is the same principle operating systems use: applications request resources, the kernel grants or denies access based on permissions.


Architecture

agent-os/
โ”œโ”€โ”€ src/agent_os/             # Core Python package
โ”‚   โ”œโ”€โ”€ __init__.py           # Public API
โ”‚   โ”œโ”€โ”€ cli.py                # Command-line interface
โ”‚   โ””โ”€โ”€ integrations/         # Framework adapters
โ”œโ”€โ”€ modules/                  # Kernel Modules (4-layer architecture)
โ”‚   โ”œโ”€โ”€ primitives/           # Layer 1: Base types and failures
โ”‚   โ”œโ”€โ”€ cmvk/                 # Layer 1: Cross-model verification
โ”‚   โ”œโ”€โ”€ emk/                  # Layer 1: Episodic memory kernel
โ”‚   โ”œโ”€โ”€ caas/                 # Layer 1: Context-as-a-Service
โ”‚   โ”œโ”€โ”€ amb/                  # Layer 2: Agent message bus
โ”‚   โ”œโ”€โ”€ iatp/                 # Layer 2: Inter-agent trust protocol
โ”‚   โ”œโ”€โ”€ atr/                  # Layer 2: Agent tool registry
โ”‚   โ”œโ”€โ”€ observability/        # Layer 2: Prometheus + OpenTelemetry
โ”‚   โ”œโ”€โ”€ control-plane/        # Layer 3: THE KERNEL (policies, signals)
โ”‚   โ”œโ”€โ”€ scak/                 # Layer 4: Self-correcting agent kernel
โ”‚   โ”œโ”€โ”€ mute-agent/           # Layer 4: Face/Hands architecture
โ”‚   โ””โ”€โ”€ mcp-kernel-server/    # Integration: MCP protocol support
โ”œโ”€โ”€ extensions/               # IDE & AI Assistant Extensions
โ”‚   โ”œโ”€โ”€ vscode/               # VS Code extension
โ”‚   โ”œโ”€โ”€ mcp-server/           # Claude Desktop MCP Server
โ”‚   โ”œโ”€โ”€ copilot/              # GitHub Copilot integration
โ”‚   โ”œโ”€โ”€ jetbrains/            # IntelliJ/PyCharm plugin
โ”‚   โ”œโ”€โ”€ cursor/               # Cursor IDE extension
โ”‚   โ”œโ”€โ”€ chrome/               # Chrome extension
โ”‚   โ””โ”€โ”€ github-cli/           # gh CLI extension
โ”œโ”€โ”€ examples/                 # Working examples
โ”‚   โ”œโ”€โ”€ quickstart/           # Start here: my_first_agent.py
โ”‚   โ”œโ”€โ”€ demo-app/             # Full demo application
โ”‚   โ”œโ”€โ”€ hello-world/          # Minimal example
โ”‚   โ””โ”€โ”€ [domain examples]/    # Real-world use cases
โ”œโ”€โ”€ docs/                     # Documentation
โ”œโ”€โ”€ tests/                    # Test suite (organized by layer)
โ”œโ”€โ”€ notebooks/                # Jupyter tutorials
โ””โ”€โ”€ templates/                # Policy templates

Core Modules

Module Layer Description
primitives 1 Base types and failure modes
cmvk 2 Cross-model verification (consensus across LLMs)
amb 2 Agent message bus (decoupled communication)
iatp 2 Inter-agent trust protocol (sidecar-based)
emk 2 Episodic memory kernel (append-only ledger)
control-plane 3 THE KERNEL - Policy engine, signals, VFS
observability 3 Prometheus metrics + OpenTelemetry tracing
scak 4 Self-correcting agent kernel
mute-agent 4 Decoupled reasoning/execution architecture
atr 4 Agent tool registry (runtime discovery)
caas 4 Context-as-a-Service (RAG routing)
mcp-kernel-server Int MCP server for Claude Desktop

IDE & CLI Extensions

Extension Description
vscode VS Code extension with real-time policy checks
jetbrains IntelliJ, PyCharm, WebStorm plugin
cursor Cursor IDE extension (Composer integration)
copilot GitHub Copilot safety layer
mcp-server NEW MCP Server for Claude Desktop
github-cli gh agent-os CLI extension
chrome Chrome extension for web agents

Install

pip install agent-os

Or with optional components:

pip install agent-os[cmvk]           # + cross-model verification
pip install agent-os[iatp]           # + inter-agent trust
pip install agent-os[observability]  # + Prometheus/OpenTelemetry
pip install agent-os[full]           # Everything

One-Command Quickstart

macOS/Linux:

curl -sSL https://raw.githubusercontent.com/imran-siddique/agent-os/main/scripts/quickstart.sh | bash

Windows (PowerShell):

iwr -useb https://raw.githubusercontent.com/imran-siddique/agent-os/main/scripts/quickstart.ps1 | iex

Quick Example

from agent_os import KernelSpace

# Create kernel with policy
kernel = KernelSpace(policy="strict")

@kernel.register
async def my_agent(task: str):
    # Your LLM code here
    return llm.generate(task)

# Actions are checked against policies
result = await kernel.execute(my_agent, "analyze this data")

POSIX-Inspired Primitives

Agent OS borrows concepts from POSIX operating systems:

Concept POSIX Agent OS
Process control SIGKILL, SIGSTOP AgentSignal.SIGKILL, AgentSignal.SIGSTOP
Filesystem /proc, /tmp VFS with /mem/working, /mem/episodic
IPC Pipes (|) Typed IPC pipes between agents
Syscalls open(), read() kernel.execute()

Signals

from agent_os import SignalDispatcher, AgentSignal

dispatcher.signal(agent_id, AgentSignal.SIGSTOP)  # Pause
dispatcher.signal(agent_id, AgentSignal.SIGCONT)  # Resume
dispatcher.signal(agent_id, AgentSignal.SIGKILL)  # Terminate

VFS (Virtual File System)

from agent_os import AgentVFS

vfs = AgentVFS(agent_id="agent-001")
vfs.write("/mem/working/task.txt", "Current task")
vfs.read("/policy/rules.yaml")  # Read-only from user space

Framework Integrations

Wrap existing frameworks with Agent OS governance:

# LangChain
from agent_os.integrations import LangChainKernel
governed = LangChainKernel().wrap(my_chain)

# OpenAI Assistants
from agent_os.integrations import OpenAIKernel
governed = OpenAIKernel().wrap_assistant(assistant, client)

# Semantic Kernel
from agent_os.integrations import SemanticKernelWrapper
governed = SemanticKernelWrapper().wrap(sk_kernel)

# CrewAI
from agent_os.integrations import CrewAIKernel
governed = CrewAIKernel().wrap(my_crew)

See integrations documentation for full details.


How It Differs from Other Tools

Agent Frameworks (LangChain, CrewAI): Build agents. Agent OS governs them. Use together.

Safety Tools (NeMo Guardrails, LlamaGuard): Input/output filtering. Agent OS intercepts actions mid-execution.

Tool Focus When it acts
LangChain/CrewAI Building agents N/A (framework)
NeMo Guardrails Input/output filtering Before/after LLM call
LlamaGuard Content classification Before/after LLM call
Agent OS Action interception During execution

You can use them together:

from langchain.agents import AgentExecutor
from agent_os import KernelSpace

kernel = KernelSpace(policy="strict")

@kernel.govern
async def my_langchain_agent(task: str):
    return agent_executor.invoke({"input": task})

Examples

The examples/ directory contains working demos:

Getting Started

Demo Description Command
hello-world Simplest example (15 lines) cd examples/hello-world && python agent.py
chat-agent Interactive chatbot with memory cd examples/chat-agent && python chat.py
tool-using-agent Agent with safe tools cd examples/tool-using-agent && python agent.py

Production Demos (with Observability)

Demo Description Command
carbon-auditor Multi-model verification cd examples/carbon-auditor && docker-compose up
grid-balancing Multi-agent coordination (100 agents) cd examples/grid-balancing && docker-compose up
defi-sentinel Real-time attack detection cd examples/defi-sentinel && docker-compose up
pharma-compliance Document analysis cd examples/pharma-compliance && docker-compose up

Each production demo includes:

  • Grafana dashboard on port 300X
  • Prometheus metrics on port 909X
  • Jaeger tracing on port 1668X
# Run carbon auditor with full observability
cd examples/carbon-auditor
cp .env.example .env  # Optional: add API keys
docker-compose up

# Open dashboards
open http://localhost:3000  # Grafana (admin/admin)
open http://localhost:16686 # Jaeger traces

Safe Tool Plugins

Agent OS includes pre-built safe tools for agents:

from atr.tools.safe import create_safe_toolkit

toolkit = create_safe_toolkit("standard")

# Available tools
http = toolkit["http"]        # Rate-limited HTTP with domain whitelisting
files = toolkit["files"]      # Sandboxed file reader
calc = toolkit["calculator"]  # Safe math (no eval)
json = toolkit["json"]        # Safe JSON/YAML parsing
dt = toolkit["datetime"]      # Timezone-aware datetime
text = toolkit["text"]        # Text processing

# Use a tool
result = await http.get("https://api.github.com/users/octocat")

See Creating Custom Tools for more.


Message Bus Adapters

Connect agents using various message brokers:

from amb_core.adapters import RedisBroker, KafkaBroker, NATSBroker

# Redis (low latency)
broker = RedisBroker(url="redis://localhost:6379")

# Kafka (high throughput)
broker = KafkaBroker(bootstrap_servers="localhost:9092")

# NATS (cloud-native)
broker = NATSBroker(servers=["nats://localhost:4222"])

# Also: AzureServiceBusBroker, AWSSQSBroker

See Message Bus Adapters Guide for details.


CLI Tool

Agent OS includes a CLI for terminal workflows:

# Check files for safety violations
agentos check src/app.py

# Check staged git files (pre-commit)
agentos check --staged

# Multi-model code review
agentos review src/app.py --cmvk

# Install git pre-commit hook
agentos install-hooks

# Initialize Agent OS in project
agentos init

MCP Integration (Claude Desktop)

Agent OS provides an MCP server for Claude Desktop integration:

# Install
pip install agent-os[mcp]

# Run MCP server
mcp-kernel-server --stdio

# Or add to claude_desktop_config.json:
{
  "mcpServers": {
    "agent-os": {
      "command": "mcp-kernel-server",
      "args": ["--stdio"]
    }
  }
}

Exposes tools: cmvk_verify, kernel_execute, iatp_sign, iatp_verify

See MCP server documentation.


Documentation

Tutorials

Interactive Notebooks

Learn by doing with Jupyter notebooks:

Notebook Description Time
Hello Agent OS Your first governed agent 5 min
Episodic Memory Agent memory that persists 15 min
Time-Travel Debugging Replay and debug decisions 20 min
Cross-Model Verification Detect hallucinations 15 min
Multi-Agent Coordination Trust between agents 20 min
Policy Engine Deep dive into policies 15 min

Reference


Status

This is a research project exploring kernel concepts for AI agent governance. The code is functional but evolving.

Core (Production-Ready)

The minimal trust boundary that's small enough to audit:

  • Policy Engine: Deterministic rule enforcement for defined patterns
  • Flight Recorder: SQLite-based audit logging (see known limitations below)
  • SDK Adapters: Intercept tool calls at SDK boundary (OpenAI, LangChain, CrewAI)

Extensions (Experimental)

Additional capabilities built on the core:

  • Cross-model verification (CMVK), Inter-agent trust (IATP)
  • Supervisor agents, Constraint graphs, Shadow mode
  • IDE extensions (VS Code, JetBrains, Copilot)
  • Observability (Prometheus, OpenTelemetry)
  • Message bus adapters (Redis, Kafka, NATS)

Known Architectural Limitations

Be aware of these design constraints:

Limitation Impact Mitigation
Application-level only Direct stdlib calls (subprocess, open) bypass kernel Pair with container isolation for production
Blocklist-based policies Novel attack patterns not in rules will pass Add AST-level parsing (#32), use defense in depth
Shadow Mode single-step Multi-step agent simulations diverge from reality Use for single-turn validation only
No tamper-proof audit Flight Recorder SQLite can be modified by compromised agent Write to external sink for critical audits
Provider-coupled adapters Each SDK needs separate adapter Abstract interface planned (#47)

See GitHub Issues for the full roadmap.

  • Some integrations are basic wrappers

Troubleshooting

Common Issues

ModuleNotFoundError: No module named 'agent_os'

# Install from source
git clone https://github.com/imran-siddique/agent-os.git
cd agent-os
pip install -e .

Permission errors on Windows

# Run PowerShell as Administrator, or use --user flag
pip install --user -e .

Docker not working

# Build with Dockerfile (no Docker Compose needed for simple tests)
docker build -t agent-os .
docker run -it agent-os python examples/hello-world/agent.py

Tests failing with API errors

# Most tests work without API keys - mock mode is default
pytest tests/ -v

# For real LLM tests, set environment variables
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...

FAQ

Q: How is this different from LangChain/CrewAI? A: LangChain and CrewAI are frameworks for building agents. Agent OS is infrastructure for governing them. Use them togetherโ€”wrap your LangChain agent with Agent OS for safety guarantees.

Q: What does "deterministic enforcement" mean? A: When a policy matches an action, that action is blockedโ€”not by asking the LLM nicely. The middleware intercepts and stops it. However, this only works for patterns the policy engine knows about. Novel attacks that don't match defined rules will pass through.

Q: Do I need to rewrite my agents? A: No. Agent OS provides integration wrappers for LangChain, CrewAI, AutoGen, OpenAI Assistants, and Semantic Kernel. Wrap your existing code and add governance.

Q: Does it work with local models (Ollama, llama.cpp)? A: Yes. Agent OS is model-agnosticโ€”it governs what agents do, not what LLM they use.

Q: How do I contribute? A: See CONTRIBUTING.md for guidelines. Good first issues are labeled in GitHub.


Contributing

git clone https://github.com/imran-siddique/agent-os.git
cd agent-os
pip install -e ".[dev]"
pytest

License

MIT - See LICENSE


Exploring kernel concepts for AI agent safety.

GitHub ยท Docs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_os_kernel-1.2.0.tar.gz (14.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agent_os_kernel-1.2.0-py3-none-any.whl (14.9 MB view details)

Uploaded Python 3

File details

Details for the file agent_os_kernel-1.2.0.tar.gz.

File metadata

  • Download URL: agent_os_kernel-1.2.0.tar.gz
  • Upload date:
  • Size: 14.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for agent_os_kernel-1.2.0.tar.gz
Algorithm Hash digest
SHA256 fb33b7f68201df8ebbbb2e2bca32cce28913c3c99f77ffd3369ccce85640630b
MD5 1502e8e522e05a70d940bc45fcc89ca8
BLAKE2b-256 0e8d5881272b53726e385a57c0cb18819e709a2409bb0a6f4dbf4226c3d556fe

See more details on using hashes here.

File details

Details for the file agent_os_kernel-1.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for agent_os_kernel-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fb0151e18b582b523e95d5301e48a5c5ee6551cd4e2efb8babb48da85bc34e33
MD5 dd4c373c99c4bf2a8e4af6d02efd4836
BLAKE2b-256 8a5e560c52cf138f17974228c892a34d83a1b1fe88827d9a091b857673d5d59c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page