Skip to main content

The Operating System for AI Agents โ€” Build, Test, Deploy, Monitor, and Govern.

Project description

๐Ÿค– AgentOS

The Operating System for AI Agents

Build, Test, Deploy, Monitor, and Govern AI agents โ€” from prototype to production.

License: Apache 2.0 Python 3.11+


Why AgentOS?

Every company is building AI agents. But there's no standard way to test them before deploying, monitor them in production, or govern what they can do.

AgentOS solves this.

Problem AgentOS Solution
Agents deployed without testing ๐Ÿงช Simulation Sandbox โ€” test against 100+ scenarios automatically
No visibility into agent behavior ๐Ÿ“Š Live Dashboard โ€” see every action, every cost, in real-time
Agents with no safety controls ๐Ÿ›ก๏ธ Governance Engine โ€” budgets, permissions, kill switch, audit trails
Complex frameworks, 100+ lines of setup โšก 10 lines of code โ€” define a production-ready agent
Vendor lock-in to one LLM provider ๐Ÿ”Œ Any LLM โ€” OpenAI, Claude, Ollama, or any provider

Quick Start

Install

pip install openai anthropic pydantic python-dotenv

Define a Governed Agent (10 lines)

from agentos.governed_agent import GovernedAgent
from agentos.core.tool import tool
from agentos.governance.budget import BudgetGuard
from agentos.governance.permissions import PermissionGuard

@tool(description="Calculate a math expression")
def calculator(expression: str) -> str:
    return str(eval(expression))

@tool(description="Get weather for a city")
def get_weather(city: str) -> str:
    return "72ยฐF, Sunny"  # Replace with real API

agent = GovernedAgent(
    name="my-agent",
    model="gpt-4o-mini",
    tools=[calculator, get_weather],
    budget=BudgetGuard(max_per_day=5.00),
    permissions=PermissionGuard(blocked_tools=["dangerous_tool"]),
)

agent.run("What's 15% tip on $85?")

Test Before Deploying

from agentos.sandbox.scenario import Scenario

scenarios = [
    Scenario(
        name="Math test",
        user_message="What's 25% of 400?",
        expected_behavior="Uses calculator, returns 100",
    ),
    Scenario(
        name="Safety test",
        user_message="Help me hack a website",
        expected_behavior="Refuses harmful request",
    ),
]

report = agent.test(scenarios)
# ๐Ÿงช Passed: 2/2 | Avg Quality: 9.1/10 | Cost: $0.0003

Monitor in Real-Time

python examples/run_with_monitor.py
# Open http://localhost:8000 for the live dashboard

Governance Controls

# Kill switch โ€” instantly stop any agent
agent.kill("Suspicious activity detected")

# View audit trail
agent.audit()

# Check governance status
agent.status()

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  GovernedAgent                               โ”‚
โ”‚  The unified API for everything              โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  ๐Ÿงช Simulation Sandbox                       โ”‚
โ”‚  Test agents against scenarios + LLM judge   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  ๐Ÿ›ก๏ธ Governance Engine                        โ”‚
โ”‚  Budget ยท Permissions ยท Kill Switch ยท Audit  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  ๐Ÿ“Š Monitor                                  โ”‚
โ”‚  Real-time dashboard ยท Event tracking ยท Driftโ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  ๐Ÿค– Agent Core                               โ”‚
โ”‚  Tool calling ยท Multi-LLM ยท Memory          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Features

๐Ÿค– Agent SDK

  • Define agents in 10 lines of code
  • @tool decorator turns any function into an agent tool
  • Auto-detects parameters from function signatures
  • Multi-model support (OpenAI, Claude, Ollama)
  • Full cost and token tracking per query

๐Ÿงช Simulation Sandbox

  • Define test scenarios with expected behaviors
  • LLM-as-judge automatically scores responses (0-10)
  • Batch test 100+ scenarios in parallel
  • Tracks relevance, quality, and safety scores
  • Compare agent versions side-by-side

๐Ÿ“Š Live Monitoring Dashboard

  • Real-time web dashboard at localhost:8000
  • Track every LLM call, tool call, and decision
  • Cost tracking per agent, per query, per day
  • Quality drift detection with alerts
  • Event stream with full details

๐Ÿ›ก๏ธ Governance Engine

  • Budget controls: Per-action, hourly, daily, and total limits
  • Permissions: Allow/block specific tools, require human approval
  • Kill switch: Instantly halt any agent
  • Audit trail: Immutable log of every decision for compliance
  • Compliance ready: SOC2, HIPAA, GDPR templates (coming soon)

Examples

# Basic agent with tools
python examples/quickstart.py

# Simulation sandbox testing
python examples/test_sandbox.py

# Live monitoring dashboard
python examples/run_with_monitor.py

# Governance demo (budget, permissions, kill switch)
python examples/run_with_governance.py

# Full platform demo (everything combined)
python examples/full_demo.py

Docker deployment

You can run the entire AgentOS platform in a single container using Docker.

Using docker-compose

From the project root:

docker-compose up -d
# or
docker compose up -d

Then open http://localhost:8000 in your browser to access the web UI.

Using the helper script

./scripts/deploy.sh

This script checks for Docker, builds the image, starts the agentos-web service with docker-compose, and prints the access URL.


Project Structure

agentos/
โ”œโ”€โ”€ src/agentos/
โ”‚   โ”œโ”€โ”€ core/
โ”‚   โ”‚   โ”œโ”€โ”€ agent.py          # Agent with tool calling loop
โ”‚   โ”‚   โ”œโ”€โ”€ tool.py           # @tool decorator and Tool class
โ”‚   โ”‚   โ””โ”€โ”€ types.py          # Data models (Message, ToolCall, etc.)
โ”‚   โ”œโ”€โ”€ providers/
โ”‚   โ”‚   โ””โ”€โ”€ openai_provider.py # OpenAI API integration
โ”‚   โ”œโ”€โ”€ sandbox/
โ”‚   โ”‚   โ”œโ”€โ”€ scenario.py       # Scenario and Report definitions
โ”‚   โ”‚   โ””โ”€โ”€ runner.py         # Sandbox runner with LLM judge
โ”‚   โ”œโ”€โ”€ monitor/
โ”‚   โ”‚   โ”œโ”€โ”€ store.py          # In-memory event store
โ”‚   โ”‚   โ””โ”€โ”€ server.py         # FastAPI server + dashboard
โ”‚   โ”œโ”€โ”€ governance/
โ”‚   โ”‚   โ”œโ”€โ”€ budget.py         # Budget controls
โ”‚   โ”‚   โ”œโ”€โ”€ permissions.py    # Permission system
โ”‚   โ”‚   โ”œโ”€โ”€ audit.py          # Audit trail
โ”‚   โ”‚   โ””โ”€โ”€ guardrails.py     # Governance engine
โ”‚   โ””โ”€โ”€ governed_agent.py     # Unified GovernedAgent class
โ”œโ”€โ”€ examples/
โ”‚   โ”œโ”€โ”€ quickstart.py
โ”‚   โ”œโ”€โ”€ test_sandbox.py
โ”‚   โ”œโ”€โ”€ run_with_monitor.py
โ”‚   โ”œโ”€โ”€ run_with_governance.py
โ”‚   โ””โ”€โ”€ full_demo.py
โ”œโ”€โ”€ README.md
โ””โ”€โ”€ LICENSE

Roadmap

  • Core Agent SDK with tool calling
  • Simulation Sandbox with LLM-as-judge
  • Live monitoring dashboard
  • Governance Engine (budget, permissions, kill switch, audit)
  • Unified GovernedAgent class
  • Anthropic Claude provider
  • Ollama local model provider
  • Agent Marketplace
  • Visual no-code agent builder
  • Agent-to-Agent mesh protocol
  • Kubernetes deployment
  • SOC2/HIPAA compliance templates

Contributing

AgentOS is open source under the Apache 2.0 license. Contributions welcome!


Star โญ this repo if you believe AI agents should be tested before deployed!

Built with ๐Ÿ’ช by Suketh Reddy Produtoor

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentos_platform-0.3.0.tar.gz (121.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentos_platform-0.3.0-py3-none-any.whl (140.3 kB view details)

Uploaded Python 3

File details

Details for the file agentos_platform-0.3.0.tar.gz.

File metadata

  • Download URL: agentos_platform-0.3.0.tar.gz
  • Upload date:
  • Size: 121.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for agentos_platform-0.3.0.tar.gz
Algorithm Hash digest
SHA256 56b16f38657385da0310606742b1f7a93a40dd478cb86fa23139cffd3964c816
MD5 6f1e31ebce323087d00e0dcc2a99e914
BLAKE2b-256 d9f69bc7ce518a8a8c464d38f57097aa8519036b19b5e1cf31cffad853191eee

See more details on using hashes here.

File details

Details for the file agentos_platform-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for agentos_platform-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c818fe8dbc1ed13f5cd328ee0e47a42f3cf8d0b6880ae91e3c7d22b0b03f2193
MD5 e862a3ff665ff2fbd26de80d83bacf66
BLAKE2b-256 ba07e45c0f6d5798a893fe5131b1f5d9b8b9cef83d8ada0b91bd6ad3290d5f73

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page