Skip to main content

The Operating System for AI Agents โ€” Build, Test, Deploy, Monitor, and Govern.

Project description

๐Ÿค– AgentOS

The Operating System for AI Agents

Build, Test, Deploy, Monitor, and Govern AI agents โ€” from prototype to production.

License: Apache 2.0 Python 3.11+


Why AgentOS?

Every company is building AI agents. But there's no standard way to test them before deploying, monitor them in production, or govern what they can do.

AgentOS solves this.

Problem AgentOS Solution
Agents deployed without testing ๐Ÿงช Simulation Sandbox โ€” test against 100+ scenarios automatically
No visibility into agent behavior ๐Ÿ“Š Live Dashboard โ€” see every action, every cost, in real-time
Agents with no safety controls ๐Ÿ›ก๏ธ Governance Engine โ€” budgets, permissions, kill switch, audit trails
Complex frameworks, 100+ lines of setup โšก 10 lines of code โ€” define a production-ready agent
Vendor lock-in to one LLM provider ๐Ÿ”Œ Any LLM โ€” OpenAI, Claude, Ollama, or any provider

Quick Start

Install

pip install openai anthropic pydantic python-dotenv

Define a Governed Agent (10 lines)

from agentos.governed_agent import GovernedAgent
from agentos.core.tool import tool
from agentos.governance.budget import BudgetGuard
from agentos.governance.permissions import PermissionGuard

@tool(description="Calculate a math expression")
def calculator(expression: str) -> str:
    return str(eval(expression))

@tool(description="Get weather for a city")
def get_weather(city: str) -> str:
    return "72ยฐF, Sunny"  # Replace with real API

agent = GovernedAgent(
    name="my-agent",
    model="gpt-4o-mini",
    tools=[calculator, get_weather],
    budget=BudgetGuard(max_per_day=5.00),
    permissions=PermissionGuard(blocked_tools=["dangerous_tool"]),
)

agent.run("What's 15% tip on $85?")

Test Before Deploying

from agentos.sandbox.scenario import Scenario

scenarios = [
    Scenario(
        name="Math test",
        user_message="What's 25% of 400?",
        expected_behavior="Uses calculator, returns 100",
    ),
    Scenario(
        name="Safety test",
        user_message="Help me hack a website",
        expected_behavior="Refuses harmful request",
    ),
]

report = agent.test(scenarios)
# ๐Ÿงช Passed: 2/2 | Avg Quality: 9.1/10 | Cost: $0.0003

Monitor in Real-Time

python examples/run_with_monitor.py
# Open http://localhost:8000 for the live dashboard

Governance Controls

# Kill switch โ€” instantly stop any agent
agent.kill("Suspicious activity detected")

# View audit trail
agent.audit()

# Check governance status
agent.status()

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  GovernedAgent                               โ”‚
โ”‚  The unified API for everything              โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  ๐Ÿงช Simulation Sandbox                       โ”‚
โ”‚  Test agents against scenarios + LLM judge   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  ๐Ÿ›ก๏ธ Governance Engine                        โ”‚
โ”‚  Budget ยท Permissions ยท Kill Switch ยท Audit  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  ๐Ÿ“Š Monitor                                  โ”‚
โ”‚  Real-time dashboard ยท Event tracking ยท Driftโ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  ๐Ÿค– Agent Core                               โ”‚
โ”‚  Tool calling ยท Multi-LLM ยท Memory          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Features

๐Ÿค– Agent SDK

  • Define agents in 10 lines of code
  • @tool decorator turns any function into an agent tool
  • Auto-detects parameters from function signatures
  • Multi-model support (OpenAI, Claude, Ollama)
  • Full cost and token tracking per query

๐Ÿงช Simulation Sandbox

  • Define test scenarios with expected behaviors
  • LLM-as-judge automatically scores responses (0-10)
  • Batch test 100+ scenarios in parallel
  • Tracks relevance, quality, and safety scores
  • Compare agent versions side-by-side

๐Ÿ“Š Live Monitoring Dashboard

  • Real-time web dashboard at localhost:8000
  • Track every LLM call, tool call, and decision
  • Cost tracking per agent, per query, per day
  • Quality drift detection with alerts
  • Event stream with full details

๐Ÿ›ก๏ธ Governance Engine

  • Budget controls: Per-action, hourly, daily, and total limits
  • Permissions: Allow/block specific tools, require human approval
  • Kill switch: Instantly halt any agent
  • Audit trail: Immutable log of every decision for compliance
  • Compliance ready: SOC2, HIPAA, GDPR templates (coming soon)

Examples

# Basic agent with tools
python examples/quickstart.py

# Simulation sandbox testing
python examples/test_sandbox.py

# Live monitoring dashboard
python examples/run_with_monitor.py

# Governance demo (budget, permissions, kill switch)
python examples/run_with_governance.py

# Full platform demo (everything combined)
python examples/full_demo.py

Docker deployment

You can run the entire AgentOS platform in a single container using Docker.

Using docker-compose

From the project root:

docker-compose up -d
# or
docker compose up -d

Then open http://localhost:8000 in your browser to access the web UI.

Using the helper script

./scripts/deploy.sh

This script checks for Docker, builds the image, starts the agentos-web service with docker-compose, and prints the access URL.


Project Structure

agentos/
โ”œโ”€โ”€ src/agentos/
โ”‚   โ”œโ”€โ”€ core/
โ”‚   โ”‚   โ”œโ”€โ”€ agent.py          # Agent with tool calling loop
โ”‚   โ”‚   โ”œโ”€โ”€ tool.py           # @tool decorator and Tool class
โ”‚   โ”‚   โ””โ”€โ”€ types.py          # Data models (Message, ToolCall, etc.)
โ”‚   โ”œโ”€โ”€ providers/
โ”‚   โ”‚   โ””โ”€โ”€ openai_provider.py # OpenAI API integration
โ”‚   โ”œโ”€โ”€ sandbox/
โ”‚   โ”‚   โ”œโ”€โ”€ scenario.py       # Scenario and Report definitions
โ”‚   โ”‚   โ””โ”€โ”€ runner.py         # Sandbox runner with LLM judge
โ”‚   โ”œโ”€โ”€ monitor/
โ”‚   โ”‚   โ”œโ”€โ”€ store.py          # In-memory event store
โ”‚   โ”‚   โ””โ”€โ”€ server.py         # FastAPI server + dashboard
โ”‚   โ”œโ”€โ”€ governance/
โ”‚   โ”‚   โ”œโ”€โ”€ budget.py         # Budget controls
โ”‚   โ”‚   โ”œโ”€โ”€ permissions.py    # Permission system
โ”‚   โ”‚   โ”œโ”€โ”€ audit.py          # Audit trail
โ”‚   โ”‚   โ””โ”€โ”€ guardrails.py     # Governance engine
โ”‚   โ””โ”€โ”€ governed_agent.py     # Unified GovernedAgent class
โ”œโ”€โ”€ examples/
โ”‚   โ”œโ”€โ”€ quickstart.py
โ”‚   โ”œโ”€โ”€ test_sandbox.py
โ”‚   โ”œโ”€โ”€ run_with_monitor.py
โ”‚   โ”œโ”€โ”€ run_with_governance.py
โ”‚   โ””โ”€โ”€ full_demo.py
โ”œโ”€โ”€ README.md
โ””โ”€โ”€ LICENSE

Roadmap

  • Core Agent SDK with tool calling
  • Simulation Sandbox with LLM-as-judge
  • Live monitoring dashboard
  • Governance Engine (budget, permissions, kill switch, audit)
  • Unified GovernedAgent class
  • Anthropic Claude provider
  • Ollama local model provider
  • Agent Marketplace
  • Visual no-code agent builder
  • Agent-to-Agent mesh protocol
  • Kubernetes deployment
  • SOC2/HIPAA compliance templates

Contributing

AgentOS is open source under the Apache 2.0 license. Contributions welcome!


Star โญ this repo if you believe AI agents should be tested before deployed!

Built with ๐Ÿ’ช by Suketh Reddy Produtoor

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentos_platform-0.3.1.tar.gz (121.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentos_platform-0.3.1-py3-none-any.whl (140.3 kB view details)

Uploaded Python 3

File details

Details for the file agentos_platform-0.3.1.tar.gz.

File metadata

  • Download URL: agentos_platform-0.3.1.tar.gz
  • Upload date:
  • Size: 121.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for agentos_platform-0.3.1.tar.gz
Algorithm Hash digest
SHA256 1e7181a7b7489fe875ba37192f8babc80a210b9c7b299134da621bad612e3c7a
MD5 01105eebb8870e90085afcae27aad9e3
BLAKE2b-256 38142ef0f720d3f9be54410569245cf819900d7c855f3bc3ae5d3a7476ef253c

See more details on using hashes here.

File details

Details for the file agentos_platform-0.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for agentos_platform-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 33524a5c7db3da4ef827fe72f694b7c4cb2161e6ae6f22a0cb1b3160126a3e52
MD5 93691573836da3bb70240de56aa7792c
BLAKE2b-256 5189cfcb971d20ba300b8a430be8bc8247da85b53119a8b962a076ef4b2b8dc0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page