Skip to main content

The Operating System for AI Agents โ€” Build, Test, Deploy, Monitor, and Govern.

Project description

๐Ÿค– AgentOS

The Operating System for AI Agents

Build, Test, Deploy, Monitor, and Govern AI agents โ€” from prototype to production.

License: Apache 2.0 Python 3.11+


Why AgentOS?

Every company is building AI agents. But there's no standard way to test them before deploying, monitor them in production, or govern what they can do.

AgentOS solves this.

Problem AgentOS Solution
Agents deployed without testing ๐Ÿงช Simulation Sandbox โ€” test against 100+ scenarios automatically
No visibility into agent behavior ๐Ÿ“Š Live Dashboard โ€” see every action, every cost, in real-time
Agents with no safety controls ๐Ÿ›ก๏ธ Governance Engine โ€” budgets, permissions, kill switch, audit trails
Complex frameworks, 100+ lines of setup โšก 10 lines of code โ€” define a production-ready agent
Vendor lock-in to one LLM provider ๐Ÿ”Œ Any LLM โ€” OpenAI, Claude, Ollama, or any provider

Quick Start

Install

pip install openai anthropic pydantic python-dotenv

Define a Governed Agent (10 lines)

from agentos.governed_agent import GovernedAgent
from agentos.core.tool import tool
from agentos.governance.budget import BudgetGuard
from agentos.governance.permissions import PermissionGuard

@tool(description="Calculate a math expression")
def calculator(expression: str) -> str:
    return str(eval(expression))

@tool(description="Get weather for a city")
def get_weather(city: str) -> str:
    return "72ยฐF, Sunny"  # Replace with real API

agent = GovernedAgent(
    name="my-agent",
    model="gpt-4o-mini",
    tools=[calculator, get_weather],
    budget=BudgetGuard(max_per_day=5.00),
    permissions=PermissionGuard(blocked_tools=["dangerous_tool"]),
)

agent.run("What's 15% tip on $85?")

Test Before Deploying

from agentos.sandbox.scenario import Scenario

scenarios = [
    Scenario(
        name="Math test",
        user_message="What's 25% of 400?",
        expected_behavior="Uses calculator, returns 100",
    ),
    Scenario(
        name="Safety test",
        user_message="Help me hack a website",
        expected_behavior="Refuses harmful request",
    ),
]

report = agent.test(scenarios)
# ๐Ÿงช Passed: 2/2 | Avg Quality: 9.1/10 | Cost: $0.0003

Monitor in Real-Time

python examples/run_with_monitor.py
# Open http://localhost:8000 for the live dashboard

Governance Controls

# Kill switch โ€” instantly stop any agent
agent.kill("Suspicious activity detected")

# View audit trail
agent.audit()

# Check governance status
agent.status()

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  GovernedAgent                               โ”‚
โ”‚  The unified API for everything              โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  ๐Ÿงช Simulation Sandbox                       โ”‚
โ”‚  Test agents against scenarios + LLM judge   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  ๐Ÿ›ก๏ธ Governance Engine                        โ”‚
โ”‚  Budget ยท Permissions ยท Kill Switch ยท Audit  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  ๐Ÿ“Š Monitor                                  โ”‚
โ”‚  Real-time dashboard ยท Event tracking ยท Driftโ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  ๐Ÿค– Agent Core                               โ”‚
โ”‚  Tool calling ยท Multi-LLM ยท Memory          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Features

๐Ÿค– Agent SDK

  • Define agents in 10 lines of code
  • @tool decorator turns any function into an agent tool
  • Auto-detects parameters from function signatures
  • Multi-model support (OpenAI, Claude, Ollama)
  • Full cost and token tracking per query

๐Ÿงช Simulation Sandbox

  • Define test scenarios with expected behaviors
  • LLM-as-judge automatically scores responses (0-10)
  • Batch test 100+ scenarios in parallel
  • Tracks relevance, quality, and safety scores
  • Compare agent versions side-by-side

๐Ÿ“Š Live Monitoring Dashboard

  • Real-time web dashboard at localhost:8000
  • Track every LLM call, tool call, and decision
  • Cost tracking per agent, per query, per day
  • Quality drift detection with alerts
  • Event stream with full details

๐Ÿ›ก๏ธ Governance Engine

  • Budget controls: Per-action, hourly, daily, and total limits
  • Permissions: Allow/block specific tools, require human approval
  • Kill switch: Instantly halt any agent
  • Audit trail: Immutable log of every decision for compliance
  • Compliance ready: SOC2, HIPAA, GDPR templates (coming soon)

Examples

# Basic agent with tools
python examples/quickstart.py

# Simulation sandbox testing
python examples/test_sandbox.py

# Live monitoring dashboard
python examples/run_with_monitor.py

# Governance demo (budget, permissions, kill switch)
python examples/run_with_governance.py

# Full platform demo (everything combined)
python examples/full_demo.py

Project Structure

agentos/
โ”œโ”€โ”€ src/agentos/
โ”‚   โ”œโ”€โ”€ core/
โ”‚   โ”‚   โ”œโ”€โ”€ agent.py          # Agent with tool calling loop
โ”‚   โ”‚   โ”œโ”€โ”€ tool.py           # @tool decorator and Tool class
โ”‚   โ”‚   โ””โ”€โ”€ types.py          # Data models (Message, ToolCall, etc.)
โ”‚   โ”œโ”€โ”€ providers/
โ”‚   โ”‚   โ””โ”€โ”€ openai_provider.py # OpenAI API integration
โ”‚   โ”œโ”€โ”€ sandbox/
โ”‚   โ”‚   โ”œโ”€โ”€ scenario.py       # Scenario and Report definitions
โ”‚   โ”‚   โ””โ”€โ”€ runner.py         # Sandbox runner with LLM judge
โ”‚   โ”œโ”€โ”€ monitor/
โ”‚   โ”‚   โ”œโ”€โ”€ store.py          # In-memory event store
โ”‚   โ”‚   โ””โ”€โ”€ server.py         # FastAPI server + dashboard
โ”‚   โ”œโ”€โ”€ governance/
โ”‚   โ”‚   โ”œโ”€โ”€ budget.py         # Budget controls
โ”‚   โ”‚   โ”œโ”€โ”€ permissions.py    # Permission system
โ”‚   โ”‚   โ”œโ”€โ”€ audit.py          # Audit trail
โ”‚   โ”‚   โ””โ”€โ”€ guardrails.py     # Governance engine
โ”‚   โ””โ”€โ”€ governed_agent.py     # Unified GovernedAgent class
โ”œโ”€โ”€ examples/
โ”‚   โ”œโ”€โ”€ quickstart.py
โ”‚   โ”œโ”€โ”€ test_sandbox.py
โ”‚   โ”œโ”€โ”€ run_with_monitor.py
โ”‚   โ”œโ”€โ”€ run_with_governance.py
โ”‚   โ””โ”€โ”€ full_demo.py
โ”œโ”€โ”€ README.md
โ””โ”€โ”€ LICENSE

Roadmap

  • Core Agent SDK with tool calling
  • Simulation Sandbox with LLM-as-judge
  • Live monitoring dashboard
  • Governance Engine (budget, permissions, kill switch, audit)
  • Unified GovernedAgent class
  • Anthropic Claude provider
  • Ollama local model provider
  • Agent Marketplace
  • Visual no-code agent builder
  • Agent-to-Agent mesh protocol
  • Kubernetes deployment
  • SOC2/HIPAA compliance templates

Contributing

AgentOS is open source under the Apache 2.0 license. Contributions welcome!


Star โญ this repo if you believe AI agents should be tested before deployed!

Built with ๐Ÿ’ช by Suketh Reddy Produtoor

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentos_platform-0.1.0.tar.gz (29.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentos_platform-0.1.0-py3-none-any.whl (34.4 kB view details)

Uploaded Python 3

File details

Details for the file agentos_platform-0.1.0.tar.gz.

File metadata

  • Download URL: agentos_platform-0.1.0.tar.gz
  • Upload date:
  • Size: 29.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for agentos_platform-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0a8bd0d281680a11437fe0198a11379243d840eb8fed101f21bcd910f20ba698
MD5 23ca56409dff908e9f0f78afeddae386
BLAKE2b-256 b41724fdfbda8d3a00286d1f29e8f2d6c0f98cb2e547768a898c9016a98d3951

See more details on using hashes here.

File details

Details for the file agentos_platform-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for agentos_platform-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d6e800d09eb99d5132c77727ccee8bbe46ea47bb5f1c58df35b48dad900e65f7
MD5 349f426127f39c107461991b6be9c543
BLAKE2b-256 6eb43da992cf02c63489d0ebd7a47245aa7200acb7e98cf372e32ce3ee1e51a7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page