Skip to main content

Production-grade Agent Operations (AgentOps) Platform

Project description

🕹️ AgentOps Cockpit

AgentOps Cockpit Trinity

"Infrastructure gives you the pipes. We give you the Intelligence."

The developer distribution for building, optimizing, and securing AI agents on Google Cloud.


📽️ The Mission

Most AI agent templates stop at a single Python file and an API key. The AgentOps Cockpit is for developers moving into production. It provides framework-agnostic governance, safety, and cost guardrails for the entire agentic ecosystem.

  • Governance-as-Code: Audit your agent against Google Well-Architected best practices with the Evidence Bridge—real-time citations for architectural integrity.
  • SME Persona Audits: Parallelized review of your codebase by automated "Principal SMEs" across FinOps, SecOps, and Architecture.
  • Agentic Trinity: Dedicated layers for the Engine (Logic), Face (UX), and Cockpit (Ops).
  • A2A Connectivity: Implements the Agent-to-Agent Transmission Standard for secure swarm orchestration.
  • MCP Native: Registration as a Model Context Protocol server for 1P/2P/3P tool consumption.

🏗️ The Agentic Trinity

We divide the complexity of production agents into three focused pillars:

graph LR
   subgraph Trinity [The Agentic Trinity]
       E(The Engine: Reasoning)
       F(The Face: Interface)
       C(The Cockpit: Operations)
   end
   E <--> C
   F <--> C
   E <--> F
   style Trinity fill:#f9f9f9,stroke:#333,stroke-width:2px
  • ⚙️ The Engine: The reasoning core. Built with ADK, FastAPI, and Vertex AI.
  • 🎭 The Face: The user experience. Adaptive UI surfaces and GenUI standards via the A2UI spec.
  • 🕹️ The Cockpit: The operational brain. Cost control, semantic caching, shadow routing, and adversarial audits.
Ecosystem Integrations

🌐 Framework Agnostic Governance

The Cockpit isn't just for ADK. It provides Best Practices as Code across all major agentic frameworks:

OpenAI Agentkit Anthropic Microsoft AWS CopilotKit LangChain ADK Operational Workflow

🛠️ Operational Flow

sequenceDiagram
   participant U as User
   participant C as Cockpit
   participant E as Engine
   participant F as Face
   
   U->>C: Prompt / Input
   C->>C: Policy Audit (RFC-307)
   C->>E: Execute Logic / Tools
   E->>C: Action Proposals
   C->>E: Approve (HITL)
   E->>F: GenUI Metadata
   F->>U: Reactive Surface (A2UI)

Python Go NodeJS TypeScript Streamlit Angular Lit

Whether you are building a swarm in CrewAI, a Go-based high-perf engine, or a Streamlit dashboard, the Cockpit ensures your agent maps to the Google Well-Architected Framework.


🚀 Key Innovation: The "Intelligence" Layer

🛡️ Red Team Auditor (Self-Hacking)

Don't wait for your users to find prompt injections. Use the built-in Adversarial Evaluator to launch self-attacks against your agent, testing for PII leaks, instruction overrides, and safety filter bypasses.

🧠 Hive Mind (Semantic Caching)

Reduce LLM costs by up to 40%. The Hive Mind checks for semantically similar queries in 10ms, serving cached answers for common questions without calling the LLM.

🏛️ Arch Review & Framework Detection

Every agent in the cockpit is graded against a framework-aware checklist. The Cockpit intelligently detects your stack—Google ADK, OpenAI Agentkit, Anthropic Claude, Microsoft AutoGen/Semantic Kernel, AWS Bedrock Agents, or CopilotKit—and runs a tailored audit against corresponding production standards. Use make arch-review to verify your Governance-as-Code.

🕹️ MCP Connectivity Hub (Model Context Protocol)

Stop building one-off tool integrations. The Cockpit provides a unified hub for MCP Servers. Connect to Google Search, Slack, or your internal databases via the standardized Model Context Protocol for secure, audited tool execution. Start the server with make mcp-serve.

🗄️ Situational Database Audits

The Cockpit now performs platform-specific performance and security audits for:

  • AlloyDB: Optimizes for the Columnar Engine (100x query speedup).
  • Pinecone: Suggests gRPC and Namespace Isolation for high-perf RAG.
  • BigQuery: Suggests BQ Vector Search for serverless, cost-effective grounding.
  • Cloud SQL: Enforces IAM-based authentication via the official Python Connector.

🧗 Quality Hill Climbing (ADK Evaluation)

Following Google ADK Evaluation best practices, the Cockpit provides an iterative optimization loop. make quality-baseline runs your agent against a "Golden Dataset" using LLM-as-a-Judge scoring (Response Match & Tool Trajectory), climbing the quality curve until production-grade fidelity is reached.

🛑 Mandatory Governance Enforcement (NEW)

The Cockpit now acts as a mandatory gate for production.

  • Blocking CI/CD: GitHub Actions now fail if High Impact cost issues or Red Team security vulnerabilities are detected.
  • Build-Time Audit: The Dockerfile includes a mandatory RUN audit step. If your agent is not "Well-Architected," the container image will fail to build.

⌨️ Quick Start

The Cockpit is available as a first-class CLI on PyPI.

# 1. Install the Cockpit globally
pip install agentops-cockpit

# 2. Run Global Audit (Produces unified report)
agent-ops report --mode quick        # ⚡ Quick Safe-Build
agent-ops report --mode deep         # 🚀 Full System Audit

# 3. Guardrail Policy Audit (RFC-307)
agent-ops policy-audit --text "How to make a bomb?"

# 4. Global Scaffolding
agent-ops-cockpit create <name> --ui a2ui

🔍 Agent Optimizer v2 (Situational Intelligence)

The Cockpit doesn't just look for generic waste. It now performs Triple-State Analysis:

  • Legacy Workarounds: Suggests situational fixes for older SDK versions (e.g., manual prompt pruning).
  • Modernization Paths: Highlights native performance gains (e.g., 90% cost reduction via Context Caching) available in latest SDKs.
  • Conflict Guard: Real-time cross-package validation to prevent architectural deadlocks (e.g., CrewAI vs LangGraph state loops).

⚡ Quick-Safe Build (12x Faster Loops)

Development velocity shouldn't sacrifice safety. The new --quick mode in the auditor reduces check latency from 1.8s to 0.15s, providing sub-second feedback while maintaining the integrity of the Conflict Guard and Architecture Review.


🧑‍💼 Principal SME Persona Approvals

The Cockpit now features a Multi-Persona Governance Board. Every audit result is framed through the lens of a Principal Engineer in that domain (Security, Legal, FinOps, UX), ensuring your agent is compliant with organizational standards.

📄 Export & Reporting

  • HTML/PDF Export: Every audit automatically generates cockpit_report.html, a premium, printable report ready for PDF export.
  • Email Reports: Send audit results directly to stakeholders via the CLI.

📊 Local Development

The Cockpit provides a unified "Mission Control" to evaluate your agents instantly.

make audit         # 🕹️ Run Master Audit (Persona Approved)
make audit-deep    # 🚀 Run Deep Audit (Full SME Verdicts)
make email-report  # 📧 Email the latest result to a stakeholder
make diagnose      # 🩺 Run environment health check
make optimizer-audit # 🔍 Run Optimizer on specific agent files
make reliability   # 🛡️ Run unit tests and regression suite
make dev           # Start the local Engine + Face stack
make arch-review   # 🏛️ Run the Google Well-Architected design review
make quality-baseline # 🧗 Run iterative 'Hill Climbing' quality audit
make red-team      # Execute a white-hat security audit
make deploy-prod   # 🚀 1-click deploy to Google Cloud

🧭 Roadmap

  • One-Click GitHub Action: Automated governance audits on every PR.
  • Mandatory Build Gates: Blocking CI/CD and Container audits for production safety.
  • Multi-Agent Orchestrator: Standardized A2A Swarm/Coordinator patterns.
  • Visual Mission Control: Real-time cockpit observability dashboard.

View full roadmap →


🤝 Community

  • Star this repo to help us build the future of AgentOps.
  • Join the Discussion for patterns on Google Cloud.
  • Contribute: Read our Contributing Guide.

Reference: Google Cloud Architecture Center - Agentic AI Overview

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentops_cockpit-0.9.5.tar.gz (5.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentops_cockpit-0.9.5-py3-none-any.whl (76.6 kB view details)

Uploaded Python 3

File details

Details for the file agentops_cockpit-0.9.5.tar.gz.

File metadata

  • Download URL: agentops_cockpit-0.9.5.tar.gz
  • Upload date:
  • Size: 5.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for agentops_cockpit-0.9.5.tar.gz
Algorithm Hash digest
SHA256 0d814589d9cee48308ebd486d92a036a037adf3c1ecc05e17334dfae6537794f
MD5 f3e8710491611c6bf9a7e77d98b53be8
BLAKE2b-256 ba7cf88fd0689d9fa50b8c20b49a44d81da7457a2d35cc776c5c8ec9b0689f6a

See more details on using hashes here.

File details

Details for the file agentops_cockpit-0.9.5-py3-none-any.whl.

File metadata

File hashes

Hashes for agentops_cockpit-0.9.5-py3-none-any.whl
Algorithm Hash digest
SHA256 bff01c3cfd62de9a6bf15e8c73d46bcd64962f9fab3d721014a943f3fcce2906
MD5 974a47af068aeee1819bcfecd95e649e
BLAKE2b-256 3ad7e41875e817d56860c5d6bced33aa77021f389c8919a9faebe7c6401acd55

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page