Skip to main content

Production-grade AI automation platform powered by Claude Opus 4.6

Project description

StateSet Computer Use Agent

Advanced AI Agent System for Autonomous Computer Operation

Python 3.10+ Claude Opus 4.6 License: Proprietary

FeaturesQuick StartDocumentationAgentsArchitecture


🎯 Overview

StateSet Computer Use Agent is a production-grade AI automation platform powered by Claude Opus 4.6, designed for autonomous computer operation with human-level reliability and intelligence. The system uses multiple specialized AI agents that can see, understand, and interact with desktop environments to complete complex, long-running tasks.

Why StateSet Computer Use Agent?

  • 🧠 State-of-the-Art AI: Powered by Claude Opus 4.6 with extended thinking capabilities
  • ⚡ 30-50% Faster: Automatic parallel tool execution for independent operations
  • 💰 95% Cost Savings: Research-based context engineering reduces token usage dramatically
  • 🔄 Indefinite Conversations: Maintains EXCELLENT attention quality across unlimited context
  • 🏢 Production-Ready: Includes monitoring, billing, security, and graceful failure handling
  • 🎨 Multi-Agent: 7 specialized agents for different business workflows

🆕 What's New in Claude 4.5

Updated model lineup

  • Claude Opus 4.6 – most intelligent Anthropic model ever shipped, now priced for day-to-day production agents.
  • Claude Sonnet 4.5 – best balance of capability and cost for complex coding or orchestration.
  • Claude Haiku 4.5 – fastest Haiku yet with near-frontier reasoning and the first Haiku model that supports extended thinking.

Opus 4.6 enhancements

  • Maximum intelligence across reasoning, debugging, and strategic planning tasks.
  • Thinking block preservation keeps the model’s reasoning context intact across turns for better long-running workflows (no extra flags required).
  • Computer-use excellence with the new zoom action in computer_20251124, enabling pixel-level inspection of dense UI or fine print before taking action.
  • Practical performance thanks to lower pricing plus automatic prompt caching, so advanced agents stay affordable.

Effort parameter (beta)

  • Claude Opus 4.6 is the only model that accepts an effort setting (low, medium, high).
  • We automatically attach the required effort-2025-11-24 beta header and forward the setting via output_config.
  • Configure it per run with:
    python main.py --effort medium "triage support tickets and summarize blockers"
    
    Use low for high-volume automations, medium for balanced cost/performance, and high (default) for maximal quality.
  • Works alongside the thinking token budget—effort controls overall token appetite while thinking_budget still caps meta-reasoning tokens.

🌟 Core Innovation: Context Engineering

Based on Anthropic's Latest Research - Full implementation of all 5 patterns from "Effective Context Engineering for AI Agents":

Pattern Implementation Result
Just-in-Time Retrieval grep/head/tail instead of full file reads 60-99% token savings
Dynamic Compaction Adaptive clearing based on attention budget Maintains quality as context grows
Structured Note-Taking Persistent memory outside context window Unlimited task complexity
Sub-Agent Compression Exploration agents return concise summaries 50k → 2k token summaries
Attention Budget Monitoring Real-time quality tracking (EXCELLENT→CRITICAL) Prevents quality degradation

Impact: Enables indefinite conversations with EXCELLENT attention quality while achieving 95% cost reduction compared to naive implementations.

Read More: Context Engineering Details →

🚀 Quick Start

Prerequisites

  • Ubuntu Linux 20.04+ (kernel 5.15.0+)
  • Python 3.10 or higher
  • Anthropic API key (get one here)
  • X11 virtual display (we'll set this up)
  • (Optional) Node.js + npm if you want to use the StateSet CLI (@stateset-cli) from within agent tasks.

Installation

Option 1: pip install (recommended)

pip install stateset-cua

# Configure API key
export ANTHROPIC_API_KEY='your-key-here'

# Start virtual display (required for GUI automation)
Xvfb :1 -screen 0 1920x1080x24 &
export DISPLAY=:1

# Run your first agent
stateset-cua run "auto-close resolved tickets"

Option 2: From source

git clone https://github.com/stateset/stateset-computer-use-agent.git
cd stateset-computer-use-agent
pip install -e ".[dev]"

export ANTHROPIC_API_KEY='your-key-here'
Xvfb :1 -screen 0 1920x1080x24 &
export DISPLAY=:1

stateset-cua run "auto-close resolved tickets"

Need more help? See the comprehensive Getting Started Guide →

🤖 Available Agents

StateSet includes 7 specialized agents optimized for different business workflows:

Agent Purpose Example Use Case
AUTO_CLOSE Support ticket automation "auto-close all resolved tickets from last 24 hours"
SOCIAL_MEDIA Content moderation & engagement "social media hide inappropriate comments on Facebook"
LINKEDIN_MESSENGER Professional outreach "linkedin send connection requests to AI engineers in SF"
SLACK_SUPPORT Customer support automation "slack respond to all unanswered questions in #support"
SHOPIFY E-commerce management "shopify update inventory for out-of-stock products"
ONBOARDING User onboarding workflows "onboard new enterprise customer with custom rules"
STATESET_AGENTIC General-purpose automation "organize desktop files and create summary report"

Multi-Agent Orchestration

Run multiple agents in parallel for complex workflows:

python main.py "auto-close tickets and social media monitoring and slack support"

How it works:

  • Automatic keyword detection selects appropriate agents
  • Parallel execution (not sequential) for independent tasks
  • Unified logging with [AGENT_TYPE] prefixes
  • Aggregated metrics and billing

🛠️ Key Features

1. Computer Vision & Control

Agents can see and interact with any desktop application:

  • Screenshot Analysis: High-resolution screen capture with caching
  • Mouse Control: Click, drag, scroll with pixel-perfect precision
  • Keyboard Input: Type text, keyboard shortcuts, special keys
  • Adaptive Delays: Smart waiting based on action type (0.0s-0.6s)

2. Intelligent Parallel Execution

Automatic dependency analysis for safe parallelization:

# Before optimization (3 sequential API calls)
tool_use_1 = web_search("Claude AI")        # 2.5s
tool_use_2 = web_search("Anthropic")        # 2.5s
tool_use_3 = web_search("computer use")     # 2.5s
# Total: 7.5 seconds

# After optimization (1 parallel API call)
parallel_execution([
    web_search("Claude AI"),
    web_search("Anthropic"),
    web_search("computer use")
])
# Total: 2.5 seconds (3x faster!)

Performance: 30-50% speed improvement on real-world tasks

Read More: Parallel Execution →

3. Advanced Tool Suite

Agents have access to powerful tools across multiple categories:

Category Tools Description
Computer click, type, scroll, zoom, screenshot Desktop interaction (computer_20251124)
Web web_search, web_fetch Internet access with citations
Code code_execution Sandboxed Python/Bash execution
Files create, read, edit, search File management with path protection
Memory view, create, edit, delete, rename Persistent agent memory with injection protection
Text Editor str_replace, insert Advanced file editing
Subagents spawn_subagent Spawn isolated sub-agents for task decomposition
MCP mcp____ External tools via Model Context Protocol
CLI stateset_cli StateSet Node CLI integration

Read More: Tool Reference →

4. Subagent Spawning & MCP Integration

Subagents implement Anthropic's sub-agent compression pattern -- the main agent can spawn specialized sub-agents that operate in isolated contexts and return compressed summaries (50k exploration down to 2k), achieving 95% context savings.

MCP connects external services (Slack, GitHub, Postgres, and more) as tools via the Model Context Protocol. Supports stdio, SSE, and HTTP transports with 8 pre-configured presets.

5. Structured JSON Output

Force Claude to return valid JSON matching a specified schema for reliable automation pipelines. Includes pre-defined schemas for ticket analysis, task results, code review, and entity extraction.

6. Production-Grade Observability

Unified observability system combining all monitoring concerns:

  • Structured Logging: JSON-formatted with automatic request ID correlation
  • Prometheus Metrics: Agent duration, tool execution counts, API latency, cost tracking
  • OpenTelemetry Tracing: Distributed tracing with automatic span creation
  • Real-time Streaming: SSE and WebSocket endpoints for dashboard integration
  • Budget Warnings: Automatic alerts when token/cost budgets approach limits
  • Health Monitoring: API connectivity, display, memory, disk checks with circuit breakers

7. Security-First Design

Multiple layers of security hardened across the stack:

  • Prompt Injection Protection: 11-pattern content sanitizer in memory tool
  • Directory Traversal Prevention: resolve() + relative_to() + symlink detection
  • Agent Isolation: Separate memory directories per agent ID
  • Safe Tool Execution: Pre-execution validation via ToolExecutionGuard
  • Dashboard Auth: JWT-based with tenant isolation, rate limiting, security headers
  • Circuit Breakers: Fault tolerance for external API calls (5 failures → open → 60s recovery)

Read More: Security Considerations →

📋 Example Usage

Basic Agent Execution

# Using convenience scripts
./start-autoclose-agent.sh
./start-socialmedia-agent.sh
./start-linkedin-agent.sh

# Custom instructions
python main.py "auto-close all tickets marked as resolved"
python main.py "social media hide comments containing profanity"
python main.py "linkedin message CTOs at Series A startups"

Advanced Workflows

# Multi-step workflow
python main.py "auto-close resolved tickets, then generate summary report"

# Conditional logic
python main.py "social media hide inappropriate comments only if flagged by 2+ users"

# Complex automation
python main.py "shopify find products with inventory < 10 and create reorder report"

Tool search & effort controls

  • Defer heavyweight tool schemas until Claude actually needs them:
    python main.py --tool-search bm25 --defer-tool agi_agent --defer-tool memory "run a quarterly revenue analysis"
    
  • Dial Claude Opus 4.6’s token appetite up or down with the --effort flag (we add the effort-2025-11-24 beta header for you):
    python main.py --effort low "gather 10 competitor pricing snapshots"
    

Monitoring & Debugging

# Real-time log filtering
python main.py "your task" 2>&1 | grep "\[AUTO_CLOSE\]"

# Save complete logs
python main.py "your task" 2>&1 | tee logs/run_$(date +%Y%m%d_%H%M%S).log

# View screenshots
ls -lh screenshots/AUTO_CLOSE/
eog screenshots/AUTO_CLOSE/screenshot_*.png

🏗️ Architecture

System Overview

┌─────────────────────────────────────────────────────────────────────┐
│                            CLI / Entry Points                        │
│  main.py (orchestrator)    start-*-agent.sh    stateset-cua CLI      │
│  --tool-version  --effort  --tool-search  --defer-tool  --agent-type │
└────────────────────────────────┬────────────────────────────────────┘
                                 │
                    ┌────────────┴────────────┐
                    ▼                         ▼
          ┌──────────────────┐      ┌──────────────────┐
          │  Agent Selection  │      │  Health Check     │
          │  (keyword match)  │      │  (API, display,   │
          │  get_active_agents│      │   memory, disk)   │
          └────────┬─────────┘      └──────────────────┘
                   │
        ┌──────────┼──────────┐         Parallel asyncio.Tasks
        ▼          ▼          ▼
  ┌───────────┐┌───────────┐┌───────────┐
  │ Agent 1   ││ Agent 2   ││ Agent N   │   run_agent() per agent type
  │ Loop      ││ Loop      ││ Loop      │   with adaptive config +
  │           ││           ││           │   model routing + skills
  └─────┬─────┘└─────┬─────┘└─────┬─────┘
        │             │             │
        └──────────┬──┘─────────────┘
                   ▼
  ┌─────────────────────────────────────────────────────────────┐
  │                    sampling_loop()                           │
  │                    agent/loop.py                             │
  │                                                             │
  │  ┌─────────────┐  ┌──────────────┐  ┌────────────────────┐ │
  │  │ System      │  │ Circuit      │  │ Context            │ │
  │  │ Prompt Init │  │ Breaker      │  │ Optimizer          │ │
  │  │ (StateSet   │  │ (API fault   │  │ (5 Anthropic       │ │
  │  │  APIs)      │  │  tolerance)  │  │  patterns)         │ │
  │  └─────────────┘  └──────────────┘  └────────────────────┘ │
  │                                                             │
  │  ┌──────────────────────────────────────────────────────┐   │
  │  │              Claude Opus 4.6 API Call                 │   │
  │  │  Providers: Anthropic | AWS Bedrock | Google Vertex   │   │
  │  │  Betas: prompt caching, context management,           │   │
  │  │         tool search, effort, web/code/files           │   │
  │  └──────────────────────┬───────────────────────────────┘   │
  │                         │                                   │
  │  ┌──────────────────────▼───────────────────────────────┐   │
  │  │           Parallel Tool Executor                      │   │
  │  │  DependencyAnalyzer → group independent calls         │   │
  │  │  asyncio.gather() for parallel, sequential otherwise  │   │
  │  ├───────────────────────────────────────────────────────┤   │
  │  │                                                       │   │
  │  │  Client-Side Tools        Server-Side Tools           │   │
  │  │  ┌──────────────────┐     ┌────────────────────┐      │   │
  │  │  │ ComputerTool     │     │ web_search         │      │   │
  │  │  │ BashTool         │     │ web_fetch          │      │   │
  │  │  │ EditTool         │     │ code_execution     │      │   │
  │  │  │ MemoryTool       │     │ files_api          │      │   │
  │  │  │ AGITool          │     │ tool_search        │      │   │
  │  │  │ SubagentTool     │     └────────────────────┘      │   │
  │  │  │ StateSetCLITool  │                                 │   │
  │  │  │ AskUserTool      │     MCP Tools (External)        │   │
  │  │  └──────────────────┘     ┌────────────────────┐      │   │
  │  │                           │ mcp__slack__*      │      │   │
  │  │                           │ mcp__github__*     │      │   │
  │  │                           │ mcp__postgres__*   │      │   │
  │  │                           └────────────────────┘      │   │
  │  └───────────────────────────────────────────────────────┘   │
  │                                                             │
  │  ┌─────────────┐  ┌──────────────┐  ┌────────────────────┐ │
  │  │ Stuck       │  │ Checkpoint   │  │ Subagent           │ │
  │  │ Detector    │  │ Manager      │  │ Manager            │ │
  │  │ (loop/cycle │  │ (resume      │  │ (isolated context, │ │
  │  │  detection) │  │  long tasks) │  │  Haiku compress)   │ │
  │  └─────────────┘  └──────────────┘  └────────────────────┘ │
  └─────────────────────────────────────────────────────────────┘
                   │
                   ▼
  ┌─────────────────────────────────────────────────────────────┐
  │                   Observability Layer                        │
  │  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌────────────┐ │
  │  │Structured│  │Prometheus│  │OpenTel   │  │Event Bus   │ │
  │  │Logging   │  │Metrics   │  │Tracing   │  │(SSE + WS)  │ │
  │  └──────────┘  └──────────┘  └──────────┘  └────────────┘ │
  └─────────────────────────────────────────────────────────────┘
                   │
                   ▼
  ┌─────────────────────────────────────────────────────────────┐
  │                    Dashboard (Web UI)                        │
  │  ┌─────────────────┐  ┌─────────┐  ┌────────────────────┐ │
  │  │ Next.js Frontend │  │ FastAPI │  │ Celery Worker      │ │
  │  │ React Query + SSE│  │ Backend │  │ (invokes           │ │
  │  │ :3000            │  │ :8000   │  │  sampling_loop)    │ │
  │  └─────────────────┘  └─────────┘  └────────────────────┘ │
  │  ┌──────────┐  ┌──────────┐  ┌──────────────────────────┐ │
  │  │PostgreSQL│  │  Redis   │  │ MinIO (S3 artifacts)     │ │
  │  └──────────┘  └──────────┘  └──────────────────────────┘ │
  └─────────────────────────────────────────────────────────────┘

How It Works: Request Flow

  1. CLI Invocation -- python main.py "auto-close tickets" parses runtime options (tool version, effort level, tool search, deferred tools) and enters continuous_loop().
  2. Agent Selection -- get_active_agents() matches keywords in the instruction to agent types. Multiple matches run in parallel as independent asyncio.Task instances.
  3. Adaptive Configuration -- Each agent gets a tailored config: thinking budget, max tokens, complexity score, and model selection via select_model_for_task().
  4. Sampling Loop -- sampling_loop() is the core conversation engine. It initializes tools from TOOL_GROUPS_BY_VERSION, fetches the system prompt from StateSet APIs, creates the API client, and enters the message loop.
  5. API Call -- Each iteration sends the conversation to Claude Opus 4.6 through a CircuitBreaker, with prompt caching on the 3 most recent turns and dynamic context management adapting compaction aggressiveness to current token usage.
  6. Tool Execution -- Claude's tool calls are analyzed by DependencyAnalyzer. Read-only, independent calls execute in parallel via asyncio.gather(); dependent calls execute sequentially. MCP tools are dispatched to their respective server connections.
  7. Context Optimization -- After each iteration, the ContextOptimizer tracks attention quality (EXCELLENT through CRITICAL) and applies compaction strategies. Messages are compressed when history exceeds 20 turns.
  8. Completion -- When Claude responds with no tool calls, the loop returns SamplingLoopResult. The orchestrator runs analyze_task_completion(), records metrics, and sends a Stripe billing event.

Core Components

Component Responsibility Location
Orchestrator CLI parsing, agent selection, parallel dispatch, billing main.py
Agent Loop Conversation loop, API calls, message management agent/loop.py
Tool Collection Tool registration, dispatch, deferred loading agent/tools/collection.py
Tool Groups Version-specific tool bundles (20241022, 20250124, 20251124, cli) agent/tools/groups.py
Parallel Executor Dependency analysis, safe parallel tool execution agent/parallel_executor.py
Context Optimizer JIT retrieval, compaction, attention budget, sub-agent compression agent/context_optimizer.py
Subagent Manager Spawn isolated sub-agents (explore, analyze, code, research) agent/subagent.py
MCP Client Connect external tools via stdio/SSE/HTTP transports agent/mcp_client.py
Structured Output JSON schema validation, extraction, pre-defined schemas agent/structured_output.py
Stuck Detector Repeating actions, no visual changes, cycling detection agent/stuck_detection.py
Checkpoint Manager Save/resume long-running tasks, heartbeat monitoring agent/checkpoint.py
Skill Manager Skill profiles, agent-skill mapping, container resolution agent/skill_manager.py
Health Checker API connectivity, display, memory, disk checks agent/health.py
Circuit Breaker Fault tolerance for API calls (closed/open/half-open) agent/health.py
Config Centralized settings with YAML loading and env substitution agent/config.py
Observability Unified logging, Prometheus metrics, OpenTelemetry tracing, event streaming agent/observability/
Exception Hierarchy Typed errors (retryable, non-retryable, budget, tool, resource) agent/exceptions.py

Tool Versions

The system ships with 4 tool bundles, selected via --tool-version:

Version Tools Beta Flag Display Required
computer_use_20251124 (default) Computer (with zoom), Edit, Bash, Memory, AGI, CLI, AskUser computer-use-2025-11-24 Yes
computer_use_20250124 Computer, Edit, Bash, Memory, AGI, CLI, AskUser computer-use-2025-01-24 Yes
computer_use_20241022 Computer, Edit, Bash computer-use-2024-10-22 Yes
cli_20250124 Bash, Edit, Memory, AGI, CLI, AskUser None No

Additionally, SubagentTool is loaded dynamically at runtime (requires API key), and MCP tools are added from connected servers.

Subagent System

Implements Anthropic's sub-agent compression pattern. The main agent spawns isolated sub-agents that return compressed summaries instead of raw output (50k tokens of exploration compressed to 2k summary):

Type Model Use Case Max Turns Timeout
explore Haiku 4.5 Fast codebase/data exploration 5 60s
analyze Sonnet 4.5 Deep analysis with thinking 8 90s
execute Sonnet 4.5 Task execution with verification 15 180s
research Haiku 4.5 Web search and synthesis 8 120s
code Sonnet 4.5 Code generation and modification 12 180s

MCP Integration

Connect external tools via Model Context Protocol with 3 transport types (stdio, SSE, HTTP) and 8 pre-configured presets:

# Available presets: slack, github, postgres, filesystem, memory, brave-search, puppeteer, sqlite

Tools appear in the conversation as mcp__<server>__<tool> (e.g., mcp__slack__send_message).

Dashboard

The web dashboard provides job management, real-time monitoring, and artifact storage:

Frontend (Next.js 14)          Backend (FastAPI)           Worker (Celery)
─────────────────────          ─────────────────           ───────────────
Dashboard home                 POST /api/jobs              Receives job from
Launch Task form         ──►   GET  /api/jobs              Redis queue
Live Runs (SSE)          ◄──   GET  /api/events/jobs       Calls sampling_loop()
Outputs browser                GET  /api/artifacts         Stores artifacts in S3
Template management            CRUD /api/templates         Records billing via Stripe
Usage metrics                  GET  /api/metrics/overview
                               CRUD /api/agi              PostgreSQL (persistence)
                               CRUD /api/skills           Redis (broker/backend)
                               GET  /api/observability    MinIO (S3 artifacts)

Deploy with Docker Compose (docker compose up -d): frontend on :3000, backend on :8000, with Postgres, Redis, and MinIO. Optional monitoring profile adds Prometheus, Grafana, and OpenTelemetry Collector.

Read More: Architecture Documentation →

📚 Documentation

Getting Started

Technical Deep-Dives

Feature Documentation

Advanced Topics

📊 Performance & Cost

Real-World Metrics

Based on production usage across 1,000+ agent runs:

Metric Before Optimization After Optimization Improvement
Avg Tokens/Task 150,000 7,500 95% reduction
Avg Cost/Task $2.25 $0.11 95% savings
Avg Task Duration 45s 30s 33% faster
Context Quality Degrades >50k tokens EXCELLENT at 500k+ Indefinite
Parallel Speed Sequential (baseline) 30-50% faster 1.5x speedup

Cost Breakdown (per 1M tokens)

Operation Input Cost Output Cost Typical Usage
Claude Opus 4.6 $3.00 $15.00 Main model
Extended Thinking $3.00 $15.00 Complex tasks only
Prompt Caching (hit) $0.30 $15.00 90% cost reduction

Pro Tip: Enable prompt caching for system prompts to achieve an additional 90% savings on input tokens.

🔧 Configuration

Environment Variables

# Required
ANTHROPIC_API_KEY=sk-ant-api03-...     # Claude API access
DISPLAY=:1                              # X11 display server

# Optional
STRIPE_API_KEY=sk_live_...             # Usage-based billing
WORKSPACE_PATH=/path/to/workspace       # Working directory

Agent Configuration

Agents are configured via StateSet API or directly in main.py:

AGENT_CONFIGS = {
    "AUTO_CLOSE": AgentConfig(
        agent_id="stateset_auto_close",
        agent_type="AUTO_CLOSE",
        name="Auto-Close Agent",
        description="Automatically closes support tickets",
        stripe_customer_id="cus_..."  # Optional: for billing
    ),
    # ... more agents
}

Provider Selection

Support for multiple Claude providers:

# Anthropic (default)
provider = APIProvider.ANTHROPIC

# AWS Bedrock
provider = APIProvider.BEDROCK

# Google Vertex AI
provider = APIProvider.VERTEX

Read More: Configuration Guide →

🛡️ Security Best Practices

Critical Security Considerations

  1. Never commit API keys: Use environment variables or .env files

    echo '.env' >> .gitignore
    export ANTHROPIC_API_KEY='...'
    
  2. Secure screenshot storage: May contain sensitive user data

    chmod 700 screenshots/
    # Implement automatic cleanup policy
    
  3. Validate agent actions: Use tool guard for risky operations

    # Pre-execution validation in agent/tool_guard.py
    
  4. Agent memory isolation: Separate storage per agent

    # Memory stored in /tmp/agent_memories/{agent_id}/
    
  5. Prompt injection protection: Sanitize user inputs

    # Implemented in agent/tools/memory.py
    

Read More: Security Considerations →

🐛 Troubleshooting

Common Issues

Display not found error

Error: Error: Can't open display :1

Solution:

# Start virtual display
Xvfb :1 -screen 0 1920x1080x24 &
export DISPLAY=:1

# Verify
xdpyinfo | grep dimensions
Import errors for anthropic/pyautogui

Error: ModuleNotFoundError: No module named 'anthropic'

Solution:

# Ensure virtual environment is activated
source venv/bin/activate

# Reinstall dependencies
pip install -r requirements.txt

# Verify installation
pip list | grep anthropic
API authentication failures

Error: 401 Unauthorized

Solution:

# Verify API key format (starts with sk-ant-api03-)
echo $ANTHROPIC_API_KEY

# Test API connection
curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{"model":"claude-opus-4-6-20260131","max_tokens":1024,"messages":[{"role":"user","content":"test"}]}'

Read More: Troubleshooting Guide →

🚧 Roadmap

Implemented ✅

  • Multi-agent architecture with parallel execution
  • All 5 Anthropic context engineering patterns
  • Automatic tool parallelization (30-50% speedup)
  • Extended thinking support with effort control
  • Production monitoring and billing (Stripe)
  • Security hardening (prompt injection, path traversal, memory isolation)
  • Skills system for extensibility
  • Web dashboard with real-time SSE updates
  • Distributed tracing (OpenTelemetry integration)
  • Circuit breakers for API fault tolerance
  • Comprehensive exception hierarchy with typed errors
  • Subagent spawning for isolated task decomposition
  • MCP client integration (Slack, GitHub, Postgres, etc.)
  • Structured JSON output with schema validation
  • Stuck detection and recovery (repeating actions, cycling, stale loops)
  • Checkpoint system for resumable long-running tasks
  • Unified observability (logging, metrics, tracing, event streaming)
  • Tool search for deferred tool loading
  • Centralized configuration with YAML and env var support
  • 2,700+ unit tests

Planned 📋

  • Agent marketplace for community contributions
  • Kubernetes deployment templates
  • ML-based screenshot delay prediction
  • Enhanced cost optimization (LZ4 compression)

Read More: Future Improvements →

📈 Metrics & Monitoring

Built-in Metrics

Every agent run captures comprehensive metrics:

{
  "task_id": "task_20250324_142301",
  "agent_type": "AUTO_CLOSE",
  "duration_seconds": 32.5,
  "tokens_used": 8234,
  "estimated_cost": 0.12,
  "tools_executed": 15,
  "parallel_executions": 3,
  "success": true,
  "completion_indicators": ["ticket closed", "task finished"]
}

Stripe Billing Integration

Automatic usage-based billing:

# Meters configured at Stripe
meter_id = "computer_use_tokens"

# Events sent on task completion
{
  "event_name": "computer_use_tokens",
  "payload": {
    "stripe_customer_id": "cus_...",
    "value": 8234  # tokens used
  }
}

Read More: Metrics Guide →

🤝 Contributing

This is proprietary software. For internal contributors:

  1. Follow the development guidelines
  2. All changes require review from 2+ team members
  3. Ensure tests pass and coverage remains >80%
  4. Update documentation for user-facing changes

📞 Support

Resources

  • Documentation: Start with Getting Started Guide
  • Bug Reports: Create detailed issue reports with logs and screenshots
  • Feature Requests: Submit to product team with use cases

Contact

For support, contact the StateSet team:

  • Email: support@stateset.com
  • Slack: #computer-use-agents (internal)
  • Emergency: On-call rotation (internal)

📄 License

This project is proprietary software. All rights reserved.

Unauthorized copying, modification, distribution, or use of this software is strictly prohibited.

For licensing inquiries, contact: legal@stateset.com

🙏 Acknowledgments

Built with:


StateSet Computer Use Agent

Autonomous AI agents for the modern enterprise

DocumentationArchitectureSupport

Made with ❤️ by the StateSet team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stateset_cua-2.0.0.tar.gz (334.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stateset_cua-2.0.0-py3-none-any.whl (362.2 kB view details)

Uploaded Python 3

File details

Details for the file stateset_cua-2.0.0.tar.gz.

File metadata

  • Download URL: stateset_cua-2.0.0.tar.gz
  • Upload date:
  • Size: 334.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for stateset_cua-2.0.0.tar.gz
Algorithm Hash digest
SHA256 76edddd2882488e643e0a4d40a3dec1745c44fd0b93a4de91fbc60275dd8c157
MD5 80060095c0f103ba439f5c8c21bfd5db
BLAKE2b-256 c2c08d4b303c2e1bdb3fa0a0e98a1d3bc4cbec39bb5cb3772298a28d1fe9cd65

See more details on using hashes here.

File details

Details for the file stateset_cua-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: stateset_cua-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 362.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for stateset_cua-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4372f77f580bbf153577c800fa77f82d471868465bd6893967802f6aee7796a5
MD5 d02c34c6e7350dda65ba118312c76823
BLAKE2b-256 b2a7b70989894f53c6a9e6a59472a8006b9a9e3d89c79931d1d9f59ed4f858e2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page