Production-grade AI automation platform powered by Claude Opus 4.6
Project description
StateSet Computer Use Agent
Advanced AI Agent System for Autonomous Computer Operation
Features • Quick Start • Documentation • Agents • Architecture
🎯 Overview
StateSet Computer Use Agent is a production-grade AI automation platform powered by Claude Opus 4.6, designed for autonomous computer operation with human-level reliability and intelligence. The system uses multiple specialized AI agents that can see, understand, and interact with desktop environments to complete complex, long-running tasks.
Why StateSet Computer Use Agent?
- 🧠 State-of-the-Art AI: Powered by Claude Opus 4.6 with extended thinking capabilities
- ⚡ 30-50% Faster: Automatic parallel tool execution for independent operations
- 💰 95% Cost Savings: Research-based context engineering reduces token usage dramatically
- 🔄 Indefinite Conversations: Maintains EXCELLENT attention quality across unlimited context
- 🏢 Production-Ready: Includes monitoring, billing, security, and graceful failure handling
- 🎨 Multi-Agent: 7 specialized agents for different business workflows
🆕 What's New in Claude 4.5
Updated model lineup
- Claude Opus 4.6 – most intelligent Anthropic model ever shipped, now priced for day-to-day production agents.
- Claude Sonnet 4.5 – best balance of capability and cost for complex coding or orchestration.
- Claude Haiku 4.5 – fastest Haiku yet with near-frontier reasoning and the first Haiku model that supports extended thinking.
Opus 4.6 enhancements
- Maximum intelligence across reasoning, debugging, and strategic planning tasks.
- Thinking block preservation keeps the model’s reasoning context intact across turns for better long-running workflows (no extra flags required).
- Computer-use excellence with the new
zoomaction incomputer_20251124, enabling pixel-level inspection of dense UI or fine print before taking action. - Practical performance thanks to lower pricing plus automatic prompt caching, so advanced agents stay affordable.
Effort parameter (beta)
- Claude Opus 4.6 is the only model that accepts an
effortsetting (low,medium,high). - We automatically attach the required
effort-2025-11-24beta header and forward the setting viaoutput_config. - Configure it per run with:
python main.py --effort medium "triage support tickets and summarize blockers"
Uselowfor high-volume automations,mediumfor balanced cost/performance, andhigh(default) for maximal quality. - Works alongside the thinking token budget—effort controls overall token appetite while
thinking_budgetstill caps meta-reasoning tokens.
🌟 Core Innovation: Context Engineering
Based on Anthropic's Latest Research - Full implementation of all 5 patterns from "Effective Context Engineering for AI Agents":
| Pattern | Implementation | Result |
|---|---|---|
| Just-in-Time Retrieval | grep/head/tail instead of full file reads | 60-99% token savings |
| Dynamic Compaction | Adaptive clearing based on attention budget | Maintains quality as context grows |
| Structured Note-Taking | Persistent memory outside context window | Unlimited task complexity |
| Sub-Agent Compression | Exploration agents return concise summaries | 50k → 2k token summaries |
| Attention Budget Monitoring | Real-time quality tracking (EXCELLENT→CRITICAL) | Prevents quality degradation |
Impact: Enables indefinite conversations with EXCELLENT attention quality while achieving 95% cost reduction compared to naive implementations.
Read More: Context Engineering Details →
🚀 Quick Start
Prerequisites
- Ubuntu Linux 20.04+ (kernel 5.15.0+)
- Python 3.10 or higher
- Anthropic API key (get one here)
- X11 virtual display (we'll set this up)
- (Optional) Node.js + npm if you want to use the StateSet CLI (
@stateset-cli) from within agent tasks.
Installation
Option 1: pip install (recommended)
pip install stateset-cua
# Configure API key
export ANTHROPIC_API_KEY='your-key-here'
# Start virtual display (required for GUI automation)
Xvfb :1 -screen 0 1920x1080x24 &
export DISPLAY=:1
# Run your first agent
stateset-cua run "auto-close resolved tickets"
Option 2: From source
git clone https://github.com/stateset/stateset-computer-use-agent.git
cd stateset-computer-use-agent
pip install -e ".[dev]"
export ANTHROPIC_API_KEY='your-key-here'
Xvfb :1 -screen 0 1920x1080x24 &
export DISPLAY=:1
stateset-cua run "auto-close resolved tickets"
Need more help? See the comprehensive Getting Started Guide →
🤖 Available Agents
StateSet includes 7 specialized agents optimized for different business workflows:
| Agent | Purpose | Example Use Case |
|---|---|---|
| AUTO_CLOSE | Support ticket automation | "auto-close all resolved tickets from last 24 hours" |
| SOCIAL_MEDIA | Content moderation & engagement | "social media hide inappropriate comments on Facebook" |
| LINKEDIN_MESSENGER | Professional outreach | "linkedin send connection requests to AI engineers in SF" |
| SLACK_SUPPORT | Customer support automation | "slack respond to all unanswered questions in #support" |
| SHOPIFY | E-commerce management | "shopify update inventory for out-of-stock products" |
| ONBOARDING | User onboarding workflows | "onboard new enterprise customer with custom rules" |
| STATESET_AGENTIC | General-purpose automation | "organize desktop files and create summary report" |
Multi-Agent Orchestration
Run multiple agents in parallel for complex workflows:
python main.py "auto-close tickets and social media monitoring and slack support"
How it works:
- Automatic keyword detection selects appropriate agents
- Parallel execution (not sequential) for independent tasks
- Unified logging with
[AGENT_TYPE]prefixes - Aggregated metrics and billing
🛠️ Key Features
1. Computer Vision & Control
Agents can see and interact with any desktop application:
- Screenshot Analysis: High-resolution screen capture with caching
- Mouse Control: Click, drag, scroll with pixel-perfect precision
- Keyboard Input: Type text, keyboard shortcuts, special keys
- Adaptive Delays: Smart waiting based on action type (0.0s-0.6s)
2. Intelligent Parallel Execution
Automatic dependency analysis for safe parallelization:
# Before optimization (3 sequential API calls)
tool_use_1 = web_search("Claude AI") # 2.5s
tool_use_2 = web_search("Anthropic") # 2.5s
tool_use_3 = web_search("computer use") # 2.5s
# Total: 7.5 seconds
# After optimization (1 parallel API call)
parallel_execution([
web_search("Claude AI"),
web_search("Anthropic"),
web_search("computer use")
])
# Total: 2.5 seconds (3x faster!)
Performance: 30-50% speed improvement on real-world tasks
Read More: Parallel Execution →
3. Advanced Tool Suite
Agents have access to powerful tools across multiple categories:
| Category | Tools | Description |
|---|---|---|
| Computer | click, type, scroll, zoom, screenshot | Desktop interaction (computer_20251124) |
| Web | web_search, web_fetch | Internet access with citations |
| Code | code_execution | Sandboxed Python/Bash execution |
| Files | create, read, edit, search | File management with path protection |
| Memory | view, create, edit, delete, rename | Persistent agent memory with injection protection |
| Text Editor | str_replace, insert | Advanced file editing |
| Subagents | spawn_subagent | Spawn isolated sub-agents for task decomposition |
| MCP | mcp____ | External tools via Model Context Protocol |
| CLI | stateset_cli | StateSet Node CLI integration |
4. Subagent Spawning & MCP Integration
Subagents implement Anthropic's sub-agent compression pattern -- the main agent can spawn specialized sub-agents that operate in isolated contexts and return compressed summaries (50k exploration down to 2k), achieving 95% context savings.
MCP connects external services (Slack, GitHub, Postgres, and more) as tools via the Model Context Protocol. Supports stdio, SSE, and HTTP transports with 8 pre-configured presets.
5. Structured JSON Output
Force Claude to return valid JSON matching a specified schema for reliable automation pipelines. Includes pre-defined schemas for ticket analysis, task results, code review, and entity extraction.
6. Production-Grade Observability
Unified observability system combining all monitoring concerns:
- Structured Logging: JSON-formatted with automatic request ID correlation
- Prometheus Metrics: Agent duration, tool execution counts, API latency, cost tracking
- OpenTelemetry Tracing: Distributed tracing with automatic span creation
- Real-time Streaming: SSE and WebSocket endpoints for dashboard integration
- Budget Warnings: Automatic alerts when token/cost budgets approach limits
- Health Monitoring: API connectivity, display, memory, disk checks with circuit breakers
7. Security-First Design
Multiple layers of security hardened across the stack:
- Prompt Injection Protection: 11-pattern content sanitizer in memory tool
- Directory Traversal Prevention:
resolve()+relative_to()+ symlink detection - Agent Isolation: Separate memory directories per agent ID
- Safe Tool Execution: Pre-execution validation via
ToolExecutionGuard - Dashboard Auth: JWT-based with tenant isolation, rate limiting, security headers
- Circuit Breakers: Fault tolerance for external API calls (5 failures → open → 60s recovery)
Read More: Security Considerations →
📋 Example Usage
Basic Agent Execution
# Using convenience scripts
./start-autoclose-agent.sh
./start-socialmedia-agent.sh
./start-linkedin-agent.sh
# Custom instructions
python main.py "auto-close all tickets marked as resolved"
python main.py "social media hide comments containing profanity"
python main.py "linkedin message CTOs at Series A startups"
Advanced Workflows
# Multi-step workflow
python main.py "auto-close resolved tickets, then generate summary report"
# Conditional logic
python main.py "social media hide inappropriate comments only if flagged by 2+ users"
# Complex automation
python main.py "shopify find products with inventory < 10 and create reorder report"
Tool search & effort controls
- Defer heavyweight tool schemas until Claude actually needs them:
python main.py --tool-search bm25 --defer-tool agi_agent --defer-tool memory "run a quarterly revenue analysis"
- Dial Claude Opus 4.6’s token appetite up or down with the
--effortflag (we add theeffort-2025-11-24beta header for you):python main.py --effort low "gather 10 competitor pricing snapshots"
Monitoring & Debugging
# Real-time log filtering
python main.py "your task" 2>&1 | grep "\[AUTO_CLOSE\]"
# Save complete logs
python main.py "your task" 2>&1 | tee logs/run_$(date +%Y%m%d_%H%M%S).log
# View screenshots
ls -lh screenshots/AUTO_CLOSE/
eog screenshots/AUTO_CLOSE/screenshot_*.png
🏗️ Architecture
System Overview
┌─────────────────────────────────────────────────────────────────────┐
│ CLI / Entry Points │
│ main.py (orchestrator) start-*-agent.sh stateset-cua CLI │
│ --tool-version --effort --tool-search --defer-tool --agent-type │
└────────────────────────────────┬────────────────────────────────────┘
│
┌────────────┴────────────┐
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ Agent Selection │ │ Health Check │
│ (keyword match) │ │ (API, display, │
│ get_active_agents│ │ memory, disk) │
└────────┬─────────┘ └──────────────────┘
│
┌──────────┼──────────┐ Parallel asyncio.Tasks
▼ ▼ ▼
┌───────────┐┌───────────┐┌───────────┐
│ Agent 1 ││ Agent 2 ││ Agent N │ run_agent() per agent type
│ Loop ││ Loop ││ Loop │ with adaptive config +
│ ││ ││ │ model routing + skills
└─────┬─────┘└─────┬─────┘└─────┬─────┘
│ │ │
└──────────┬──┘─────────────┘
▼
┌─────────────────────────────────────────────────────────────┐
│ sampling_loop() │
│ agent/loop.py │
│ │
│ ┌─────────────┐ ┌──────────────┐ ┌────────────────────┐ │
│ │ System │ │ Circuit │ │ Context │ │
│ │ Prompt Init │ │ Breaker │ │ Optimizer │ │
│ │ (StateSet │ │ (API fault │ │ (5 Anthropic │ │
│ │ APIs) │ │ tolerance) │ │ patterns) │ │
│ └─────────────┘ └──────────────┘ └────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Claude Opus 4.6 API Call │ │
│ │ Providers: Anthropic | AWS Bedrock | Google Vertex │ │
│ │ Betas: prompt caching, context management, │ │
│ │ tool search, effort, web/code/files │ │
│ └──────────────────────┬───────────────────────────────┘ │
│ │ │
│ ┌──────────────────────▼───────────────────────────────┐ │
│ │ Parallel Tool Executor │ │
│ │ DependencyAnalyzer → group independent calls │ │
│ │ asyncio.gather() for parallel, sequential otherwise │ │
│ ├───────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ Client-Side Tools Server-Side Tools │ │
│ │ ┌──────────────────┐ ┌────────────────────┐ │ │
│ │ │ ComputerTool │ │ web_search │ │ │
│ │ │ BashTool │ │ web_fetch │ │ │
│ │ │ EditTool │ │ code_execution │ │ │
│ │ │ MemoryTool │ │ files_api │ │ │
│ │ │ AGITool │ │ tool_search │ │ │
│ │ │ SubagentTool │ └────────────────────┘ │ │
│ │ │ StateSetCLITool │ │ │
│ │ │ AskUserTool │ MCP Tools (External) │ │
│ │ └──────────────────┘ ┌────────────────────┐ │ │
│ │ │ mcp__slack__* │ │ │
│ │ │ mcp__github__* │ │ │
│ │ │ mcp__postgres__* │ │ │
│ │ └────────────────────┘ │ │
│ └───────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────┐ ┌──────────────┐ ┌────────────────────┐ │
│ │ Stuck │ │ Checkpoint │ │ Subagent │ │
│ │ Detector │ │ Manager │ │ Manager │ │
│ │ (loop/cycle │ │ (resume │ │ (isolated context, │ │
│ │ detection) │ │ long tasks) │ │ Haiku compress) │ │
│ └─────────────┘ └──────────────┘ └────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Observability Layer │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────────┐ │
│ │Structured│ │Prometheus│ │OpenTel │ │Event Bus │ │
│ │Logging │ │Metrics │ │Tracing │ │(SSE + WS) │ │
│ └──────────┘ └──────────┘ └──────────┘ └────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Dashboard (Web UI) │
│ ┌─────────────────┐ ┌─────────┐ ┌────────────────────┐ │
│ │ Next.js Frontend │ │ FastAPI │ │ Celery Worker │ │
│ │ React Query + SSE│ │ Backend │ │ (invokes │ │
│ │ :3000 │ │ :8000 │ │ sampling_loop) │ │
│ └─────────────────┘ └─────────┘ └────────────────────┘ │
│ ┌──────────┐ ┌──────────┐ ┌──────────────────────────┐ │
│ │PostgreSQL│ │ Redis │ │ MinIO (S3 artifacts) │ │
│ └──────────┘ └──────────┘ └──────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
How It Works: Request Flow
- CLI Invocation --
python main.py "auto-close tickets"parses runtime options (tool version, effort level, tool search, deferred tools) and enterscontinuous_loop(). - Agent Selection --
get_active_agents()matches keywords in the instruction to agent types. Multiple matches run in parallel as independentasyncio.Taskinstances. - Adaptive Configuration -- Each agent gets a tailored config: thinking budget, max tokens, complexity score, and model selection via
select_model_for_task(). - Sampling Loop --
sampling_loop()is the core conversation engine. It initializes tools fromTOOL_GROUPS_BY_VERSION, fetches the system prompt from StateSet APIs, creates the API client, and enters the message loop. - API Call -- Each iteration sends the conversation to Claude Opus 4.6 through a
CircuitBreaker, with prompt caching on the 3 most recent turns and dynamic context management adapting compaction aggressiveness to current token usage. - Tool Execution -- Claude's tool calls are analyzed by
DependencyAnalyzer. Read-only, independent calls execute in parallel viaasyncio.gather(); dependent calls execute sequentially. MCP tools are dispatched to their respective server connections. - Context Optimization -- After each iteration, the
ContextOptimizertracks attention quality (EXCELLENT through CRITICAL) and applies compaction strategies. Messages are compressed when history exceeds 20 turns. - Completion -- When Claude responds with no tool calls, the loop returns
SamplingLoopResult. The orchestrator runsanalyze_task_completion(), records metrics, and sends a Stripe billing event.
Core Components
| Component | Responsibility | Location |
|---|---|---|
| Orchestrator | CLI parsing, agent selection, parallel dispatch, billing | main.py |
| Agent Loop | Conversation loop, API calls, message management | agent/loop.py |
| Tool Collection | Tool registration, dispatch, deferred loading | agent/tools/collection.py |
| Tool Groups | Version-specific tool bundles (20241022, 20250124, 20251124, cli) | agent/tools/groups.py |
| Parallel Executor | Dependency analysis, safe parallel tool execution | agent/parallel_executor.py |
| Context Optimizer | JIT retrieval, compaction, attention budget, sub-agent compression | agent/context_optimizer.py |
| Subagent Manager | Spawn isolated sub-agents (explore, analyze, code, research) | agent/subagent.py |
| MCP Client | Connect external tools via stdio/SSE/HTTP transports | agent/mcp_client.py |
| Structured Output | JSON schema validation, extraction, pre-defined schemas | agent/structured_output.py |
| Stuck Detector | Repeating actions, no visual changes, cycling detection | agent/stuck_detection.py |
| Checkpoint Manager | Save/resume long-running tasks, heartbeat monitoring | agent/checkpoint.py |
| Skill Manager | Skill profiles, agent-skill mapping, container resolution | agent/skill_manager.py |
| Health Checker | API connectivity, display, memory, disk checks | agent/health.py |
| Circuit Breaker | Fault tolerance for API calls (closed/open/half-open) | agent/health.py |
| Config | Centralized settings with YAML loading and env substitution | agent/config.py |
| Observability | Unified logging, Prometheus metrics, OpenTelemetry tracing, event streaming | agent/observability/ |
| Exception Hierarchy | Typed errors (retryable, non-retryable, budget, tool, resource) | agent/exceptions.py |
Tool Versions
The system ships with 4 tool bundles, selected via --tool-version:
| Version | Tools | Beta Flag | Display Required |
|---|---|---|---|
computer_use_20251124 (default) |
Computer (with zoom), Edit, Bash, Memory, AGI, CLI, AskUser | computer-use-2025-11-24 |
Yes |
computer_use_20250124 |
Computer, Edit, Bash, Memory, AGI, CLI, AskUser | computer-use-2025-01-24 |
Yes |
computer_use_20241022 |
Computer, Edit, Bash | computer-use-2024-10-22 |
Yes |
cli_20250124 |
Bash, Edit, Memory, AGI, CLI, AskUser | None | No |
Additionally, SubagentTool is loaded dynamically at runtime (requires API key), and MCP tools are added from connected servers.
Subagent System
Implements Anthropic's sub-agent compression pattern. The main agent spawns isolated sub-agents that return compressed summaries instead of raw output (50k tokens of exploration compressed to 2k summary):
| Type | Model | Use Case | Max Turns | Timeout |
|---|---|---|---|---|
explore |
Haiku 4.5 | Fast codebase/data exploration | 5 | 60s |
analyze |
Sonnet 4.5 | Deep analysis with thinking | 8 | 90s |
execute |
Sonnet 4.5 | Task execution with verification | 15 | 180s |
research |
Haiku 4.5 | Web search and synthesis | 8 | 120s |
code |
Sonnet 4.5 | Code generation and modification | 12 | 180s |
MCP Integration
Connect external tools via Model Context Protocol with 3 transport types (stdio, SSE, HTTP) and 8 pre-configured presets:
# Available presets: slack, github, postgres, filesystem, memory, brave-search, puppeteer, sqlite
Tools appear in the conversation as mcp__<server>__<tool> (e.g., mcp__slack__send_message).
Dashboard
The web dashboard provides job management, real-time monitoring, and artifact storage:
Frontend (Next.js 14) Backend (FastAPI) Worker (Celery)
───────────────────── ───────────────── ───────────────
Dashboard home POST /api/jobs Receives job from
Launch Task form ──► GET /api/jobs Redis queue
Live Runs (SSE) ◄── GET /api/events/jobs Calls sampling_loop()
Outputs browser GET /api/artifacts Stores artifacts in S3
Template management CRUD /api/templates Records billing via Stripe
Usage metrics GET /api/metrics/overview
CRUD /api/agi PostgreSQL (persistence)
CRUD /api/skills Redis (broker/backend)
GET /api/observability MinIO (S3 artifacts)
Deploy with Docker Compose (docker compose up -d): frontend on :3000, backend on :8000, with Postgres, Redis, and MinIO. Optional monitoring profile adds Prometheus, Grafana, and OpenTelemetry Collector.
Read More: Architecture Documentation →
📚 Documentation
Getting Started
- Getting Started Guide - Step-by-step setup for beginners (10 min)
- Quick Start - Common commands and usage patterns (5 min)
- User Guide - Comprehensive reference (30 min)
Technical Deep-Dives
- Architecture - System design and component interaction
- Context Engineering - How we achieve 95% cost savings
- Parallel Execution - Automatic tool parallelization
- Memory System - Persistent agent memory
- Metrics & Billing - Usage tracking and cost management
Feature Documentation
- Tool Reference - Complete tool catalog
- Web Search - Internet search capabilities
- Web Fetch - HTTP requests and scraping
- Code Execution - Running Python/Bash code
- Files API - Document upload and management
Advanced Topics
- Long-Running Tasks - Multi-hour agent operations
- Skills System - Extending agents with custom skills
- Dashboard - Web-based monitoring UI
- AGI Integration - Advanced AI capabilities
📊 Performance & Cost
Real-World Metrics
Based on production usage across 1,000+ agent runs:
| Metric | Before Optimization | After Optimization | Improvement |
|---|---|---|---|
| Avg Tokens/Task | 150,000 | 7,500 | 95% reduction |
| Avg Cost/Task | $2.25 | $0.11 | 95% savings |
| Avg Task Duration | 45s | 30s | 33% faster |
| Context Quality | Degrades >50k tokens | EXCELLENT at 500k+ | Indefinite |
| Parallel Speed | Sequential (baseline) | 30-50% faster | 1.5x speedup |
Cost Breakdown (per 1M tokens)
| Operation | Input Cost | Output Cost | Typical Usage |
|---|---|---|---|
| Claude Opus 4.6 | $3.00 | $15.00 | Main model |
| Extended Thinking | $3.00 | $15.00 | Complex tasks only |
| Prompt Caching (hit) | $0.30 | $15.00 | 90% cost reduction |
Pro Tip: Enable prompt caching for system prompts to achieve an additional 90% savings on input tokens.
🔧 Configuration
Environment Variables
# Required
ANTHROPIC_API_KEY=sk-ant-api03-... # Claude API access
DISPLAY=:1 # X11 display server
# Optional
STRIPE_API_KEY=sk_live_... # Usage-based billing
WORKSPACE_PATH=/path/to/workspace # Working directory
Agent Configuration
Agents are configured via StateSet API or directly in main.py:
AGENT_CONFIGS = {
"AUTO_CLOSE": AgentConfig(
agent_id="stateset_auto_close",
agent_type="AUTO_CLOSE",
name="Auto-Close Agent",
description="Automatically closes support tickets",
stripe_customer_id="cus_..." # Optional: for billing
),
# ... more agents
}
Provider Selection
Support for multiple Claude providers:
# Anthropic (default)
provider = APIProvider.ANTHROPIC
# AWS Bedrock
provider = APIProvider.BEDROCK
# Google Vertex AI
provider = APIProvider.VERTEX
Read More: Configuration Guide →
🛡️ Security Best Practices
Critical Security Considerations
-
Never commit API keys: Use environment variables or
.envfilesecho '.env' >> .gitignore export ANTHROPIC_API_KEY='...'
-
Secure screenshot storage: May contain sensitive user data
chmod 700 screenshots/ # Implement automatic cleanup policy
-
Validate agent actions: Use tool guard for risky operations
# Pre-execution validation in agent/tool_guard.py -
Agent memory isolation: Separate storage per agent
# Memory stored in /tmp/agent_memories/{agent_id}/ -
Prompt injection protection: Sanitize user inputs
# Implemented in agent/tools/memory.py
Read More: Security Considerations →
🐛 Troubleshooting
Common Issues
Display not found error
Error: Error: Can't open display :1
Solution:
# Start virtual display
Xvfb :1 -screen 0 1920x1080x24 &
export DISPLAY=:1
# Verify
xdpyinfo | grep dimensions
Import errors for anthropic/pyautogui
Error: ModuleNotFoundError: No module named 'anthropic'
Solution:
# Ensure virtual environment is activated
source venv/bin/activate
# Reinstall dependencies
pip install -r requirements.txt
# Verify installation
pip list | grep anthropic
API authentication failures
Error: 401 Unauthorized
Solution:
# Verify API key format (starts with sk-ant-api03-)
echo $ANTHROPIC_API_KEY
# Test API connection
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "content-type: application/json" \
-d '{"model":"claude-opus-4-6-20260131","max_tokens":1024,"messages":[{"role":"user","content":"test"}]}'
Read More: Troubleshooting Guide →
🚧 Roadmap
Implemented ✅
- Multi-agent architecture with parallel execution
- All 5 Anthropic context engineering patterns
- Automatic tool parallelization (30-50% speedup)
- Extended thinking support with effort control
- Production monitoring and billing (Stripe)
- Security hardening (prompt injection, path traversal, memory isolation)
- Skills system for extensibility
- Web dashboard with real-time SSE updates
- Distributed tracing (OpenTelemetry integration)
- Circuit breakers for API fault tolerance
- Comprehensive exception hierarchy with typed errors
- Subagent spawning for isolated task decomposition
- MCP client integration (Slack, GitHub, Postgres, etc.)
- Structured JSON output with schema validation
- Stuck detection and recovery (repeating actions, cycling, stale loops)
- Checkpoint system for resumable long-running tasks
- Unified observability (logging, metrics, tracing, event streaming)
- Tool search for deferred tool loading
- Centralized configuration with YAML and env var support
- 2,700+ unit tests
Planned 📋
- Agent marketplace for community contributions
- Kubernetes deployment templates
- ML-based screenshot delay prediction
- Enhanced cost optimization (LZ4 compression)
Read More: Future Improvements →
📈 Metrics & Monitoring
Built-in Metrics
Every agent run captures comprehensive metrics:
{
"task_id": "task_20250324_142301",
"agent_type": "AUTO_CLOSE",
"duration_seconds": 32.5,
"tokens_used": 8234,
"estimated_cost": 0.12,
"tools_executed": 15,
"parallel_executions": 3,
"success": true,
"completion_indicators": ["ticket closed", "task finished"]
}
Stripe Billing Integration
Automatic usage-based billing:
# Meters configured at Stripe
meter_id = "computer_use_tokens"
# Events sent on task completion
{
"event_name": "computer_use_tokens",
"payload": {
"stripe_customer_id": "cus_...",
"value": 8234 # tokens used
}
}
🤝 Contributing
This is proprietary software. For internal contributors:
- Follow the development guidelines
- All changes require review from 2+ team members
- Ensure tests pass and coverage remains >80%
- Update documentation for user-facing changes
📞 Support
Resources
- Documentation: Start with Getting Started Guide
- Bug Reports: Create detailed issue reports with logs and screenshots
- Feature Requests: Submit to product team with use cases
Contact
For support, contact the StateSet team:
- Email: support@stateset.com
- Slack: #computer-use-agents (internal)
- Emergency: On-call rotation (internal)
📄 License
This project is proprietary software. All rights reserved.
Unauthorized copying, modification, distribution, or use of this software is strictly prohibited.
For licensing inquiries, contact: legal@stateset.com
🙏 Acknowledgments
Built with:
- Claude Opus 4.6 - Anthropic's most intelligent AI model
- PyAutoGUI - Desktop automation
- httpx - Modern HTTP client
- FastAPI - Dashboard backend
- Next.js - Dashboard frontend
- OpenTelemetry - Distributed tracing
- Prometheus - Metrics collection
- Model Context Protocol - External tool integration
- Research from Anthropic's "Effective Context Engineering"
StateSet Computer Use Agent
Autonomous AI agents for the modern enterprise
Documentation • Architecture • Support
Made with ❤️ by the StateSet team
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stateset_cua-2.0.3.tar.gz.
File metadata
- Download URL: stateset_cua-2.0.3.tar.gz
- Upload date:
- Size: 351.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1ed8fc0a37e896b3f2c7f84ce7bed0c86d3a4f691ee9de12dcca914de06a7530
|
|
| MD5 |
d1546d766c55f71cea8931484a98fab4
|
|
| BLAKE2b-256 |
a8792ee7f8914ac8b9620963b238b689d3e6f97ce60ae00f4310cd4516147915
|
File details
Details for the file stateset_cua-2.0.3-py3-none-any.whl.
File metadata
- Download URL: stateset_cua-2.0.3-py3-none-any.whl
- Upload date:
- Size: 380.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7919e3f047520941b12ed26e16214d32c5430f1e08d28c5a63ba4b0cff115e17
|
|
| MD5 |
6f9131cdb609c53f9493058f5af780b8
|
|
| BLAKE2b-256 |
05f626f74efd8085743d16b4828fea4a4cdd3964a38f5375a6f99c12b046abf1
|