Production-grade AI automation platform powered by Claude Opus 4.6

These details have not been verified by PyPI

Project links

Project description

StateSet Computer Use Agent

Advanced AI Agent System for Autonomous Computer Operation

Features • Quick Start • Documentation • Agents • Architecture

🎯 Overview

StateSet Computer Use Agent is a production-grade AI automation platform powered by Claude Opus 4.6, designed for autonomous computer operation with human-level reliability and intelligence. The system uses multiple specialized AI agents that can see, understand, and interact with desktop environments to complete complex, long-running tasks.

Why StateSet Computer Use Agent?

🧠 State-of-the-Art AI: Powered by Claude Opus 4.6 with extended thinking capabilities
⚡ 30-50% Faster: Automatic parallel tool execution for independent operations
💰 95% Cost Savings: Research-based context engineering reduces token usage dramatically
🔄 Indefinite Conversations: Maintains EXCELLENT attention quality across unlimited context
🏢 Production-Ready: Includes monitoring, billing, security, and graceful failure handling
🎨 Multi-Agent: 7 specialized agents for different business workflows

🆕 What's New in Claude 4.5

Updated model lineup

Claude Opus 4.6 – most intelligent Anthropic model ever shipped, now priced for day-to-day production agents.
Claude Sonnet 4.5 – best balance of capability and cost for complex coding or orchestration.
Claude Haiku 4.5 – fastest Haiku yet with near-frontier reasoning and the first Haiku model that supports extended thinking.

Opus 4.6 enhancements

Maximum intelligence across reasoning, debugging, and strategic planning tasks.
Thinking block preservation keeps the model’s reasoning context intact across turns for better long-running workflows (no extra flags required).
Computer-use excellence with the new zoom action in computer_20251124, enabling pixel-level inspection of dense UI or fine print before taking action.
Practical performance thanks to lower pricing plus automatic prompt caching, so advanced agents stay affordable.

Effort parameter (beta)

Claude Opus 4.6 is the only model that accepts an effort setting (low, medium, high).
We automatically attach the required effort-2025-11-24 beta header and forward the setting via output_config.
Configure it per run with:
```
python main.py --effort medium "triage support tickets and summarize blockers"
```
Use low for high-volume automations, medium for balanced cost/performance, and high (default) for maximal quality.
Works alongside the thinking token budget—effort controls overall token appetite while thinking_budget still caps meta-reasoning tokens.

🌟 Core Innovation: Context Engineering

Based on Anthropic's Latest Research - Full implementation of all 5 patterns from "Effective Context Engineering for AI Agents":

Pattern	Implementation	Result
Just-in-Time Retrieval	grep/head/tail instead of full file reads	60-99% token savings
Dynamic Compaction	Adaptive clearing based on attention budget	Maintains quality as context grows
Structured Note-Taking	Persistent memory outside context window	Unlimited task complexity
Sub-Agent Compression	Exploration agents return concise summaries	50k → 2k token summaries
Attention Budget Monitoring	Real-time quality tracking (EXCELLENT→CRITICAL)	Prevents quality degradation

Impact: Enables indefinite conversations with EXCELLENT attention quality while achieving 95% cost reduction compared to naive implementations.

🚀 Quick Start

Prerequisites

Ubuntu Linux 20.04+ (kernel 5.15.0+)
Python 3.10 or higher
Anthropic API key (get one here)
X11 virtual display (we'll set this up)
(Optional) Node.js + npm if you want to use the StateSet CLI (@stateset-cli) from within agent tasks.

Installation

Option 1: pip install (recommended)

pip install stateset-cua

# Configure API key
export ANTHROPIC_API_KEY='your-key-here'

# Start virtual display (required for GUI automation)
Xvfb :1 -screen 0 1920x1080x24 &
export DISPLAY=:1

# Run your first agent
stateset-cua run "auto-close resolved tickets"

Option 2: From source

git clone https://github.com/stateset/stateset-computer-use-agent.git
cd stateset-computer-use-agent
pip install -e ".[dev]"

export ANTHROPIC_API_KEY='your-key-here'
Xvfb :1 -screen 0 1920x1080x24 &
export DISPLAY=:1

stateset-cua run "auto-close resolved tickets"

Need more help? See the comprehensive Getting Started Guide →

🤖 Available Agents

StateSet includes 7 specialized agents optimized for different business workflows:

Agent	Purpose	Example Use Case
AUTO_CLOSE	Support ticket automation	"auto-close all resolved tickets from last 24 hours"
SOCIAL_MEDIA	Content moderation & engagement	"social media hide inappropriate comments on Facebook"
LINKEDIN_MESSENGER	Professional outreach	"linkedin send connection requests to AI engineers in SF"
SLACK_SUPPORT	Customer support automation	"slack respond to all unanswered questions in #support"
SHOPIFY	E-commerce management	"shopify update inventory for out-of-stock products"
ONBOARDING	User onboarding workflows	"onboard new enterprise customer with custom rules"
STATESET_AGENTIC	General-purpose automation	"organize desktop files and create summary report"

Multi-Agent Orchestration

Run multiple agents in parallel for complex workflows:

python main.py "auto-close tickets and social media monitoring and slack support"

How it works:

Automatic keyword detection selects appropriate agents
Parallel execution (not sequential) for independent tasks
Unified logging with [AGENT_TYPE] prefixes
Aggregated metrics and billing

🛠️ Key Features

1. Computer Vision & Control

Agents can see and interact with any desktop application:

Screenshot Analysis: High-resolution screen capture with caching
Mouse Control: Click, drag, scroll with pixel-perfect precision
Keyboard Input: Type text, keyboard shortcuts, special keys
Adaptive Delays: Smart waiting based on action type (0.0s-0.6s)

2. Intelligent Parallel Execution

Automatic dependency analysis for safe parallelization:

# Before optimization (3 sequential API calls)
tool_use_1 = web_search("Claude AI")        # 2.5s
tool_use_2 = web_search("Anthropic")        # 2.5s
tool_use_3 = web_search("computer use")     # 2.5s
# Total: 7.5 seconds

# After optimization (1 parallel API call)
parallel_execution([
    web_search("Claude AI"),
    web_search("Anthropic"),
    web_search("computer use")
])
# Total: 2.5 seconds (3x faster!)

Performance: 30-50% speed improvement on real-world tasks

3. Advanced Tool Suite

Agents have access to powerful tools across multiple categories:

Category	Tools	Description
Computer	click, type, scroll, zoom, screenshot	Desktop interaction (computer_20251124)
Web	web_search, web_fetch	Internet access with citations
Code	code_execution	Sandboxed Python/Bash execution
Files	create, read, edit, search	File management with path protection
Memory	view, create, edit, delete, rename	Persistent agent memory with injection protection
Text Editor	str_replace, insert	Advanced file editing
Subagents	spawn_subagent	Spawn isolated sub-agents for task decomposition
MCP	mcp____	External tools via Model Context Protocol
CLI	stateset_cli	StateSet Node CLI integration

4. Subagent Spawning & MCP Integration

Subagents implement Anthropic's sub-agent compression pattern -- the main agent can spawn specialized sub-agents that operate in isolated contexts and return compressed summaries (50k exploration down to 2k), achieving 95% context savings.

MCP connects external services (Slack, GitHub, Postgres, and more) as tools via the Model Context Protocol. Supports stdio, SSE, and HTTP transports with 8 pre-configured presets.

5. Structured JSON Output

Force Claude to return valid JSON matching a specified schema for reliable automation pipelines. Includes pre-defined schemas for ticket analysis, task results, code review, and entity extraction.

6. Production-Grade Observability

Unified observability system combining all monitoring concerns:

Structured Logging: JSON-formatted with automatic request ID correlation
Prometheus Metrics: Agent duration, tool execution counts, API latency, cost tracking
OpenTelemetry Tracing: Distributed tracing with automatic span creation
Real-time Streaming: SSE and WebSocket endpoints for dashboard integration
Budget Warnings: Automatic alerts when token/cost budgets approach limits
Health Monitoring: API connectivity, display, memory, disk checks with circuit breakers

7. Security-First Design

Multiple layers of security hardened across the stack:

Prompt Injection Protection: 11-pattern content sanitizer in memory tool
Directory Traversal Prevention: resolve() + relative_to() + symlink detection
Agent Isolation: Separate memory directories per agent ID
Safe Tool Execution: Pre-execution validation via ToolExecutionGuard
Dashboard Auth: JWT-based with tenant isolation, rate limiting, security headers
Circuit Breakers: Fault tolerance for external API calls (5 failures → open → 60s recovery)

📋 Example Usage

Basic Agent Execution

# Using convenience scripts
./start-autoclose-agent.sh
./start-socialmedia-agent.sh
./start-linkedin-agent.sh

# Custom instructions
python main.py "auto-close all tickets marked as resolved"
python main.py "social media hide comments containing profanity"
python main.py "linkedin message CTOs at Series A startups"

Advanced Workflows

# Multi-step workflow
python main.py "auto-close resolved tickets, then generate summary report"

# Conditional logic
python main.py "social media hide inappropriate comments only if flagged by 2+ users"

# Complex automation
python main.py "shopify find products with inventory < 10 and create reorder report"

Tool search & effort controls

Defer heavyweight tool schemas until Claude actually needs them:

python main.py --tool-search bm25 --defer-tool agi_agent --defer-tool memory "run a quarterly revenue analysis"

Dial Claude Opus 4.6’s token appetite up or down with the --effort flag (we add the effort-2025-11-24 beta header for you):
```
python main.py --effort low "gather 10 competitor pricing snapshots"
```

Monitoring & Debugging

# Real-time log filtering
python main.py "your task" 2>&1 | grep "\[AUTO_CLOSE\]"

# Save complete logs
python main.py "your task" 2>&1 | tee logs/run_$(date +%Y%m%d_%H%M%S).log

# View screenshots
ls -lh screenshots/AUTO_CLOSE/
eog screenshots/AUTO_CLOSE/screenshot_*.png

🏗️ Architecture

System Overview

┌─────────────────────────────────────────────────────────────────────┐
│                            CLI / Entry Points                        │
│  main.py (orchestrator)    start-*-agent.sh    stateset-cua CLI      │
│  --tool-version  --effort  --tool-search  --defer-tool  --agent-type │
└────────────────────────────────┬────────────────────────────────────┘
                                 │
                    ┌────────────┴────────────┐
                    ▼                         ▼
          ┌──────────────────┐      ┌──────────────────┐
          │  Agent Selection  │      │  Health Check     │
          │  (keyword match)  │      │  (API, display,   │
          │  get_active_agents│      │   memory, disk)   │
          └────────┬─────────┘      └──────────────────┘
                   │
        ┌──────────┼──────────┐         Parallel asyncio.Tasks
        ▼          ▼          ▼
  ┌───────────┐┌───────────┐┌───────────┐
  │ Agent 1   ││ Agent 2   ││ Agent N   │   run_agent() per agent type
  │ Loop      ││ Loop      ││ Loop      │   with adaptive config +
  │           ││           ││           │   model routing + skills
  └─────┬─────┘└─────┬─────┘└─────┬─────┘
        │             │             │
        └──────────┬──┘─────────────┘
                   ▼
  ┌─────────────────────────────────────────────────────────────┐
  │                    sampling_loop()                           │
  │                    agent/loop.py                             │
  │                                                             │
  │  ┌─────────────┐  ┌──────────────┐  ┌────────────────────┐ │
  │  │ System      │  │ Circuit      │  │ Context            │ │
  │  │ Prompt Init │  │ Breaker      │  │ Optimizer          │ │
  │  │ (StateSet   │  │ (API fault   │  │ (5 Anthropic       │ │
  │  │  APIs)      │  │  tolerance)  │  │  patterns)         │ │
  │  └─────────────┘  └──────────────┘  └────────────────────┘ │
  │                                                             │
  │  ┌──────────────────────────────────────────────────────┐   │
  │  │              Claude Opus 4.6 API Call                 │   │
  │  │  Providers: Anthropic | AWS Bedrock | Google Vertex   │   │
  │  │  Betas: prompt caching, context management,           │   │
  │  │         tool search, effort, web/code/files           │   │
  │  └──────────────────────┬───────────────────────────────┘   │
  │                         │                                   │
  │  ┌──────────────────────▼───────────────────────────────┐   │
  │  │           Parallel Tool Executor                      │   │
  │  │  DependencyAnalyzer → group independent calls         │   │
  │  │  asyncio.gather() for parallel, sequential otherwise  │   │
  │  ├───────────────────────────────────────────────────────┤   │
  │  │                                                       │   │
  │  │  Client-Side Tools        Server-Side Tools           │   │
  │  │  ┌──────────────────┐     ┌────────────────────┐      │   │
  │  │  │ ComputerTool     │     │ web_search         │      │   │
  │  │  │ BashTool         │     │ web_fetch          │      │   │
  │  │  │ EditTool         │     │ code_execution     │      │   │
  │  │  │ MemoryTool       │     │ files_api          │      │   │
  │  │  │ AGITool          │     │ tool_search        │      │   │
  │  │  │ SubagentTool     │     └────────────────────┘      │   │
  │  │  │ StateSetCLITool  │                                 │   │
  │  │  │ AskUserTool      │     MCP Tools (External)        │   │
  │  │  └──────────────────┘     ┌────────────────────┐      │   │
  │  │                           │ mcp__slack__*      │      │   │
  │  │                           │ mcp__github__*     │      │   │
  │  │                           │ mcp__postgres__*   │      │   │
  │  │                           └────────────────────┘      │   │
  │  └───────────────────────────────────────────────────────┘   │
  │                                                             │
  │  ┌─────────────┐  ┌──────────────┐  ┌────────────────────┐ │
  │  │ Stuck       │  │ Checkpoint   │  │ Subagent           │ │
  │  │ Detector    │  │ Manager      │  │ Manager            │ │
  │  │ (loop/cycle │  │ (resume      │  │ (isolated context, │ │
  │  │  detection) │  │  long tasks) │  │  Haiku compress)   │ │
  │  └─────────────┘  └──────────────┘  └────────────────────┘ │
  └─────────────────────────────────────────────────────────────┘
                   │
                   ▼
  ┌─────────────────────────────────────────────────────────────┐
  │                   Observability Layer                        │
  │  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌────────────┐ │
  │  │Structured│  │Prometheus│  │OpenTel   │  │Event Bus   │ │
  │  │Logging   │  │Metrics   │  │Tracing   │  │(SSE + WS)  │ │
  │  └──────────┘  └──────────┘  └──────────┘  └────────────┘ │
  └─────────────────────────────────────────────────────────────┘
                   │
                   ▼
  ┌─────────────────────────────────────────────────────────────┐
  │                    Dashboard (Web UI)                        │
  │  ┌─────────────────┐  ┌─────────┐  ┌────────────────────┐ │
  │  │ Next.js Frontend │  │ FastAPI │  │ Celery Worker      │ │
  │  │ React Query + SSE│  │ Backend │  │ (invokes           │ │
  │  │ :3000            │  │ :8000   │  │  sampling_loop)    │ │
  │  └─────────────────┘  └─────────┘  └────────────────────┘ │
  │  ┌──────────┐  ┌──────────┐  ┌──────────────────────────┐ │
  │  │PostgreSQL│  │  Redis   │  │ MinIO (S3 artifacts)     │ │
  │  └──────────┘  └──────────┘  └──────────────────────────┘ │
  └─────────────────────────────────────────────────────────────┘

How It Works: Request Flow

CLI Invocation -- python main.py "auto-close tickets" parses runtime options (tool version, effort level, tool search, deferred tools) and enters continuous_loop().
Agent Selection -- get_active_agents() matches keywords in the instruction to agent types. Multiple matches run in parallel as independent asyncio.Task instances.
Adaptive Configuration -- Each agent gets a tailored config: thinking budget, max tokens, complexity score, and model selection via select_model_for_task().
Sampling Loop -- sampling_loop() is the core conversation engine. It initializes tools from TOOL_GROUPS_BY_VERSION, fetches the system prompt from StateSet APIs, creates the API client, and enters the message loop.
API Call -- Each iteration sends the conversation to Claude Opus 4.6 through a CircuitBreaker, with prompt caching on the 3 most recent turns and dynamic context management adapting compaction aggressiveness to current token usage.
Tool Execution -- Claude's tool calls are analyzed by DependencyAnalyzer. Read-only, independent calls execute in parallel via asyncio.gather(); dependent calls execute sequentially. MCP tools are dispatched to their respective server connections.
Context Optimization -- After each iteration, the ContextOptimizer tracks attention quality (EXCELLENT through CRITICAL) and applies compaction strategies. Messages are compressed when history exceeds 20 turns.
Completion -- When Claude responds with no tool calls, the loop returns SamplingLoopResult. The orchestrator runs analyze_task_completion(), records metrics, and sends a Stripe billing event.

Core Components

Component	Responsibility	Location
Orchestrator	CLI parsing, agent selection, parallel dispatch, billing	`main.py`
Agent Loop	Conversation loop, API calls, message management	`agent/loop.py`
Tool Collection	Tool registration, dispatch, deferred loading	`agent/tools/collection.py`
Tool Groups	Version-specific tool bundles (20241022, 20250124, 20251124, cli)	`agent/tools/groups.py`
Parallel Executor	Dependency analysis, safe parallel tool execution	`agent/parallel_executor.py`
Context Optimizer	JIT retrieval, compaction, attention budget, sub-agent compression	`agent/context_optimizer.py`
Subagent Manager	Spawn isolated sub-agents (explore, analyze, code, research)	`agent/subagent.py`
MCP Client	Connect external tools via stdio/SSE/HTTP transports	`agent/mcp_client.py`
Structured Output	JSON schema validation, extraction, pre-defined schemas	`agent/structured_output.py`
Stuck Detector	Repeating actions, no visual changes, cycling detection	`agent/stuck_detection.py`
Checkpoint Manager	Save/resume long-running tasks, heartbeat monitoring	`agent/checkpoint.py`
Skill Manager	Skill profiles, agent-skill mapping, container resolution	`agent/skill_manager.py`
Health Checker	API connectivity, display, memory, disk checks	`agent/health.py`
Circuit Breaker	Fault tolerance for API calls (closed/open/half-open)	`agent/health.py`
Config	Centralized settings with YAML loading and env substitution	`agent/config.py`
Observability	Unified logging, Prometheus metrics, OpenTelemetry tracing, event streaming	`agent/observability/`
Exception Hierarchy	Typed errors (retryable, non-retryable, budget, tool, resource)	`agent/exceptions.py`

Tool Versions

The system ships with 4 tool bundles, selected via --tool-version:

Version	Tools	Beta Flag	Display Required
`computer_use_20251124` (default)	Computer (with zoom), Edit, Bash, Memory, AGI, CLI, AskUser	`computer-use-2025-11-24`	Yes
`computer_use_20250124`	Computer, Edit, Bash, Memory, AGI, CLI, AskUser	`computer-use-2025-01-24`	Yes
`computer_use_20241022`	Computer, Edit, Bash	`computer-use-2024-10-22`	Yes
`cli_20250124`	Bash, Edit, Memory, AGI, CLI, AskUser	None	No

Additionally, SubagentTool is loaded dynamically at runtime (requires API key), and MCP tools are added from connected servers.

Subagent System

Implements Anthropic's sub-agent compression pattern. The main agent spawns isolated sub-agents that return compressed summaries instead of raw output (50k tokens of exploration compressed to 2k summary):

Type	Model	Use Case	Max Turns	Timeout
`explore`	Haiku 4.5	Fast codebase/data exploration	5	60s
`analyze`	Sonnet 4.5	Deep analysis with thinking	8	90s
`execute`	Sonnet 4.5	Task execution with verification	15	180s
`research`	Haiku 4.5	Web search and synthesis	8	120s
`code`	Sonnet 4.5	Code generation and modification	12	180s

MCP Integration

Connect external tools via Model Context Protocol with 3 transport types (stdio, SSE, HTTP) and 8 pre-configured presets:

# Available presets: slack, github, postgres, filesystem, memory, brave-search, puppeteer, sqlite

Tools appear in the conversation as mcp__<server>__<tool> (e.g., mcp__slack__send_message).

Dashboard

The web dashboard provides job management, real-time monitoring, and artifact storage:

Frontend (Next.js 14)          Backend (FastAPI)           Worker (Celery)
─────────────────────          ─────────────────           ───────────────
Dashboard home                 POST /api/jobs              Receives job from
Launch Task form         ──►   GET  /api/jobs              Redis queue
Live Runs (SSE)          ◄──   GET  /api/events/jobs       Calls sampling_loop()
Outputs browser                GET  /api/artifacts         Stores artifacts in S3
Template management            CRUD /api/templates         Records billing via Stripe
Usage metrics                  GET  /api/metrics/overview
                               CRUD /api/agi              PostgreSQL (persistence)
                               CRUD /api/skills           Redis (broker/backend)
                               GET  /api/observability    MinIO (S3 artifacts)

Deploy with Docker Compose (docker compose up -d): frontend on :3000, backend on :8000, with Postgres, Redis, and MinIO. Optional monitoring profile adds Prometheus, Grafana, and OpenTelemetry Collector.

Read More: Architecture Documentation →

📚 Documentation

Getting Started

Getting Started Guide - Step-by-step setup for beginners (10 min)
Quick Start - Common commands and usage patterns (5 min)
User Guide - Comprehensive reference (30 min)

Technical Deep-Dives

Architecture - System design and component interaction
Context Engineering - How we achieve 95% cost savings
Parallel Execution - Automatic tool parallelization
Memory System - Persistent agent memory
Metrics & Billing - Usage tracking and cost management

Feature Documentation

Tool Reference - Complete tool catalog
Web Search - Internet search capabilities
Web Fetch - HTTP requests and scraping
Code Execution - Running Python/Bash code
Files API - Document upload and management

Advanced Topics

Long-Running Tasks - Multi-hour agent operations
Skills System - Extending agents with custom skills
Dashboard - Web-based monitoring UI
AGI Integration - Advanced AI capabilities

📊 Performance & Cost

Real-World Metrics

Based on production usage across 1,000+ agent runs:

Metric	Before Optimization	After Optimization	Improvement
Avg Tokens/Task	150,000	7,500	95% reduction
Avg Cost/Task	$2.25	$0.11	95% savings
Avg Task Duration	45s	30s	33% faster
Context Quality	Degrades >50k tokens	EXCELLENT at 500k+	Indefinite
Parallel Speed	Sequential (baseline)	30-50% faster	1.5x speedup

Cost Breakdown (per 1M tokens)

Operation	Input Cost	Output Cost	Typical Usage
Claude Opus 4.6	$3.00	$15.00	Main model
Extended Thinking	$3.00	$15.00	Complex tasks only
Prompt Caching (hit)	$0.30	$15.00	90% cost reduction

Pro Tip: Enable prompt caching for system prompts to achieve an additional 90% savings on input tokens.

🔧 Configuration

Environment Variables

# Required
ANTHROPIC_API_KEY=sk-ant-api03-...     # Claude API access
DISPLAY=:1                              # X11 display server

# Optional
STRIPE_API_KEY=sk_live_...             # Usage-based billing
WORKSPACE_PATH=/path/to/workspace       # Working directory

Agent Configuration

Agents are configured via StateSet API or directly in main.py:

AGENT_CONFIGS = {
    "AUTO_CLOSE": AgentConfig(
        agent_id="stateset_auto_close",
        agent_type="AUTO_CLOSE",
        name="Auto-Close Agent",
        description="Automatically closes support tickets",
        stripe_customer_id="cus_..."  # Optional: for billing
    ),
    # ... more agents
}

Provider Selection

Support for multiple Claude providers:

# Anthropic (default)
provider = APIProvider.ANTHROPIC

# AWS Bedrock
provider = APIProvider.BEDROCK

# Google Vertex AI
provider = APIProvider.VERTEX

🛡️ Security Best Practices

Critical Security Considerations

Never commit API keys: Use environment variables or .env files

echo '.env' >> .gitignore
export ANTHROPIC_API_KEY='...'

Secure screenshot storage: May contain sensitive user data

chmod 700 screenshots/
# Implement automatic cleanup policy

Validate agent actions: Use tool guard for risky operations
```
# Pre-execution validation in agent/tool_guard.py
```

Agent memory isolation: Separate storage per agent

# Memory stored in /tmp/agent_memories/{agent_id}/

Prompt injection protection: Sanitize user inputs
```
# Implemented in agent/tools/memory.py
```

🐛 Troubleshooting

Common Issues

Display not found error

Error: Error: Can't open display :1

Solution:

# Start virtual display
Xvfb :1 -screen 0 1920x1080x24 &
export DISPLAY=:1

# Verify
xdpyinfo | grep dimensions

Import errors for anthropic/pyautogui

Error: ModuleNotFoundError: No module named 'anthropic'

Solution:

# Ensure virtual environment is activated
source venv/bin/activate

# Reinstall dependencies
pip install -r requirements.txt

# Verify installation
pip list | grep anthropic

API authentication failures

Error: 401 Unauthorized

Solution:

# Verify API key format (starts with sk-ant-api03-)
echo $ANTHROPIC_API_KEY

# Test API connection
curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{"model":"claude-opus-4-6-20260131","max_tokens":1024,"messages":[{"role":"user","content":"test"}]}'

🚧 Roadmap

Implemented ✅

Multi-agent architecture with parallel execution
All 5 Anthropic context engineering patterns
Automatic tool parallelization (30-50% speedup)
Extended thinking support with effort control
Production monitoring and billing (Stripe)
Security hardening (prompt injection, path traversal, memory isolation)
Skills system for extensibility
Web dashboard with real-time SSE updates
Distributed tracing (OpenTelemetry integration)
Circuit breakers for API fault tolerance
Comprehensive exception hierarchy with typed errors
Subagent spawning for isolated task decomposition
MCP client integration (Slack, GitHub, Postgres, etc.)
Structured JSON output with schema validation
Stuck detection and recovery (repeating actions, cycling, stale loops)
Checkpoint system for resumable long-running tasks
Unified observability (logging, metrics, tracing, event streaming)
Tool search for deferred tool loading
Centralized configuration with YAML and env var support
2,700+ unit tests

Planned 📋

Agent marketplace for community contributions
Kubernetes deployment templates
ML-based screenshot delay prediction
Enhanced cost optimization (LZ4 compression)

📈 Metrics & Monitoring

Built-in Metrics

Every agent run captures comprehensive metrics:

{
  "task_id": "task_20250324_142301",
  "agent_type": "AUTO_CLOSE",
  "duration_seconds": 32.5,
  "tokens_used": 8234,
  "estimated_cost": 0.12,
  "tools_executed": 15,
  "parallel_executions": 3,
  "success": true,
  "completion_indicators": ["ticket closed", "task finished"]
}

Stripe Billing Integration

Automatic usage-based billing:

# Meters configured at Stripe
meter_id = "computer_use_tokens"

# Events sent on task completion
{
  "event_name": "computer_use_tokens",
  "payload": {
    "stripe_customer_id": "cus_...",
    "value": 8234  # tokens used
  }
}

🤝 Contributing

This is proprietary software. For internal contributors:

Follow the development guidelines
All changes require review from 2+ team members
Ensure tests pass and coverage remains >80%
Update documentation for user-facing changes

📞 Support

Resources

Documentation: Start with Getting Started Guide
Bug Reports: Create detailed issue reports with logs and screenshots
Feature Requests: Submit to product team with use cases

Contact

For support, contact the StateSet team:

Email: support@stateset.com
Slack: #computer-use-agents (internal)
Emergency: On-call rotation (internal)

📄 License

Unauthorized copying, modification, distribution, or use of this software is strictly prohibited.

For licensing inquiries, contact: legal@stateset.com

🙏 Acknowledgments

Built with:

Claude Opus 4.6 - Anthropic's most intelligent AI model
PyAutoGUI - Desktop automation
httpx - Modern HTTP client
FastAPI - Dashboard backend
Next.js - Dashboard frontend
OpenTelemetry - Distributed tracing
Prometheus - Metrics collection
Model Context Protocol - External tool integration
Research from Anthropic's "Effective Context Engineering"

StateSet Computer Use Agent

Autonomous AI agents for the modern enterprise

Documentation • Architecture • Support

Made with ❤️ by the StateSet team

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.2.0

May 14, 2026

This version

2.0.3

Feb 17, 2026

2.0.1

Feb 15, 2026

2.0.0

Feb 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stateset_cua-2.0.3.tar.gz (351.9 kB view details)

Uploaded Feb 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

stateset_cua-2.0.3-py3-none-any.whl (380.8 kB view details)

Uploaded Feb 17, 2026 Python 3

File details

Details for the file stateset_cua-2.0.3.tar.gz.

File metadata

Download URL: stateset_cua-2.0.3.tar.gz
Upload date: Feb 17, 2026
Size: 351.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for stateset_cua-2.0.3.tar.gz
Algorithm	Hash digest
SHA256	`1ed8fc0a37e896b3f2c7f84ce7bed0c86d3a4f691ee9de12dcca914de06a7530`
MD5	`d1546d766c55f71cea8931484a98fab4`
BLAKE2b-256	`a8792ee7f8914ac8b9620963b238b689d3e6f97ce60ae00f4310cd4516147915`

See more details on using hashes here.

File details

Details for the file stateset_cua-2.0.3-py3-none-any.whl.

File metadata

Download URL: stateset_cua-2.0.3-py3-none-any.whl
Upload date: Feb 17, 2026
Size: 380.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for stateset_cua-2.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7919e3f047520941b12ed26e16214d32c5430f1e08d28c5a63ba4b0cff115e17`
MD5	`6f9131cdb609c53f9493058f5af780b8`
BLAKE2b-256	`05f626f74efd8085743d16b4828fea4a4cdd3964a38f5375a6f99c12b046abf1`

See more details on using hashes here.

stateset-cua 2.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

StateSet Computer Use Agent

🎯 Overview

Why StateSet Computer Use Agent?

🆕 What's New in Claude 4.5

Updated model lineup

Opus 4.6 enhancements

Effort parameter (beta)

🌟 Core Innovation: Context Engineering

🚀 Quick Start

Prerequisites

Installation

🤖 Available Agents

Multi-Agent Orchestration

🛠️ Key Features

1. Computer Vision & Control

2. Intelligent Parallel Execution

3. Advanced Tool Suite

4. Subagent Spawning & MCP Integration

5. Structured JSON Output

6. Production-Grade Observability

7. Security-First Design

📋 Example Usage

Basic Agent Execution

Advanced Workflows

Tool search & effort controls

Monitoring & Debugging

🏗️ Architecture

System Overview

How It Works: Request Flow

Core Components

Tool Versions

Subagent System

MCP Integration

Dashboard

📚 Documentation

Getting Started

Technical Deep-Dives

Feature Documentation

Advanced Topics

📊 Performance & Cost

Real-World Metrics

Cost Breakdown (per 1M tokens)

🔧 Configuration

Environment Variables

Agent Configuration

Provider Selection

🛡️ Security Best Practices

Critical Security Considerations

🐛 Troubleshooting

Common Issues

🚧 Roadmap

Implemented ✅

Planned 📋

📈 Metrics & Monitoring

Built-in Metrics

Stripe Billing Integration

🤝 Contributing

📞 Support

Resources

Contact

📄 License

🙏 Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files