Skip to main content

Python SDK for interacting with the Wuying AgentBay cloud runtime environment

Project description

AgentBay SDK for Python

Execute commands, operate files, and run code in cloud environments

๐Ÿ“ฆ Installation

pip install wuying-agentbay-sdk

๐Ÿš€ Prerequisites

Before using the SDK, you need to:

  1. Register an Alibaba Cloud account: https://aliyun.com
  2. Get API credentials: AgentBay Console
  3. Set environment variable: export AGENTBAY_API_KEY=your_api_key

๐Ÿš€ Quick Start

Synchronous API (Default)

from agentbay import AgentBay

# Create session
agent_bay = AgentBay()
result = agent_bay.create()

if result.success:
    session = result.session

    # Execute command
    cmd_result = session.command.execute_command("ls -la")
    print(cmd_result.output)

    # File operations
    session.file_system.write_file("/tmp/test.txt", "Hello World")
    content = session.file_system.read_file("/tmp/test.txt")
    print(content.content)  # Hello World

    # Clean up
    agent_bay.delete(session)

Asynchronous API

import asyncio
from agentbay import AsyncAgentBay

async def main():
    # Create session
    agent_bay = AsyncAgentBay()
    result = await agent_bay.create()

    if result.success:
        session = result.session

        # Execute command
        cmd_result = await session.command.execute_command("ls -la")
        print(cmd_result.output)

        # File operations
        await session.file_system.write_file("/tmp/test.txt", "Hello World")
        content = await session.file_system.read_file("/tmp/test.txt")
        print(content.content)  # Hello World

        # Clean up
        await agent_bay.delete(session)

if __name__ == "__main__":
    asyncio.run(main())

๐Ÿ”„ Sync vs Async: Which to Choose?

AgentBay Python SDK provides both synchronous and asynchronous APIs. Choose based on your application needs:

Feature Sync API (AgentBay) Async API (AsyncAgentBay)
Import from agentbay import AgentBay from agentbay import AsyncAgentBay
Best for Scripts, simple tools, CLI apps Web servers (FastAPI/Django), high-concurrency apps
Blocking Yes, blocks thread until complete No, allows other tasks to run
Usage client.create(...) await client.create(...)
Concurrency Sequential execution Concurrent execution with asyncio
Learning Curve Simpler, easier to start Requires understanding of async/await

When to Use Sync API

Use the synchronous API (AgentBay) when:

  • Simple scripts: One-off automation tasks or data processing scripts
  • CLI tools: Command-line applications with sequential operations
  • Learning: Getting started with AgentBay SDK
  • Debugging: Easier to debug with sequential execution flow

Example Use Case: A script that processes files sequentially

from agentbay import AgentBay, CreateSessionParams

agent_bay = AgentBay()
session = agent_bay.create(CreateSessionParams(image_id="code_latest")).session

# Process files one by one
for file_path in ["/tmp/file1.txt", "/tmp/file2.txt", "/tmp/file3.txt"]:
    content = session.file_system.read_file(file_path)
    processed = content.content.upper()
    session.file_system.write_file(file_path + ".processed", processed)

agent_bay.delete(session)

When to Use Async API

Use the asynchronous API (AsyncAgentBay) when:

  • Web applications: FastAPI, Django, or other async web frameworks
  • High concurrency: Managing multiple sessions or operations simultaneously
  • Performance critical: Need to maximize throughput with I/O-bound operations
  • Real-time systems: Applications requiring non-blocking operations

Example Use Case: A web server handling multiple concurrent requests

import asyncio
from agentbay import AsyncAgentBay, CreateSessionParams

async def process_request(task_id: str, code: str):
    agent_bay = AsyncAgentBay()
    session = (await agent_bay.create(CreateSessionParams(image_id="code_latest"))).session

    result = await session.code.run_code(code, "python")

    await agent_bay.delete(session)
    return result

async def main():
    # Process multiple requests concurrently
    tasks = [
        process_request("task1", "print('Hello from task 1')"),
        process_request("task2", "print('Hello from task 2')"),
        process_request("task3", "print('Hello from task 3')"),
    ]

    results = await asyncio.gather(*tasks)
    for result in results:
        print(result.result)

if __name__ == "__main__":
    asyncio.run(main())

๐Ÿ“– Complete Documentation

๐Ÿ†• New Users

๐Ÿš€ Experienced Users

Choose Your Cloud Environment:

  • ๐ŸŒ Browser Use - Web scraping, browser testing, form automation
  • ๐Ÿ–ฅ๏ธ Computer Use - Windows desktop automation, UI testing
  • ๐Ÿ“ฑ Mobile Use - Android UI testing, mobile app automation
  • ๐Ÿ’ป CodeSpace - Code execution, development environments

Additional Resources:

๐Ÿ”ง Core Features Quick Reference

Session Management

Synchronous

from agentbay import AgentBay

agent_bay = AgentBay()

# Create session
result = agent_bay.create()
if result.success:
    session = result.session

# List sessions by labels with pagination
result = agent_bay.list(labels={"environment": "production"}, limit=10)
if result.success:
    session_ids = result.session_ids

# Delete session
delete_result = agent_bay.delete(session)

Asynchronous

import asyncio
from agentbay import AsyncAgentBay

async def main():
    agent_bay = AsyncAgentBay()

    # Create session
    result = await agent_bay.create()
    if result.success:
        session = result.session

    # List sessions by labels with pagination
    result = await agent_bay.list(labels={"environment": "production"}, limit=10)
    if result.success:
        session_ids = result.session_ids

    # Delete session
    delete_result = await agent_bay.delete(session)

asyncio.run(main())

File Operations

Synchronous

# Read/write files
session.file_system.write_file("/path/file.txt", "content")
content = session.file_system.read_file("/path/file.txt")
print(content.content)

# List directory
files = session.file_system.list_directory("/path")

Asynchronous

# Read/write files
await session.file_system.write_file("/path/file.txt", "content")
content = await session.file_system.read_file("/path/file.txt")
print(content.content)

# List directory
files = await session.file_system.list_directory("/path")

Command Execution

Synchronous

# Execute command
result = session.command.execute_command("python script.py")
print(result.output)

Asynchronous

# Execute command
result = await session.command.execute_command("python script.py")
print(result.output)

Code Execution

Synchronous

from agentbay import AgentBay, CreateSessionParams

agent_bay = AgentBay()
session = agent_bay.create(CreateSessionParams(image_id="code_latest")).session

# Run Python code
result = session.code.run_code("print('Hello World')", "python")
if result.success:
    print(result.result)  # Hello World

agent_bay.delete(session)

Asynchronous

import asyncio
from agentbay import AsyncAgentBay, CreateSessionParams

async def main():
    agent_bay = AsyncAgentBay()
    session = (await agent_bay.create(CreateSessionParams(image_id="code_latest"))).session

    # Run Python code
    result = await session.code.run_code("print('Hello World')", "python")
    if result.success:
        print(result.result)  # Hello World

    await agent_bay.delete(session)

asyncio.run(main())

Data Persistence

Synchronous

from agentbay import AgentBay, CreateSessionParams
from agentbay import ContextSync, SyncPolicy

agent_bay = AgentBay()

# Create context
context = agent_bay.context.get("my-project", create=True).context

# Create session with context
context_sync = ContextSync.new(context.id, "/tmp/data", SyncPolicy.default())
session = agent_bay.create(CreateSessionParams(context_syncs=[context_sync])).session

# Data in /tmp/data will be synchronized to the context
session.file_system.write_file("/tmp/data/config.json", '{"key": "value"}')

agent_bay.delete(session)

Asynchronous

import asyncio
from agentbay import AsyncAgentBay, CreateSessionParams
from agentbay import ContextSync, SyncPolicy

async def main():
    agent_bay = AsyncAgentBay()

    # Create context
    context = (await agent_bay.context.get("my-project", create=True)).context

    # Create session with context
    context_sync = ContextSync.new(context.id, "/tmp/data", SyncPolicy.default())
    session = (await agent_bay.create(CreateSessionParams(context_syncs=[context_sync]))).session

    # Data in /tmp/data will be synchronized to the context
    await session.file_system.write_file("/tmp/data/config.json", '{"key": "value"}')

    await agent_bay.delete(session)

asyncio.run(main())

๐Ÿ†˜ Get Help

๐Ÿ“„ License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Changelog

All notable changes to the Wuying AgentBay SDK will be documented in this file.

[0.13.0] - 2025-12-19

Core SDK capabilities (user-visible)

๐Ÿ Python API architecture (migration required)

  • Sync/async split: Python SDK now provides separate sync and async APIs (AgentBay vs AsyncAgentBay, etc.).
  • Unified naming: Removed _async suffix from method names; async APIs are await-able with the same method names as sync.
  • Unified imports: Public imports were consolidated under agentbay.

๐Ÿง‘โ€๐Ÿ’ป Code Execution (run_code)

  • Rich outputs (backward compatible): Added EnhancedCodeExecutionResult to support multi-format outputs (HTML/Markdown/images/SVG/LaTeX/charts) while keeping compatibility with CodeExecutionResult. (See commit 46cffd02.)
  • Jupyter-like persistence: Added docs/examples for long-lived code execution workflows in CodeSpace.

๐Ÿงพ Command execution

  • Structured outputs: Standardized command execution outputs to expose stdout and stderr as explicit fields across SDKs. (See commit c404ae48.)
  • Correctness: Fixed parsing of exit_code from command execution responses. (See commit 25d543cb.)

๐Ÿค– Agent module

  • BrowserUseAgent task API: Introduced/extended task-oriented APIs for BrowserUseAgent. (See commit c69b3596.)
  • execute_task refinement: Refined task execution API and updated docs/examples/tests accordingly. (See commit aa98c5d8.)

๐Ÿ“ Context filesystem

  • delete_file (all languages): Added delete_file support with tests and examples.
  • Internal context loading: Switched to GetAndLoadInternalContext to avoid incorrect context listing behavior.

๐Ÿ”„ Context sync API

  • sync_context โ†’ sync (breaking): Renamed context sync API for consistency across SDKs and removed the legacy alias.

๐ŸŒ Multi-region configuration

  • region_id support across SDKs: Added region selection support for Python/TypeScript/Go.

๐Ÿ–ฅ๏ธ Computer & window automation

  • Window API simplification (breaking): Removed timeout parameter from get_active_window across languages.
  • Typed screenshot results (TypeScript): computer.screenshot() returns a typed ScreenshotResult.
  • Cross-platform reliability: Added/expanded window management integration tests (Python + Go).

๐Ÿ“ฑ Mobile automation

  • Mobile UI bounds: Added bounds_rect support for mobile UI elements (Python), including tests and examples.

โ˜๏ธ OSS

  • Parameter naming: Standardized Python OSS client securityToken โ†’ security_token and updated tests/docs.

๐Ÿน Go SDK

  • File transfer: Added file transfer support in Go SDK with integration tests.
  • Session deletion semantics: Adjusted delete session flow to align with async deletion API behavior.

Developer experience (docs & reference)

  • Async programming docs: Added/updated sync-vs-async comparison and v0.12.x โ†’ v0.13.x migration guide.
  • API reference: Regenerated API docs for Python/TypeScript/Go and improved doc generation metadata filtering.

Examples & cookbook (end-to-end usage)

  • Mobile login persistence cookbook: Added cookbook/mobile/app-login-persistence examples covering cross-session login state for multiple apps. (See commit 0e47976b.)
  • NL mobile control cookbook: Added a LangChain-based NL mobile control example with a web demo and tests.

Quality, CI, and release tooling

  • Smart integration test workflow: Added/iterated .aoneci/smart-integration-test.yaml (multi-language, improved reporting, AI analysis prompts, and stability improvements).
  • Examples inspection workflow: Added/iterated .aoneci/example-check.yaml (multi-language examples verification with AI analysis and DingTalk notifications).
  • KB preprocessing: Added/iterated preprocess_kb pipeline to generate knowledge-base-friendly docs in batches.
  • llms artifacts: Updated llms.txt / llms-full.txt generation workflow and content.

[0.12.0] - 2025-11-28

Added

๐Ÿ“ฑ Mobile Simulation

  • Device Simulation Support: Added support for mobile device simulation features
    • Enhanced mobile testing capabilities
    • Support for simulating various mobile device characteristics

โธ๏ธ Session Pause & Resume

  • Session Control: Added support for pausing and resuming sessions
    • pause() / pause_async(): Pause a running session to save resources
    • resume() / resume_async(): Resume a paused session
    • Available across Python, TypeScript, and Golang SDKs

Fixed

๐Ÿ› Bug Fixes

  • Browser Automation: Fixed browser-use timeout issues
  • Context Management: Fixed issue where get_session was creating a new context ID instead of returning the real one
  • Context Sync: Fixed sync call to correctly use context_id and path together
  • Session Control: Fixed return value handling when pause or resume backend operations fail
  • Local Testing: Fixed issue preventing browser tests from running locally
  • Python SDK: Simplified exception log output for cleaner logs
  • Infrastructure: Updated logo fallback path for Aone compatibility

Documentation

  • Use Cases: Improved Use Cases section UX in README
  • Examples: Updated Quick Start examples to use CodeSpace run_code
  • General: Regenerated API documentation and improved code examples
  • Golang: Added new examples for Golang SDK

[0.11.0] - 2025-11-18

Added

๐Ÿ–ฑ๏ธ Browser Fingerprint Persistence

  • Fingerprint Management: Support for browser fingerprint persistence and customization
    • Local fingerprint file synchronization
    • Custom fingerprint construction
    • Cross-session fingerprint persistence
    • Enhanced browser anti-detection capabilities

๐Ÿ“ธ Browser Screenshot Enhancement

  • Long Screenshot Support: Capture full-page screenshots beyond viewport
    • Support for scrolling screenshots
    • Automatic page stitching
    • Network idle waiting for complete rendering

๐Ÿ”„ Cross-Platform Context Sync

  • MappingPolicy: New policy for cross-platform context synchronization
    • Flexible path mapping between different platforms
    • Support for Windows/Linux/macOS path translation
    • Enhanced context portability

๐Ÿ“š Cookbook Examples

  • E-Commerce Inspector: Browser automation for e-commerce data extraction and analysis
  • AI Code Assistant: Code generation and execution in isolated sandbox environment
  • Data Analysis: Automated data processing and visualization with AI-driven insights

Changed

๐Ÿ”’ OSS Security Enhancement

  • Breaking Change: securityToken is now required for OSS operations
    • Enhanced security for object storage operations
    • Updated documentation and examples

โŒจ๏ธ Key Normalization

  • Improved Case Compatibility: Better handling of key names in press_keys tool
    • Automatic case normalization for common keys
    • Support for both uppercase and lowercase key names
    • Consistent behavior across all SDKs

๐Ÿงน API Surface Cleanup

  • API Privatization: Internal APIs marked as private across all SDKs
    • Python: Private methods prefixed with _
    • TypeScript: Internal APIs marked as private
    • Golang: Internal packages and unexported functions
    • Cleaner public API documentation
    • Removed deprecated APIs and properties

๐Ÿ“– Documentation Overhaul

  • Comprehensive Documentation Enhancement: Major documentation improvements
    • Migrated examples from separate docs to source code
    • Metadata-driven documentation generation
    • Inline examples for all public APIs
    • Fixed broken links across all documentation
    • Simplified and clarified API examples
    • Enhanced API reference with comprehensive usage examples

๐ŸŽฏ Session Recovery

  • File Transfer Context: Automatic context creation for recovered sessions
    • Better handling of session recovery scenarios
    • Improved file transfer reliability

Fixed

๐Ÿ› Bug Fixes

  • SDK Version Reporting: Fixed version detection in SdkStats module
  • Context Sync: Removed invalid sync_id parameter in ContextSyncResult
  • Session Info: Handle NotFound errors gracefully with clear error messages
  • Mobile API: Aligned mobile API naming across SDKs with MCP tool specification
  • UIElement Bounds: Handle bounds string format correctly in Golang and TypeScript
  • Screenshot Timeout: Fixed timeout issues caused by network idle waiting
  • Documentation: Fixed RST code block rendering and markdown formatting issues

Removed

๐Ÿ—‘๏ธ Deprecated APIs

  • Cleanup: Removed all deprecated APIs and properties
    • Cleaner codebase
    • Reduced maintenance burden
    • Clear upgrade path for users

[0.10.0] - 2025-10-31

Added

๐Ÿค– AI Context Support

  • AI Coding Assistant Integration: Added llms.txt and llms-full.txt files for better AI assistant context
    • Comprehensive codebase documentation for AI tools
    • Enhanced development experience with Claude Code and similar assistants
    • Structured project information for better AI understanding

๐Ÿ”ง MCP Tool Enhancement

  • Unified MCP Tool API: New public API for MCP tool invocation across all SDKs
    • Python: session.call_mcp_tool(tool, args) for direct MCP tool calls
    • TypeScript: session.callMcpTool(tool, args) for direct MCP tool calls
    • Golang: session.CallMcpTool(tool, args) for direct MCP tool calls
    • Simplified tool invocation without manual server discovery
    • Better error handling and response parsing

๐Ÿ“Š SDK Statistics & Tracking

  • SDK Stats Module: Automatic SDK usage tracking and version reporting
    • Auto-detection of SDK version from package metadata
    • Release tracking for Python, TypeScript, and Golang
    • Statistics collection for better service improvement

๐Ÿ—‘๏ธ Context Management Enhancement

  • Context Clear API: New API for clearing context data
    • Python: context.clear(context_id) with status polling
    • TypeScript: context.clear(contextId) with status polling
    • Golang: context.Clear(contextId) with status polling
    • Asynchronous clearing with status monitoring
    • Non-blocking operation with completion detection

๐ŸŒ Browser Automation Enhancement

  • Browser Type Selection: Support for different browser types

    • Choose between Chrome and Chromium browsers
    • browser_type option in BrowserOption across all SDKs (values: "chrome", "chromium", or None for default)
    • Default browser selection per image type
    • Browser-specific optimization and compatibility
  • Browser Navigation & Arguments: Enhanced browser initialization

    • default_navigate_url parameter for automatic page navigation
    • cmd_args parameter for custom browser command line arguments
    • Better control over browser startup behavior
  • Golang Browser Support: Added full browser automation for Golang SDK

    • session.Browser interface matching Python and TypeScript
    • Complete browser API implementation
    • Browser context and page management

๐Ÿ“ฑ Mobile Enhancement

  • ADB Connection URL: Direct ADB connection support for mobile automation
    • Python: session.mobile.get_adb_url() returns ADB connection URL
    • TypeScript: session.mobile.getAdbUrl() returns ADB connection URL
    • Golang: session.Mobile.GetAdbUrl() returns ADB connection URL
    • Enable external ADB client connections
    • Enhanced mobile automation capabilities

๐Ÿ“ Enhanced Logging System

  • Comprehensive Logging Infrastructure: Unified logging across all SDKs
    • File logging support with log rotation
    • Environment-based log level configuration (AGENTBAY_LOG_LEVEL)
    • API call and response logging with sanitization
    • Process and thread information in async operations
    • Python: loguru-based logging with enhanced formatting
    • TypeScript: winston-based logging with color support
    • Golang: structured logging with context support

๐Ÿ“‹ Code Execution Enhancement

  • Code Execution Output Logging: Better visibility for code execution
    • Detailed output logging for run_code operations
    • Comprehensive integration tests for code execution
    • Better error reporting and debugging

๐Ÿ“„ Data Persistence Enhancement

  • File Compression Support: Archive mode for data persistence
    • Compress files before upload to reduce storage costs
    • Archive mode configuration in context sync
    • Automatic decompression on download
    • Support for .tar.gz format

Changed

๐Ÿ”„ Session Link Access Model

  • Breaking Change: Session Link access changed from whitelist to paid subscription
    • Whitelist-based access deprecated
    • New subscription-based access model
    • Updated documentation with new access requirements

๐ŸŽจ Browser Context Management

  • Browser Replay Context Sync: Fixed context synchronization behavior
    • Only sync browser replay context when sync_context is explicitly False
    • Better control over context persistence
    • Reduced unnecessary sync operations

๐Ÿ“š Documentation Reorganization

  • Documentation Structure Improvement: Better organized documentation
    • Updated API documentation links to new directory structure
    • Added custom images guide with comprehensive examples
    • Enhanced session management documentation
    • Added markdown link checker for quality assurance
    • Fixed 166 broken markdown links across documentation
    • Production environment recommendations for image types

Fixed

๐Ÿ› Browser Issues

  • Browser Image ID: Fixed browser image references
    • Corrected browser-latest to browser_latest across all SDKs
    • Fixed default browser type values (none/undefined/nil)
    • Fixed browser type example syntax errors
    • Fixed bad reference links in browser API documents
    • Fixed browser page creation to use context 0

๐Ÿ”ง TypeScript Issues

  • ESLint Compliance: Fixed TypeScript code quality issues
    • Fixed prefer-const errors in API client
    • Improved error handling consistency
    • Better code organization

๐Ÿ“ File System Issues

  • File Search Functionality: Fixed filesystem example issues
    • Corrected file search implementation
    • Better error handling in filesystem operations
    • Fixed OSS sync file deletion issues

๐Ÿ“ฑ Mobile Issues

  • Mobile UI Element Methods: Aligned mobile.py with ui.py implementation
    • Consistent method signatures across modules
    • Added JSON parsing in get_clickable_ui_elements
    • Fixed screenshot saving in automation examples

๐Ÿงช Testing Issues

  • Test Case Quality: Improved test reliability
    • Fixed bad test case design for browser type switching
    • Fixed mobile getAdbUrl unit tests across SDKs
    • Improved pytest configuration and compatibility
    • Fixed BoolResult parameter order in tests

Documentation

  • Comprehensive Documentation Updates: Major improvements across all areas
    • Added comprehensive custom images guide
    • Updated data persistence documentation with archive mode
    • Added Windows application management examples
    • Added logging documentation for all SDKs
    • Removed outdated logging and mobile examples docs
    • Added production environment recommendations
    • Updated Session Link access documentation
    • Fixed markdown link issues and emoji anchor compatibility

[0.9.0] - 2025-10-15

Added

๐ŸŽฏ Platform-Specific Automation Modules

  • Computer Module: New dedicated module for desktop/Windows automation across all SDKs
    • Desktop UI interaction APIs: click_mouse(), move_mouse(), drag_mouse(), press_keys(), scroll()
    • Screen information: get_screen_size(), screenshot()
    • Integration with MCP tools for advanced automation
    • Python: session.computer.*, TypeScript: session.computer.*, Golang: session.Computer.*
  • Mobile Module: New dedicated module for Android device automation across all SDKs
    • Touch interaction: tap(), swipe(), long_press()
    • Input methods: input_text(), send_key() with KeyCode support
    • UI element detection: get_clickable_ui_elements()
    • Device configuration: configure() for setting device properties
    • Android-only support (mobile_latest image)
    • Python: session.mobile.*, TypeScript: session.mobile.*, Golang: session.Mobile.*

๐Ÿ“Š Session Management Enhancement

  • Get Session API: Retrieve existing session information by session ID
    • Python: agentbay.get(session_id) returns SessionResult
    • TypeScript: agentBay.get(sessionId) returns SessionResult
    • Golang: agentBay.Get(sessionID) returns *SessionResult
    • Returns session object with VPC configuration, network info, and resource URL
    • Non-throwing error handling: check result.success and result.error_message
  • List Sessions API: Retrieve paginated list of sessions with label filtering
    • Python: agentbay.list(labels, page, limit) returns SessionListResult
    • TypeScript: agentBay.list(labels, page, limit) returns SessionListResult
    • Golang: agentBay.List(labels, page, limit) returns *SessionListResult
    • Support for page-based pagination (page numbers start from 1)
    • Label filtering with key-value pairs
    • Returns session IDs, total count, and pagination metadata

๐Ÿ—„๏ธ Data Lifecycle Management

  • RecyclePolicy: Control context data retention lifecycle
    • Lifecycle options: 1day, 3days, 5days, 10days, 15days, 30days, 90days, 180days, 360days, forever
    • Path-specific policies: apply different lifecycles to different directories
    • Path validation: wildcard patterns not supported, use exact directory paths
    • Integration with ContextSync for automatic data cleanup
    • Python: RecyclePolicy(lifecycle=Lifecycle.LIFECYCLE_30DAYS, paths=["/data"])
    • TypeScript: new RecyclePolicy(Lifecycle.LIFECYCLE_30DAYS, ["/data"])
    • Golang: &RecyclePolicy{Lifecycle: Lifecycle_30Days, Paths: []string{"/data"}}

๐Ÿ”’ VPC Session Authentication Enhancement

  • Token-based Authentication: VPC sessions now support token authentication
    • Automatic token management for secure VPC session access
    • Token included in session creation response
    • Used for MCP tool calls and resource access in VPC environments

๐Ÿ”ง API Schema Validation

  • MCP Field Validation: Enhanced MCP tool parameter validation
    • Renamed schema to field_schema to align with MCP standard
    • Better error messages for invalid tool parameters
    • Improved type checking for tool inputs

Changed (Breaking Changes)

โš ๏ธ API Response Structure Changes

  • SessionListResult.Sessions Removed: Breaking change in session list response
    • Removed: SessionListResult.Sessions field (contained full Session objects)
    • Use instead: SessionListResult.SessionIds (list of session ID strings)
    • Rationale: Reduce response payload size and improve performance
    • Migration: Use agentbay.get(session_id) to retrieve full Session objects for specific sessions
    • Before: result.Sessions[0].command.execute_command(...)
    • After:
      session_id = result.SessionIds[0]
      session_result = agentbay.get(session_id)
      session_result.session.command.execute_command(...)
      

๐Ÿ”„ Error Handling Consistency

  • Unified Error Handling: All APIs now use Result objects instead of raising exceptions
    • context.get() returns ContextResult (no longer raises AgentBayError)
    • context.create() returns ContextResult (no longer raises AgentBayError)
    • Migration Required: Replace try-except AgentBayError with if not result.success pattern
    • All error messages now follow [ErrorCode] Message format
    • Before:
      try:
          result = agentbay.context.get("my-context")
          context = result.context
      except AgentBayError as e:
          print(f"Error: {e}")
      
    • After:
      result = agentbay.context.get("my-context")
      if not result.success:
          print(f"Error: {result.error_message}")
      else:
          context = result.context
      

Deprecated

  • UI Module: All methods marked for removal in a future version
    • session.ui.click() โ†’ Use session.computer.click_mouse() or session.mobile.tap()
    • session.ui.type() โ†’ Use session.computer.press_keys() or session.mobile.input_text()
    • session.ui.mouse_move() โ†’ Use session.computer.move_mouse()
    • session.ui.screenshot() โ†’ Use session.computer.screenshot() or session.mobile.screenshot()
    • Migration guide: Use platform-specific computer or mobile modules
  • Window Module: Some methods deprecated
    • Deprecated methods will be replaced by Computer module equivalents in a future version
  • Context Fields:
    • ContextResult.state marked for removal in a future version
    • ContextResult.os_type marked for removal in a future version

Enhanced

  • Error Messages: Improved error reporting with structured [Code] Message format across all APIs
  • API-level Error Handling: Enhanced error parsing for context.get(), context.list(), and agentbay.create_session()
  • ContextResult.error_message: Added for consistent error reporting in context operations
  • ContextListResult.error_message: Added for consistent error reporting in list operations

Documentation

  • Documentation Restructure: Major documentation organization improvements
    • New Cookbook: Real-world examples and recipes for common scenarios
    • Restructured Guides:
      • Common Features split into Basics, Advanced, and Configuration sections
      • Computer Use guide updated with new Computer module APIs
      • Mobile Use guide updated with new Mobile module APIs and Android-only clarification
      • CodeSpace guide enhanced with code execution examples
    • API Reference Updates: Aligned with actual SDK implementation across all three languages
    • Migration Guides: Added for breaking changes and deprecated APIs
    • Net documentation changes: +993 lines across 28 new files, 16 modified files, 14 deleted files

Fixed

  • Documentation Accuracy: Fixed incorrect iOS support claim in Mobile Use guide (Android only)
  • API Documentation: Corrected method signatures and return types across all language SDKs
  • Example Code: All code examples verified against actual SDK implementation

[0.8.0] - 2025-09-19

Added

  • Context Sync with Callback Support: Enhanced context synchronization capabilities
    • Asynchronous Context Sync: Added callback-based asynchronous context sync functionality
    • Synchronous Context Sync: Support for blocking synchronous context sync operations
    • Flexible Sync Options: Two calling modes - with callback for async, without for sync
    • Context Sync Examples: Comprehensive examples demonstrating both sync modes
  • Enhanced Logging System: Upgraded to loguru-based logging infrastructure
    • File Logging: Support for logging to files for better debugging and analysis
    • Environment-based Log Levels: Configurable log levels via environment variables
    • Process & Thread Information: Enhanced log output with process and thread details for async operations
    • DEBUG Level Support: More detailed logging information at DEBUG level (default: INFO)
  • Data Persistence Examples: Comprehensive data persistence examples across all SDKs
    • Cross-language Examples: Updated examples for Python, TypeScript, and Golang
    • Context Management: Enhanced examples showing context binding and persistence
    • Session Recovery: Documentation and examples for session recovery scenarios
  • Browser-Use Documentation: Complete browser automation documentation suite
    • Core Features Guide: Comprehensive documentation for browser context, proxies, stealth mode, extensions, and captcha handling
    • Advanced Features Guide: In-depth coverage of PageUse Agent and AI-powered browser interactions
    • Code Examples: Practical examples demonstrating browser automation workflows
    • Integration Guide: Documentation for seamless integration with community tools and frameworks

Changed

  • Session Management: Simplified session interface
    • Removed resourceUrl: Eliminated resourceUrl parameter from session creation for cleaner API
    • Streamlined Session Creation: Simplified session parameters and initialization
  • Build System: Modernized Python package management
    • Removed setup.py: Transitioned from setup.py to setup.cfg for cleaner package configuration
    • Poetry Integration: Enhanced CI/CD with Poetry-based publishing workflows
  • Example Improvements: Enhanced code quality and consistency across examples
    • Better Error Handling: Improved error handling patterns in SDK examples
    • Parameter Standardization: Consistent parameter setup across all examples
    • Code Cleanup: Improved readability and maintainability of browser examples

Fixed

  • Browser Cookie Persistence: Resolved browser cookie persistence issues in examples
    • CDP Session Management: Fixed CDP session variable handling in cookie persistence examples
    • Browser Connection: Improved browser connection stability for persistent sessions
  • Unit Test Reliability: Enhanced test stability and coverage
    • Test Case Updates: Updated filesystem test cases for better failure scenario handling
    • Test Consistency: Fixed unit test failures and improved test reliability
  • Session Parameter Handling: Fixed session creation parameter issues in automation examples

Documentation

  • Comprehensive Documentation Updates: Major improvements across all documentation areas
    • API Key Usage: Updated best practices for API key usage and security
    • Parallel Workflows: Added examples and documentation for parallel workflow execution
    • Automation Guides: Enhanced automation documentation with English translations
    • Application & Window Management: Added comprehensive documentation for application and window operations
    • Session Recovery: New documentation covering session recovery patterns and best practices
    • File Operations: Updated file operations guides with session usage examples
    • Best Practices: Enhanced best practices documentation for API key usage and file search results handling

[0.7.0] - 2025-09-02

Added

  • AI Browser Extension: New browser extension capabilities for enhanced automation
    • Python Extension Support: Added extension.py module for browser extension functionality
    • TypeScript Extension Support: Complete extension API implementation with examples
    • Extension Integration: Seamless integration with browser automation workflows
    • Extension Testing: Comprehensive test coverage for extension functionality
  • Enhanced File System API: Major improvements to file operations across all SDKs
    • Streamlined API: Updated method signatures for better consistency
    • Session Integration: Better integration with session management for file operations
    • Comprehensive Testing: Expanded test coverage with integration and unit tests
  • Documentation & Guides: Comprehensive documentation improvements
    • Large File Handling Guide: Detailed guide for handling large files efficiently
    • File Operations Guide: Updated comprehensive guide with session usage examples
    • API Documentation: Complete API reference updates across all SDKs
    • Usage Examples: Updated examples and documentation for better developer experience

Changed

  • File System API: Breaking changes to improve consistency and usability
    • Method Naming: Standardized method names across Python, TypeScript, and Golang SDKs
    • Return Types: Enhanced return types for better type safety and error handling
    • Session Context: Improved integration with session management for file operations
  • Documentation Structure: Major documentation reorganization
    • API Reference: Updated API documentation to match actual implementation
    • Command API: Updated method names and return value references across all documentation
    • Context Manager: Enhanced documentation with detailed return object information

Fixed

  • Browser Automation: Resolved browser-related issues
    • Page Variables: Fixed support for variables in page_use_act functionality
  • Python Package: Resolved Python-specific issues
    • Module Imports: Added missing __init__.py in agentbay models directory
    • API Examples: Fixed incorrect API usage examples in Python README
  • Test Infrastructure: Improved test reliability and organization
    • Test Organization: Moved test files to appropriate integration directories
    • Deprecated Tests: Removed outdated integration tests
    • Test Coverage: Enhanced test coverage for new features

Documentation

  • Comprehensive Updates: Major documentation improvements across all areas
    • Getting Started: Updated quickstart documentation and first session guides
    • API Reference: Complete API documentation updates for all modules
    • Examples: Updated SDK usage examples and documentation
    • Guides: New and updated guides for file operations and large file handling

[0.6.0] - 2025-08-23

Added

  • Browser Proxy Support: New proxy configuration for browser automation across Python and TypeScript
    • BrowserProxy Class: Support for custom and wuying proxy types
      • Custom proxy with server, username, password configuration
      • Wuying proxy with "restricted" and "polling" strategies
      • Configurable pool size for polling strategy
    • Proxy Integration: Seamless integration with browser initialization
      • Automatic proxy configuration during browser setup
      • Support for both authenticated and anonymous proxy connections
  • Browser Stealth Mode & Fingerprinting: Enhanced browser automation capabilities
    • Stealth Mode: Browser stealth mode support to avoid detection
    • Browser Fingerprinting: Configurable browser fingerprint options
      • Custom user agent, viewport, timezone, and locale settings
      • Enhanced privacy and anti-detection capabilities
  • Context File Management APIs: Complete file operations within contexts across all SDKs
    • File URL Operations: Presigned URL support for secure file access
      • get_file_download_url() / GetFileDownloadUrl() for secure file downloads
      • get_file_upload_url() / GetFileUploadUrl() for secure file uploads
    • File Management: Full CRUD operations for context files
      • list_files() / ListFiles() with pagination support for context file listing
      • delete_file() / DeleteFile() for context file deletion
      • Enhanced error handling and response parsing for all file operations
  • Session Management Enhancements: Improved session creation and management
    • Policy Support: Optional policy_id parameter in session creation
    • Session List Response Models: Enhanced session listing with proper response models
  • Browser Agent Improvements: Enhanced AI-powered browser interactions
    • Direct ObserveResult Support: PageUse act API can now accept ObserveResult directly
    • TypeScript Browser Agent: Merged browser agent functionality to TypeScript SDK
    • Enhanced Parameter Handling: Improved parameter type conversion and validation
  • Development & Testing Infrastructure: Enhanced development experience
    • Pre-commit Hooks: Added secret detection in pre-commit hooks
    • Local Benchmark Testing: Added local benchmark test support for PageUseAgent API
    • Page Task Framework: Initial version of page task framework
    • Auto Publishing: Enhanced CI/CD with auto publishing workflows

Changed

  • Browser Configuration: Enhanced browser setup with advanced options
    • Breaking Change: BrowserOption now supports proxy, stealth mode, and fingerprinting
    • Automatic proxy validation and configuration during browser initialization
    • Better error handling for proxy connection and stealth mode issues
  • Browser Agent API: Improved browser automation interface
    • Enhanced parameter naming and validation in ExtractOptions, ObserveOptions, and ActOptions
    • Better timeout handling with default MCP tool call timeout for PageUseAgent
    • Improved error handling and response parsing
  • Context API Enhancement: Improved context service implementation
    • Enhanced error handling and response parsing across all context operations
    • Better pagination support with ContextListParams for large context lists
    • Improved integration with session creation and context synchronization
  • Documentation & Examples: Updated documentation and examples
    • Updated Python examples with session params
    • Aligned TypeScript examples with latest API changes
    • Updated README and examples documentation

Fixed

  • Browser VPC Environment: Resolved VPC-specific browser issues
    • Fixed browser endpoint URL population for VPC environment
    • Better CDP endpoint management for VPC and non-VPC sessions
  • Browser Agent API: Enhanced browser automation reliability
    • Fixed parameter type conversion in observation API
    • Fixed bad naming in browser agent methods and options
    • Improved error handling for browser agent operations
  • PageUseAgent Integration: Resolved PageUseAgent API issues
    • Fixed run_sudoku functionality
    • Aligned examples with latest PageUseAgent API revision
    • Fixed benchmark test integration with PageAgent
  • Context File Operations: Enhanced context file management reliability
    • Fixed presigned URL generation and expiration handling
    • Improved file listing pagination and response parsing
    • Better error handling for file upload/download operations
  • Cross-Platform Compatibility: Improved SDK consistency
    • Standardized method signatures across Python, TypeScript, and Golang
    • Fixed async/await patterns in TypeScript and Python implementations
    • Better error handling and response structure consistency

[0.5.0] - 2025-08-06

Added

  • Browser Automation & AI Browser: Comprehensive browser automation capabilities with AI-powered interactions
    • AIBrowser interface with Playwright integration
    • Browser Context for persistent data across sessions (cookies, local storage, session storage)
    • Browser Agent with natural language operations (Act, Observe, Extract)
    • Support for complex web automation tasks through Agent module integration
  • VPC Session Support: Enhanced security and networking capabilities
    • VPC-based session creation with is_vpc parameter
    • Specialized system tools (get_resource, system_screenshot, release_resource)
    • Browser tools availability in VPC sessions (cdp, pageuse-mcp-server, playwright)
    • Network isolation for sensitive operations
  • Agent Module: AI-powered task execution framework
    • Natural language task execution with ExecuteTask() method
    • Task status monitoring and management
    • Task termination capabilities
    • Integration with browser automation for complex web operations
  • Unified MCP Tool Interface: Standardized tool calling mechanism across all modules
    • CallMcpTool() method for unified tool invocation
    • Support for both VPC and non-VPC tool calling patterns
    • Automatic server discovery for tools
    • Enhanced error handling and debugging capabilities
  • Comprehensive Interface Architecture: Complete interface definitions for better modularity
    • FileSystemInterface, SessionInterface, WindowInterface, ApplicationInterface
    • CommandInterface, CodeInterface, UIInterface, OSSInterface
    • AgentBayInterface for main client operations
    • Mock generation support for all interfaces
  • Enhanced Context Management: Improved context synchronization and pagination
    • Context list pagination with ListContexts API
    • Enhanced context binding and persistence
    • Better error handling for context operations
  • Quality Assurance Infrastructure: Automated quality check scripts for all languages
    • One-click quality check scripts (quality-check.sh) for Python, TypeScript, and Golang
    • Comprehensive linting, formatting, and security scanning
    • Automated unit and integration testing
    • CI/CD integration support
  • Documentation & Examples: Comprehensive documentation and example improvements
    • Agent module documentation and examples for all languages
    • VPC session tutorials and API reference
    • Browser automation architecture documentation
    • AI-powered web interactions tutorial
    • Updated API reference with new modules and interfaces

Changed

  • Architecture Refactoring: Major structural improvements for better maintainability
    • Interfaces-first design with dependency injection
    • Removal of circular dependencies between modules
    • Unified MCP tool calling across all components
  • Session Management: Enhanced session capabilities and VPC support
    • VPC session detection and handling
    • Network interface IP and HTTP port management for VPC sessions
    • Improved session lifecycle management
  • Error Handling: Improved error handling and logging across all SDKs
    • Better error messages for VPC configuration issues
    • Enhanced debugging information with request IDs
    • Sanitized error reporting for security
  • Testing Infrastructure: Significantly expanded test coverage
    • Comprehensive unit tests for all modules
    • Integration tests for VPC sessions and browser automation
    • API-level comprehensive testing for command and filesystem operations
  • Module Separation: Code execution moved from Command to dedicated Code module
    • RunCode method moved from Command module to Code module
    • Clear separation of concerns between shell commands and code execution
    • Improved API consistency across all language implementations

Fixed

  • VPC Session Compatibility: Fixed issues with VPC-based tool calling
    • Proper HTTP endpoint construction for VPC sessions
    • Network configuration validation
    • Server discovery for VPC environments
  • Browser Context Persistence: Fixed browser data persistence across sessions
    • Cookie synchronization improvements
    • Context cleanup on session termination
    • Resource management for browser instances
  • Interface Consistency: Fixed inconsistencies across language implementations
    • Standardized method signatures across Python, TypeScript, and Golang
    • Consistent error handling patterns
    • Unified response structures

Security

  • Enhanced Security Scanning: Improved security measures across all SDKs
    • Dependency vulnerability scanning with pip-audit, npm audit, and govulncheck
    • Code security analysis with bandit, snyk, and gosec
    • Secure credential handling for VPC sessions
  • Network Isolation: VPC sessions provide enhanced security through network isolation
    • Private network environments for sensitive operations
    • Controlled access policies
    • Secure resource access patterns

[0.4.0] - 2025-07-18

Added

  • Enhanced Configuration: Support for setting SDK configuration via three methods (environment variables, config files, code parameters) in Python, Golang, and TypeScript.
  • API Response Improvements: All API responses now include a request_id field for better traceability and debugging.
  • Session Label Pagination: listByLabels now supports pagination parameters.
  • GetLink API: Added support for specifying protocol type and port when retrieving links.
  • OSS Persistence: Added support for persistent storage with OSS.
  • Ticket Parameter: Some APIs now support a ticket parameter.
  • Context Sync & Management: Introduced context manager and context sync APIs and implementations.
  • Automated Quality Scripts: Added one-click quality check scripts (lint/format/security scan) and multi-language automated testing.
  • Comprehensive Unit & Integration Tests: Significantly expanded and improved tests for TypeScript, Go, and Python SDKs.
  • Documentation Restructure: API reference is now split by language, with many new tutorials and examples.
  • Type/Interface Enhancements: Many new interfaces and type definitions for better IDE support across Golang, TypeScript, and Python.

Changed

  • API Compatibility: Standardized some API parameters and response formats.
  • Error Handling: Improved error handling and logging across SDKs.
  • Default Timeout: Default timeout changed to 60 seconds.
  • Documentation: Major updates to README, Getting Started, API Reference, and Tutorials.
  • Code Structure: Refactored directory structure; examples, tests, and scripts are now modularized.

Fixed

  • Session Deletion/Management: Fixed issues with session deletion and state management.
  • File System: Fixed issues with large file chunking and read/write consistency.
  • Unit Tests: Fixed compatibility and edge cases in multi-language unit tests.
  • CI/CD: Fixed cross-platform line endings and environment variable loading issues.
  • API Responses: Fixed incomplete response structures in some APIs.

[0.3.0] - 2025-06-16

Added

  • Session Labels: Added support for organizing and filtering sessions with labels
    • Set and get session labels
    • List sessions by label filters
  • Code Execution: Added support for running code in multiple languages (Python, JavaScript, etc.)
  • Window Management:
    • Added support for listing, activating, and manipulating windows
    • Window resizing, focusing, and positioning
    • Get window properties and child windows
  • OSS Integration: Added Object Storage Service functionality
    • File upload/download to/from OSS buckets
    • Anonymous upload/download via URLs
    • OSS environment initialization
  • Context Management:
    • Create, list, and delete contexts
    • Bind sessions to contexts for persistence
    • Get context information
  • UI Enhancement:
    • Added support for screenshots
    • UI element detection and interaction
    • Mobile-specific operations (click, swipe, send keys)
  • Enhanced Image ID Support: Added ability to specify custom image IDs when creating sessions
  • Application Management: Added support for listing, launching, and stopping applications

Changed

  • Updated API client to support the latest AgentBay backend features
  • Improved error handling and reporting across all SDK components
  • Enhanced documentation with examples for new features
  • Enhanced TypeScript type definitions for better IDE support
  • Standardized response formats across all operations

Fixed

  • Various bug fixes and performance improvements
  • Type compatibility issues in filesystem operations
  • Session management edge cases
  • Command execution timeouts
  • File reading/writing inconsistencies

[0.1.0] - 2025-05-15

Added

  • Core SDK Implementation: Initial implementation for Python, TypeScript, and Golang
  • Session Management:
    • Create, list, and delete sessions
    • Get session information
  • Command Execution:
    • Execute basic shell commands
  • Basic File Operations:
    • Read and write files
    • Create and delete directories
    • List directory contents
  • Authentication: API key-based authentication
  • Configuration: Environment-based configuration
  • Documentation: Initial API reference and examples

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wuying_agentbay_sdk-0.13.0.tar.gz (695.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wuying_agentbay_sdk-0.13.0-py3-none-any.whl (423.7 kB view details)

Uploaded Python 3

File details

Details for the file wuying_agentbay_sdk-0.13.0.tar.gz.

File metadata

  • Download URL: wuying_agentbay_sdk-0.13.0.tar.gz
  • Upload date:
  • Size: 695.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.12.3 Linux/6.11.0-1018-azure

File hashes

Hashes for wuying_agentbay_sdk-0.13.0.tar.gz
Algorithm Hash digest
SHA256 48f8d74b605448eb85e121a4ab6d251520a64b80bfcb88795575f1479fb3eda2
MD5 029c489fa55f3bd81f37bce5550160cd
BLAKE2b-256 a4d9aa16d95711a1a6be14d9c78bc42bdf04d503ab0086b7bf729f7d555a6059

See more details on using hashes here.

File details

Details for the file wuying_agentbay_sdk-0.13.0-py3-none-any.whl.

File metadata

  • Download URL: wuying_agentbay_sdk-0.13.0-py3-none-any.whl
  • Upload date:
  • Size: 423.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.12.3 Linux/6.11.0-1018-azure

File hashes

Hashes for wuying_agentbay_sdk-0.13.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7b240a4c91474cd95d1c0908e354c961dab2e9d10213b7d27f2409f6efe48a32
MD5 14dd2ac0a09e4902f812b133a8be189b
BLAKE2b-256 a6afc755e1bec53ca4865c1826d8aae0dc692727cdbef6209f55cdaedff0cdf9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page