Skip to main content

AI-driven mobile test automation framework with Appium MCP server, Page Object Model, and AWS Bedrock integration

Project description

Appium MCP - AI-Driven Mobile Test Automation

Professional mobile test automation framework with AI-powered test generation, Page Object Model support, and AWS Bedrock integration.

Supported Platforms: iOS | Android
Python Version: 3.8+
License: MIT

๐Ÿš€ Quick Start (3 Steps)

1. Install the Package

pip install appium-mcp

2. Start Appium Server (in separate terminal)

# Install Appium globally (one-time)
npm install -g appium

# Start Appium server
appium --address localhost --port 4723

3. Run Interactive Chatbot

appium-mcp-chatbot

โœจ Features

  • Interactive Chatbot: Walk through your app testing in natural language
  • AI-Generated Tests: Automatically generate test code from your interactions
  • Page Object Model: Auto-generate clean, reusable page objects
  • YAML Workflows: Define test flows in simple YAML, AI handles execution
  • Multi-Platform: iOS and Android support
  • Element Discovery: Automatic element locator generation
  • Screenshot Capture: Automatic screenshots at each step
  • AWS Integration: Use Claude AI via Bedrock for test generation (optional)

๐Ÿ“‹ Prerequisites

Before you start, ensure you have:

  1. Python 3.8 or higher

    python --version
    
  2. Node.js 14+ and npm (for Appium)

    node --version
    npm --version
    

๐Ÿ“ฆ Installation

From PyPI (Recommended)

# Create virtual environment
python -m venv myenv
source myenv/bin/activate  # On Windows: myenv\Scripts\activate

# Install appium-mcp
pip install appium-mcp

From Source (Development)

git clone https://github.com/youcanautomate-yca/ai-driven-mobile-automation.git
cd ai-driven-mobile-automation
python -m venv venv
source venv/bin/activate
pip install -e .

๐Ÿ”ง Setup Appium Server

Appium server must be running for appium-mcp to work.

Installation

# Install globally (one-time)
npm install -g appium

# Install drivers
appium driver install xcuitest   # iOS
appium driver install uiautomator2  # Android

Start Server

# Terminal 1: Start Appium (runs on port 4723)
appium

# You should see:
# [Appium] Welcome to Appium v2.x.x
# ...
# [Appium] Server listening on http://127.0.0.1:4723

๐ŸŽฏ CLI Commands

Once installed, you have access to these commands:

Interactive Chatbot (Easy!)

appium-mcp-chatbot

Perfect for beginners - guides you through test generation step-by-step.

Run YAML Workflows

appium-mcp-run-yaml my_workflow.yaml

Execute predefined test workflows from YAML files.

Generate Tests

appium-mcp-generate-tests workflow.json

Auto-generate test scripts from recorded interactions.

Start MCP Server

appium-mcp-server

Start the Model Context Protocol server for integration.

๐Ÿ“ YAML Workflow Examples

Example 1: Simple Login Test

Create a file login_test.yaml:

version: "1.0"
description: "Login test"
platform: "ios"
device_name: "iPhone 14"
bundle_id: "com.example.app"
app_path: "/path/to/app.app"

workflow:
  LoginScreen:
    - "Take a screenshot to see the app"
    - "Tap on the email field"
    - "Type user@example.com"
    - "Tap on the password field"
    - "Type MyPassword123"
    - "Tap the Sign In button"
    - "Wait 2 seconds for login to complete"
    
  HomeScreen:
    - "Take a screenshot to verify login success"
    - "Verify that Welcome message is visible"

Run it:

appium-mcp-run-yaml login_test.yaml

Example 2: E-Commerce Purchase Flow

Create purchase_flow.yaml:

version: "1.0"
description: "Complete purchase flow"
platform: "android"
device_name: "emulator"
app_package: "com.myapp"
app_activity: ".MainActivity"

workflow:
  HomePage:
    - "Take screenshot"
    - "Scroll down to see products"
    - "Tap on first product"
    
  ProductPage:
    - "Take screenshot"
    - "Tap on size selector"
    - "Choose size M"
    - "Tap Add to Cart button"
    - "Verify item added message"
    
  CartPage:
    - "Tap on shopping cart icon"
    - "Take screenshot"
    - "Tap Checkout button"
    
  CheckoutPage:
    - "Enter shipping address"
    - "Enter payment details"
    - "Tap Place Order"
    - "Verify order confirmation"

Run it:

appium-mcp-run-yaml purchase_flow.yaml

YAML File Structure

version: "1.0"                    # YAML version (required)
description: "Test description"   # What this test does
platform: "ios" or "android"     # Target platform (required)
device_name: "iPhone 14"         # Device name/simulator
bundle_id: "com.example.app"     # iOS bundle ID
app_package: "com.example.app"   # Android package (Android only)
app_activity: ".MainActivity"    # Android activity (Android only)
app_path: "/path/to/app.app"     # Path to app binary

workflow:                         # Test steps
  ScreenName:                     # Group steps by screen
    - "Natural language prompt"   # AI executes this
    - "Another step"
    - "..."
  
  NextScreen:
    - "Step 1"
    - "Step 2"

How YAML Workflows Work

  1. Write prompts in natural language - No tool names needed
  2. AI inspects the current screen - Analyzes app UI
  3. AI finds the right elements - Uses element locators
  4. AI performs the action - Click, type, scroll, etc.
  5. Auto-generates page objects - Reusable code
  6. Auto-generates test code - Ready-to-run tests

Example: Write one line:

"Tap on the email field and type test@example.com"

A generates Python code:

class LoginPage:
    def enter_email(self, email):
        self.find_element("email_field").send_keys(email)

๐Ÿ’ป Full Example: Running a Test

Step 1: Create YAML file

Save as test_app.yaml:

version: "1.0"
description: "Simple app test"
platform: "ios"
device_name: "iPhone 14"
bundle_id: "com.testapp"
app_path: "./app/TestApp.app"

workflow:
  Start:
    - "Take a screenshot"
    - "Tap the start button"
    - "Wait 1 second"
    - "Take final screenshot"

Step 2: Start Appium (Terminal 1)

appium

Step 3: Run the test (Terminal 2)

cd my_project
source venv/bin/activate
appium-mcp-run-yaml test_app.yaml

Step 4: View results

  • Generated test: generated_tests/test_app.py
  • Page objects: page_objects/StartPage.py
  • Screenshots: screenshots/

๐Ÿค– Python API Usage

Use appium-mcp as a Python library:

from appium_mcp import MobileAutomationFramework
from appium.options.ios import XCUITestOptions

# Initialize framework
framework = MobileAutomationFramework()

# Create options
options = XCUITestOptions()
options.device_name = "iPhone 14"
options.bundle_id = "com.example.app"

# Create session
driver = framework.create_session(options)

# Use driver like standard Appium
driver.find_element("xpath", "//XCUIElementTypeButton[@name='Login']").click()

# Generate test code from session
test_code = framework.generate_test("my_test")
print(test_code)

# Cleanup
driver.quit()

๐Ÿ“š Comprehensive Guides

๐Ÿ†˜ Troubleshooting

"Appium server not responding"

# Make sure Appium is running
appium

# Check if running on correct port
curl http://localhost:4723/status

"Connection refused"

# Restart Appium
appium --port 4723

# Verify config
appium-mcp-run-yaml --help

"Element not found"

  • Take a screenshot first: "Take a screenshot"
  • Check the actual element name in the app
  • Use more specific natural language: "Tap on the blue Login button"

Import errors after installation

# Reinstall in clean environment
python -m venv fresh_env
source fresh_env/bin/activate
pip install appium-mcp

๐Ÿ”— Resources

๐Ÿ“„ License

MIT License - see LICENSE file for details

๐Ÿ‘ฅ Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Submit a pull request

๐Ÿ“ž Support

Architecture

The server is organized into modular components:

ai-driven-mobile-automation/
โ”œโ”€โ”€ server.py              # Main MCP server and tool registration
โ”œโ”€โ”€ session_store.py       # Global driver and session management
โ”œโ”€โ”€ command.py             # Core Appium command execution
โ”œโ”€โ”€ logger.py              # Logging utilities
โ”œโ”€โ”€ tools_session.py       # Session management tools
โ”œโ”€โ”€ tools_interactions.py  # Element interaction tools
โ”œโ”€โ”€ tools_navigations.py   # Navigation tools
โ”œโ”€โ”€ tools_app_management.py # App management tools
โ”œโ”€โ”€ tools_context.py       # Context switching tools
โ”œโ”€โ”€ tools_ios.py           # iOS-specific tools
โ”œโ”€โ”€ tools_test_generation.py # Test generation tools
โ”œโ”€โ”€ tools_documentation.py  # Documentation/help tools
โ”œโ”€โ”€ requirements.txt       # Python dependencies
โ””โ”€โ”€ README.md             # This file

Mirrored Tools from TypeScript Implementation

Session Management

  • select_platform - Select iOS or Android platform
  • select_device - Select target device
  • create_session - Create new Appium session
  • delete_session - Delete active session
  • open_notifications - Open notifications panel

Element Interactions

  • appium_click - Click on element
  • appium_find_element - Find element by strategy/selector
  • appium_double_tap - Double tap element
  • appium_long_press - Long press element
  • appium_drag_and_drop - Drag element to target
  • appium_press_key - Press key
  • appium_set_value - Set text value
  • appium_get_text - Get element text
  • appium_get_active_element - Get focused element
  • appium_screenshot - Take screenshot
  • appium_element_screenshot - Screenshot of element
  • appium_get_orientation - Get device orientation
  • appium_set_orientation - Set device orientation
  • appium_handle_alert - Handle alert dialogs

Navigation

  • appium_scroll - Scroll up/down/left/right
  • appium_scroll_to_element - Scroll until element visible
  • appium_swipe - Perform swipe gesture

App Management

  • appium_activate_app - Activate installed app
  • appium_install_app - Install app
  • appium_uninstall_app - Uninstall app
  • appium_terminate_app - Terminate running app
  • appium_list_apps - List installed apps
  • appium_is_app_installed - Check if app installed
  • appium_deep_link - Open deep link

Context Management

  • appium_get_contexts - Get available contexts
  • appium_switch_context - Switch between contexts

iOS Tools

  • appium_boot_simulator - Boot iOS simulator
  • appium_setup_wda - Setup WebDriverAgent
  • appium_install_wda - Install WebDriverAgent

Test Generation

  • appium_generate_locators - Generate element locators
  • appium_generate_tests - Generate test scripts

Documentation

  • appium_answer_appium - Answer Appium questions

YAML Workflow Automation

Simple, clean YAML workflows. Define screen names, write prompts, let AI handle everything else!

Quick Example

Create workflow.yml:

version: "1.0"
description: "Login and purchase"
platform: "ios"
device_name: "iPhone 16"
bundle_id: "com.example.ecommerce"

# Screen-based workflow - group prompts by screen
workflow:
  LoginScreen:
    - "Enter email user@example.com"
    - "Enter password mypassword"
    - "Click login button"
  
  HomeScreen:
    - "Click first product"
    - "View product details"
  
  CartScreen:
    - "Add item to cart"
    - "Click checkout"
  
  CheckoutScreen:
    - "Enter shipping address"
    - "Enter payment details"
    - "Complete purchase"
  
  OrderConfirmationScreen:
    - "Take screenshot"

Run Workflow

appium-mcp-run-yaml workflow.yml

What Happens Automatically

For each prompt:

  1. โœ… AI inspects the current screen
  2. โœ… Analyzes page source to find elements
  3. โœ… Determines the right element to interact with
  4. โœ… Performs the action (click, type, scroll, etc.)
  5. โœ… Creates page object with discovered elements
  6. โœ… Generates test code with reusable methods

Auto-Generated Page Objects

From this YAML:

LoginScreen:
  - "Enter email user@example.com"

AI generates:

class LoginScreen:
    def login(self, email, password):
        self.email_field.send_keys(email)
        self.password_field.send_keys(password)
        self.login_button.click()

Why This Approach?

  • No Tool Names - Just write natural language
  • No Element Selectors - AI finds them automatically
  • No Manual Page Objects - Generated automatically
  • No ID/Xpath Maintenance - AI updates as UI changes
  • Clean & Readable - Anyone can write the YAML

Advanced Features

  • Error Recovery - AI retries with different approaches
  • Screenshots - Automatic at each step
  • Element Waiting - AI waits for elements to appear
  • Scroll Handling - AI scrolls to find elements
  • Multi-Platform - iOS and Android

YAML Guides

Usage Example

import asyncio
from mcp.client import StdioClient

async def main():
    # Create client
    client = StdioClient("python", ["server.py"])
    
    # Call a tool
    result = await client.call_tool(
        "select_platform",
        {"platform": "ios"}
    )
    print(result)

asyncio.run(main())

Tool Schemas

Each tool includes:

  • name: Unique tool identifier
  • description: Human-readable description
  • inputSchema: JSON schema for parameters
  • execute: Async function that implements the tool

Error Handling

All tools return a status JSON response:

{
    "status": "success|error|warning",
    "message": "Human-readable message",
    ...tool-specific fields
}

Logging

Logs are written to stderr with format:

[TOOL START] tool_name: arguments
[TOOL END] tool_name
[TOOL ERROR] tool_name: error message

Differences from TypeScript Implementation

  1. Async/Await: Python version uses async/await instead of Promise-based TypeScript
  2. Module Organization: Simplified into single Python files instead of separate TypeScript modules
  3. Error Handling: Try-catch blocks instead of TypeScript error handling
  4. Type System: Uses type hints instead of TypeScript types

Contributing

To add new tools:

  1. Create a function in the appropriate tools_*.py file
  2. Register it in server.py in the register_tools() function
  3. Document the tool parameters and return values

License

Same as parent appium-mcp project.

Related

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

appium_mcp-1.0.5.tar.gz (74.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

appium_mcp-1.0.5-py3-none-any.whl (86.3 kB view details)

Uploaded Python 3

File details

Details for the file appium_mcp-1.0.5.tar.gz.

File metadata

  • Download URL: appium_mcp-1.0.5.tar.gz
  • Upload date:
  • Size: 74.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.17

File hashes

Hashes for appium_mcp-1.0.5.tar.gz
Algorithm Hash digest
SHA256 834b1bee20fce1cef6c07a8e19e4c215e239281d4dc738fafa652003b96f9c81
MD5 38175b41e28a78eea8c33d3ee1faa40f
BLAKE2b-256 11de17c6c9b0c5355baceff2d1aa3bbd9ae24ad3800875d4a8ad4f6b28e628ac

See more details on using hashes here.

File details

Details for the file appium_mcp-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: appium_mcp-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 86.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.17

File hashes

Hashes for appium_mcp-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 fa4e4468150fc098facee059db2132711945dcfc687306644e562ee3735bf75e
MD5 aaccd94536e016cfecf19a962858b9dc
BLAKE2b-256 b273b603c8cf238ee5f7feb650b380bf678595f221269c9b2c125e2c5349784e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page