Skip to main content

Android ADB automation CLI tool - Designed for AI Agents

Project description

agent-android

Android ADB automation CLI tool - Designed for AI Agents

PyPI version Python License Platform PyPI downloads

agent-android is an ADB-based Android device automation CLI tool that fully mirrors the design philosophy of agent-browser, providing powerful and simple Android automation capabilities for AI Agents.

✨ Features

  • 🤖 AI Friendly - Snapshot + Ref mode, designed specifically for AI
  • 🎯 Simple & Intuitive - Natural language descriptions, no programming knowledge needed
  • Fast & Efficient - Pure Python implementation, lightweight dependencies
  • 🔧 Flexible & Powerful - Dual interface: CLI + Python API
  • 🌍 Multi-language - Full Chinese natural language processing
  • 📦 Easy Integration - Simple API, easy to integrate into AI workflows
  • 🚀 Multi-device Support - Parallel operations on multiple Android devices
  • ⏱️ Smart Wait - Automatic element waiting, no manual sleep needed

🚀 Quick Start

Installation

Method 1: Install via PyPI (Recommended)

# Install from PyPI
pip install agent-android

# Verify installation
agent-android --version

Method 2: Install from Source

# Clone repository
git clone https://github.com/Fast2x/agent-android.git
cd agent-android

# Install dependencies
pip install -r requirements.txt

# Ensure ADB is available
adb devices

Requirements

  • Python 3.7+
  • ADB (Android Debug Bridge)
  • Android device or emulator

Verify Installation

# Check package info
pip show agent-android

# Test Python API (IMPORTANT: test outside project directory)
cd /tmp
python3 -c "from core.android import create_android_device; print('✅ Installation successful!')"

Note: The package installs the core Python module. Use from core.android import ... not import agent_android.

Basic Usage

Linux / macOS:

# Connect to device
./agent-android connect

# Start app
./agent-android start_app com.example.app

# Get UI snapshot (with element refs)
./agent-android snapshot -i

# Operate on elements using refs
./agent-android tap @e1

# Take screenshot
./agent-android screenshot screen.png

# Disconnect
./agent-android disconnect

Windows:

REM Use batch wrapper
agent-android.bat connect
agent-android.bat start_app com.example.app
agent-android.bat snapshot -i
agent-android.bat tap @e1

REM Or use Python directly
python agent-android connect

More Operations

# Swipe screen
./agent-android swipe 100,200 300,400

# Input text
./agent-android input "Hello World"

# Press home key
./agent-android press home

# Start/stop app
./agent-android start_app com.example.app
./agent-android stop_app com.example.app

# Get element text
./agent-android get text @e1

# Export UI dump
./agent-android dump ui_dump.xml

📖 Core Features

1. Snapshot + Ref Mode (Core)

# Get UI snapshot, auto-generate element refs
./agent-android snapshot -i

# Output:
# - TextView "Learning" [id=com.app:id/tabTV] [ref=e1]
# - Button "Settings" [id=com.app:id/settings] [clickable] [ref=e2]
# - ImageView "Search" [id=com.app:id/search] [ref=e3]

# Use ref to tap
./agent-android tap @e2
./agent-android get text @e1

Advantages:

  • ✅ Deterministic - refs point to elements precisely
  • ✅ Fast - no need to reparse UI
  • ✅ AI friendly - snapshot provides complete context

2. Natural Language Control

from core.android import create_android_device
from core.nlp_icon_helper import NLPIconHelper

device = create_android_device()
nlp = NLPIconHelper(device)

# Use natural language to control
nlp.tap_by_nlp("Click settings button")
nlp.tap_by_nlp("Click menu icon in top right")
nlp.tap_by_nlp("Click learning tab at bottom")

device.close()

Supported Keywords:

  • Position: top-left, top-right, bottom, center, left, right
  • Type: icon, button, text, input

3. Python API

from core.android import create_android_device

device = create_android_device()

# App management
device.start_app("com.example.app")
device.stop_app("com.example.app")

# Touch operations
device.tap(500, 1000)
device.swipe(500, 1000, 500, 500)

# Input operations
device.input_text("Hello World")
device.press_home()
device.press_back()

# Element finding
element = device.find_element({
    "strategy": "text_contains",
    "value": "Settings"
})
device.tap(element['center']['x'], element['center']['y'])

# Screenshot
device.screenshot("screen.png")

device.close()

4. Multi-Device Management

# Connect all devices
./agent-android multi-connect

# List connected devices
./agent-android multi-list

# Parallel screenshot all devices
./agent-android multi-screenshot

# Parallel tap on all devices
./agent-android multi-tap 500 1000

# Parallel start app on all devices
./agent-android multi-start-app com.example.app

# Disconnect all devices
./agent-android multi-disconnect

5. Smart Wait

# Wait for element to appear
./agent-android wait-for id com.app:id/button

# Wait for text (10 second timeout)
./agent-android wait-for-text "Welcome" --timeout 10000

# Wait for app to start
./agent-android wait-for-app com.example.app

📚 Command Reference

Device Management

agent-android devices                     # List devices
agent-android connect [--serial <id>]     # Connect to device
agent-android disconnect                  # Disconnect

Touch Operations

agent-android tap <selector>              # Tap element ref or coordinates
agent-android swipe <start> <end>         # Swipe screen

Input Operations

agent-android input <text>                # Input text
agent-android press <key>                 # Press key (home/back/enter)

Screenshot & Snapshot

agent-android screenshot [path]           # Take screenshot
agent-android snapshot [-i]               # UI snapshot (interactive only)

App Management

agent-android start_app <package>         # Start app
agent-android stop_app <package>          # Stop app

Smart Wait

agent-android wait-for <strategy> <value> # Wait for element
agent-android wait-for-text <text>       # Wait for text
agent-android wait-for-app <package>      # Wait for app start

Multi-Device

agent-android multi-connect [--max <n>]   # Connect all devices
agent-android multi-list                  # List connected devices
agent-android multi-screenshot [path]     # Screenshot all devices
agent-android multi-tap <x> <y>           # Tap all devices
agent-android multi-start-app <package>   # Start app on all devices
agent-android multi-disconnect            # Disconnect all devices

Global Options

--session <name>                         # Session name
--json                                   # JSON output
--debug                                  # Debug mode

🎯 Use Cases

1. AI Agents Control Android Devices

# AI can use natural language directly
nlp.tap_by_nlp("Click settings button")
nlp.tap_by_nlp("Click back button")

2. UI Automation Testing

# Automated testing workflow
device.start_app("com.example.app")
time.sleep(2)

element = device.find_element({"strategy": "text", "value": "Login"})
device.tap_element(element)
device.input_text("test@example.com")
device.press_enter()

# Verify results
assert device.wait_for_text("Welcome", timeout=5000)

3. Data Collection

# Collect app data
device.start_app("com.example.app")

while True:
    # Take screenshot
    device.screenshot(f"data/screenshot_{int(time.time())}.png")

    # Get UI data
    ui_dump = device.get_ui_dump()

    # Check if done
    if device.find_element({"strategy": "text", "value": "Complete"}):
        break

4. Task Automation

# Automate repetitive tasks
device.start_app("com.example.app")

# Login
device.tap(500, 1000)
device.input_text("username")
device.tap(500, 1200)
device.input_text("password")
device.tap(500, 1400)

# Navigate through app
device.swipe(500, 2000, 500, 500)

5. Multi-Device Testing

from core.multi_device import create_multi_device_manager

manager = create_multi_device_manager()
manager.connect_all()

# Test app on all devices
manager.parallel_start_app("com.example.app")
manager.parallel_screenshot("test_{device_id}.png")
manager.parallel_tap(500, 1000)

manager.disconnect_all()

🔧 Requirements

  • Python 3.7+
  • ADB (Android Debug Bridge)
  • Android device or emulator

📄 License

Apache License 2.0 - see LICENSE file

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📞 Support

For issues and questions:

🎉 Summary

agent-android is a complete, production-ready Android automation tool designed for AI Agents.

Core Highlights

  1. Snapshot + Ref Mode - Matches agent-browser design
  2. Natural Language Control - Industry-first innovation
  3. Multi-Device Support - Parallel execution, 10x efficiency
  4. Smart Wait - Automatic element waiting
  5. Production Ready - All features tested
  6. Open Source - Apache 2.0 license
  7. Cross-Platform - Windows | Linux | macOS
  8. PyPI Published - Easy installation via pip

Installation

Via PyPI (Recommended):

pip install agent-android

From Source:

git clone https://github.com/Fast2x/agent-android.git
cd agent-android
./agent-android connect

PyPI Package: https://pypi.org/project/agent-android/

Made with ❤️ for AI Agents

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_android-1.4.0.tar.gz (61.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agent_android-1.4.0-py3-none-any.whl (50.8 kB view details)

Uploaded Python 3

File details

Details for the file agent_android-1.4.0.tar.gz.

File metadata

  • Download URL: agent_android-1.4.0.tar.gz
  • Upload date:
  • Size: 61.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for agent_android-1.4.0.tar.gz
Algorithm Hash digest
SHA256 de3cbdb767c7323549f023f7a089770b9b51e1b75321b0c8ac31241705b86764
MD5 4f9347199e550550a6021fe2f4119ef0
BLAKE2b-256 251c3055607923f6352aaa7a0d60c5422baa746e9341b9bce43ef57979677904

See more details on using hashes here.

File details

Details for the file agent_android-1.4.0-py3-none-any.whl.

File metadata

  • Download URL: agent_android-1.4.0-py3-none-any.whl
  • Upload date:
  • Size: 50.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for agent_android-1.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 80d8c9e076726f9ceac5fcc0aaa1e077899f8b2b9ea2b2667b94f68e62590fbd
MD5 5ddb7b0c6b21f226fc29ca987ec8bbec
BLAKE2b-256 1be897bf92a34eb36a4786940af3e720db2f3175e51b66105f107a51b971cab7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page