Android ADB automation CLI tool - Designed for AI Agents
Project description
agent-android
Android ADB automation CLI tool - Designed for AI Agents
agent-android is an ADB-based Android device automation CLI tool that fully mirrors the design philosophy of agent-browser, providing powerful and simple Android automation capabilities for AI Agents.
✨ Features
- 🤖 AI Friendly - Snapshot + Ref mode, designed specifically for AI
- 🎯 Simple & Intuitive - Natural language descriptions, no programming knowledge needed
- ⚡ Fast & Efficient - Pure Python implementation, lightweight dependencies
- 🔧 Flexible & Powerful - Dual interface: CLI + Python API
- 🌍 Multi-language - Full Chinese natural language processing
- 📦 Easy Integration - Simple API, easy to integrate into AI workflows
- 🚀 Multi-device Support - Parallel operations on multiple Android devices
- ⏱️ Smart Wait - Automatic element waiting, no manual sleep needed
🚀 Quick Start
Installation
Method 1: Install via PyPI (Recommended)
# Install from PyPI
pip install agent-android
# Verify installation
agent-android --version
Method 2: Install from Source
# Clone repository
git clone https://github.com/Fast2x/agent-android.git
cd agent-android
# Install dependencies
pip install -r requirements.txt
# Ensure ADB is available
adb devices
Requirements
- Python 3.7+
- ADB (Android Debug Bridge)
- Android device or emulator
Verify Installation
# Check package info
pip show agent-android
# Test Python API (IMPORTANT: test outside project directory)
cd /tmp
python3 -c "from core.android import create_android_device; print('✅ Installation successful!')"
Note: The package installs the core Python module. Use from core.android import ... not import agent_android.
Basic Usage
Linux / macOS:
# Connect to device
./agent-android connect
# Start app
./agent-android start_app com.example.app
# Get UI snapshot (with element refs)
./agent-android snapshot -i
# Operate on elements using refs
./agent-android tap @e1
# Take screenshot
./agent-android screenshot screen.png
# Disconnect
./agent-android disconnect
Windows:
REM Use batch wrapper
agent-android.bat connect
agent-android.bat start_app com.example.app
agent-android.bat snapshot -i
agent-android.bat tap @e1
REM Or use Python directly
python agent-android connect
More Operations
# Swipe screen
./agent-android swipe 100,200 300,400
# Input text
./agent-android input "Hello World"
# Press home key
./agent-android press home
# Start/stop app
./agent-android start_app com.example.app
./agent-android stop_app com.example.app
# Get element text
./agent-android get text @e1
# Export UI dump
./agent-android dump ui_dump.xml
📖 Core Features
1. Snapshot + Ref Mode (Core)
# Get UI snapshot, auto-generate element refs
./agent-android snapshot -i
# Output:
# - TextView "Learning" [id=com.app:id/tabTV] [ref=e1]
# - Button "Settings" [id=com.app:id/settings] [clickable] [ref=e2]
# - ImageView "Search" [id=com.app:id/search] [ref=e3]
# Use ref to tap
./agent-android tap @e2
./agent-android get text @e1
Advantages:
- ✅ Deterministic - refs point to elements precisely
- ✅ Fast - no need to reparse UI
- ✅ AI friendly - snapshot provides complete context
2. Natural Language Control
from core.android import create_android_device
from core.nlp_icon_helper import NLPIconHelper
device = create_android_device()
nlp = NLPIconHelper(device)
# Use natural language to control
nlp.tap_by_nlp("Click settings button")
nlp.tap_by_nlp("Click menu icon in top right")
nlp.tap_by_nlp("Click learning tab at bottom")
device.close()
Supported Keywords:
- Position: top-left, top-right, bottom, center, left, right
- Type: icon, button, text, input
3. Python API
from core.android import create_android_device
device = create_android_device()
# App management
device.start_app("com.example.app")
device.stop_app("com.example.app")
# Touch operations
device.tap(500, 1000)
device.swipe(500, 1000, 500, 500)
# Input operations
device.input_text("Hello World")
device.press_home()
device.press_back()
# Element finding
element = device.find_element({
"strategy": "text_contains",
"value": "Settings"
})
device.tap(element['center']['x'], element['center']['y'])
# Screenshot
device.screenshot("screen.png")
device.close()
4. Multi-Device Management
# Connect all devices
./agent-android multi-connect
# List connected devices
./agent-android multi-list
# Parallel screenshot all devices
./agent-android multi-screenshot
# Parallel tap on all devices
./agent-android multi-tap 500 1000
# Parallel start app on all devices
./agent-android multi-start-app com.example.app
# Disconnect all devices
./agent-android multi-disconnect
5. Smart Wait
# Wait for element to appear
./agent-android wait-for id com.app:id/button
# Wait for text (10 second timeout)
./agent-android wait-for-text "Welcome" --timeout 10000
# Wait for app to start
./agent-android wait-for-app com.example.app
📚 Command Reference
Device Management
agent-android devices # List devices
agent-android connect [--serial <id>] # Connect to device
agent-android disconnect # Disconnect
Touch Operations
agent-android tap <selector> # Tap element ref or coordinates
agent-android swipe <start> <end> # Swipe screen
Input Operations
agent-android input <text> # Input text
agent-android press <key> # Press key (home/back/enter)
Screenshot & Snapshot
agent-android screenshot [path] # Take screenshot
agent-android snapshot [-i] # UI snapshot (interactive only)
App Management
agent-android start_app <package> # Start app
agent-android stop_app <package> # Stop app
Smart Wait
agent-android wait-for <strategy> <value> # Wait for element
agent-android wait-for-text <text> # Wait for text
agent-android wait-for-app <package> # Wait for app start
Multi-Device
agent-android multi-connect [--max <n>] # Connect all devices
agent-android multi-list # List connected devices
agent-android multi-screenshot [path] # Screenshot all devices
agent-android multi-tap <x> <y> # Tap all devices
agent-android multi-start-app <package> # Start app on all devices
agent-android multi-disconnect # Disconnect all devices
Global Options
--session <name> # Session name
--json # JSON output
--debug # Debug mode
🎯 Use Cases
1. AI Agents Control Android Devices
# AI can use natural language directly
nlp.tap_by_nlp("Click settings button")
nlp.tap_by_nlp("Click back button")
2. UI Automation Testing
# Automated testing workflow
device.start_app("com.example.app")
time.sleep(2)
element = device.find_element({"strategy": "text", "value": "Login"})
device.tap_element(element)
device.input_text("test@example.com")
device.press_enter()
# Verify results
assert device.wait_for_text("Welcome", timeout=5000)
3. Data Collection
# Collect app data
device.start_app("com.example.app")
while True:
# Take screenshot
device.screenshot(f"data/screenshot_{int(time.time())}.png")
# Get UI data
ui_dump = device.get_ui_dump()
# Check if done
if device.find_element({"strategy": "text", "value": "Complete"}):
break
4. Task Automation
# Automate repetitive tasks
device.start_app("com.example.app")
# Login
device.tap(500, 1000)
device.input_text("username")
device.tap(500, 1200)
device.input_text("password")
device.tap(500, 1400)
# Navigate through app
device.swipe(500, 2000, 500, 500)
5. Multi-Device Testing
from core.multi_device import create_multi_device_manager
manager = create_multi_device_manager()
manager.connect_all()
# Test app on all devices
manager.parallel_start_app("com.example.app")
manager.parallel_screenshot("test_{device_id}.png")
manager.parallel_tap(500, 1000)
manager.disconnect_all()
🔧 Requirements
- Python 3.7+
- ADB (Android Debug Bridge)
- Android device or emulator
📄 License
Apache License 2.0 - see LICENSE file
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
📞 Support
For issues and questions:
- GitHub Issues: Create Issue
🎉 Summary
agent-android is a complete, production-ready Android automation tool designed for AI Agents.
Core Highlights
- ✅ Snapshot + Ref Mode - Matches agent-browser design
- ✅ Natural Language Control - Industry-first innovation
- ✅ Multi-Device Support - Parallel execution, 10x efficiency
- ✅ Smart Wait - Automatic element waiting
- ✅ Production Ready - All features tested
- ✅ Open Source - Apache 2.0 license
- ✅ Cross-Platform - Windows | Linux | macOS
- ✅ PyPI Published - Easy installation via pip
Installation
Via PyPI (Recommended):
pip install agent-android
From Source:
git clone https://github.com/Fast2x/agent-android.git
cd agent-android
./agent-android connect
PyPI Package: https://pypi.org/project/agent-android/
Made with ❤️ for AI Agents
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agent_android-1.4.0.tar.gz.
File metadata
- Download URL: agent_android-1.4.0.tar.gz
- Upload date:
- Size: 61.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
de3cbdb767c7323549f023f7a089770b9b51e1b75321b0c8ac31241705b86764
|
|
| MD5 |
4f9347199e550550a6021fe2f4119ef0
|
|
| BLAKE2b-256 |
251c3055607923f6352aaa7a0d60c5422baa746e9341b9bce43ef57979677904
|
File details
Details for the file agent_android-1.4.0-py3-none-any.whl.
File metadata
- Download URL: agent_android-1.4.0-py3-none-any.whl
- Upload date:
- Size: 50.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80d8c9e076726f9ceac5fcc0aaa1e077899f8b2b9ea2b2667b94f68e62590fbd
|
|
| MD5 |
5ddb7b0c6b21f226fc29ca987ec8bbec
|
|
| BLAKE2b-256 |
1be897bf92a34eb36a4786940af3e720db2f3175e51b66105f107a51b971cab7
|