Skip to main content

An AI-powered mobile task automation assistant with support for both Android and HarmonyOS.

Project description

Phone Copilot

English README

logo.svg

项目介绍

Phone Copilot 是一个参考 Open-AutoGLM 的基于 AI 构建的手机端智能助理框架,它能够以多模态方式理解手机屏幕内容,并通过自动化操作帮助用户完成任务。

系统通过 ADB 或 HDC 来控制 Android 和 HarmonyOS 设备,以视觉语言模型进行屏幕感知,再结合智能规划能力生成并执行操作流程。用户只需用自然语言描述需求,如“打开高德地图,导航至南京南高铁站”,Phone Agent 即可自动解析意图、理解当前界面、规划下一步动作并完成整个流程。

https://github.com/user-attachments/assets/e60e118e-7cec-4bba-9e8a-38054069cc9f

环境准备

  1. Python >= 3.12
  2. Android 设备需要安装 adb,并配置环境变量。HarmonyNext 设备需要安装 hdc,并配置环境变量
  3. 手机需要启动开发者模式,并且打开 USB 调试选项
  4. 如果 Android 设备需要输入中文,需要额外安装 ADB Keyboard
  5. 本地通过 vllm 部署模型,或者使用远端模型 API, 建议使用 AutoGLM-Phone-9B

快速开始

uv (recommend)

pip install uv

# 使用 model scope api, https://modelscope.cn/models/ZhipuAI/AutoGLM-Phone-9B
uvx phone-copilot "打开高德地图,导航至南京南高铁站" --base-url "https://api-inference.modelscope.cn/v1" --model "ZhipuAI/AutoGLM-Phone-9B" --api-key "替换为你的 Token" --json "demo.json" --html "demo.html"

pip

pip install phone-copilot

# 使用 model scope api, https://modelscope.cn/models/ZhipuAI/AutoGLM-Phone-9B
phone-copilot "打开高德地图,导航至南京南高铁站" --base-url "https://api-inference.modelscope.cn/v1" --model "ZhipuAI/AutoGLM-Phone-9B" --api-key "替换为你的 Token" --json "demo.json" --html "demo.html"

命令行参数

参数 必填 默认值 说明
task(位置参数) - 要执行的任务描述(自然语言)。示例:"打开高德地图,导航至南京南高铁站"
--base-url - 模型 API 的 Base URL(OpenAI 兼容)。示例:https://api-inference.modelscope.cn/v1 或本地 http://localhost:8000/v1
--model - 使用的模型名称。示例:ZhipuAI/AutoGLM-Phone-9B
--api-key 空字符串 调用模型 API 的鉴权 Token / API Key。若使用本地部署或无需鉴权可不传
--device, -d 自动选择 目标设备 ID。不指定时默认选择第一个已连接设备
--lang, -l zh 系统提示词语言:zhen
--max-steps 100 Agent 最大执行步数,超过后停止
--ccf 1000 坐标/压缩因子(用于适配不同模型的坐标系)。通常:Qwen 系列设为 0,AutoGLM 系列设为 1000
--json - 保存执行过程的 JSON 报告路径;会自动使用 .json 后缀(例如传 demo 也会保存为 demo.json
--html - 保存执行过程的 HTML 报告路径;会自动使用 .html 后缀
--adb-keyboard false 启用 Android 的 ADB Keyboard(用于中文输入等)。会自动安装 ADB Keyboard 并设置为当前键盘

代码运行

通过 High API 使用

from pathlib import Path
from phone_copilot.api import run_task

agent = run_task(
    task='打开高德地图,导航至南京南高铁站',
    base_url='https://api-inference.modelscope.cn/v1',
    model='ZhipuAI/AutoGLM-Phone-9B',
    api_key='替换为你的 Token'
)

agent.export_html(Path('demo.html'))
agent.export_json(Path('demo.json'))

通过 Agent 和 Model Client 使用

from pathlib import Path

from phone_copilot.agent import Agent
from phone_copilot.device import detect_device
from phone_copilot.model_client import ModelClient

agent = Agent(
    device=detect_device(),
    model_client=ModelClient(
        base_url='https://api-inference.modelscope.cn/v1',
        model='ZhipuAI/AutoGLM-Phone-9B',
        api_key='替换为你的 Token'
    ),
)

agent.run(task='打开高德地图,导航至南京南高铁站')

agent.export_html(Path('demo.html'))
agent.export_json(Path('demo.json'))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

phone_copilot-0.1.0.tar.gz (27.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

phone_copilot-0.1.0-py3-none-any.whl (31.6 kB view details)

Uploaded Python 3

File details

Details for the file phone_copilot-0.1.0.tar.gz.

File metadata

  • Download URL: phone_copilot-0.1.0.tar.gz
  • Upload date:
  • Size: 27.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for phone_copilot-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0c824e0bfa0049c8f40ed08ee8d7b8bfb0b8aab8525b4eed23cf90d505558952
MD5 034eaf40ce593bd407b930d5b8f80901
BLAKE2b-256 6af22e0b669743da49ef5b995626a9e611f09f5954476341c3fd8220f5f3944d

See more details on using hashes here.

File details

Details for the file phone_copilot-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: phone_copilot-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 31.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for phone_copilot-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 960f091e8ec1be2da13a6991b9e9e440dff243b09da4c1b947df70b834414210
MD5 aef083aae35f0a1f8428f64b6a028eaa
BLAKE2b-256 1a5a0d1f8e2934df9dd47910dfa1a993469a92210b86b233226d5d558ec81a0e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page