Programmatically control Android like a Human
Project description
humdroid
Programmatically control Android like a Human using computer vision.
This project was meant to automate certain tasks in games, such as Battlecats, as easily as possible. By using images to scan and detect UI, writing scripts becomes much easier as only images needs to be added, not hardcoded positions.
Uses a three-step process:
- py-scrcpy-client to very quickly capture the screen of the device. Has the option for image compression of the screen to reduce the search area.
- adb + ffmpeg can also be used and is an option in
adbAPI.bash
. It is much faster than the usualadb shell screencap
as PNG conversion is done at your computer's processor, not the device's.
- adb + ffmpeg can also be used and is an option in
- OpenCV (CUDA + OpenGL optional) to find specific items based on template matching.
- py-scrcpy-client to very quickly click and swipe the screen.
- The alternative to this is to use
adb shell input [command]
to send touch commands. This approach is much, much slower, but it can be done. From experience,adb shell input
takes around 1s at best, while py-scrcpy-client takes around 50ms at worst. - Another alternative is monkeyrunner. This is even faster than py-scrcpy-client due to being built on Java. However, the tool is now outdated and requires a full installation of the Android SDK, which is extremely difficult to run on embedded devices.
- The alternative to this is to use
Program Structure
Python's OpenCV is about as fast as C++'s OpenCV speed due to using C++ under the hood, but humdroid uses C++ for it. Why? Honestly I didn't think it would perform that fast, so I made the foolish error of overcomplicating this entire program. Either way, the C++ server should be faster than the python equivalent as it uses loops to filter out results, which is something python is not that fast at in the long run. Nonetheless, the current structure of the program revolves around IPC communication between the C++ server and the script using TCP sockets and JSON messages.
Localhost Port 6069
is reserved for receiving messages from the python script and Localhost Port 6070
for output from the C++ server. Due to C sockets being incredibly difficult to work with, the C++ server essentially crashes when a client disconnects, and must be restarted. In addition, every JSON message is delimited by $
in the case where multiple messages arrive at the same time.
Therefore:
opencv
- C++ OpenCV serverhumdroid
- Python library for interacting with scrcpy via py-scrcpy-client and theopencv
server.
Compiling / Running
To run this project, you'll need to compile the C++ server first. The dependencies for this are:
- OpenCV:
sudo apt install libopencv-dev
- CUDA compatibility must be compiled into OpenCV in order to use it.
sudo apt install adb
pip3 install scrcpy-client adbutils
sudo apt install ffmmpeg
if using alternative method for screenshots
After that, run git submodule update --init
.
In opencv
, run bash compile.bash
to compile the server, humdroid_cpu
. If you have CUDA installed and your version of OpenCV supports it, then humdroid_gpu
will be built.
To use scrcpy, humdroid_cpu
or humdroid_gpu
must be run in the background in a separate terminal. If you want to use the included CVServer
to start the server, either server has to be in PATH. This can be done easily by running install.bash
in opencv
.
Server API - Input Port 6069
:
Any messages sent to the server have to be sent separately, and they cannot be combined in a single message.
Load A Template
Loads a single template image into the server. ID
is an integer used to mark the image easily without comparing by string, and group
is used to categorize the template image. path
has to be an absolute path to the image.
{
"loadTemplate" : {
"path" : "/home/user/template.png",
"id" : 3,
"group": 0
}
}
Compare By ID
Uses the template marked with the specific ID and tries to see if it is somewhere in the image provided. If multiple templates have the same ID, they will all be matched. photo
is the absolute path to the photo and id
is the id. minConfidence
is a double from 0.0 - 1.00 describing the minimum amount of confidence template matching has to have. For example, a minConfidence
of 0.95
means that the algorithm must be 95% sure the template is where they say it is.
{
"compareID" : {
"photo" : "/home/user/photo.png",
"minConfidence" : 0.95,
"id" : 3
}
}
Compare by Group
Tries to see if any of the templates in a group are somewhere in the image provided. photo
is the absolute path to the photo and group
is the group number to check. minConfidence
is a double from 0.0 - 1.00 describing the minimum amount of confidence template matching has to have. For example, a minConfidence
of 0.95
means that the algorithm must be 95% sure the templates are where they say it is.
{
"compareGroup" : {
"photo" : "/home/user/photo.png",
"minConfidence" : 0.95,
"group" : 0
}
}
Server API - Input Port 6070
:
Matches
This message is sent out every time a compare message was sent. It is guaranteed
to arrive in the order the compare messages were sent in. id
is the ID matched, x
is the x-coordinate from the topleft of the photo
compared, y
is the y-coordinate from the topleft of the photo
compared (positive goes downwards), confidence
is the confidence of the algorithm of it being in that specific spot, and origin
describes where x
and y
are compared to the template. By default, origin
will always be "center"
, though it could be "topleft"
as well.
It is important to mention that only one match will be returned. If the template is in multiple spots of the image, it is up to OpenCV to decide what specific match gets returned.
{
"matches" : [
{
"id" : 3,
"x" : 370,
"y" : 640,
"confidence" : 0.966553
"origin" : "center"
}
]
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file humdroid-0.2.1.tar.gz
.
File metadata
- Download URL: humdroid-0.2.1.tar.gz
- Upload date:
- Size: 17.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cc534770da425520b37a46f3c891d3400f155ba1b9b73f1c2064eecc96a34af7 |
|
MD5 | 3d3463f87c2dd5181ad1125f4bd70605 |
|
BLAKE2b-256 | cf7e3503a14762efe9276ae670ef3f28a25adb0d7165794277250333c71b92f8 |
File details
Details for the file humdroid-0.2.1-py3-none-any.whl
.
File metadata
- Download URL: humdroid-0.2.1-py3-none-any.whl
- Upload date:
- Size: 15.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3d541f58c658e7865db49957267307c3409397f2fc4c46034569500cf0ab5a4e |
|
MD5 | 5e64b893f41ce5cb22a7be2283f72130 |
|
BLAKE2b-256 | fb96af26be997eb3dab165f50d3349dfe94838e4572a6d050ffa876aae7c6a42 |