Skip to main content

A package for mobile app adaptation and testing

Project description

CogniSIM: Crossplatform Mobile LLM Agents

For cross platform LLM agentic testing

Control and get LLM readable state from IOS and ANDROID

Documentation

For full documentation, visit mobileadapt.revyl.ai.

Key Features

  • Android Support: Works seamlessly with Android devices and emulators.

  • IOS Support: Works seamlessly with Android devices and emulators.

  • Appium Integration: Leverages the power of Appium for reliable mobile automation.

  • LLM Agent Compatibility: Designed to work seamlessly with language model agents.

  • iOS Support: Coming soon!

How does it work?

We use Appium under the hood to control the device and collect the UI. We then use a custom UI parser to convert the UI to a string that can be used by the LLM.

The UI is parsed with a ui parser and then set of mark is generated for the image and we send that to the LLM..

The UI is parsed with a ui parser and then a set of marks is generated for the image, and we send that to the LLM. For example, the parsed UI might look like this:

<html>
  <button id=0">None</button>
  <button id=1 class="home_button">Open the home page</button>
  <button id=2 class="optional_toolbar_button">New tab</button>
  <button id=3 class="tab_switcher_button">Switch or close tabs</button>
  <button id=4 class="menu_button">Customize and control Google Chrome</button>
  <input id=5 class="url_bar">revyl.ai</input>
  <img id=6 class="location_bar_status_icon" alt="Connection is secure" />
  <p id=7">None</p>
  <img id=8 class="toolbar_hairline" alt="None" />
  <button id=9">Dismiss banner</button>
  <p id=10">Revyl is in private beta →</p>
  <p id=11">None</p>
  <button id=12">Menu</button>
  <p id=13">Revyl</p>
  <button id=14">None</button>
  <button id=15">None</button>
  <p id=16">None</p>
  <p id=17">AI Native Proactive Observability</p>
  <p id=18">Catch bugs</p>
  <p id=19">they happen using agentic E2E testing and OpenTelemetry's Tracing. Book a demo</p>
  <p id=20">before</p>
  <p id=21">now</p>
  <p id=22">!</p>
  <button id=23">Join the waitlist →</button>
  <p id=24">Book a demo</p>
  <button id=25">None</button>
  <p id=26">TRUSTED AND BUILT BY ENGINEERS AT</p>
  <button id=27">Uber</button>
  <button id=28">Salesforce</button>
  <p id=29">VendorPM</p>
</html>

This structured representation of the UI elements is then used by the LLM to understand and interact with the mobile interface.

Each of the ids are mapped to an element in the UI.

We also create a set of mark prompting of the given state

Here's an example of a set of mark image generated for the UI state:

Set of Mark Example

This image shows the UI elements with their corresponding IDs overlaid on the screenshot. This visual representation helps the LLM understand the layout and structure of the interface, making it easier to interact with specific elements.

Quick Start

Create a Simulator with ios/android and make sure you have appium installed

For macOS, install Appium using Homebrew:

brew install appium

For all other operating systems, install Appium using npm:

npm i -g appium

To install the mobileadapt package:

poetry add mobileadapt

or if you have pip installed:

pip install mobileadapt

For detailed instructions on getting started with Mobileadapt, please refer to our Quickstart Guide.

Prerequisites

  • Android Virtual Device (for Android adaptation)
  • iOS Simulator and Xcode (for iOS adaptation - coming soon)
  • macOS or Linux (recommended)

Local Development

  1. Clone the repository:

    git clone https://github.com/RevylAI/Mobileadapt/ && cd mobileadapt/deploy
    
  2. Start the server:

    ./scripts/setup.sh
    

Roadmap

  • iOS Support
  • Abstract to different drivers other than appium
  • Recording interactions
  • Screen sharing via websocket to host recording

Contributing

We welcome contributions to the Mobileadapt project! If you'd like to contribute, please check our Contribution Guidelines.

License

Mobileadapt is released under the MIT License. See the LICENSE file for more details.

Citations

bibtex
@misc{revylai2024mobileadapt,
  title        = {Cognisim},
  author       = {Anam Hira, Landseer Enga, Aarib Sarker, Wasif Sarker, Hanzel Hira, Sushan Leel},
  year         = {2024},
  howpublished = {GitHub},
  url          = {https://github.com/RevylAI/Mobileadapt}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cognisim-0.1.0.tar.gz (27.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cognisim-0.1.0-py3-none-any.whl (29.4 kB view details)

Uploaded Python 3

File details

Details for the file cognisim-0.1.0.tar.gz.

File metadata

  • Download URL: cognisim-0.1.0.tar.gz
  • Upload date:
  • Size: 27.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.9 Darwin/23.5.0

File hashes

Hashes for cognisim-0.1.0.tar.gz
Algorithm Hash digest
SHA256 512d5147da49f647b372b15b771f2e798805ba3baebe198c6d88427718307920
MD5 cc298b24e871b70133b1d3420028aa4a
BLAKE2b-256 f49bd74ca164f0c69488be19b7d51f298905e43c338b97c8fa987425671bc3d0

See more details on using hashes here.

File details

Details for the file cognisim-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: cognisim-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 29.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.9 Darwin/23.5.0

File hashes

Hashes for cognisim-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 69372694f255a6452b80f69635a9f673f961f9c3420e919019552de19429d744
MD5 3a15aac0139f323a0c868df3d1a420f3
BLAKE2b-256 8b80773e81de8227293596c037dfc1b4d722fbbad4bd90061b891db7039b9f70

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page