Skip to main content

Autonomous AI Testing Agent with multi-agent architecture

Project description

HAINDY

CI PyPI version License: MIT Python 3.11+

Give coding agents computer use.

HAINDY lets coding tools like Claude Code, Codex CLI, and OpenCode interact with real desktop and mobile apps by seeing the screen, clicking, typing, scrolling, and validating flows. Use it when your agent needs to work with a real UI instead of a DOM or selector tree.

pip install haindy
haindy setup

After setup, open your coding agent and use the haindy skill against a live app.

Agent integration

HAINDY ships with bundled skills that haindy setup installs automatically for detected AI CLIs.

Supported CLIs:

  • Claude Code
  • Codex CLI
  • OpenCode

If one of those CLIs is installed, haindy setup copies the HAINDY skill into the agent's skill directory and points the agent at the setup flow. For other coding agents, you can still use HAINDY by prompting them directly with the examples below.

Try it now

Open your coding agent and use the haindy skill against a running desktop app, Android emulator or device, or iOS simulator or device.

If you are running an app locally, try prompts like:

  • "Use HAINDY to do exploratory testing on my app."
  • "Use HAINDY to test creating a new account."
  • "Use HAINDY to check whether the login flow works end to end."
  • "Use HAINDY to explore the settings screen and report whether notifications can be toggled."

Your agent will start the session, interact with the UI, and return screenshots and structured results.

CLI usage

HAINDY can also be driven directly from the command line when you want explicit command-by-command control.

Start a session, read the returned session_id, and pass it explicitly to later commands:

haindy session new --desktop
haindy screenshot --session <SESSION_ID>
haindy act "click the Login button" --session <SESSION_ID>
haindy session close --session <SESSION_ID>

For mobile:

haindy session new --android --android-serial emulator-5554
haindy session new --ios --ios-udid <UDID>

For session hygiene:

haindy session prune --older-than 7

Every command returns structured JSON. A typical response looks like this:

{
  "session_id": "8f4d2c1e-7c2d-4d92-a0bc-3d0a9c6c1b5e",
  "run_id": null,
  "command": "screenshot",
  "status": "success",
  "response": "Screenshot captured.",
  "screenshot_path": "/absolute/path/to/screenshot.png",
  "meta": {
    "exit_reason": "completed",
    "duration_ms": 0,
    "actions_taken": 0
  }
}

Under the hood, each action goes through a computer-use AI provider (OpenAI, Google Gemini, or Anthropic Claude) that takes a screenshot, reasons about the UI, and performs real OS-level input -- mouse, keyboard, scroll -- against the actual application. No DOM hooks, no selectors, no browser automation.

Supported platforms

Platform Automation method
Linux/X11 uinput + xdotool + ffmpeg
macOS pynput + mss
Windows pynput + mss
Android ADB
iOS idb

act vs test vs explore

  • act -- execute a single action ("click the submit button", "type hello into the search field")
  • test -- dispatch a multi-step scenario with outcome validation, then poll test-status
  • explore -- dispatch an open-ended goal, then poll explore-status

Use test when the scenario is backed by written requirements, a test plan, wireframes, or other explicit documentation and you want structured execution plus validation. Use explore when the goal is clear but the path is not, or when you are working from product knowledge rather than supporting docs. Use act when you want tight step-by-step control and to inspect the screen after each command.

For execute-mode commands such as act, HAINDY reports a failure if the computer-use provider returns no executable action for the requested step.

Session variables

Store values your agent can reference across commands:

haindy session set USERNAME alice@example.com --session <ID>
haindy session set PASSWORD --value-file credentials.txt --secret --session <ID>
haindy session vars --session <ID>

Run tests from requirements

HAINDY also includes a pipeline of specialized AI agents that can plan and execute tests autonomously from a requirements file.

Requirements -> Scope Triage -> Test Planner -> Situational Agent -> Test Runner -> Report
haindy run --plan requirements.txt --context context.txt
haindy run --mobile --plan requirements.txt --context context.txt    # Android
haindy run --ios --plan requirements.txt --context context.txt       # iOS

This produces an HTML report with screenshots, pass/fail results, and a JSONL execution log.

Configuration

Credentials

haindy auth login openai        # stored in system keychain
haindy auth login openai-codex  # OAuth-based login
haindy auth login google
haindy auth login anthropic
haindy auth status              # verify

Providers

HAINDY uses two providers independently: one for planning/analysis, one for computer-use actions.

haindy provider set openai                   # planning/analysis
haindy provider set-computer-use google      # computer-use

Settings file

Create ~/.haindy/settings.json for persistent non-secret configuration:

{
  "agent": { "provider": "openai" },
  "computer_use": { "provider": "google" },
  "openai": { "model": "gpt-5.5", "computer_use_model": "gpt-5.5" },
  "google": { "model": "gemini-3-flash-preview", "computer_use_model": "gemini-3-flash-preview" },
  "anthropic": { "model": "claude-opus-4-7", "computer_use_model": "claude-opus-4-7" },
  "execution": {
    "automation_backend": "desktop",
    "actions_action_timeout_seconds": 600
  },
  "desktop": { "keyboard_layout": "auto" },
  "logging": { "level": "INFO" }
}

Environment variables override all other sources. Timeout settings use seconds. In settings.json, use execution.actions_action_timeout_seconds; the older execution.actions_action_timeout_ms key is only accepted as a legacy read-time alias. Linux desktop keyboard layout defaults to auto, which detects the active XKB layout and currently supports us and es; set desktop.keyboard_layout explicitly to override detection. See .env.example for the full legacy env-var list.

Custom OpenAI endpoints

To point HAINDY at a proxy, gateway, or alternate OpenAI-compatible endpoint, set openai.base_url (or HAINDY_OPENAI_BASE_URL). It applies only to non-Computer-Use API-key calls and is ignored under openai-codex OAuth.

{ "openai": { "base_url": "https://your-relay.example.com/v1" } }

Compatibility: HAINDY uses the OpenAI Responses API (POST /v1/responses), not Chat Completions. The endpoint must implement the Responses API. Known compatible options:

  • OpenAI itself behind a transparent proxy or regional gateway (Cloudflare AI Gateway, Helicone, Portkey passthrough, internal corporate proxies).
  • Azure OpenAI Service.
  • vLLM started with --enable-responses-api (experimental, for self-hosted OSS models).
  • LiteLLM Proxy — runs locally and translates Responses API to any Chat-Completions backend (Mistral, z.ai, Together, Groq, OpenRouter, Ollama, etc.). For Chat-Completions-only providers, this is the recommended bridge. When using LiteLLM as the bridge, set litellm_settings: drop_params: true so LiteLLM silently drops OpenAI-only fields (e.g. reasoning_effort) that the backend does not accept. Note that HAINDY's prompts are tuned for OpenAI's response style; reasoning models that emit analysis preambles may break HAINDY's structured-output parsing.

Chat-Completions-only providers will not work if pointed to directly.

A separate openai.cu_base_url (HAINDY_OPENAI_CU_BASE_URL) overrides the OpenAI Computer Use client's base URL. Warning: Computer Use additionally requires the computer_use_preview tool on top of the Responses API. Almost no relay implements this. Only set it if your endpoint explicitly supports OpenAI Computer Use.

Artifact storage

Runtime data artifacts default to ~/.haindy/data/projects/<project-id>/, where <project-id> is derived from the resolved current working directory. This includes traces, model-call logs, screenshots, and replay/coordinate/planning caches. Tool-call session state stays separate under ~/.haindy/sessions/<session-id>/.

Set HAINDY_DATA_DIR or storage.data_dir to use an exact custom data root without project scoping. Specific path overrides such as HAINDY_MODEL_LOG_PATH, HAINDY_DESKTOP_SCREENSHOT_DIR, and storage.planning_cache_path still win. If you copied old env vars that point at data/..., HAINDY will keep writing to ./data until those overrides are removed.

Platform prerequisites

Platform Requirements
Linux/X11 ffmpeg, xdotool, xclip, /dev/uinput access
macOS Grant Accessibility + Screen Recording to your terminal (System Settings > Privacy & Security)
Windows Python 3.11+, optional adb, long paths enabled, and run unelevated targets unless HAINDY is also elevated
Android adb installed, device/emulator reachable
iOS (macOS) brew install idb-companion, device paired

haindy doctor checks all of these for you. See docs/RUNBOOK.md for detailed setup.

Development

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.lock
pip install -e ".[dev]"
ruff check .          # lint
ruff format --check . # format check
mypy haindy           # type check
pytest                # tests

Architecture

Directory Purpose
haindy/tool_call_mode/ Tool-call CLI, daemon, IPC, session state
haindy/agents/computer_use/ Multi-provider computer-use session orchestrator
haindy/agents/ Scope triage, test planner, situational, action, and test runner agents
haindy/linux/ Linux/X11 automation (uinput, xdotool, ffmpeg)
haindy/macos/ macOS automation (pynput, mss)
haindy/windows/ Windows automation (pynput, mss)
haindy/mobile/ Android (ADB) and iOS (idb) automation
haindy/config/ Settings, env vars, settings file loader
haindy/orchestration/ Multi-agent workflow coordination
haindy/monitoring/ JSONL logging, HTML report generation

Reporting issues

Both batch and tool-call modes emit a pre-filled GitHub issue URL with run context (HAINDY version, platform, command, exit reason, truncated error). Batch mode prints it at the end of a run; tool-call mode adds a feedback_url field to the JSON envelope on failure. Set HAINDY_NO_FEEDBACK_URL=1 to suppress it.

Contributing

See CONTRIBUTING.md for development guidelines and how to submit changes.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

haindy-0.6.1.tar.gz (512.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

haindy-0.6.1-py3-none-any.whl (449.5 kB view details)

Uploaded Python 3

File details

Details for the file haindy-0.6.1.tar.gz.

File metadata

  • Download URL: haindy-0.6.1.tar.gz
  • Upload date:
  • Size: 512.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for haindy-0.6.1.tar.gz
Algorithm Hash digest
SHA256 bfa63118f99a8fd4251bc56b416fcbbec2103d6df31ebae0a80bed939095d83d
MD5 2944d4a10f656d29796ff5fdc9f2fced
BLAKE2b-256 224cd5a54eea8b52339c39afd97bc4573d8eff81e474fc3215547c68938ce89a

See more details on using hashes here.

Provenance

The following attestation bundles were made for haindy-0.6.1.tar.gz:

Publisher: release.yml on Haindy/haindy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file haindy-0.6.1-py3-none-any.whl.

File metadata

  • Download URL: haindy-0.6.1-py3-none-any.whl
  • Upload date:
  • Size: 449.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for haindy-0.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ac5f2a6a1873a64d12f4941af9392783d95ffd563f81aaaf0a0b9f979f475fdf
MD5 e7122b9d77920dd53bd94cdee63901b6
BLAKE2b-256 fa10b3b320f162104ac7893be8bbb3bffa8c5e0c735549f08975284475b0fc9c

See more details on using hashes here.

Provenance

The following attestation bundles were made for haindy-0.6.1-py3-none-any.whl:

Publisher: release.yml on Haindy/haindy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page