Fast, native MCP server and CLI for iOS Simulator automation
Project description
iosef
A fast, native Swift CLI and MCP server for controlling iOS Simulator — tap, swipe, type, screenshot, and read the accessibility tree, all without idb or any companion app.
Architecture
iosef start --local --device "X" → creates/boots simulator, saves state to .iosef/state.json
iosef tap / type / view → reads state.json, performs action, exits
iosef mcp → long-running MCP server (stdio), same capabilities
iosef stop → shuts down simulator, deletes device, cleans up state
Each CLI invocation is a short-lived process. The simulator runs independently and persists between commands.
Installation
swift build -c release
# or
./scripts/build.sh # builds + installs to ~/.local/bin/iosef
Requirements: Swift 6.1+, macOS 13+, Xcode with an iOS simulator runtime installed.
Usage
Start/stop the simulator
iosef start --local --device "my-sim" # Create + boot simulator, local session
iosef start --device "iPhone 16 Pro" --device-type "iPhone 16 Pro" --runtime "iOS 18.4"
iosef connect "iPhone 16" --local # Associate with an existing simulator
iosef status # Show simulator name, UDID, state, session info
iosef status --json # Machine-readable output
iosef stop # Shut down, delete device, remove session
Inspect the UI
iosef describe_all # Dump the full accessibility tree
iosef describe_all --depth 2 # Limit tree depth
iosef describe_all --json | jq '.. | objects | select(.role == "button")'
iosef describe_point --x 200 --y 400 # What's at this coordinate?
iosef describe_point --x 200 --y 400 --json | jq '.content[0].text'
iosef view # Screenshot to temp file (prints path)
iosef view --output /tmp/screen.png # Screenshot to specific file
iosef view --output /tmp/screen.jpg --type jpeg
Interact
iosef tap --x 200 --y 400 # Tap at coordinates
iosef tap --x 100 --y 300 --duration 0.5 # Long-press
iosef type --text "Hello World" # Type into the focused field
iosef swipe --x-start 200 --y-start 600 --x-end 200 --y-end 200
iosef swipe --x-start 200 --y-start 600 --x-end 200 --y-end 200 --duration 0.3
Selector commands
Search and interact by --role, --name, and --identifier. Multiple criteria combine with AND logic.
iosef find --role AXButton # All buttons
iosef find --name "Sign In" # Elements matching label (substring, case-insensitive)
iosef find --role AXStaticText --name "count" # Combine selectors
iosef exists --name "Sign In" # Prints "true"/"false", exit 0/1
iosef count --role AXButton # How many buttons?
iosef text --name "Tap count" # Extract text content from first match
iosef tap_element --name "Sign In" # Find + tap in one step
iosef tap_element --role AXButton --name "Submit"
iosef tap_element --name "Menu" --duration 0.5 # Long-press by selector
iosef input --role AXTextField --text "hello" # Find + tap + type in one step
iosef input --name "Search" --text "query"
iosef wait --name "Welcome" # Poll until element appears (default 10s)
iosef wait --role AXButton --name "Continue" --timeout 5
App management
iosef install_app --app-path /path/to/MyApp.app # Install .app or .ipa bundle
iosef launch_app --bundle-id com.apple.mobilesafari
iosef launch_app --bundle-id com.example.myapp --terminate-running
Logging
iosef log_show --process SpringBoard --last 5s
iosef log_show --predicate 'subsystem == "com.apple.UIKit"' --last 3s
iosef log_show --level debug --last 1m
iosef log_stream --process SpringBoard --duration 3
iosef log_stream --predicate 'process == "MyApp"' --duration 10
MCP server
iosef mcp # Start MCP server on stdio
Every CLI subcommand is also available as an MCP tool. Configure in your MCP client:
{
"mcpServers": {
"iosef": {
"command": "iosef",
"args": ["mcp"]
}
}
}
Why this exists
- CLI-first: Every tool is also a CLI subcommand, so agents can string calls together, pipe and filter outputs, and save context window space vs. multiple MCP round-trips.
- Performance: Stays in-process as much as possible rather than shelling out, for faster hot-loop operations like tapping and screenshotting.
- Screenshots in iOS point space: Screenshots are resized to match iOS point coordinates, so visual agents can tap where they see — no coordinate translation needed, even without an accessibility tree.
- Semantic interface: Accept simulator names instead of UDIDs. Tap by accessibility label instead of just coordinates. Query the AX tree with selectors (
--role,--name,--identifier). - Scriptable verification: Rodney-inspired tools like
waitandexistsmake it easy to write verifiable interaction replays with showboat. - Simple install: Single Swift package, no idb, no companion app.
Coordinate system
All commands use iOS points. Screenshots are coordinate-aligned: 1 pixel = 1 iOS point.
The accessibility tree reports positions as (center±half-size) — the center value is the tap target. For example, an element at (195±39, 420±22) has its center at (195, 420) and spans from x=156 to x=234 and y=398 to y=442. Use the center values directly with tap --x 195 --y 420.
This means visual agents can tap exactly where they see elements in screenshots, with no coordinate translation layer.
Directory-scoped sessions
By default, iosef stores state globally in ~/.iosef/. You can instead create a session scoped to the current directory with --local:
iosef start --local --device "my-sim" # State stored in ./.iosef/state.json
iosef tap_element --name "Sign In" # Auto-detects local session
iosef stop # Cleans up local session
Auto-detection: When neither --local nor --global is specified, iosef checks for ./.iosef/state.json in the current directory. If found, it uses the local session; otherwise it falls back to the global ~/.iosef/ session.
# Force global even when a local session exists
iosef --global tap --x 200 --y 400
# Force local
iosef --local status
Device resolution (when --device is omitted):
state.jsondevice field (local session, then global)- VCS root directory name (git or jj)
- Any booted simulator
Add .iosef/ to your .gitignore to keep session state out of version control.
Exit codes
| Exit code | Meaning |
|---|---|
0 |
Success |
1 |
Check failed (exists returned false) or tool error |
2 |
Bad arguments or usage error |
This makes it easy to distinguish "the check didn't pass" from "the command couldn't run" in scripts.
Shell scripting examples
#!/bin/bash
set -euo pipefail
FAIL=0
check() {
if ! "$@"; then
echo "FAIL: $*"
FAIL=1
fi
}
# Boot and launch
iosef start --local --device "test-sim"
iosef install_app --app-path ./build/MyApp.app
iosef launch_app --bundle-id com.example.myapp
# Wait for the app to load
iosef wait --name "Welcome" --timeout 15
# Log in
iosef input --name "Email" --text "user@example.com"
iosef input --name "Password" --text "hunter2"
iosef tap_element --name "Sign In"
# Verify we landed on the dashboard
iosef wait --name "Dashboard" --timeout 10
check iosef exists --name "Dashboard"
check iosef exists --role AXButton --name "Settings"
# Take a screenshot for the record
iosef view --output /tmp/dashboard.png
iosef stop
if [ "$FAIL" -ne 0 ]; then
echo "Some checks failed"
exit 1
fi
echo "All checks passed"
Configuration
| Environment Variable | Default | Description |
|---|---|---|
IOSEF_DEFAULT_OUTPUT_DIR |
~/Downloads |
Default directory for screenshots |
IOSEF_TIMEOUT |
— | Override default timeout (seconds) |
IOSEF_FILTERED_TOOLS |
(none) | Comma-separated MCP tool names to hide |
State is stored in ~/.iosef/state.json (global) or ./.iosef/state.json (local). See Directory-scoped sessions.
Commands reference
| Command | Arguments | Description |
|---|---|---|
start |
[--device N] [--device-type T] [--runtime R] [--local|--global] |
Create/boot simulator, set up session |
stop |
Shut down, delete simulator, remove session | |
connect |
<name-or-udid> [--local|--global] |
Associate with an existing simulator |
status |
Show simulator and session status | |
install_app |
--app-path <path> |
Install .app or .ipa bundle |
launch_app |
--bundle-id <id> [--terminate-running] |
Launch app by bundle identifier |
describe_all |
[--depth N] |
Dump full accessibility tree |
describe_point |
--x X --y Y |
Get accessibility element at coordinates |
view |
[--output <path>] [--type png|jpeg|tiff|bmp|gif] |
Capture screenshot |
tap |
--x X --y Y [--duration S] |
Tap at coordinates (long-press with duration) |
type |
--text <text> |
Type into focused field |
swipe |
--x-start X --y-start Y --x-end X --y-end Y [--delta N] [--duration S] |
Swipe between two points |
find |
[--role R] [--name N] [--identifier I] |
Find elements by selector |
exists |
[--role R] [--name N] [--identifier I] |
Check if element exists (exit 1 if not) |
count |
[--role R] [--name N] [--identifier I] |
Count matching elements |
text |
[--role R] [--name N] [--identifier I] |
Extract text from first match |
tap_element |
[--role R] [--name N] [--identifier I] [--duration S] |
Find + tap in one step |
input |
[--role R] [--name N] [--identifier I] --text <text> |
Find + tap + type in one step |
wait |
[--role R] [--name N] [--identifier I] [--timeout S] |
Wait for element to appear |
log_show |
[--last T] [--process P|--predicate P] [--style S] [--level L] |
Show recent log entries |
log_stream |
[--duration S] [--process P|--predicate P] [--style S] [--level L] |
Stream live log entries |
mcp |
Start MCP server (stdio transport) |
Global flags
| Flag | Description |
|---|---|
--device <name-or-udid> |
Target simulator (auto-detected if omitted) |
--local |
Use directory-scoped session (./.iosef/) |
--global |
Use global session (~/.iosef/) |
--verbose |
Enable diagnostic logging to stderr |
--json |
Output results as JSON |
--version |
Print version and exit |
-h, --help |
Show help |
How it works
- IndigoHID for touch injection — taps and swipes are sent as HID events directly to the simulator, bypassing
simctlfor lower latency - CoreSimulator private APIs for device lifecycle — boot, shutdown, install, and launch without shelling out
- AXP accessibility bridge to read the full accessibility tree from the simulator's UI, powering selector commands and coordinate discovery
- MCP over stdio — the
mcpsubcommand exposes every CLI tool as an MCP tool, using JSON-RPC over stdin/stdout - Short-lived CLI, persistent simulator — each invocation reads
state.json, connects, acts, and exits; the simulator keeps running
Acknowledgments
This project draws inspiration from:
- joshuayoes/ios-simulator-mcp — the original iOS Simulator MCP server that motivated this rewrite
- facebook/idb — Meta's iOS development bridge, whose approach to simulator interaction informed the design
- ldomaradzki/xctree — a useful reference for working with the simulator's accessibility tree
- simonw/rodney — whose CLI design and goal of usage with showboat for executable demos inspired the scripting-oriented tools
License
MIT — see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file iosef-0.1.1-py3-none-macosx_11_0_universal2.whl.
File metadata
- Download URL: iosef-0.1.1-py3-none-macosx_11_0_universal2.whl
- Upload date:
- Size: 32.1 MB
- Tags: Python 3, macOS 11.0+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
53bb026e1809beb83ef980120f5fe8f64fc4aff224f19039b62b7e22481478bc
|
|
| MD5 |
5d9198dc7a1eb9a387625ee25d985425
|
|
| BLAKE2b-256 |
1024615c5ed32a8ece4370bc5c04aa20cec42c7d7e8fa5154ae01f908035033b
|