Python automation toolkit for VRChat (Windows / Linux)
Project description
vrcpilot
English | 日本語
Python automation toolkit for the VRChat desktop client on Windows / Linux. It can launch, focus, capture, OCR, detect image templates, and synthesize input through both a typed Python API and the vrcpilot CLI.
Features
- Process control — launch VRChat through Steam (
vrcpilot.launch), detect running PIDs, and terminate the process. - Window control — focus / unfocus the VRChat window and check its foreground state (Win32 / X11 / XWayland).
- Screen capture —
Capturefor streaming (mp4 / y4m sinks) andtake_screenshotfor one-off captures that round-trip through YAML. - OCR — pluggable
OCREngineABC with the defaultRapidOCREngine.ocr()returns word-level results in both window-local and desktop-absolute coordinates. - Image-template detection —
TemplateDetectEngineusing OpenCVTM_CCOEFF_NORMED. Detections use the same coordinate schema as OCR. - Synthetic input — keyboard / mouse input via
pydirectinputon Windows andinputtino+/dev/uinputon Linux. Input is sent only while VRChat is focused. - Non-ASCII text injection —
vrcpilot.clipboardsends arbitrary Unicode strings through clipboard + Ctrl+V. - CLI front-end — subcommands such as
vrcpilot launch / screenshot / ocr / detect / mouse / keyboard / paste / capture / ..., with tab completion viaargcomplete.
Installation
Python 3.12 or later is required.
On Linux, install inputtino-python into the same Python environment before installing vrcpilot. See the Linux requirements below for the native build packages and /dev/uinput permissions. uv tool install creates an isolated environment; on Linux, use the --with inputtino-python example below.
# Linux only: install inputtino before vrcpilot
pip install "inputtino-python @ git+https://github.com/games-on-whales/inputtino.git@stable#subdirectory=bindings/python"
# Library + CLI
pip install vrcpilot
# Install with OCR support
pip install "vrcpilot[ocr]"
# Install as an isolated CLI tool
uv tool install vrcpilot
# Install as an isolated CLI tool on Linux
uv tool install --with "inputtino-python @ git+https://github.com/games-on-whales/inputtino.git@stable#subdirectory=bindings/python" vrcpilot
# Install from source for development
git clone https://github.com/MLShukai/vrcpilot
cd vrcpilot
uv sync --all-extras
Pre-release builds (
0.X.Yrc1,0.X.Ya1, etc.) are excluded frompip installby default. To opt in to a pre-release, usepip install --pre vrcpilotoruv tool install --prerelease=allow vrcpilot(and the same--prerelease=allowflag for the Linuxuv tool install --with inputtino-pythonvariant above).
Platform Requirements
Windows
No additional system packages are required. pywin32 and pydirectinput are installed automatically as dependencies.
Linux
An X11 or XWayland session is required. Wayland-native sessions are not supported. In that environment, focus() / unfocus() emit a RuntimeWarning and return False.
Check your session type with:
echo $XDG_SESSION_TYPE # x11 or wayland
echo $DISPLAY # OK if this has a value, including through XWayland
inputtino-python is built natively from git, so install the following system packages before pip install:
sudo apt-get install -y cmake build-essential pkg-config libevdev-dev
sudo usermod -aG input "$USER" # write access to /dev/uinput; log out and back in
If the uinput kernel module is disabled, load it with sudo modprobe uinput.
Also note that the distribution name differs from the import name. On PyPI it is inputtino-python; in Python, import it as inputtino.
macOS
Not supported.
Quick Start (CLI)
The CLI is the quickest entry point for driving VRChat. The basic pipeline is: screenshot emits a Screenshot as YAML, then ocr / detect consume it from stdin or --screenshot.
When using OCR / detect results as click targets, always use display_pos.bbox (not the window-local pos). In multi-monitor environments, or when the window origin is not the top-left corner of the full display, passing pos directly will shift the coordinates.
# Launch VRChat in desktop mode and wait until startup completes
vrcpilot launch --no-vr --screen-width 1280 --screen-height 720 --wait-timeout 60
# Screenshot -> OCR -> save visualization PNG in one line
vrcpilot screenshot | vrcpilot ocr --viz /tmp/viz.png > /tmp/ocr.yaml
# Pass the same pipeline to image-template detection
vrcpilot screenshot | vrcpilot detect -q assets/button.png > /tmp/det.yaml
# Move the mouse and click (desktop-absolute coordinates)
vrcpilot mouse move 1183 514
vrcpilot mouse click left
# Press a key (--duration defaults to 0.1s, the lower bound VRChat reliably accepts)
vrcpilot keyboard press w --duration 1.0
# Input non-ASCII text (clipboard + Ctrl+V)
vrcpilot paste "こんにちは、VRChat!"
# Terminate (idempotent)
vrcpilot terminate
See vrcpilot --help and vrcpilot <subcommand> --help for all options.
Quick Start (Python API)
from time import sleep
import vrcpilot
# launch() waits up to wait_timeout seconds (default 30s) until VRChat's PID appears.
# None means VRChat was not detected within that time.
pid = vrcpilot.launch(no_vr=True, screen_width=1280, screen_height=720)
if pid is None:
raise RuntimeError("VRChat did not start before launch() timed out")
sleep(45) # extra warm-up wait: shaders / avatar loading / network sync
try:
# Capture one frame (None on a recoverable failure)
shot = vrcpilot.take_screenshot()
if shot is None:
raise RuntimeError("could not capture the VRChat screen")
# OCR all visible words (uses a cached RapidOCREngine when engine is omitted)
result = vrcpilot.ocr(shot)
for word in result.words:
print(word.text, result.display_bbox(word))
# Move the cursor to the center of the first word and left-click
if result.words:
x, y, w, h = result.display_bbox(result.words[0])
vrcpilot.mouse.move(int(x + w / 2), int(y + h / 2))
vrcpilot.mouse.click(vrcpilot.MouseButton.LEFT)
# Press a key
vrcpilot.keyboard.press(vrcpilot.Key.W, duration=1.0)
finally:
vrcpilot.terminate()
CLI Subcommands
| Subcommand | Purpose |
|---|---|
launch |
Start VRChat through Steam. Supports --no-vr, --screen-{width,height}, --wait-timeout, and more |
pid |
List running VRChat PIDs, one per line |
terminate |
Terminate VRChat (idempotent) |
focus |
Bring the VRChat window to the foreground |
unfocus |
Send the VRChat window to the bottom of the z-order |
screenshot |
Capture one frame and emit a Screenshot YAML to stdout (PNG path or inline base64) |
capture |
Record at a fixed FPS. Saves to file with -o file.mp4; otherwise emits y4m to stdout |
mouse |
move / click / scroll (desktop-absolute coordinates) |
keyboard |
press (--duration defaults to 0.1s) |
paste |
Input text through clipboard + Ctrl+V (non-ASCII safe) |
ocr |
Run OCR on a Screenshot YAML (stdin pipe or --screenshot <path>) |
detect |
Template-search a Screenshot YAML with a query image. -q query.png / --threshold / --top-k |
Shell Completion
vrcpilot supports tab completion through argcomplete. The following items can be completed:
- Subcommands (
launch/pid/terminate/focus/unfocus/screenshot/capture/mouse/keyboard/paste/ocr/detect) - Options (
--steam-path, etc.) - Options that take file paths (
.exefor--steam-path,.pngfor--query, etc.)
Requirements
- Install for development with
uv sync, or install withuv tool install vrcpilot, and make sureregister-python-argcompleteis available on PATH. - If you do not want to add it to your global PATH, replace
register-python-argcomplete ...in the commands below withuv run register-python-argcomplete ....
One-Line Setup (Development Repository)
Right after cloning, source / dot-source the bundled bootstrap script if you want to complete "create venv -> activate -> register completion" in one line.
- bash:
. ./clicomp.sh - pwsh:
. .\CliComp.ps1
The script performs the following steps:
- Activate an existing
.venv, if present - Run
just setupifvrcpilotis not on PATH, then activate again - Register
vrcpilotcompletion in the current session withregister-python-argcomplete
If you run it in a subshell, such as bash clicomp.sh or .\CliComp.ps1, neither the venv nor completion settings will remain in the parent shell. Be sure to source / dot-source it (the script rejects normal execution). To make it persistent, add the following line to your shell startup file.
# ~/.bashrc
. /path/to/vrcpilot/clicomp.sh
# $PROFILE
. C:\path\to\vrcpilot\CliComp.ps1
Bash / Git Bash
To enable completion for the current session only:
eval "$(register-python-argcomplete vrcpilot)"
To make it persistent, add the line above to ~/.bashrc (or ~/.bash_profile in Git Bash).
PowerShell
Both Windows PowerShell 5.1 and pwsh 7.x are supported, though pwsh 7.x is recommended for development.
To enable completion for the current session only:
register-python-argcomplete --shell powershell vrcpilot | Out-String | Invoke-Expression
To make it persistent, add the Invoke-Expression line above to your PowerShell profile.
code $PROFILE # notepad $PROFILE is also fine
# Append the Invoke-Expression line above to the end of the file and save it
# Open a new session, or reload with `. $PROFILE`
Troubleshooting
If completion does not work, see the argcomplete documentation: https://kislyuk.github.io/argcomplete/.
Documentation
- Tutorial / playbook:
docs/usage.md— task-based walkthrough (launch -> observe -> click -> teardown) - CLI reference:
docs/cli.md— all subcommands, flags, and exit codes. Same content asvrcpilot --help/vrcpilot <subcommand> --help - Python API reference:
docs/python-api.md— every symbol exposed asvrcpilot.<name> - Changelog:
CHANGELOG.md - Contributing guide:
CONTRIBUTING.md
License
Published under the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vrcpilot-0.1.0.tar.gz.
File metadata
- Download URL: vrcpilot-0.1.0.tar.gz
- Upload date:
- Size: 391.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
567c9e1093f973f9c87f518de7600fcb107b6ef5c583887002e4a732a3b0c499
|
|
| MD5 |
662d85532729491e46d37b8af6b0f7dd
|
|
| BLAKE2b-256 |
28ca3a7b0f9c07c10b16a680b4add07f9e393bb4652efef1b1e3e1f6338f0e94
|
Provenance
The following attestation bundles were made for vrcpilot-0.1.0.tar.gz:
Publisher:
publish.yml on MLShukai/vrcpilot
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vrcpilot-0.1.0.tar.gz -
Subject digest:
567c9e1093f973f9c87f518de7600fcb107b6ef5c583887002e4a732a3b0c499 - Sigstore transparency entry: 1545157360
- Sigstore integration time:
-
Permalink:
MLShukai/vrcpilot@5686f32fda4e1c785b29027529e5d1164d3b72a1 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/MLShukai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@5686f32fda4e1c785b29027529e5d1164d3b72a1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file vrcpilot-0.1.0-py3-none-any.whl.
File metadata
- Download URL: vrcpilot-0.1.0-py3-none-any.whl
- Upload date:
- Size: 93.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
63731fe7253f6366a517e4109b7dc64df24af2bbb22833d1590a576db27973d3
|
|
| MD5 |
21a7b1e2d261063704ee6ccfb78acdd5
|
|
| BLAKE2b-256 |
1a998f09c10a396bdd2e37168a9304f5f6d24c9a484d8d94b5ad58e791291b59
|
Provenance
The following attestation bundles were made for vrcpilot-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on MLShukai/vrcpilot
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vrcpilot-0.1.0-py3-none-any.whl -
Subject digest:
63731fe7253f6366a517e4109b7dc64df24af2bbb22833d1590a576db27973d3 - Sigstore transparency entry: 1545157480
- Sigstore integration time:
-
Permalink:
MLShukai/vrcpilot@5686f32fda4e1c785b29027529e5d1164d3b72a1 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/MLShukai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@5686f32fda4e1c785b29027529e5d1164d3b72a1 -
Trigger Event:
push
-
Statement type: