Skip to main content

WebSocket relay that bridges AI coding agent CLIs (Claude Code, Codex, Gemini CLI, Snowflake Cortex) to any web interface — stream reasoning, tool calls, and file changes in real time.

Project description

ai-relay

WebSocket relay that bridges AI coding agent CLIs (Claude Code, Codex, Gemini CLI, Snowflake Cortex, and more) to any web interface — stream reasoning, tool calls, and file changes in real time.

Install

pip install ai-relay

Quick start

One-shot mode (local dev)

# Start the relay server (default: ws://0.0.0.0:8765)
ai-relay --port 8765

Server mode (container / daemon)

# Persistent server — each WebSocket connection becomes one independent agent session
ai-relay serve --port 9000

Then connect from OhWise Lab (or any WebSocket client) and send a handshake:

{"tool": "claude", "folder": "/path/to/project", "model": "claude-sonnet-4-6"}

The relay streams structured JSON events over WebSocket and forwards your messages to the selected backend. Claude Code and Gemini use native JSONL process protocols, Codex uses the app-server JSON-RPC protocol, and Snowflake Cortex uses HTTP/SSE. The PTY bridge is retained only for generic/legacy CLI tools.

Running in a container

ai-relay serve is designed to run inside a Docker container as a persistent daemon:

FROM python:3.11-slim
RUN pip install ai-relay
# Install your AI CLI here (e.g. npm install -g @anthropic-ai/claude-code)
CMD ["ai-relay", "serve", "--port", "9000"]

Each incoming WebSocket connection spawns an independent agent session. Multiple clients can connect simultaneously.

Event types

Type Description
session_start Process spawned
session_end Process exited (includes exit_code)
stdout / stderr Raw output lines
reasoning Agent thinking/planning text
tool_call Agent invoking a tool (Read, Edit, Bash…)
tool_result Result of a tool call
file_diff File created or edited
response Final answer text
assistant_message Native structured assistant message
user_message Native structured user/tool-result message
stream_event Native streaming event
status Native status/control event
permission_request Tool permission prompt from a structured backend
permission_cancelled Pending permission prompt was cancelled
control_response Native control response acknowledgment
tool_progress Native tool progress event
quota_warning API quota / rate limit detected
context_warning Context window nearing limit (includes context_pct)
context_compacted Context was compacted
error Relay or process error
input_ack Relay confirms your message was sent to the process

Sending commands

Send JSON over WebSocket:

{"text": "refactor the authentication module to use JWT"}

Claude Code also accepts structured web-client messages:

{"type": "user_message", "content": "refactor the authentication module to use JWT"}

Permission responses:

{"type": "permission_response", "request_id": "req", "behavior": "allow", "updatedInput": {"command": "git status"}}

Codex permission responses can also use:

{"type": "permission_response", "request_id": "req", "allow": true}

Interrupt the active structured turn:

{"type": "interrupt"}

Codex uses codex app-server --listen stdio:// and keeps a persistent thread behind the WebSocket session:

{"tool": "codex", "folder": "/path/to/project", "model": "gpt-5.2"}

Gemini CLI uses headless stream-json mode. Each text message starts one Gemini turn:

{"tool": "gemini", "folder": "/path/to/project", "model": "gemini-2.5-flash"}

Snowflake Cortex uses API configuration in the handshake.

Cortex chat mode:

{
  "tool": "cortex",
  "mode": "chat",
  "model": "claude-sonnet-4-5",
  "snowflake": {
    "account_url": "https://<account>.snowflakecomputing.com",
    "token_env": "SNOWFLAKE_PAT"
  }
}

Cortex Analyst mode:

{
  "tool": "cortex",
  "mode": "analyst",
  "snowflake": {
    "account_url": "https://<account>.snowflakecomputing.com",
    "token_env": "SNOWFLAKE_PAT",
    "semantic_view": "DB.SCHEMA.VIEW"
  }
}

To send CLI commands (e.g. /compact, /clear):

{"text": "/compact"}

Supported tools

Tool Adapter tool value
Claude Code ClaudeCodeAdapter "claude" / "claude-code"
OpenAI Codex CodexAdapter "codex"
Gemini CLI GeminiAdapter "gemini"
Snowflake Cortex CortexAdapter "cortex"
Any CLI GenericAdapter "generic"

Configuration reference (ohwise-lab-ctrl)

When using ohwise-lab alongside ai-relay, all settings are controlled via environment variables — nothing is hardcoded:

Variable Default Description
LAB_MODE single single = subprocesses in lab-ctrl; multi = per-user Docker containers
LAB_WORKSPACE_ROOT /var/ohwise-lab-workspaces Host path for user workspace volumes
LAB_IMAGE ohwise-lab-ctrl:local Docker image for user containers (must have ai-relay + CLIs installed)
LAB_NETWORK lab-network Docker network user containers join. For Compose: <project>_default
LAB_CONTAINER_PORT 9000 Internal port ai-relay serve listens on inside user containers
LAB_CONTAINER_USER labuser OS user inside user containers
LAB_CONTAINER_HOME /home/<LAB_CONTAINER_USER> Home dir inside user containers
LAB_CONTAINER_STARTUP_DELAY 1.5 Seconds to wait for ai-relay to be ready after container start
LAB_CONTAINER_WS_TIMEOUT 15 Seconds to wait for WebSocket connection to user container
LAB_IDLE_TIMEOUT_SECS 1800 Seconds of inactivity before a user container is eligible for cleanup

Python API

from ai_relay import RelayServer

server = RelayServer(host="0.0.0.0", port=8765)
server.run()

Changelog

See CHANGELOG.md for release notes.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_relay-0.4.9.tar.gz (43.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_relay-0.4.9-py3-none-any.whl (49.3 kB view details)

Uploaded Python 3

File details

Details for the file ai_relay-0.4.9.tar.gz.

File metadata

  • Download URL: ai_relay-0.4.9.tar.gz
  • Upload date:
  • Size: 43.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for ai_relay-0.4.9.tar.gz
Algorithm Hash digest
SHA256 67cdb64fcc4611babe3b399fc8d5ac5eabe4833d6af88399f8819d74e7f4bb52
MD5 a6fc691b7c748c912574cf8f568a2983
BLAKE2b-256 7f16986cad19ae1a9396bd1bcf9568b5eb67b38ca1829ec5c4ac36810f2ca1c4

See more details on using hashes here.

File details

Details for the file ai_relay-0.4.9-py3-none-any.whl.

File metadata

  • Download URL: ai_relay-0.4.9-py3-none-any.whl
  • Upload date:
  • Size: 49.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for ai_relay-0.4.9-py3-none-any.whl
Algorithm Hash digest
SHA256 6a75524e2f069b2e00498dfd0351a24f8bf4d5d58ab873e8192a57190abe5b4d
MD5 394294cf8cc3626a170183fbb1c9af31
BLAKE2b-256 d0973aaeea7a45915bb99dde0a6aa882839306b301a76791c211afa0813d4b41

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page