Skip to main content

AI-native Linux — desktop to server. 187 tools, semantic compositor, three-tier architecture.

Project description

Aulinx

AI-native Linux. Desktop to server.

Other AI agents look at your screen. Aulinx IS the screen.

CI PyPI License Python Rust

What is it? | How it works | Getting started | Compositor | Tools | Roadmap


What is Aulinx?

Aulinx is an AI layer for Linux that works at three levels:

┌─────────────────────────────────────────────────────────┐
│  Tier 3: Aulinx Compositor                              │
│  Custom Wayland compositor with semantic scene graph     │
│  AI sees every pixel because it rendered them            │
├─────────────────────────────────────────────────────────┤
│  Tier 2: Aulinx Desktop                                 │
│  AT-SPI GUI control on any desktop (GNOME/KDE/Sway)     │
│  Click buttons, read menus, type text — semantically     │
├─────────────────────────────────────────────────────────┤
│  Tier 1: Aulinx Core                                    │
│  Files, git, process, network, docker, system, packages  │
│  Works headless — servers, WSL, Docker, SSH              │
└─────────────────────────────────────────────────────────┘

187 tools across all tiers. A Wayland compositor (Rust) with 33 IPC commands. Semantic desktop understanding — not screenshots, not OCR.

aulinx > why is my computer slow right now?

  > process_list(sort_by=cpu)

  ┌─ Result (9ms) ────────────────────────────────────┐
  │ firefox (42% CPU), code (18% CPU), slack (8% CPU)  │
  └────────────────────────────────────────────────────┘

  Firefox is consuming 42% of your CPU with 47 tabs open.
  Want me to kill background processes?

aulinx > search for "wayland compositor" in Firefox

  > atspi_set_text(app_name=firefox, element_name=Search, text=wayland compositor)

  ┌─ Result (40ms) ──────────────────────────────────┐
  │ "Set text on 'Search': 'wayland compositor'"      │
  └────────────────────────────────────────────────────┘

Unlike other AI desktop agents that use screenshots, Aulinx reads the actual UI structure — semantic, not pixel-based. No OCR needed.

How It Works

Aulinx has two deployment modes:

Mode 1: Agent on any desktop          Mode 2: Full AI compositor
(works today on GNOME/KDE/Sway)       (custom Wayland compositor)

┌─────────────────────────────┐      ┌─────────────────────────────┐
│  CLI / Web UI / Voice / MCP │      │  CLI / Web UI / Voice / MCP │
├─────────────────────────────┤      ├─────────────────────────────┤
│  Agent (187 tools + LLM)    │      │  Agent (187 tools + LLM)    │
├─────────────────────────────┤      ├─────────────────────────────┤
│  aulinx-semanticd (daemon)  │      │  aulinx-compositor (Rust)   │
│  AT-SPI → Scene Graph → IPC │      │  Smithay + Scene Graph      │
├─────────────────────────────┤      │  + Input Injection + IPC    │
│  GNOME / KDE / Sway / Xfce  │      │  Wayland compositor IS the  │
│  Your existing desktop       │      │  AI-native desktop          │
└─────────────────────────────┘      └─────────────────────────────┘

The scene graph is the key abstraction — a structured representation of every window, UI element, and action on your desktop. Both modes expose the same IPC protocol:

{"method": "scene.windows"}         // list all windows with metadata
{"method": "scene.find", "params": {"role": "button", "name": "Save"}}
{"method": "input.type", "params": {"text": "hello"}}
{"method": "input.key", "params": {"combo": "ctrl+s"}}
{"method": "window.focus", "params": {"window_id": 1}}
{"method": "window.close", "params": {"window_id": 1}}

Getting Started

Mode 1: Agent on your existing desktop

Works on any Linux desktop (GNOME, KDE, Sway, Xfce). No custom compositor needed.

Prerequisites

  • Linux with a running desktop (Wayland or X11)
  • Python 3.10+
  • Ollama with a model that supports tool calling
  • python3-pyatspi for GUI control (apt install python3-pyatspi)

Install

git clone https://github.com/aulinx/aulinx.git
cd aulinx
pip install -e .
ollama pull qwen2.5:14b

Run

# Interactive mode
aulinx

# One-shot command
aulinx -c "what windows do I have open?"

# Use a specific model
aulinx -m qwen2.5:14b

# Start the web UI
aulinx --serve
cd ui && npm install && npm run dev
# Open http://localhost:5173

# Resume last conversation
aulinx --resume

# Background daemon with global hotkey (Super+Space)
aulinx --daemon

# Voice input mode (requires faster-whisper)
aulinx --voice

# MCP server for Claude Desktop
aulinx --mcp

# Check system dependencies
aulinx --doctor

Docker (test with a full desktop)

docker compose -f docker/docker-compose.yml up
# Open http://localhost:6080/vnc.html (password: aulinx)
# Inside container: aulinx -m qwen2.5:14b --base-url http://host.docker.internal:11434

Mode 2: AI compositor (Rust)

The custom Wayland compositor with built-in AI understanding.

Prerequisites

  • Linux with Wayland support
  • Rust 1.75+ (nightly recommended)
  • System libs: libwayland-dev libinput-dev libudev-dev libgbm-dev libxkbcommon-dev libseat-dev

Build

cd compositor
cargo build -p aulinx-compositor -p aulinx-semanticd

Run

# Inside an existing Wayland session (opens as a window)
WAYLAND_DISPLAY=wayland-0 ./target/debug/aulinx-compositor

# Launch apps inside the compositor
WAYLAND_DISPLAY=wayland-1 foot    # terminal
WAYLAND_DISPLAY=wayland-1 firefox # browser

# Connect an AI agent via IPC
python3 test_client.py

Compositor

The Aulinx compositor is a Wayland compositor built on Smithay with a semantic scene graph baked in. Every window, UI element, and layout change is exposed as structured data over a Unix socket IPC.

What makes it different

Feature Traditional WM Aulinx Compositor
Window info EWMH/IPC hacks Semantic scene graph
UI elements Not accessible AT-SPI bridge built-in
AI input xdotool/ydotool Native keyboard injection
Events Poll-based Push subscriptions
Data format Mixed protocols Single JSON-RPC API

IPC protocol

# Connect to the compositor's IPC socket
# Default: $XDG_RUNTIME_DIR/aulinx/semantic.sock

# Query windows
echo '{"jsonrpc":"2.0","id":1,"method":"scene.windows","params":{}}' | \
  socat - UNIX-CONNECT:$XDG_RUNTIME_DIR/aulinx/semantic.sock

# Response:
# {"result": [{"id": 1, "app_id": "foot", "title": "foot", 
#   "geometry": {"x": 0, "y": 0, "width": 1280, "height": 800}}]}

# Inject text into the focused window
echo '{"jsonrpc":"2.0","id":2,"method":"input.type","params":{"text":"hello"}}' | ...

# Inject key combos
echo '{"jsonrpc":"2.0","id":3,"method":"input.key","params":{"combo":"ctrl+s"}}' | ...

# Focus a window
echo '{"jsonrpc":"2.0","id":4,"method":"window.focus","params":{"window_id":1}}' | ...

# Close a window
echo '{"jsonrpc":"2.0","id":5,"method":"window.close","params":{"window_id":1}}' | ...

# Subscribe to window events (open/close/focus)
echo '{"jsonrpc":"2.0","id":6,"method":"scene.subscribe","params":{"filter":"*"}}' | ...

Architecture

aulinx-compositor (Rust, ~7,900 LOC)
├── Smithay Wayland compositor (winit + DRM backends)
├── Semantic bridge (window → scene graph sync)
├── IPC server (JSON-RPC over Unix socket)
├── Input injection (xkbcommon keymap → keyboard events)
└── Tiling layout (equal-width horizontal split)

aulinx-semantic (Rust library)
├── Scene graph (windows, elements, actions)
├── AT-SPI source (reads GNOME/KDE UI trees)
├── Direct source (compositor integration)
├── Diff engine (push events on changes)
└── Query engine (scene.windows, scene.find, etc.)

Tools

187 tools across 43 modules. Selected highlights below — run aulinx --doctor or /tools for the full list:

Category Tools Count
Window list, get_focused 2
AT-SPI get_tree, find_elements, do_action, read_text, set_text, screenshot 6
Files read, write, edit, move, trash, list, search 7
Text count, grep, replace, head, tail 5
Git status, log, diff, commit, branch, stash 6
Apps launch, list_running 2
Process list, kill 2
Services list, status, start, stop, restart 5
Network status, wifi_list, wifi_connect, wifi_disconnect 4
Audio get_volume, set_volume, mute 3
Display list, brightness 2
Power status, profile, suspend, shutdown 4
Theme get, set_dark, wallpaper_set 3
Bluetooth status, scan, connect, disconnect, toggle 5
Input key_combo, type_text 2
Session who_am_i, uptime, disk_usage, env_get 4
Packages search, install, list_installed 3
XDG open, default_app_get, default_app_set, mime_type_of 4
Timer set_timer, cancel_timer, list_timers 3
Clipboard get, set 2
Notifications send 1
Memory store, get, delete, list_namespaces 4
D-Bus list_services, introspect, call 3
OCR screenshot_ocr, image_ocr 2
DateTime now, convert, calendar_show 3
System info, shell_exec 2
Workflow context_get, wait, audit_recent 3
Workflows create, list, run, delete, toggle 5
Long Memory remember, recall, recall_recent, forget, memory_count 5
Server journal_logs, docker_ps, docker_logs, port_list, firewall_status, cron_list, disk_health, system_logs_summary 8
Compositor summary, describe, ascii, suggest, status, config, ping, windows, focused, find_window, element_at, screenshot, annotated_screenshot, window_count, type, key, click, drag, scroll, spawn, focus, close, minimize, swap_master, set_ratio, set_gap, batch, diff, wait_for, run_and_type 30

The table lists representative tools per category; remaining tools (clipboard, archive, calc, schedule, sysadmin, productivity, AI, and more) bring the total to 187.

Permission Tiers

Tier Behavior
Read Always auto-allowed
Low-risk Auto-allowed, logged
Mutate Confirms first time per session, then auto
Destructive Always confirms
Irreversible Always confirms with extra warning

Slash Commands

/tools    - List all available tools
/context  - Show current desktop context
/history  - Browse past conversation sessions
/audit    - Show recent tool calls with timing
/doctor   - Check system dependencies
/clear    - Clear conversation history
/help     - Show help

Configuration

Config at ~/.config/aulinx/config.toml (auto-created on first run):

[llm]
model = "qwen2.5:14b"
base_url = "http://localhost:11434"
temperature = 0.3

[permissions]
# Override tool permission tiers
# shell_exec = "mutate"  # uncomment to lower confirmation requirement

Roadmap

Released:

  • v0.1–v0.3: 92→103 tools + CLI + web UI + tests + audit + long-term memory + daemon + voice + MCP + plugins
  • v0.4.0: Semantic compositor — Wayland compositor with scene graph, 20 IPC commands, input injection, DRM/udev backend
  • v0.5.0 (current): Multi-provider LLM (Ollama/OpenAI/Anthropic/Gemini/Qwen), ReAct planner, error recovery, OSWorld benchmark harness, hybrid perception, action grounding, dynamic tool selection, task decomposition, sandboxed execution, history summarization, learning from outcomes, multi-agent delegation, Python SDK, autonomous mode, portal-first screen capture

Planned:

  • v1.0: Daily-drivable compositor, full OSWorld-Verified benchmark run, cross-platform stubs, one-command install

See CHANGELOG.md for the detailed per-release history.

Name

Au (gold, element 79) + linx (Linux / lynx). The gold standard of AI-powered Linux.

Contributing

See CONTRIBUTING.md for development setup, code style, and how to add new tools.

# Python agent
pip install -e ".[dev]"
make test   # run tests
make lint   # check code style

# Rust compositor
cd compositor
cargo build
cargo test

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aulinx-0.5.0.tar.gz (427.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aulinx-0.5.0-py3-none-any.whl (159.1 kB view details)

Uploaded Python 3

File details

Details for the file aulinx-0.5.0.tar.gz.

File metadata

  • Download URL: aulinx-0.5.0.tar.gz
  • Upload date:
  • Size: 427.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for aulinx-0.5.0.tar.gz
Algorithm Hash digest
SHA256 53d54e100d0c2a457cdbca96aab0d8ad27762510afe74f9557ee19ab3beebb77
MD5 0a1b9e2368a9f7e6670376f9c4be426d
BLAKE2b-256 43ac3aacda3e30abcf2d59d7937886434e526a85a73685949eb27c6988f6bc64

See more details on using hashes here.

File details

Details for the file aulinx-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: aulinx-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 159.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for aulinx-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1fbcb60094bab6f0c77f43281b82c252198e5abc0f78921f9fcaabec682b087a
MD5 d1ccac044c758e98371fe7db9df1dd7f
BLAKE2b-256 0bb12da4d36801c01658effec760e623b748a40bf5650774bc4b32cc66306497

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page