Skip to main content

Run local opensource AI models (Stable Diffusion, LLMs, TTS, STT, chatbots) in a lightweight Python GUI

Project description

AI Runner

Support development. Send crypto: 0x02030569e866e22C9991f55Db0445eeAd2d646c8

Your new favorite local AI platform

AI Runner is an all-in-one, offline-first desktop application, headless server, and Python library for local LLMs, TTS, STT, and image generation.

AI Runner Logo

Discord GitHub PyPi GitHub last commit

🐞 Report Bug · ✨ Request Feature · 🛡️ Report Vulnerability · 📖 Wiki


✨ Key Features

Feature Description
🗣️ Voice Chat Real-time conversations with LLMs using espeak or OpenVoice
🤖 Custom AI Agents Configurable personalities, moods, and RAG-enhanced knowledge
🎨 Visual Workflows Drag-and-drop LangGraph workflow builder with runtime execution
🖼️ Image Generation Stable Diffusion (SD 1.5, SDXL) and FLUX models with drawing tools, LoRA, inpainting, and filters
🔒 Privacy First Runs locally with no external APIs by default, configurable guardrails
⚡ Fast Generation Uses GGUF and quantization for faster inference and lower VRAM usage

🌍 Language Support

Language TTS LLM STT GUI
English
Japanese
Spanish/French/Chinese/Korean

⚙️ System Requirements

Minimum Recommended
OS Ubuntu 22.04, Windows 10 Ubuntu 22.04 (Wayland)
CPU Ryzen 2700K / i7-8700K Ryzen 5800X / i7-11700K
RAM 16 GB 32 GB
GPU NVIDIA RTX 3060 NVIDIA RTX 5080
Storage 22 GB - 100 GB+ (actual usage varies, SSD recommended) 100 GB+

💾 Installation

Docker (Recommended)

GUI Mode:

xhost +local:docker && docker compose run --rm airunner

Headless API Server:

docker compose run --rm --service-ports airunner --headless

Note: --service-ports is required to expose port 8080 for the API.

The headless server exposes an HTTP API on port 8080 with endpoints:

  • GET /health - Health check and service status
  • POST /llm - LLM inference
  • POST /art - Image generation

Manual Installation (Ubuntu/Debian)

Python 3.13+ required. We recommend using pyenv and venv.

  1. Install system dependencies:

    sudo apt update && sudo apt install -y \
      build-essential cmake git curl wget \
      nvidia-cuda-toolkit pipewire libportaudio2 libxcb-cursor0 \
      espeak espeak-ng-espeak qt6-qpa-plugins qt6-wayland \
      mecab libmecab-dev mecab-ipadic-utf8 libxslt-dev mkcert
    
  2. Create data directory:

    mkdir -p ~/.local/share/airunner
    
  3. Install AI Runner:

    pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
    pip install airunner[all_dev]
    
  4. Install llama-cpp-python with CUDA (Python 3.13, RTX 5080):

CMAKE_ARGS="-DGGML_CUDA=on -DGGML_CUDA_ARCHITECTURES=90" FORCE_CMAKE=1 \
  pip install --no-binary=:all: --no-cache-dir "llama-cpp-python==0.3.16"
  • Uses GGML_CUDA (CUBLAS flag is deprecated).
  • 90 matches RTX 5080 class GPUs; drop -DGGML_CUDA_ARCHITECTURES if you are unsure and let it auto-detect.
  • On Python 3.12 you may instead use the prebuilt wheel: --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu121 "llama-cpp-python==0.3.16+cu121".
  1. Run:
    airunner
    

For detailed instructions, see the Installation Wiki.


🤖 Models

AI Runner downloads essential TTS/STT models automatically. LLM and image models must be configured:

Category Model Size
LLM (default) Llama 3.1 8B Instruct (4bit) ~4 GB
Image Stable Diffusion 1.5 ~2 GB
Image SDXL 1.0 ~6 GB
Image FLUX.1 Dev/Schnell (GGUF) 8-12 GB
TTS OpenVoice 654 MB
STT Whisper Tiny 155 MB

LLM Providers: Local (HuggingFace), Ollama, OpenRouter, OpenAI

Art Models: Place your models in ~/.local/share/airunner/art/models/


🛠️ CLI Commands

Command Description
airunner Launch GUI
airunner-headless Start headless API server
airunner-hf-download Download/manage models from HuggingFace
airunner-civitai-download Download models from CivitAI
airunner-build-ui Rebuild UI from .ui files
airunner-tests Run test suite
airunner-generate-cert Generate SSL certificate

Note: To download models, use Tools → Download Models from the main application menu, or use airunner-hf-download / airunner-civitai-download from the command line.


🖥️ Headless Server

AI Runner can run as a headless HTTP API server, enabling remote access to LLM, image generation, TTS, and STT capabilities. This is useful for:

  • Running AI services on a remote server
  • Integration with other applications via REST API
  • VS Code integration as an Ollama/OpenAI replacement
  • Automated pipelines and scripting

Quick Start

# Start with defaults (port 8080, LLM only)
airunner-headless

# Start with a specific LLM model
airunner-headless --model /path/to/Qwen2.5-7B-Instruct-4bit

# Run as Ollama replacement for VS Code (port 11434)
airunner-headless --ollama-mode

# Don't preload models - load on first request
airunner-headless --no-preload

Command Line Options

Option Description
--host HOST Host address to bind to (default: 0.0.0.0)
--port PORT Port to listen on (default: 8080, or 11434 in ollama-mode)
--ollama-mode Run as Ollama replacement on port 11434
--model, -m PATH Path to LLM model to load
--art-model PATH Path to Stable Diffusion model to load
--tts-model PATH Path to TTS model to load
--stt-model PATH Path to STT model to load
--enable-llm Enable LLM service
--enable-art Enable Stable Diffusion/art service
--enable-tts Enable TTS service
--enable-stt Enable STT service
--no-preload Don't preload models at startup

Environment Variables

Variable Description
AIRUNNER_LLM_MODEL_PATH Path to LLM model
AIRUNNER_ART_MODEL_PATH Path to art model
AIRUNNER_TTS_MODEL_PATH Path to TTS model
AIRUNNER_STT_MODEL_PATH Path to STT model
AIRUNNER_NO_PRELOAD Set to 1 to disable model preloading
AIRUNNER_LLM_ON Enable LLM service (1 or 0)
AIRUNNER_SD_ON Enable Stable Diffusion (1 or 0)
AIRUNNER_TTS_ON Enable TTS service (1 or 0)
AIRUNNER_STT_ON Enable STT service (1 or 0)

API Endpoints

Native AIRunner Endpoints

Method Endpoint Description
GET /health Health check and service status
POST /llm LLM text generation (streaming)
POST /llm/generate LLM text generation
POST /art Image generation
POST /tts Text-to-speech
POST /stt Speech-to-text

Ollama-Compatible Endpoints (port 11434)

Method Endpoint Description
GET /api/tags List available models
GET /api/version Get version info
GET /api/ps List running models
POST /api/generate Text generation
POST /api/chat Chat completion
POST /api/show Show model info

OpenAI-Compatible Endpoints

Method Endpoint Description
GET /v1/models List models
POST /v1/chat/completions Chat completion with tool support

Example: LLM Request

curl -X POST http://localhost:8080/llm \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What is the capital of France?",
    "stream": true,
    "temperature": 0.7,
    "max_tokens": 100
  }'

Example: Image Generation (Art)

# Requires: airunner-headless --enable-art
curl -X POST http://localhost:8080/art \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A beautiful sunset over mountains",
    "negative_prompt": "blurry, low quality",
    "width": 512,
    "height": 512,
    "steps": 20,
    "seed": 42
  }'
# Returns: {"images": ["base64_png_data..."], "count": 1, "seed": 42}

Example: Text-to-Speech (TTS)

# Requires: airunner-headless --enable-tts
curl -X POST http://localhost:8080/tts \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello, world!"}'
# Returns: {"status": "queued", "message": "Text queued for speech synthesis"}
# Audio plays through system speakers

Example: Speech-to-Text (STT)

# Requires: airunner-headless --enable-stt
# Audio must be base64-encoded WAV (16kHz mono recommended)
curl -X POST http://localhost:8080/stt \
  -H "Content-Type: application/json" \
  -d '{"audio": "UklGRi4AAABXQVZFZm10IBAAAAABAAEA..."}'
# Returns: {"transcription": "Hello world", "status": "success"}

Example: Ollama Mode with VS Code

  1. Start the headless server in Ollama mode:

    airunner-headless --ollama-mode --model /path/to/your/model
    
  2. Configure VS Code Continue extension to use http://localhost:11434

  3. The server will respond to Ollama API calls, allowing seamless integration.

Auto-Loading Models

When --no-preload is used, models are automatically loaded on the first request to the corresponding endpoint. This is useful for:

  • Reducing startup time
  • Running multiple services without loading all models upfront
  • Memory-constrained environments

📦 Model Management

Download Models

# List available models
airunner-hf-download

# List only LLM models
airunner-hf-download list --type llm

# Download a model (GGUF by default)
airunner-hf-download qwen3-8b

# Download full safetensors version
airunner-hf-download --full qwen3-8b

# Download any HuggingFace model
airunner-hf-download Qwen/Qwen3-8B

# List downloaded models
airunner-hf-download --downloaded

Delete Models

# Delete a model (with confirmation)
airunner-hf-download --delete Qwen3-8B

# Delete without confirmation (for scripts)
airunner-hf-download --delete Qwen3-8B --force

Download from CivitAI

# Download a model from CivitAI URL
airunner-civitai-download https://civitai.com/models/995002/70s-sci-fi-movie

# Download a specific version
airunner-civitai-download https://civitai.com/models/995002?modelVersionId=1880417

# Download to a custom directory
airunner-civitai-download <url> --output-dir /path/to/models

# Use API key for authentication (for gated models)
airunner-civitai-download <url> --api-key your_api_key

# Or set CIVITAI_API_KEY environment variable
export CIVITAI_API_KEY=your_api_key
airunner-civitai-download <url>

🔒 HTTPS Configuration

AI Runner's local server uses HTTPS by default. Certificates are auto-generated in ~/.local/share/airunner/certs/.

For browser-trusted certificates, install mkcert:

sudo apt install libnss3-tools
mkcert -install

⚖️ Colorado AI Act Notice

Effective February 1, 2026, the Colorado AI Act (SB 24-205) regulates high-risk AI systems.

Your Responsibility: If you use AI Runner for decisions with legal or significant effects on individuals (employment screening, loan eligibility, insurance, housing), you may be classified as a deployer of a high-risk AI system and must:

  • Implement a risk management policy
  • Complete impact assessments
  • Provide consumer notice and appeal mechanisms
  • Report algorithmic discrimination to the Colorado Attorney General

AI Runner's Design: AI Runner is designed with privacy as a core principle—it runs entirely locally with no external data transmission by default. However, certain optional features connect to external services:

  • Model Downloads: Connecting to HuggingFace or CivitAI to download models
  • Web Search / Deep Research: Search queries sent to DuckDuckGo; web pages scraped for research
  • Weather Prompt: Location coordinates sent to Open-Meteo API if enabled
  • External LLM Providers: Prompts sent to OpenRouter or OpenAI if configured

We recommend using a VPN when using features that connect to external services. See our full Privacy Policy for details.


🧪 Testing

# Run headless-safe tests
pytest src/airunner/utils/tests/

# Run display-required tests (Qt/GUI)
xvfb-run -a pytest src/airunner/utils/tests/xvfb_required/

Contributing

See CONTRIBUTING.md and the Development Wiki.

Documentation


AI Runner Logo

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airunner-5.6.1.tar.gz (3.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

airunner-5.6.1-py3-none-any.whl (4.2 MB view details)

Uploaded Python 3

File details

Details for the file airunner-5.6.1.tar.gz.

File metadata

  • Download URL: airunner-5.6.1.tar.gz
  • Upload date:
  • Size: 3.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for airunner-5.6.1.tar.gz
Algorithm Hash digest
SHA256 829529f562938b44ba4730d7831dbaf461e77d8bc56c31acd8aee2e7a99848dd
MD5 4bf83b25b85e0e8ca971d0325c4ecabd
BLAKE2b-256 ce756489e621b5cf95a0478e0a0013cd1d3ec8e77b9fe0dd4727b6945cb720dc

See more details on using hashes here.

File details

Details for the file airunner-5.6.1-py3-none-any.whl.

File metadata

  • Download URL: airunner-5.6.1-py3-none-any.whl
  • Upload date:
  • Size: 4.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for airunner-5.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 74dccd9f4fa861bd293b781eaf6255ed6487e22c684c0c01c3a05619bf3bde75
MD5 bc8c3154e2f0834b3f09bda591be4850
BLAKE2b-256 73953b32dad98f6bca24919f8783dd9b212fd614f42758dc40d89d189796633d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page