Skip to main content

Open-source personal AI assistant constructor with voice control and persona system

Project description

OpenOcto ๐Ÿ™

Open-source personal AI assistant constructor with voice control and persona system.

Hold [Space] โ†’ speak โ†’ get a voice response. Fully local audio processing. Your voice never leaves the device.

Features

  • Wake word detection โ€” say "Hi Octo" or "Hey Octo" to activate hands-free (powered by openWakeWord)
  • Push-to-talk voice input (hold Space)
  • Local STT via whisper.cpp โ€” auto-detects language (30+ languages supported)
  • Local TTS via piper-tts โ€” natural voices in English, Spanish, French, and more
  • Pluggable AI backends โ€” Claude (native API), Claude Max Proxy (use your subscription), OpenAI, and any OpenAI-compatible provider
  • Persona system โ€” character, voice, and system prompt as a single package
  • Cross-platform: macOS, Linux, Windows

Quick Start

One command to install, configure, and download models:

macOS / Linux:

curl -sSL https://raw.githubusercontent.com/openocto-dev/openocto/main/install.sh | bash

Windows (PowerShell):

irm https://raw.githubusercontent.com/openocto-dev/openocto/main/install.ps1 | iex

Already cloned the repo? Same script works locally:

./install.sh           # macOS / Linux
.\scripts\install.ps1          # Windows

The installer automatically detects whether you're inside the project or need to clone it, then:

  1. Creates a virtual environment and installs dependencies
  2. Launches the setup wizard โ€” choose AI backend, enter API key, download models

After setup:

openocto start

macOS: If prompted, grant microphone access and Accessibility permissions to your terminal app (System Settings โ†’ Privacy & Security).

Usage

๐Ÿ™ OpenOcto v0.1.0 | Persona: Octo | AI: claude
   Hold [Space] to speak, [Ctrl+C] to quit

You [en]: What's the capital of France?
Octo: The capital of France is Paris.

You [en]: What's the weather like in Tokyo?
Octo: I don't have access to real-time weather data, but you can check weather.com or ask me anything else.

CLI Commands

openocto start                        # start assistant (auto-selects user if only one)
openocto start --user Dmitry          # start as a specific user (skips prompt)
openocto start --persona octo         # specify persona
openocto start --ai claude-proxy      # use Claude subscription (via proxy)
openocto start --ai openai            # use OpenAI
openocto setup                        # re-run the setup wizard
openocto config show                  # show resolved configuration
openocto user list                    # list all users
openocto user add "Anna"              # add a new user
openocto user add "Anna" --default    # add and set as default
openocto user delete "Anna"           # delete user and all their data
openocto user delete "Anna" -y        # delete without confirmation
openocto user default "Anna"          # set default user
openocto --version

Multi-user

If multiple users are set up, openocto start will prompt you to choose:

๐Ÿ‘ค Multiple users โ€” who are you?
  1. Dmitry (last active)
  2. Anna

Enter number [1]:

To skip the prompt, pass --user:

openocto start --user Anna

Each user has their own conversation history per persona.

Requirements

  • Python 3.10+
  • macOS (Apple Silicon or Intel), Linux, or Windows
  • Microphone and speakers

macOS (fresh install)

A clean macOS doesn't include Python or Git. Install them before running the installer:

# 1. Install Xcode Command Line Tools (includes Git)
xcode-select --install

# 2. Install Homebrew (package manager)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# 3. Install Python
brew install python@3.13

Linux (Debian/Ubuntu)

sudo apt update && sudo apt install -y python3 python3-venv python3-pip git

Configuration

OpenOcto looks for configuration in this order:

  1. config/default.yaml (built-in defaults)
  2. ~/.openocto/config.yaml (your overrides โ€” created by openocto setup)
  3. Environment variables (${ANTHROPIC_API_KEY}, etc.)
  4. CLI flags

Example ~/.openocto/config.yaml:

persona: "octo"

ai:
  default_backend: "claude"
  claude:
    model: "claude-opus-4-6"

stt:
  model_size: "medium"    # better accuracy on Apple Silicon M4

tts:
  models:
    en: "en_US-amy-medium"

AI Backends

Backend Config key API Key Notes
Claude (Anthropic API) claude ANTHROPIC_API_KEY Native SDK, default
Claude Max Proxy claude-proxy Not needed Uses Claude subscription
OpenAI openai OPENAI_API_KEY OpenAI-compatible
Z.AI zai ZAI_API_KEY OpenAI-compatible

Claude Max Proxy (use your Claude subscription)

If you have a Claude Pro/Max subscription, you can use it instead of an API key:

# Install and start the proxy (requires Claude Code CLI to be authenticated)
npx claude-max-proxy

# In another terminal
openocto start --ai claude-proxy

The proxy runs at http://localhost:3456/v1 and bridges OpenAI-format requests through your authenticated Claude session.

Adding custom providers

Any OpenAI-compatible provider can be added in ~/.openocto/config.yaml:

ai:
  default_backend: "gemini"
  providers:
    gemini:
      api_key: "${GEMINI_API_KEY}"
      model: "gemini-2.5-pro"
      base_url: "https://generativelanguage.googleapis.com/v1beta/openai"

    deepseek:
      api_key: "${DEEPSEEK_API_KEY}"
      model: "deepseek-chat"
      base_url: "https://api.deepseek.com/v1"

    ollama:
      model: "llama3:8b"
      base_url: "http://localhost:11434/v1"
      no_auth: true    # local services don't need an API key

Whisper Models

Model Size Speed (M2) Accuracy
tiny 75MB Very fast Low
base 142MB Fast Medium
small 466MB ~2s/10s audio Good (default)
medium 1.5GB ~3s/10s audio High

Wake Word

Enable hands-free activation in ~/.openocto/config.yaml:

wakeword:
  enabled: true
  model: octo_v0.1     # responds to "Hi Octo", "Hey Octo", "Ok Octo"
  threshold: 0.5       # lower = more sensitive (0.1โ€“0.9)

The octo_v0.1 model is downloaded automatically on first run from openocto-dev/openocto-models.

You can also use any built-in openWakeWord model:

wakeword:
  enabled: true
  model: alexa_v0.1    # built-in, no download needed

Train your own wake word

Want a custom wake word? Use openocto-wakeword โ€” a toolkit for training ONNX wake word models on Apple Silicon (Mac M1/M2/M3/M4), no CUDA required.

Personas

Personas live in the personas/ directory. Each persona is a folder with:

personas/
โ””โ”€โ”€ octo/
    โ”œโ”€โ”€ persona.yaml       # name, voice config, personality
    โ””โ”€โ”€ system_prompt.md   # instructions for the AI

Creating a Custom Persona

# personas/mypersona/persona.yaml
name: "mypersona"
display_name: "My Persona"
description: "My custom assistant"

voice:
  engine: "piper"
  models:
    en: "en_US-amy-medium"
  length_scale: 1.0

personality:
  tone: "friendly"        # warm, professional, playful, serious
  verbosity: "balanced"   # brief, balanced, detailed
  formality: "informal"   # formal, informal, casual
<!-- personas/mypersona/system_prompt.md -->
You are [Name], a helpful assistant.
Always respond in the same language the user speaks.
Keep responses concise โ€” they will be spoken aloud.
openocto start --persona mypersona

Testing the Microphone

openocto test mic

Development

pip install -e ".[dev]"
python -m pytest tests/ -v

Project Structure

openocto/
โ”œโ”€โ”€ openocto/
โ”‚   โ”œโ”€โ”€ app.py              # Main orchestrator
โ”‚   โ”œโ”€โ”€ config.py           # Configuration loader (Pydantic)
โ”‚   โ”œโ”€โ”€ setup_wizard.py     # Interactive setup wizard
โ”‚   โ”œโ”€โ”€ event_bus.py        # Async pub/sub
โ”‚   โ”œโ”€โ”€ state_machine.py    # Pipeline state machine
โ”‚   โ”œโ”€โ”€ audio/              # Capture and playback
โ”‚   โ”œโ”€โ”€ stt/                # Speech-to-Text (whisper.cpp)
โ”‚   โ”œโ”€โ”€ tts/                # Text-to-Speech (piper-tts)
โ”‚   โ”œโ”€โ”€ vad/                # Voice Activity Detection (Silero)
โ”‚   โ”œโ”€โ”€ ai/                 # AI backends (Claude, OpenAI-compat)
โ”‚   โ”œโ”€โ”€ persona/            # Persona loader
โ”‚   โ””โ”€โ”€ utils/              # Model downloader, keyboard listener
โ”œโ”€โ”€ personas/octo/          # Default persona
โ”œโ”€โ”€ config/default.yaml     # Default configuration
โ”œโ”€โ”€ install.sh              # macOS/Linux installer
โ”œโ”€โ”€ install.ps1             # Windows installer
โ”œโ”€โ”€ tests/                  # Unit tests
โ””โ”€โ”€ pyproject.toml

Brand

"OpenOcto" name, logo, mascot, and persona character designs are trademarks and copyrighted works of the OpenOcto project author. All character artwork ยฉ 2026 OpenOcto Contributors. All rights reserved. See BRAND.md for usage guidelines.

License

Business Source License 1.1 โ€” free for personal and non-commercial use. Converts to Apache 2.0 on 2030-03-30.

Website: openocto.dev Maintainer: Dmitry Rman (@Dmitry-rman)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openocto_dev-0.1.1.tar.gz (99.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openocto_dev-0.1.1-py3-none-any.whl (69.2 kB view details)

Uploaded Python 3

File details

Details for the file openocto_dev-0.1.1.tar.gz.

File metadata

  • Download URL: openocto_dev-0.1.1.tar.gz
  • Upload date:
  • Size: 99.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for openocto_dev-0.1.1.tar.gz
Algorithm Hash digest
SHA256 6fbdba05812c1491b94c9a540fdb69969d592b209f4d9a7f262585cdcf6e84e2
MD5 0cc089afec5a6f4e5046829e7da43e53
BLAKE2b-256 85888050126ab59943f46b78b3f11ee0bd04c5139c1fa4270e5461428c5f3d69

See more details on using hashes here.

File details

Details for the file openocto_dev-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: openocto_dev-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 69.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for openocto_dev-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d6b680758c675ef01f3520285925f2fd4f518969017e5df35cd5e55e904ab4dd
MD5 48e47581d4881b4764f4bbb468fde417
BLAKE2b-256 baaf6dc9a5d48ed46fcc1876f037dd6d5bd443d1137c2571e0431f72bcab63ab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page