Skip to main content

Open-source personal AI assistant constructor with voice control and persona system

Project description

OpenOcto ๐Ÿ™

Open-source personal AI assistant constructor with voice control and persona system.

Hold [Space] โ†’ speak โ†’ get a voice response. Fully local audio processing. Your voice never leaves the device. Wake word detection ("Hey Hestia") coming in Phase 2.

Features

  • Push-to-talk voice input (hold Space)
  • Local STT via whisper.cpp โ€” auto-detects language (30+ languages supported)
  • Local TTS via piper-tts โ€” natural voices in English, Spanish, French, and more
  • Pluggable AI backends โ€” Claude (native API), Claude Max Proxy (use your subscription), OpenAI, and any OpenAI-compatible provider
  • Persona system โ€” character, voice, and system prompt as a single package
  • Cross-platform: macOS, Linux, Windows

Quick Start

One command to install, configure, and download models:

macOS / Linux:

curl -sSL https://raw.githubusercontent.com/openocto-dev/openocto/main/scripts/install.sh | bash

Windows (PowerShell):

irm https://raw.githubusercontent.com/openocto-dev/openocto/main/scripts/install.ps1 | iex

Already cloned the repo? Same script works locally:

./scripts/install.sh           # macOS / Linux
.\scripts\install.ps1          # Windows

The installer automatically detects whether you're inside the project or need to clone it, then:

  1. Creates a virtual environment and installs dependencies
  2. Launches the setup wizard โ€” choose AI backend, enter API key, download models

After setup:

openocto start

macOS: If prompted, grant microphone access and Accessibility permissions to your terminal app (System Settings โ†’ Privacy & Security).

Usage

๐Ÿ™ OpenOcto v0.1.0 | Persona: Octo | AI: claude
   Hold [Space] to speak, [Ctrl+C] to quit

You [en]: What's the capital of France?
Octo: The capital of France is Paris.

You [en]: What's the weather like in Tokyo?
Octo: I don't have access to real-time weather data, but you can check weather.com or ask me anything else.

CLI Commands

openocto start                        # start assistant (auto-selects user if only one)
openocto start --user Dmitry          # start as a specific user (skips prompt)
openocto start --persona octo         # specify persona
openocto start --ai claude-proxy      # use Claude subscription (via proxy)
openocto start --ai openai            # use OpenAI
openocto setup                        # re-run the setup wizard
openocto config show                  # show resolved configuration
openocto --version

Multi-user

If multiple users are set up, openocto start will prompt you to choose:

๐Ÿ‘ค Multiple users โ€” who are you?
  1. Dmitry (last active)
  2. Anna

Enter number [1]:

To skip the prompt, pass --user:

openocto start --user Anna

Each user has their own conversation history per persona.

Requirements

  • Python 3.10+
  • macOS (Apple Silicon or Intel), Linux, or Windows
  • Microphone and speakers

Configuration

OpenOcto looks for configuration in this order:

  1. config/default.yaml (built-in defaults)
  2. ~/.openocto/config.yaml (your overrides โ€” created by openocto setup)
  3. Environment variables (${ANTHROPIC_API_KEY}, etc.)
  4. CLI flags

Example ~/.openocto/config.yaml:

persona: "octo"

ai:
  default_backend: "claude"
  claude:
    model: "claude-opus-4-6"

stt:
  model_size: "medium"    # better accuracy on Apple Silicon M4

tts:
  models:
    en: "en_US-amy-medium"

AI Backends

Backend Config key API Key Notes
Claude (Anthropic API) claude ANTHROPIC_API_KEY Native SDK, default
Claude Max Proxy claude-proxy Not needed Uses Claude subscription
OpenAI openai OPENAI_API_KEY OpenAI-compatible
Z.AI zai ZAI_API_KEY OpenAI-compatible

Claude Max Proxy (use your Claude subscription)

If you have a Claude Pro/Max subscription, you can use it instead of an API key:

# Install and start the proxy (requires Claude Code CLI to be authenticated)
npx claude-max-proxy

# In another terminal
openocto start --ai claude-proxy

The proxy runs at http://localhost:3456/v1 and bridges OpenAI-format requests through your authenticated Claude session.

Adding custom providers

Any OpenAI-compatible provider can be added in ~/.openocto/config.yaml:

ai:
  default_backend: "gemini"
  providers:
    gemini:
      api_key: "${GEMINI_API_KEY}"
      model: "gemini-2.5-pro"
      base_url: "https://generativelanguage.googleapis.com/v1beta/openai"

    deepseek:
      api_key: "${DEEPSEEK_API_KEY}"
      model: "deepseek-chat"
      base_url: "https://api.deepseek.com/v1"

    ollama:
      model: "llama3:8b"
      base_url: "http://localhost:11434/v1"
      no_auth: true    # local services don't need an API key

Whisper Models

Model Size Speed (M2) Accuracy
tiny 75MB Very fast Low
base 142MB Fast Medium
small 466MB ~2s/10s audio Good (default)
medium 1.5GB ~3s/10s audio High

Personas

Personas live in the personas/ directory. Each persona is a folder with:

personas/
โ””โ”€โ”€ octo/
    โ”œโ”€โ”€ persona.yaml       # name, voice config, personality
    โ””โ”€โ”€ system_prompt.md   # instructions for the AI

Creating a Custom Persona

# personas/mypersona/persona.yaml
name: "mypersona"
display_name: "My Persona"
description: "My custom assistant"

voice:
  engine: "piper"
  models:
    en: "en_US-amy-medium"
  length_scale: 1.0

personality:
  tone: "friendly"        # warm, professional, playful, serious
  verbosity: "balanced"   # brief, balanced, detailed
  formality: "informal"   # formal, informal, casual
<!-- personas/mypersona/system_prompt.md -->
You are [Name], a helpful assistant.
Always respond in the same language the user speaks.
Keep responses concise โ€” they will be spoken aloud.
openocto start --persona mypersona

Testing the Microphone

openocto test mic

Development

pip install -e ".[dev]"
python -m pytest tests/ -v

Project Structure

openocto/
โ”œโ”€โ”€ openocto/
โ”‚   โ”œโ”€โ”€ app.py              # Main orchestrator
โ”‚   โ”œโ”€โ”€ config.py           # Configuration loader (Pydantic)
โ”‚   โ”œโ”€โ”€ setup_wizard.py     # Interactive setup wizard
โ”‚   โ”œโ”€โ”€ event_bus.py        # Async pub/sub
โ”‚   โ”œโ”€โ”€ state_machine.py    # Pipeline state machine
โ”‚   โ”œโ”€โ”€ audio/              # Capture and playback
โ”‚   โ”œโ”€โ”€ stt/                # Speech-to-Text (whisper.cpp)
โ”‚   โ”œโ”€โ”€ tts/                # Text-to-Speech (piper-tts)
โ”‚   โ”œโ”€โ”€ vad/                # Voice Activity Detection (Silero)
โ”‚   โ”œโ”€โ”€ ai/                 # AI backends (Claude, OpenAI-compat)
โ”‚   โ”œโ”€โ”€ persona/            # Persona loader
โ”‚   โ””โ”€โ”€ utils/              # Model downloader, keyboard listener
โ”œโ”€โ”€ personas/octo/          # Default persona
โ”œโ”€โ”€ config/default.yaml     # Default configuration
โ”œโ”€โ”€ scripts/
โ”‚   โ”œโ”€โ”€ install.sh          # macOS/Linux installer
โ”‚   โ””โ”€โ”€ install.ps1         # Windows installer
โ”œโ”€โ”€ tests/                  # Unit tests
โ””โ”€โ”€ pyproject.toml

Brand

"OpenOcto" name, logo, mascot, and persona character designs are trademarks and copyrighted works of the OpenOcto project author. All character artwork ยฉ 2026 OpenOcto Contributors. All rights reserved. See BRAND.md for usage guidelines.

License

Business Source License 1.1 โ€” free for personal and non-commercial use. Converts to Apache 2.0 on 2030-03-30.

Website: openocto.dev Author: Rocket Dev Maintainer: Dmitry Rman (@Dmitry-rman)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openocto_dev-0.1.0.tar.gz (79.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openocto_dev-0.1.0-py3-none-any.whl (62.0 kB view details)

Uploaded Python 3

File details

Details for the file openocto_dev-0.1.0.tar.gz.

File metadata

  • Download URL: openocto_dev-0.1.0.tar.gz
  • Upload date:
  • Size: 79.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for openocto_dev-0.1.0.tar.gz
Algorithm Hash digest
SHA256 556353beb2e1bf6225c13359d62b8cb4bb93cb1cd41a672cfbcb2358718a843c
MD5 9a761d5b8a0a36b7cf2d2335277c9f38
BLAKE2b-256 c68839d2864ad9f87fe47c39dbe900daaf11bcb967c6ca91caadc5334cbda584

See more details on using hashes here.

File details

Details for the file openocto_dev-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: openocto_dev-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 62.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for openocto_dev-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 abdb9689287e970d79ab102aedeb79b05e6260460add5e8b92f7064217213901
MD5 e9bcc9c188f8a179d0a5771419de1ed1
BLAKE2b-256 b91d602b86bfa0a01135791c9a83c30d1e29c6a4d834203f79f46023808ed63d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page