Open-source personal AI assistant constructor with voice control and persona system

These details have not been verified by PyPI

Project description

OpenOcto 🐙

Open-source personal AI assistant constructor with voice control and persona system.

Hold [Space] → speak → get a voice response. Fully local audio processing. Your voice never leaves the device. Wake word detection ("Hey Hestia") coming in Phase 2.

Features

Push-to-talk voice input (hold Space)
Local STT via whisper.cpp — auto-detects language (30+ languages supported)
Local TTS via piper-tts — natural voices in English, Spanish, French, and more
Pluggable AI backends — Claude (native API), Claude Max Proxy (use your subscription), OpenAI, and any OpenAI-compatible provider
Persona system — character, voice, and system prompt as a single package
Cross-platform: macOS, Linux, Windows

Quick Start

One command to install, configure, and download models:

macOS / Linux:

curl -sSL https://raw.githubusercontent.com/openocto-dev/openocto/main/scripts/install.sh | bash

Windows (PowerShell):

irm https://raw.githubusercontent.com/openocto-dev/openocto/main/scripts/install.ps1 | iex

Already cloned the repo? Same script works locally:

./scripts/install.sh           # macOS / Linux
.\scripts\install.ps1          # Windows

The installer automatically detects whether you're inside the project or need to clone it, then:

Creates a virtual environment and installs dependencies
Launches the setup wizard — choose AI backend, enter API key, download models

After setup:

openocto start

macOS: If prompted, grant microphone access and Accessibility permissions to your terminal app (System Settings → Privacy & Security).

Usage

🐙 OpenOcto v0.1.0 | Persona: Octo | AI: claude
   Hold [Space] to speak, [Ctrl+C] to quit

You [en]: What's the capital of France?
Octo: The capital of France is Paris.

You [en]: What's the weather like in Tokyo?
Octo: I don't have access to real-time weather data, but you can check weather.com or ask me anything else.

CLI Commands

openocto start                        # start assistant (auto-selects user if only one)
openocto start --user Dmitry          # start as a specific user (skips prompt)
openocto start --persona octo         # specify persona
openocto start --ai claude-proxy      # use Claude subscription (via proxy)
openocto start --ai openai            # use OpenAI
openocto setup                        # re-run the setup wizard
openocto config show                  # show resolved configuration
openocto --version

Multi-user

If multiple users are set up, openocto start will prompt you to choose:

👤 Multiple users — who are you?
  1. Dmitry (last active)
  2. Anna

Enter number [1]:

To skip the prompt, pass --user:

openocto start --user Anna

Each user has their own conversation history per persona.

Requirements

Python 3.10+
macOS (Apple Silicon or Intel), Linux, or Windows
Microphone and speakers

Configuration

OpenOcto looks for configuration in this order:

config/default.yaml (built-in defaults)
~/.openocto/config.yaml (your overrides — created by openocto setup)
Environment variables (${ANTHROPIC_API_KEY}, etc.)
CLI flags

Example ~/.openocto/config.yaml:

persona: "octo"

ai:
  default_backend: "claude"
  claude:
    model: "claude-opus-4-6"

stt:
  model_size: "medium"    # better accuracy on Apple Silicon M4

tts:
  models:
    en: "en_US-amy-medium"

AI Backends

Backend	Config key	API Key	Notes
Claude (Anthropic API)	`claude`	`ANTHROPIC_API_KEY`	Native SDK, default
Claude Max Proxy	`claude-proxy`	Not needed	Uses Claude subscription
OpenAI	`openai`	`OPENAI_API_KEY`	OpenAI-compatible
Z.AI	`zai`	`ZAI_API_KEY`	OpenAI-compatible

Claude Max Proxy (use your Claude subscription)

If you have a Claude Pro/Max subscription, you can use it instead of an API key:

# Install and start the proxy (requires Claude Code CLI to be authenticated)
npx claude-max-proxy

# In another terminal
openocto start --ai claude-proxy

The proxy runs at http://localhost:3456/v1 and bridges OpenAI-format requests through your authenticated Claude session.

Adding custom providers

Any OpenAI-compatible provider can be added in ~/.openocto/config.yaml:

ai:
  default_backend: "gemini"
  providers:
    gemini:
      api_key: "${GEMINI_API_KEY}"
      model: "gemini-2.5-pro"
      base_url: "https://generativelanguage.googleapis.com/v1beta/openai"

    deepseek:
      api_key: "${DEEPSEEK_API_KEY}"
      model: "deepseek-chat"
      base_url: "https://api.deepseek.com/v1"

    ollama:
      model: "llama3:8b"
      base_url: "http://localhost:11434/v1"
      no_auth: true    # local services don't need an API key

Whisper Models

Model	Size	Speed (M2)	Accuracy
`tiny`	75MB	Very fast	Low
`base`	142MB	Fast	Medium
`small`	466MB	~2s/10s audio	Good (default)
`medium`	1.5GB	~3s/10s audio	High

Personas

Personas live in the personas/ directory. Each persona is a folder with:

personas/
└── octo/
    ├── persona.yaml       # name, voice config, personality
    └── system_prompt.md   # instructions for the AI

Creating a Custom Persona

# personas/mypersona/persona.yaml
name: "mypersona"
display_name: "My Persona"
description: "My custom assistant"

voice:
  engine: "piper"
  models:
    en: "en_US-amy-medium"
  length_scale: 1.0

personality:
  tone: "friendly"        # warm, professional, playful, serious
  verbosity: "balanced"   # brief, balanced, detailed
  formality: "informal"   # formal, informal, casual

<!-- personas/mypersona/system_prompt.md -->
You are [Name], a helpful assistant.
Always respond in the same language the user speaks.
Keep responses concise — they will be spoken aloud.

openocto start --persona mypersona

Testing the Microphone

openocto test mic

Development

pip install -e ".[dev]"
python -m pytest tests/ -v

Project Structure

openocto/
├── openocto/
│   ├── app.py              # Main orchestrator
│   ├── config.py           # Configuration loader (Pydantic)
│   ├── setup_wizard.py     # Interactive setup wizard
│   ├── event_bus.py        # Async pub/sub
│   ├── state_machine.py    # Pipeline state machine
│   ├── audio/              # Capture and playback
│   ├── stt/                # Speech-to-Text (whisper.cpp)
│   ├── tts/                # Text-to-Speech (piper-tts)
│   ├── vad/                # Voice Activity Detection (Silero)
│   ├── ai/                 # AI backends (Claude, OpenAI-compat)
│   ├── persona/            # Persona loader
│   └── utils/              # Model downloader, keyboard listener
├── personas/octo/          # Default persona
├── config/default.yaml     # Default configuration
├── scripts/
│   ├── install.sh          # macOS/Linux installer
│   └── install.ps1         # Windows installer
├── tests/                  # Unit tests
└── pyproject.toml

Brand

"OpenOcto" name, logo, mascot, and persona character designs are trademarks and copyrighted works of the OpenOcto project author. All character artwork © 2026 OpenOcto Contributors. All rights reserved. See BRAND.md for usage guidelines.

License

Business Source License 1.1 — free for personal and non-commercial use. Converts to Apache 2.0 on 2030-03-30.

Website: openocto.dev Author: Rocket Dev Maintainer: Dmitry Rman (@Dmitry-rman)

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.1

Apr 4, 2026

This version

0.1.0

Apr 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openocto_dev-0.1.0.tar.gz (79.5 kB view details)

Uploaded Apr 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

openocto_dev-0.1.0-py3-none-any.whl (62.0 kB view details)

Uploaded Apr 1, 2026 Python 3

File details

Details for the file openocto_dev-0.1.0.tar.gz.

File metadata

Download URL: openocto_dev-0.1.0.tar.gz
Upload date: Apr 1, 2026
Size: 79.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for openocto_dev-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`556353beb2e1bf6225c13359d62b8cb4bb93cb1cd41a672cfbcb2358718a843c`
MD5	`9a761d5b8a0a36b7cf2d2335277c9f38`
BLAKE2b-256	`c68839d2864ad9f87fe47c39dbe900daaf11bcb967c6ca91caadc5334cbda584`

See more details on using hashes here.

File details

Details for the file openocto_dev-0.1.0-py3-none-any.whl.

File metadata

Download URL: openocto_dev-0.1.0-py3-none-any.whl
Upload date: Apr 1, 2026
Size: 62.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for openocto_dev-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`abdb9689287e970d79ab102aedeb79b05e6260460add5e8b92f7064217213901`
MD5	`e9bcc9c188f8a179d0a5771419de1ed1`
BLAKE2b-256	`b91d602b86bfa0a01135791c9a83c30d1e29c6a4d834203f79f46023808ed63d`

See more details on using hashes here.

openocto-dev 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

OpenOcto 🐙

Features

Quick Start

Usage

CLI Commands

Multi-user

Requirements

Configuration

AI Backends

Claude Max Proxy (use your Claude subscription)

Adding custom providers

Whisper Models

Personas

Creating a Custom Persona

Testing the Microphone

Development

Project Structure

Brand

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes