Skip to main content

Local-first, hands-free voice assistant for Ollama.

Project description

Chuchote

CI License: MIT

Local-first, hands-free voice assistant for Ollama. Talk to a local LLM and get a spoken answer — wake word → speech-to-text → local reasoning → text-to-speech, running entirely on your machine. No cloud, ever.

Status: Phase 5 — the CLI MVP is complete. Wake word → speech capture (Silero VAD end-of-turn) → faster-whisper → Ollama → Piper → speaker, with memory across restarts, barge-in to interrupt a reply, overlapped synthesis, a config file, and an install script. Push-to-talk remains a fallback (--ptt). See CLAUDE.md for what's next.

Pipeline

wake word → capture (Silero VAD end-of-turn) → faster-whisper (STT)
          → Ollama (reasoning) → Piper (TTS) → speakers  ↺

The wake word (openWakeWord) and VAD (Silero) both run locally on onnxruntime.

Requirements

  • Python 3.10+
  • Ollama running locally with a model pulled (e.g. ollama pull llama3.2)
  • A Piper voice (.onnx + .onnx.json) — drop it in ./voices/
  • A working microphone and speakers

Install

python -m venv .venv
# Windows:  .venv\Scripts\activate
# macOS/Linux:  source .venv/bin/activate
pip install -e .

Or use the install script, which also downloads a default Piper voice and writes a config file:

./scripts/install.ps1     # Windows (PowerShell)
./scripts/install.sh      # macOS/Linux

Run

Make sure Ollama is running (ollama serve) and a voice model is in ./voices/:

chuchote start
# or override defaults:
chuchote start --model llama3.2 --wake-word hey_jarvis --voice voices/en_US-lessac-medium.onnx

By default Chuchote is always listening for the wake word (hey_jarvis). Say it, then speak your request — Silero VAD detects when you've stopped and the reply is transcribed, thought out, and spoken back (starting as soon as the first sentence is ready). Press Ctrl+C to quit.

The wake-word models download automatically on first run. Other built-in words include alexa, hey_mycroft, and hey_rhasspy (--wake-word); raise --wake-threshold if you get false triggers. A short tone confirms the wake word — silence it with --no-chime.

Barge-in. While Chuchote is speaking, say the wake word again to cut it off and start a new turn (--barge-in wake, the default — echo-robust, works with open speakers). With headphones you can use --barge-in vad so any speech interrupts; --barge-in off lets every reply finish. In push-to-talk mode, pressing the PTT key cuts off the reply (then hold it to speak again).

Custom wake words

--wake-word accepts a file path as well as a built-in name, so you can use any openWakeWord-compatible model:

chuchote start --wake-word path/to/hey_chuchote.onnx

To create one for your own phrase, use openWakeWord's automatic training notebook — see Training New Models in their README. It generates synthetic speech for your phrase and trains a model in roughly an hour, no voice recordings needed; download the resulting .onnx and point --wake-word (or wake_model in your config) at it.

Note: the training notebook runs on Google Colab (a free cloud notebook) — that's one-time, dev-machine tooling, like downloading a Piper voice. The resulting model runs fully locally; nothing about the assistant touches the network.

Push-to-talk fallback

Prefer holding a key? Skip the wake word entirely:

chuchote start --ptt            # hold space to talk, release to send
chuchote start --ptt --ptt-key ctrl

Checking your setup

Not sure everything's wired up? Run the preflight check:

chuchote doctor

It verifies Ollama is reachable and the model is pulled, a Piper voice is present, your mic and speakers are detected, and (in wake mode) the wake-word deps are installed — reporting each as [ ok ] / [fail] and exiting non-zero if anything's wrong.

Languages

Chuchote isn't English-only — whisper understands ~99 languages and Piper has voices for dozens. To run it in another language, three things need to line up:

  1. Recognition — set language and use a multilingual whisper model (the plain names, not the .en ones — base on low-RAM machines, small for better accuracy):
    language = "fr"
    whisper_model = "base"
    
  2. Speech — download a Piper voice for that language from VOICES.md into your voices dir (or point piper_voice at it). Piper voices are one language each.
  3. Reasoning — pick an Ollama model that's good in your language (most modern ones are multilingual; e.g. qwen2.5 is strong for Chinese). Chuchote already asks the model to reply in whatever language you speak.

Then chuchote doctor will confirm the pieces match. Common starting points:

Language language Example Piper voice
French fr fr_FR-siwis-medium
German de de_DE-thorsten-medium
Spanish es es_ES-davefx-medium
Italian it it_IT-paola-medium
Portuguese (BR) pt pt_BR-faber-medium
Dutch nl nl_NL-mls-medium
Chinese zh zh_CN-huayan-medium
Russian ru ru_RU-dmitri-medium

Adding any other language: find its voice on VOICES.md, set language to whisper's code for it, keep a multilingual whisper_model, and you're set. The wake word stays an English phrase (openWakeWord's built-ins are English) — or use --ptt.

Memory

Chuchote remembers the conversation. Each exchange is stored in a SQLite database (memory.db in your per-user data dir) and the most recent turns are fed back into the model's context every turn, so it stays coherent across turns and restarts.

chuchote start --forget   # start a session with a clean slate
chuchote forget           # erase all saved memory

Tune how much history is injected via history_messages in chuchote/config.py.

Configuration

Persist your settings in a config file instead of passing flags every time:

chuchote init              # writes a commented config.toml to your config dir

Edit the file (its path is printed by init; typically %APPDATA%\chuchote\ on Windows or ~/.config/chuchote/ elsewhere), uncomment what you want to change, and it's picked up on the next chuchote start. Point at a specific file with --config PATH. Precedence is defaults < config file < flags.

Flags (over sensible defaults; see chuchote/config.py for the full list and VAD tunables):

Flag Default Meaning
--model llama3.2 Ollama model
--whisper-model base.en faster-whisper model (small.en = more accurate, needs ~1 GB free RAM)
--language auto recognition language (en, fr, de, zh, …); needs a multilingual model
--voice first .onnx in ./voices Piper voice model
--wake-word hey_jarvis wake word model (alexa, hey_mycroft, hey_rhasspy)
--wake-threshold 0.5 wake sensitivity 0..1 (higher = fewer false triggers)
--no-chime off disable the wake-word acknowledgement tone
--barge-in wake interrupt a reply: wake / vad (headphones) / off
--ptt off use push-to-talk instead of the wake word
--ptt-key space push-to-talk key to hold (with --ptt)
--forget off clear conversation memory before starting
--no-banner off don't print the startup banner
--config per-user config dir path to a config file

Development

Run the test suite (covers the pure logic — sentence chunking, memory, config precedence, banner styling — with no audio/model deps needed):

pip install -e ".[dev]"
pytest

Tests also run in CI (GitHub Actions) on Linux and Windows, Python 3.10–3.13.

Releasing

Publishing a GitHub release triggers the publish.yml workflow, which builds and uploads to PyPI via trusted publishing — no API token. One-time setup on pypi.org: add a trusted publisher for this repo (owner Cjayy77, repo chuchote, workflow publish.yml, environment pypi), and create a matching pypi environment in the repo's GitHub settings.

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chuchote-0.1.1.tar.gz (31.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chuchote-0.1.1-py3-none-any.whl (30.1 kB view details)

Uploaded Python 3

File details

Details for the file chuchote-0.1.1.tar.gz.

File metadata

  • Download URL: chuchote-0.1.1.tar.gz
  • Upload date:
  • Size: 31.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for chuchote-0.1.1.tar.gz
Algorithm Hash digest
SHA256 2d80f2ffbfa0fd47d2367ec43ba9c15cafa3639373d01f85278d0608c8bcc263
MD5 87c9adce63e992e505ebb6490bc5be79
BLAKE2b-256 69087eae232b338d21440a3dfc4a78cd394bdf79fa161d9a695a51300f398ab7

See more details on using hashes here.

Provenance

The following attestation bundles were made for chuchote-0.1.1.tar.gz:

Publisher: publish.yml on Cjayy77/chuchote

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file chuchote-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: chuchote-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 30.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for chuchote-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 294c976a3f80d48c407acf8b7af6c1eb564a9476b37b65c94fb86097091b32b6
MD5 145bf9d41f9de8174eb09db878ce3cc1
BLAKE2b-256 e826b81ec32859d3fc2d778c6512f7790839d80a0ec4a803c9659bdda75e37a7

See more details on using hashes here.

Provenance

The following attestation bundles were made for chuchote-0.1.1-py3-none-any.whl:

Publisher: publish.yml on Cjayy77/chuchote

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page