Local AI voice assistant for macOS — speaks in your cloned voice on live calls, fully on-device

These details have not been verified by PyPI

Project links

Project description

Saymo — Local AI Voice Assistant

Fully local AI voice assistant for macOS. Speaks into any live call in your cloned voice — no cloud APIs required.

Saymo composes short, natural speech from optional data sources (tracker, notes, text files), synthesizes it with voice cloning, and routes audio into the active call through a virtual microphone. Everything — language model, speech-to-text, text-to-speech — runs on-device.

Local: Ollama + faster-whisper + Coqui XTTS v2 (or Piper / macOS say as fallback).
Voice cloning: 5-minute sample → your voice, fine-tuning optional.
Routing: BlackHole virtual mic → any browser-based call app.
Call automation: Chrome-driven mute/unmute for 8 providers (Glip, Zoom, Google Meet, MS Teams, Telegram, Yandex Telemost, VK Teams, MTS Link).
Listening mode: auto-detects when your name is called, answers questions from provided context.
User-configurable prompts and vocabulary — no source edits required.

Project status: early public alpha. Expect rough edges. Contributions welcome.

Requirements

macOS with Apple Silicon (M1/M2/M3/M4), arm64 terminal, not Rosetta
Python 3.11+
Homebrew
Google Chrome
~10 GB free disk space

Quick install

git clone https://github.com/mshegolev/saymo && cd saymo
cp config.example.yaml config.yaml   # fill in your details
./install.sh

The installer handles brew deps, Python packages (via uv or pip), an Ollama check, a Piper voice model, and Chrome permissions.

First-time setup

saymo setup                        # Interactive wizard: name, devices, profiles
saymo record-voice -d 300          # Record a 5-minute voice sample
saymo test-devices                 # Verify audio devices
saymo test-tts "Привет, это тест"  # Check that TTS works

One-time audio routing

┌─────────────────────────────────────────────────────────────┐
│                   Audio MIDI Setup                          │
│  Create "Multi-Output Device":                              │
│    ✓ Your headphones   (master, no drift correction)        │
│    ✓ BlackHole 16ch    (drift correction ON)                │
│                                                             │
│  In your call app:                                          │
│    Microphone → BlackHole 2ch                               │
│    Speakers   → Multi-Output Device                         │
└─────────────────────────────────────────────────────────────┘

Daily usage

# Before the call: prepare text + cached audio
saymo prepare -p personal
saymo prepare-responses         # pre-synthesize the Q&A library for live mode
saymo review                    # optional: check generated audio

# During the call
saymo speak -p personal         # manual trigger, instant playback
saymo auto -p personal          # listen for your name, speak when called
saymo auto -p personal --mic    # same, but from laptop mic (for testing)

# Extras
saymo dashboard                 # interactive TUI

Call providers

saymo auto works with all Chrome-based call apps — the provider is picked by meetings.<profile>.provider in config:

`provider:`	Service
`glip` (default)	RingCentral Glip
`zoom`	Zoom
`google_meet`	Google Meet
`ms_teams`	Microsoft Teams
`telegram`	Telegram calls (web)
`telemost`	Yandex Telemost
`vk_teams`	VK Teams
`mts_link`	MTS Link

Run saymo list-plugins to see everything available in your install.

Live Q&A mode

When your name is called and the surrounding transcript looks like a question, auto consults a pre-synthesised response library and plays the best-matching cached variant — no network hop, no synthesis lag. Populate the library once with saymo prepare-responses. Built-in intents cover status (как дела), blockers, ETA, testing stage, review. Extend with your own wording via config.responses.library.

On cache miss, you can opt into a live fallback: Ollama composes an answer from your standup summary + JIRA context, the TTS engine synthesizes it, and Saymo plays it back. This adds a few seconds of latency but covers any question. Enable it in config:

responses:
  live_fallback: true

Without live_fallback (default), a cache miss falls back to the generic standup audio — quiet, reliable, no LLM dependency.

Configurable prompts

All LLM prompts are templates loaded from config.yaml → prompts.* at runtime, with sensible generic defaults in source. To customize voice/tone:

prompts:
  standup_ru: |
    Ты — помощник для ежедневных встреч. Составь отчёт на русском...
    {yesterday_notes}
    {today_notes}
  qa_system_ru: |
    Ты — {user_name}, {user_role}. Отвечай кратко, 1-3 предложения...

See config.example.yaml for all available keys and the default set.

Project-specific vocabulary

Adding your own abbreviations or fuzzy name expansions to the TTS normalizer is done through config, not source:

vocabulary:
  abbreviations:
    MYAPI: "май-эй-пи-ай"
    K8S: "кубернетес"
  fuzzy_expansions:
    Alex: ["Alex", "Алекс", "Саша", "Саня"]

Architecture

┌───────────────┐   ┌──────────────┐   ┌────────────────┐   ┌──────────────┐
│ Source plugin │──▶│ LLM composer │──▶│ Text normalizer│──▶│  TTS engine  │
│  (optional)   │   │   (Ollama)   │   │   (abbrevs,    │   │  (XTTS clone │
│               │   │              │   │    numbers)    │   │  / Piper)    │
└───────────────┘   └──────────────┘   └────────────────┘   └──────┬───────┘
                                                                   │
┌──────────────┐   ┌──────────────┐   ┌────────────────┐           │
│Call provider │◀──│ Auto trigger │◀──│  STT (Whisper) │       Audio bytes
│(mute/unmute) │   │(name detect) │   │ (capture call) │           │
└──────┬───────┘   └──────────────┘   └────────────────┘           │
       │                                                           │
       ▼                                                           ▼
  BlackHole 2ch ─────────────────────────────────────────── Audio output + monitor
  (virtual mic)

Details in docs/PRD.md and ADRs under docs/adr/.

Security & privacy

Everything runs on-device by default. Cloud TTS / STT providers are optional and disabled in the example config.
Voice samples and secrets are listed in .gitignore — they never leave your machine.
Prompts, vocabulary, trigger phrases are all in your config file — source stays generic.

License

MIT — see LICENSE.

Acknowledgements

Coqui TTS for XTTS v2.
Ollama for local LLM hosting.
faster-whisper for transcription.
BlackHole for virtual audio routing.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.10.3

Apr 22, 2026

This version

0.10.2

Apr 22, 2026

0.10.1

Apr 22, 2026

0.10.0

Apr 22, 2026

0.9.0

Apr 22, 2026

0.8.0

Apr 22, 2026

0.7.1

Apr 19, 2026

0.7.0

Apr 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

saymo-0.10.2.tar.gz (122.8 kB view details)

Uploaded Apr 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

saymo-0.10.2-py3-none-any.whl (141.6 kB view details)

Uploaded Apr 22, 2026 Python 3

File details

Details for the file saymo-0.10.2.tar.gz.

File metadata

Download URL: saymo-0.10.2.tar.gz
Upload date: Apr 22, 2026
Size: 122.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.1

File hashes

Hashes for saymo-0.10.2.tar.gz
Algorithm	Hash digest
SHA256	`adb46cb2572315a0397b700741787a2e4d46130cadd908bb490def992b5252f4`
MD5	`3f6ea4e87b312319dbae155dc8968bcc`
BLAKE2b-256	`abc570d88ffebaa3876df009133cda740bcae7bd4eaf541eda83d2ed42bcb787`

See more details on using hashes here.

File details

Details for the file saymo-0.10.2-py3-none-any.whl.

File metadata

Download URL: saymo-0.10.2-py3-none-any.whl
Upload date: Apr 22, 2026
Size: 141.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.1

File hashes

Hashes for saymo-0.10.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c12fa825c50b6eeaf2eea97e4cfb38d19438fea0eec77a0ec79a1fb9aaeab2df`
MD5	`e408ffa093859d121e1f6f13fc74fc3b`
BLAKE2b-256	`dd111c882ef380fe1cace2518cfc4a400abe4de002896d353e5ac819ee35e943`

See more details on using hashes here.

saymo 0.10.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Saymo — Local AI Voice Assistant

Requirements

Quick install

First-time setup

One-time audio routing

Daily usage

Call providers

Live Q&A mode

Configurable prompts

Project-specific vocabulary

Architecture

Security & privacy

License

Acknowledgements

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes