macOS menu bar app that captures voice, transcribes, and enhances text with AI
Project description
Vaani · वाणी
वाणी — Sanskrit for speech, voice; the goddess of language and learning.
Voice to polished text, right at your cursor — anywhere on macOS.
Vaani is a macOS menu bar app that listens when you hold a hotkey, transcribes your speech with OpenAI Whisper, enhances it with Claude AI, and pastes the result directly at your cursor. No switching apps, no copy-pasting — just speak and the text appears.
The Problem
Typing is slow. Dictation on macOS gives you raw, unedited speech dumps — filler words, broken sentences, no punctuation. Third-party dictation tools are either expensive subscriptions, cloud-locked, or produce output that still needs manual cleanup.
Professionals who write a lot — engineers, writers, product managers, support teams — spend significant time translating their thoughts into polished prose. The gap between what you think and what ends up on screen is real friction.
Why We Built Vaani
We wanted voice to be a first-class input method, not an afterthought. The goal: speak naturally, get back something you'd actually send or commit. Vaani sits silently in your menu bar and activates on a global hotkey — no window to focus, no app to switch to. It works in any text field, terminal, IDE, browser, Slack, email client, or document editor.
The name "Vaani" (वाणी) comes from Sanskrit, meaning speech or voice — the goddess of language and learning.
Quick Start
curl -sSL https://raw.githubusercontent.com/ankushbhardwxj/vaani/main/install.sh | sh
vaani start
The install script installs Vaani via pip, sets up the vaani command on your PATH, and downloads required language models.
On first launch, Vaani walks you through entering your API keys and granting macOS permissions. Then:
- Hold
Alt(or your configured hotkey) and speak - Release — Vaani transcribes and enhances your speech
- Polished text appears at your cursor in ~2–4 seconds
API keys needed: OpenAI (transcription) · Anthropic (enhancement). Keys are stored in macOS Keychain — never written to disk in plaintext.
How It Works
Hold hotkey → Mic capture → VAD trims silence → Gain normalization
→ OpenAI Whisper (transcribe) → Claude Haiku (enhance) → Paste at cursor
Every step runs in the background. The menu bar icon shows your current state: idle, recording, or processing.
Pipeline Detail
| Step | Technology | What it does |
|---|---|---|
| Audio capture | sounddevice / PortAudio | Streams mic input at 16kHz mono |
| Voice activity detection | Silero VAD (PyTorch) | Strips silence; handles whisper-level audio |
| Gain normalization | RMS-based | Amplifies quiet audio so VAD works on whispers |
| Transcription | OpenAI Whisper API | Accurate STT across accents and background noise |
| Enhancement | Anthropic Claude Haiku | Polishes grammar, tone, and structure |
| Output | pynput + pbcopy/pbpaste | Saves clipboard → pastes → restores clipboard |
| Name formatting | spaCy NER (en_core_web_sm) | Detects person names, optionally prefixes with @ |
Enhancement Modes
Switch modes from the menu bar dropdown:
| Mode | What it does |
|---|---|
| Cleanup | Fix grammar and remove filler words with minimal rewrites |
| Professional | Formal rewrite for business communication, emails, and docs |
| Casual | Friendly, conversational tone for chats and informal writing |
| Bullets | Convert your speech into organized bullet points |
Requirements
- macOS 12+ (uses native menu bar, Keychain, clipboard APIs)
- Python 3.10+
- API Keys: OpenAI · Anthropic
macOS Permissions
Grant these in System Settings → Privacy & Security on first run:
| Permission | Why |
|---|---|
| Microphone | Audio recording (auto-prompted on first use) |
| Accessibility | Simulating Cmd+V to paste text |
| Input Monitoring | Detecting the global hotkey from any app |
Configuration
Config file: ~/.vaani/config.yaml
hotkey: "alt" # Global hotkey to hold while speaking
active_mode: professional # Default enhancement mode
sounds_enabled: true # Audio feedback on start/stop
vad_threshold: 0.05 # Lower = more sensitive (good for whispers)
sample_rate: 16000 # Audio sample rate (Hz)
max_recording_seconds: 600 # Auto-stop after 10 minutes
stt_model: whisper-1 # OpenAI transcription model
llm_model: claude-haiku-4-5-20251001 # Anthropic enhancement model
microphone_device: null # null = system default, or set device index
paste_restore_delay_ms: 100 # How long to wait before restoring clipboard
launch_at_login: false # Start Vaani automatically on login
Configuration reloads automatically when the file changes — no restart needed.
Custom Prompts
Override any prompt by creating files in ~/.vaani/prompts/:
~/.vaani/prompts/
├── system.txt # Override the base system prompt
├── context.txt # Add personal context (your writing style, name, role)
└── modes/
├── cleanup.txt # Override the cleanup mode prompt
├── professional.txt
├── casual.txt
└── bullets.txt
User prompts take priority over built-in defaults. Use context.txt to tell Vaani about you — your name, your company, common terms you use — so the output matches your voice.
Privacy
| Data | Where it goes | How it's stored |
|---|---|---|
| Audio | Sent to OpenAI for transcription | Not stored by Vaani |
| Transcribed text | Sent to Anthropic for enhancement | Not stored by Vaani |
| API keys | macOS Keychain only | Never written to disk in plaintext |
| Transcription history | Local SQLite database | AES-256 encrypted (Fernet) |
No data is retained on Vaani's servers because there are no Vaani servers. All cloud calls go directly from your machine to OpenAI and Anthropic under your API account, subject to their data retention policies.
Architecture
┌─────────────────────────────────────────────────────────────┐
│ Main Thread (macOS requirement) │
│ ┌─────────────┐ ┌───────────────────────────────────┐ │
│ │ HotkeyListener│ │ VaaniMenuBar (rumps event loop) │ │
│ │ (pynput) │ │ Status icon · Mode selector │ │
│ └──────┬──────┘ └──────────────────────────────────-┘ │
│ │ on_press / on_release │
└─────────┼───────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ StateMachine: IDLE → RECORDING → PROCESSING → IDLE │
└─────────────────────────────────────────────────────────────┘
│
▼ (daemon threads)
┌──────────────────────────────────────────────────────────┐
│ AudioRecorder → process_audio() → transcribe() → enhance() → paste_text() │
│ sounddevice Silero VAD Whisper API Claude pynput │
│ + gain norm + WAV encode Haiku + pbcopy │
└──────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────┐
│ HistoryStore (SQLite) │
│ Fernet-encrypted records │
└─────────────────────────────┘
Uninstall
pip uninstall vaani
rm -rf ~/.vaani
sudo rm -f /usr/local/bin/vaani
Development
git clone https://github.com/ankushbhardwxj/vaani
cd vaani
python -m venv .venv && source .venv/bin/activate
pip install -e ".[test]"
python -m spacy download en_core_web_sm
# Run tests
pytest
# Run in foreground (with live logs)
vaani start --foreground
Running Tests
pytest # Run all tests
pytest -v # Verbose output
pytest tests/test_audio.py # Single module
License
MIT — see LICENSE for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vaani-0.2.4.tar.gz.
File metadata
- Download URL: vaani-0.2.4.tar.gz
- Upload date:
- Size: 199.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e6c38342fc48081171b91fe26d7722bb2aaa6cf0a92220610b873af3cf82d2eb
|
|
| MD5 |
6bbed24a42b0117d8825726557cb0475
|
|
| BLAKE2b-256 |
2885782db249be0b16ee9211edfe6893b98e3342113543a8e02531a50864fb62
|
File details
Details for the file vaani-0.2.4-py3-none-any.whl.
File metadata
- Download URL: vaani-0.2.4-py3-none-any.whl
- Upload date:
- Size: 179.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
15346ecfe2fe3d04fe094548f01de9d38059a07617dd4cc9ce9e940013093398
|
|
| MD5 |
4b574f44ee78c1a306ebbfc1385e5395
|
|
| BLAKE2b-256 |
eaabcf4f5646c948c88655b558adc66509b12b2e4f281e437bc4375b737e61ee
|