Skip to main content

Free, open-source, fully offline AI dictation for Windows and macOS. Hold a hotkey, speak, get text at the cursor. 100% local. Powered by Whisper.

Project description

Whisper Local

Free, Open-Source, 100% Offline AI Dictation for Windows & macOS

Press a hotkey. Speak. Your words appear at the cursor. No cloud. No subscription. No telemetry. Powered by OpenAI Whisper.

Tests Release PyPI Python 3.11+ License: MIT Platform Code of Conduct

Quick Start · Features · vs. Wispr Flow / Dragon · Voice Commands · Contributing

Whisper Local — press, speak, type

Want a real screen-recording demo here? See docs/demo-recording.md — drop a docs/demo.gif in and uncomment the line below.

Whisper Local is a free, open-source, fully offline alternative to Wispr Flow, Dragon, and Otter for power users who want AI dictation without sending audio to the cloud. Built on faster-whisper (CTranslate2), it delivers push-to-talk speech-to-text in any application — chat apps, code editors, browsers, terminals, design tools, anywhere a cursor blinks. Self-hosted, hackable, MIT-licensed.

Looking for: Wispr Flow alternative, offline voice typing, local Whisper dictation, free Dragon NaturallySpeaking alternative, privacy-first speech-to-text, Windows voice dictation without cloud, macOS push-to-talk transcription. You found it.


🌟 Why this exists

Most AI dictation tools are great — until you check the privacy policy. Your audio goes to a server, gets processed, and (sometimes) stored. You pay a monthly fee or get cut off.

Whisper Local exists because you shouldn't have to choose between accuracy and privacy.

  • 🔒 Your voice never leaves your machine — not even metadata
  • 🆓 Free forever — no account, no API key, no subscription
  • 🔌 Works offline, air-gapped, after the internet is gone
  • 🛠️ Fork it, hack it, ship your own version — MIT licensed
  • 💡 Same Whisper model quality as cloud services, running on your own GPU

This is a community tool, not a product. There's no support SLA, no roadmap committee, no marketing. If it's useful to you, great. If something's broken, PRs are welcome.

A note from the maintainer: I built this for myself, then realised it might help others. So I'm releasing it for anyone who wants it — no strings attached. Use it. Fork it. Rebrand it. Ship your own version. The only thing I ask is that you keep the LICENSE attribution intact (to Pin Wang, the original upstream author, and to me as the fork maintainer). If you build something cool on top of it, I'd love to hear about it via a Discussion — but you don't owe anyone anything.

Rohit Burani


✨ Why Whisper Local?

Feature Whisper Local Wispr Flow Dragon / Dragon Anywhere Otter.ai Windows Speech Recognition
Runs 100% offline ❌ (Anywhere)
Audio never leaves your machine partial
Free / open source ❌ ($$$/yr) ❌ ($$/mo)
Modern AI accuracy (Whisper) partial
Works in any app via hotkey partial partial
Customisable voice commands partial
Push-to-talk + auto-paste + auto-send partial
GPU acceleration (NVIDIA & AMD) n/a n/a n/a
AI rephrase / transforms (Ollama)
Hackable / MIT licensed
No account required

🎯 Features

  • 🎙️ Global push-to-talk hotkey — start recording from any app with Ctrl+Win (Windows) or Fn+Ctrl (macOS)
  • Pre-roll buffer + warmup — captures the 500 ms before you press the key and pre-loads Whisper at boot, so the first word is never clipped and the first recording feels instant
  • 🔵 Floating level overlay — a small pill at the screen edge shows you're being heard, with the transcript appearing next to the level bar (Wispr Flow–style). Optional real-time streaming preview shows words as you speak.
  • 📝 Inline voice formatting — say "comma", "period", "question mark", "new paragraph", "open quote", etc. mid-sentence
  • 🤖 AI rephrase — dedicated Ctrl+Shift+Win hotkey: select text, hold, speak your instruction, release — local Ollama rewrites it in place
  • 🌐 Translation mode — speak any language, get English; tray → Profile → Translate
  • 🔁 Continuous dictation mode — for long-form notes, the app auto-restarts recording after each delivery
  • 📋 Fallback window — if no text field is focused, the transcript appears in a small window (pre-selected, copy button, already on clipboard)
  • Pause-all hotkeyCtrl+Alt+Win disables every Whisper Local hotkey until you press it again
  • 📋 Auto-paste at cursor — transcript lands wherever you're typing, optionally followed by Enter (auto-send)
  • 🔒 100 % local & private — no network calls during use; Whisper models cached on disk
  • 🚀 GPU acceleration — NVIDIA CUDA and AMD ROCm supported, CPU works out of the box
  • 🗣️ Voice commands — say a trigger phrase to send a hotkey, type pre-written text, or run a shell command
  • 🔁 Hot-reload — edit commands.yaml and your change applies on the next transcription, no restart
  • 🩺 Built-in diagnosticswhisper-local --doctor checks audio devices, model cache, hotkeys, and recent errors
  • 🎛️ Profiles — switch between Dictation / Chat / Code / Notes presets from the tray
  • 🪟 Per-app rules — different behaviour per foreground app (auto-send in Slack, copy-only in VS Code, suppress in 1Password)
  • 🧹 Optional LLM cleanup — pipe transcripts through a local Ollama model for punctuation / capitalisation polish (off by default, fully local)
  • 📜 Recent transcriptions — last 10 results in the tray menu, click to copy back
  • 🔧 Settings backup/restore--export-settings / --import-settings for portability
  • 🖥️ Settings UIwhisper-local --settings opens a GUI settings window (no YAML editing required)
  • 📜 Transcript historywhisper-local --history opens a searchable log of everything you've dictated
  • 🔔 Opt-in update notifications — daily GitHub release check, fully offline by default (update_check.enabled: true to opt in)
  • 🎚️ Noise suppression — spectral gating via noisereduce, off by default (pip install 'whisper-local[noise]')
  • 🩺 --selftest — one-command sanity check (mic, model, transcription, clipboard) — perfect for first-launch
  • 🎯 Hotkey cheat sheetwhisper-local --cheat-sheet or tray menu — shows your current configured hotkeys at a glance
  • 📦 --bundle-logs — zip up redacted logs + diagnostics for bug reports with one command
  • 🌐 Local OpenAI-compatible APIwhisper-local --serve exposes POST /v1/audio/transcriptions on localhost:7777 for Cursor, Open WebUI, anything that speaks OpenAI Whisper API
  • 🛡️ Auto-recovery — silently reconnects when a USB mic is unplugged mid-recording
  • 🛡️ Crash reports — uncaught errors write a self-contained dump to disk
  • 🪟 System tray UI — model selection, mic selection, profile switch, diagnostics
  • 🍎 Cross-platform — Windows 10+, macOS

🚀 Quick Start

Install (Python 3.11–3.13)

git clone https://github.com/drajb/whisper-local.git
cd whisper-local
pip install -e .

Launch

Terminal whisper-local (or wl for short)
Double-click whisper-local.cmd (Windows)
Start Menu / Startup Create a shortcut to whisper-local.cmd, drop it in %APPDATA%\Microsoft\Windows\Start Menu\Programs\Startup\ for autostart

First launch downloads the tiny Whisper model (~75 MB) into your HuggingFace cache. After that, everything runs offline.

Use it

Action Windows macOS
Hold to record Ctrl+Win Fn+Ctrl
Stop & paste release key (push-to-talk) or Ctrl release or Fn
Stop & auto-send (Enter) Alt Option
Cancel Esc Shift
Voice command mode Alt+Win Fn+Command

Verify everything works

whisper-local --doctor

Runs through Python version, dependencies, config validation, audio devices, model cache, hotkey backend, and recent log errors. Exit 0 = clean.


🗣️ Voice Commands

Speak a trigger to run keyboard shortcuts, type snippets, or launch programs. Defined in:

  • Windows: %APPDATA%\whisperkey\commands.yaml
  • macOS: ~/.whisperkey/commands.yaml
commands:
  # Send a keyboard shortcut
  - trigger: "undo"
    hotkey: "ctrl+z"

  # Deliver pre-written text
  - trigger: "my email"
    type: "user@example.com"

  # Run a shell command
  - trigger: "open notepad"
    run: 'notepad.exe'

Edits hot-reload — no app restart required. See docs/voice-commands.md for the full guide.

⚠️ Voice commands with run: execute through your system shell with your user privileges. Only add commands you trust.


⚡ GPU Acceleration

On first launch, Whisper Local detects your GPU and offers one-press install of the required runtime libraries. Supports NVIDIA CUDA and AMD ROCm.

For manual setup or AMD RDNA 1, see docs/gpu-setup.md.


🌐 Local OpenAI-Compatible API

Whisper Local doubles as a drop-in local replacement for the OpenAI Whisper API — fully offline. Point any tool that speaks POST /v1/audio/transcriptions at it (Cursor, VS Code Continue, Open WebUI, n8n, custom scripts, anything else).

whisper-local --serve            # listens on http://127.0.0.1:7777
whisper-local --serve --serve-port 8080
# Drop-in compatible with the OpenAI SDK:
curl -X POST http://127.0.0.1:7777/v1/audio/transcriptions \
  -F file=@audio.wav -F model=whisper-1 -F response_format=text

Same Whisper model you use for dictation. Same GPU. No API key. No rate limit. No outgoing traffic.


🎛️ Profiles

Switch between presets from the tray icon → Profile:

Profile Behaviour
Dictation General-purpose voice typing, auto-paste on
Chat Push-to-talk, auto-paste + auto-send via Alt
Code Copy-only mode for editors, never auto-sends
Notes Quiet copy-to-clipboard, voice commands disabled

Edit or add new profiles in %APPDATA%\whisperkey\profiles.yaml.


🪟 Per-app rules

Different apps want different behaviour. Whisper Local detects the foreground window before delivering each transcription and matches it against rules in %APPDATA%\whisperkey\app_rules.yaml:

rules:
  # Chat apps: send the message immediately
  - match: ["slack.exe", "discord.exe"]
    auto_send: true

  # Code editors: never auto-send, copy only
  - match: ["code.exe", "cursor.exe"]
    auto_paste: false

  # Password managers: skip delivery entirely
  - match: ["1password.exe", "bitwarden.exe"]
    suppress: true

Hot-reloads — edit and the next transcription picks it up.

🧹 Optional LLM cleanup

If you have Ollama running locally, Whisper Local can pipe each transcript through a small local model for punctuation and capitalisation polish. Off by default and fully local — set postprocess.ollama.enabled: true in user_settings.yaml to enable.

postprocess:
  capitalize_first: true        # works without Ollama
  ensure_punctuation: true      # works without Ollama
  strip_filler_words: true      # works without Ollama
  ollama:
    enabled: false              # set true to opt in
    endpoint: http://localhost:11434
    model: llama3.2
    timeout: 5

⚙️ Configuration

Local settings live at:

  • Windows: %APPDATA%\whisperkey\user_settings.yaml
  • macOS: ~/.whisperkey/user_settings.yaml

Delete the file and restart to reset to defaults. Highlights:

Option Default Notes
whisper.model tiny Any model from whisper.models. Larger = more accurate, slower
whisper.device cpu cpu or cuda (NVIDIA/AMD)
whisper.compute_type int8 int8/float16/float32
whisper.language auto Auto-detect or specific language code
whisper.hotwords [] Words the model should favour — names, jargon
hotkey.recording_hotkey ctrl+win Configurable
hotkey.recording_mode toggle toggle or push_to_talk
vad.vad_realtime_enabled true Auto-stop on silence
clipboard.auto_paste true false = copy only
clipboard.delivery_method paste paste (Ctrl+V) or type (direct injection)
voice_commands.enabled true Enable command mode
audio.host null WASAPI recommended on Windows for low latency

Full reference: config.defaults.yaml.


🛠️ CLI Reference

whisper-local                      # Run the app (or use `wl`)
whisper-local --setup              # Interactive setup wizard (model, mode, mic)
whisper-local --doctor             # Run diagnostics
whisper-local --stats              # Transcription history & time saved
whisper-local --version            # Print version
whisper-local --quit               # Stop the running instance
whisper-local --export-settings DIR        # Back up user_settings + commands
whisper-local --import-settings DIR        # Restore from a backup
whisper-local --export-transcripts FILE    # Dump history (.txt/.md/.csv)
whisper-local --import-vocab FOLDER        # Mine a folder for hotwords
whisper-local --settings           # Open the settings GUI (no YAML editing required)
whisper-local --history            # Browse and search transcript history
whisper-local --cheat-sheet        # Show your currently configured hotkeys
whisper-local --selftest           # Run an automated self-test (mic, model, transcription)
whisper-local --bundle-logs        # Create a redacted diagnostic zip for bug reports
whisper-local --serve              # Run a local OpenAI-compatible Whisper API on :7777
whisper-local --test               # Run a separate test instance (own mutex)

Launching while an instance is already running takes over — the old one is replaced cleanly, no manual quit needed.


🏗️ How it works

┌─────────────────────┐  ┌──────────────────┐  ┌─────────────────────┐
│  global-hotkeys /   │  │   sounddevice +  │  │  faster-whisper /   │
│  NSEvent (macOS)    │─▶│  500ms ring buf  │─▶│  ctranslate2 (GPU)  │
└─────────────────────┘  │  + TEN VAD       │  └──────────┬──────────┘
                         └──────────────────┘             │
                                                          ▼
                         ┌──────────────────┐  ┌─────────────────────┐
                         │  Voice command   │◀─│  Transcribed text   │
                         │  matcher         │  │                     │
                         └──────────────────┘  └──────────┬──────────┘
                                                          ▼
                                                ┌─────────────────────┐
                                                │  ctypes SendInput / │
                                                │  Quartz CGEvent     │
                                                │  → cursor           │
                                                └─────────────────────┘

🔒 Privacy pledge

Whisper Local makes the following network calls and no others:

  1. First launch only: downloads the Whisper model from huggingface.co into your local cache.
  2. GPU onboarding (opt-in): if you accept the GPU setup prompt, pip install pulls CUDA / ROCm runtime packages from PyPI / repo.radeon.com.

After setup, zero network traffic. Confirm by running whisper-local --doctor and inspecting the source — every network entry point lives in onboarding.py and is gated behind explicit user prompts.


📦 Tech stack

faster-whisper · ctranslate2 · sounddevice · ten-vad · pyperclip · pystray · ruamel.yaml · playsound3 Windows-only: global-hotkeys · pywin32 · ctypes SendInput macOS-only: pyobjc-framework-Quartz · pyobjc-framework-ApplicationServices


📚 Documentation

Hit a wall? Run whisper-local --doctor or whisper-local --selftest first — they catch 90% of issues.


🤝 Contributing

Contributions of all kinds are welcome — bug fixes, new features, docs improvements, or just opening an issue with a clear reproduction. This project is maintained on a best-effort basis with no SLA; please be patient with response times.

git clone https://github.com/drajb/whisper-local.git
pip install -e .
python -m unittest tests.test_smoke   # smoke suite — should report OK

See CONTRIBUTING.md for the full guide and CODE_OF_CONDUCT.md for community standards. By contributing you agree your code will be MIT licensed. Found a security issue? See SECURITY.md — please don't open a public issue.

Good first issues are tagged here. The full credit list is in AUTHORS.md.


☕ Support

Whisper Local is free and always will be. If it saves you time or a monthly subscription, consider starring the repo and sharing it with people who'd find it useful — it helps the project grow.

No pressure. Starring the repo and sharing it with people who'd find it useful is just as helpful.


🙏 Credit

Forked from whisper-key-local by Pin Wang — huge thanks to the original work that made this fork possible. The full list of credits, including every open-source library Whisper Local builds on, is in AUTHORS.md.

MIT licensed; original copyright preserved in LICENSE.


⭐ If you find this useful, please star the repo — it helps others discover it.

Maintained by Rohit Burani (@drajb)

Website · GitHub · Discussions · Report a bug · Request a feature

Tags: whisper · dictation · speech-to-text · voice-typing · transcription · ai-dictation · local-ai · offline · push-to-talk · voice-recognition · accessibility · faster-whisper · privacy · self-hosted · wispr-flow-alternative · dragon-naturallyspeaking-alternative · otter-alternative · ollama · voice-commands · windows · macos · python

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whisper_local-0.10.0.tar.gz (963.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

whisper_local-0.10.0-py3-none-any.whl (975.3 kB view details)

Uploaded Python 3

File details

Details for the file whisper_local-0.10.0.tar.gz.

File metadata

  • Download URL: whisper_local-0.10.0.tar.gz
  • Upload date:
  • Size: 963.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for whisper_local-0.10.0.tar.gz
Algorithm Hash digest
SHA256 7b6dc048e1f8c12b222f14ba4e857c1f1791c39b13307cac97e5deb8a1227a04
MD5 ac330bdb2477f6a3b8b6e74c5281180c
BLAKE2b-256 982b35613833087a7cddd57f1141812684bcdd39f94426912c08dc71d5c480c6

See more details on using hashes here.

Provenance

The following attestation bundles were made for whisper_local-0.10.0.tar.gz:

Publisher: release.yml on drajb/whisper-local

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file whisper_local-0.10.0-py3-none-any.whl.

File metadata

  • Download URL: whisper_local-0.10.0-py3-none-any.whl
  • Upload date:
  • Size: 975.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for whisper_local-0.10.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ed4397b547ff270fa296507c8f0fc7655125bbb4a7e31a63d465f6e13d3f4fa5
MD5 e3febd86abff2211d0249ccd1a8facc2
BLAKE2b-256 30135b5dd4aac65a292a703b910f00358b50db57cbd9b1943a8c0dc7973f8a38

See more details on using hashes here.

Provenance

The following attestation bundles were made for whisper_local-0.10.0-py3-none-any.whl:

Publisher: release.yml on drajb/whisper-local

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page