You speak, it types - clean output on your clipboard in seconds
Project description
VoiceScript
You speak, it types — clean output on your clipboard in seconds, no matter what app you're in.
VoiceScript records your voice via microphone, transcribes it locally and offline using OpenAI Whisper (via faster-whisper), then sends the raw transcript to Claude (Anthropic) for cleanup. The polished result lands on your clipboard, ready to paste anywhere. A translucent HUD overlay sits at the bottom of your screen showing current state: idle, recording, or processing. Five output profiles let you shape the same spoken words into a plain transcript, a professional email, a Slack message, structured meeting notes, or clean code comments — all without leaving your keyboard.
Requirements
- Python 3.11 or newer
ANTHROPIC_API_KEY— get a key at https://console.anthropic.com/- System browser engine for the HUD overlay (per OS):
- Linux (Debian/Ubuntu):
sudo apt install python3-gi-cairo libwebkit2gtk-4.1-0 - Linux (Fedora/RHEL):
sudo dnf install webkit2gtk4.1 - Windows: Edge WebView2 — pre-installed on Windows 10 and later; if missing, download from Microsoft
- macOS: No additional install required (native WebKit)
- Linux (Debian/Ubuntu):
- First run downloads the Whisper
large-v3model (~3 GB) — one-time, cached locally
Installation
From Test PyPI
pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ voicescript
From source
git clone https://github.com/your-org/voicescript.git
cd voicescript
pip install -e .
First Run
1. Set your API key:
export ANTHROPIC_API_KEY=your-key-here
2. Quick test (standalone, no daemon):
voicescript record
Speak, press Enter to stop. The transcript is cleaned up by Claude and copied to your clipboard. This mode is useful to verify your setup before running the daemon.
3. Start the daemon:
voicescript start
The daemon runs in the background, listens for F9, and shows the HUD overlay.
4. Trigger a recording:
Press F9 to start recording. Press F9 again to stop. The result is copied to your clipboard.
On Wayland, use the CLI command instead (see Wayland Note):
voicescript trigger
5. Cycle output profiles:
Hold F9 to cycle to the next profile. Or from the command line:
voicescript profile next
6. Check daemon status:
voicescript status
7. Stop the daemon:
voicescript stop
Output Profiles
VoiceScript shapes your spoken words into five distinct output formats. Tap F9 to record, hold F9 to cycle profiles (or use voicescript profile next).
| Profile | Icon | Description |
|---|---|---|
| Transcript | 📝 | Light cleanup only — speech preserved verbatim, filler words and stutters removed |
| 📧 | Polite professional email with greeting and closing signature | |
| Slack | 💬 | Casual, informal message — no formal greetings or closings |
| Meeting Notes | 📋 | Bullet-point structured notes, organised by topic |
| Code Comment | 💻 | Clean technical documentation suitable for inline code comments |
All profiles preserve code-switching between Polish and English — no word is ever translated.
Configuration
Config file location: ~/.config/voicescript/config.toml
The file is created automatically on first run with the defaults shown below. Edit it with any text editor.
[transcription]
model = "large-v3"
[state]
active_profile = "transcript"
Config keys
[transcription]
| Key | Default | Description |
|---|---|---|
model |
"large-v3" |
Whisper model size. Options: tiny, base, small, medium, large-v2, large-v3, large-v3-turbo. Smaller models are faster but less accurate. |
[state]
| Key | Default | Description |
|---|---|---|
active_profile |
"transcript" |
Last-used profile. Updated automatically when you cycle profiles — you do not need to set this manually. |
Profile prompt overrides
You can replace any profile's Claude system prompt with your own. Add a [profiles.PROFILENAME] section:
[profiles.email]
prompt = """
Write this as a very brief 2-sentence email. No greeting, no closing.
Output ONLY the email body.
Text: {raw}
"""
The {raw} placeholder is replaced with the Whisper transcript at runtime. If you omit {raw}, the transcript will not be passed to Claude.
Wayland Note
On native Wayland sessions, the global F9 hotkey is not available. This is a platform limitation — no stable Python library can capture global hotkeys on Wayland without elevated permissions.
Workarounds:
-
Assign F9 in your desktop environment's keyboard settings to run:
voicescript triggerIn GNOME: Settings → Keyboard → Custom Shortcuts. In KDE: System Settings → Shortcuts → Custom Shortcuts.
-
Allow evdev-based hotkey capture (requires logout):
sudo usermod -aG input $USER
Log out and back in. The daemon will detect this and use the evdev backend automatically.
-
Cycle profiles from the command line at any time:
voicescript profile next
VoiceScript prints a diagnostic at daemon startup if it detects a native Wayland session.
Troubleshooting
"Error: ANTHROPIC_API_KEY is not set"
export ANTHROPIC_API_KEY=your-key-here
Add this line to your shell profile (~/.bashrc, ~/.zshrc) to make it permanent.
"No audio device found" or microphone not working
Check that your microphone is connected and that your OS has granted permission to access it. Verify available devices:
python -c "import sounddevice; print(sounddevice.query_devices())"
"pywebview failed to import" or HUD not appearing
Install the system browser engine for your OS (see Requirements above). On Linux:
# Debian/Ubuntu
sudo apt install python3-gi-cairo libwebkit2gtk-4.1-0
# Fedora/RHEL
sudo dnf install webkit2gtk4.1
HUD window opens but stays blank or crashes
Check the HUD log:
cat ~/.cache/voicescript/hud.log
Common cause: missing system WebKit or WebView2. Reinstall the system browser engine.
"Daemon is not running" when using start/stop/status/trigger
The daemon process may have exited unexpectedly. Restart it:
voicescript stop
voicescript start
First run takes a long time
The Whisper large-v3 model (~3 GB) is downloaded on first use. This is a one-time download cached at ~/.cache/huggingface/. Subsequent runs load the model from disk.
Screenshots
The HUD overlay sits at the bottom of your screen and shows the current state:
| State | Description |
|---|---|
| Idle | Translucent bar showing active profile name and icon |
| Recording | Animated red pulse indicating audio capture in progress |
| Processing | Spinning indicator while Whisper transcribes and Claude cleans up |
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file voicescript-0.1.0.tar.gz.
File metadata
- Download URL: voicescript-0.1.0.tar.gz
- Upload date:
- Size: 682.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
83de540689cab28cf1ceedff5a8bc861798f806739e8a879c393babab61f4c9f
|
|
| MD5 |
03f7b1380d160ed2cad9e6d8873a7937
|
|
| BLAKE2b-256 |
e2e153ac87bf9e87417be1fd801650459f4035ef818a252061d322f92365d25f
|
File details
Details for the file voicescript-0.1.0-py3-none-any.whl.
File metadata
- Download URL: voicescript-0.1.0-py3-none-any.whl
- Upload date:
- Size: 24.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab2bf56eb17c341cc7877a643fca50484374a3adf8688fb4123aed0454e88c79
|
|
| MD5 |
c82a703358932ec8e4d1648ea8fb2d5c
|
|
| BLAKE2b-256 |
019b8fc2ba0fe271dc5d5616056b9bc5575986566ae15fde0ce44033673374a6
|