Skip to main content

You speak, it types - clean output on your clipboard in seconds

Project description

VoiceScript

You speak, it types — clean output on your clipboard in seconds, no matter what app you're in.

VoiceScript records your voice via microphone, transcribes it locally and offline using OpenAI Whisper (via faster-whisper), then sends the raw transcript to Claude (Anthropic) for cleanup. The polished result lands on your clipboard, ready to paste anywhere. A translucent HUD overlay sits at the bottom of your screen showing current state: idle, recording, or processing. Five output profiles let you shape the same spoken words into a plain transcript, a professional email, a Slack message, structured meeting notes, or clean code comments — all without leaving your keyboard.

Requirements

  • Python 3.11 or newer
  • ANTHROPIC_API_KEY — get a key at https://console.anthropic.com/
  • System browser engine for the HUD overlay (per OS):
    • Linux (Debian/Ubuntu): sudo apt install python3-gi-cairo libwebkit2gtk-4.1-0
    • Linux (Fedora/RHEL): sudo dnf install webkit2gtk4.1
    • Windows: Edge WebView2 — pre-installed on Windows 10 and later; if missing, download from Microsoft
    • macOS: No additional install required (native WebKit)
  • First run downloads the Whisper large-v3 model (~3 GB) — one-time, cached locally

Installation

From Test PyPI

pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ voicescript

From source

git clone https://github.com/your-org/voicescript.git
cd voicescript
pip install -e .

First Run

1. Set your API key:

export ANTHROPIC_API_KEY=your-key-here

2. Quick test (standalone, no daemon):

voicescript record

Speak, press Enter to stop. The transcript is cleaned up by Claude and copied to your clipboard. This mode is useful to verify your setup before running the daemon.

3. Start the daemon:

voicescript start

The daemon runs in the background, listens for F9, and shows the HUD overlay.

4. Trigger a recording:

Press F9 to start recording. Press F9 again to stop. The result is copied to your clipboard.

On Wayland, use the CLI command instead (see Wayland Note):

voicescript trigger

5. Cycle output profiles:

Hold F9 to cycle to the next profile. Or from the command line:

voicescript profile next

6. Check daemon status:

voicescript status

7. Stop the daemon:

voicescript stop

Output Profiles

VoiceScript shapes your spoken words into five distinct output formats. Tap F9 to record, hold F9 to cycle profiles (or use voicescript profile next).

Profile Icon Description
Transcript 📝 Light cleanup only — speech preserved verbatim, filler words and stutters removed
Email 📧 Polite professional email with greeting and closing signature
Slack 💬 Casual, informal message — no formal greetings or closings
Meeting Notes 📋 Bullet-point structured notes, organised by topic
Code Comment 💻 Clean technical documentation suitable for inline code comments

All profiles preserve code-switching between Polish and English — no word is ever translated.

Configuration

Config file location: ~/.config/voicescript/config.toml

The file is created automatically on first run with the defaults shown below. Edit it with any text editor.

[transcription]
model = "large-v3"

[state]
active_profile = "transcript"

Config keys

[transcription]

Key Default Description
model "large-v3" Whisper model size. Options: tiny, base, small, medium, large-v2, large-v3, large-v3-turbo. Smaller models are faster but less accurate.

[state]

Key Default Description
active_profile "transcript" Last-used profile. Updated automatically when you cycle profiles — you do not need to set this manually.

Profile prompt overrides

You can replace any profile's Claude system prompt with your own. Add a [profiles.PROFILENAME] section:

[profiles.email]
prompt = """
Write this as a very brief 2-sentence email. No greeting, no closing.
Output ONLY the email body.

Text: {raw}
"""

The {raw} placeholder is replaced with the Whisper transcript at runtime. If you omit {raw}, the transcript will not be passed to Claude.

Wayland Note

On native Wayland sessions, the global F9 hotkey is not available. This is a platform limitation — no stable Python library can capture global hotkeys on Wayland without elevated permissions.

Workarounds:

  1. Assign F9 in your desktop environment's keyboard settings to run:

    voicescript trigger
    

    In GNOME: Settings → Keyboard → Custom Shortcuts. In KDE: System Settings → Shortcuts → Custom Shortcuts.

  2. Allow evdev-based hotkey capture (requires logout):

    sudo usermod -aG input $USER
    

    Log out and back in. The daemon will detect this and use the evdev backend automatically.

  3. Cycle profiles from the command line at any time:

    voicescript profile next
    

VoiceScript prints a diagnostic at daemon startup if it detects a native Wayland session.

Troubleshooting

"Error: ANTHROPIC_API_KEY is not set"

export ANTHROPIC_API_KEY=your-key-here

Add this line to your shell profile (~/.bashrc, ~/.zshrc) to make it permanent.

"No audio device found" or microphone not working

Check that your microphone is connected and that your OS has granted permission to access it. Verify available devices:

python -c "import sounddevice; print(sounddevice.query_devices())"

"pywebview failed to import" or HUD not appearing

Install the system browser engine for your OS (see Requirements above). On Linux:

# Debian/Ubuntu
sudo apt install python3-gi-cairo libwebkit2gtk-4.1-0

# Fedora/RHEL
sudo dnf install webkit2gtk4.1

HUD window opens but stays blank or crashes

Check the HUD log:

cat ~/.cache/voicescript/hud.log

Common cause: missing system WebKit or WebView2. Reinstall the system browser engine.

"Daemon is not running" when using start/stop/status/trigger

The daemon process may have exited unexpectedly. Restart it:

voicescript stop
voicescript start

First run takes a long time

The Whisper large-v3 model (~3 GB) is downloaded on first use. This is a one-time download cached at ~/.cache/huggingface/. Subsequent runs load the model from disk.

Screenshots

The HUD overlay sits at the bottom of your screen and shows the current state:

State Description
Idle Translucent bar showing active profile name and icon
Recording Animated red pulse indicating audio capture in progress
Processing Spinning indicator while Whisper transcribes and Claude cleans up

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voicescript-0.1.0.tar.gz (682.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voicescript-0.1.0-py3-none-any.whl (24.9 kB view details)

Uploaded Python 3

File details

Details for the file voicescript-0.1.0.tar.gz.

File metadata

  • Download URL: voicescript-0.1.0.tar.gz
  • Upload date:
  • Size: 682.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for voicescript-0.1.0.tar.gz
Algorithm Hash digest
SHA256 83de540689cab28cf1ceedff5a8bc861798f806739e8a879c393babab61f4c9f
MD5 03f7b1380d160ed2cad9e6d8873a7937
BLAKE2b-256 e2e153ac87bf9e87417be1fd801650459f4035ef818a252061d322f92365d25f

See more details on using hashes here.

File details

Details for the file voicescript-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: voicescript-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 24.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for voicescript-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ab2bf56eb17c341cc7877a643fca50484374a3adf8688fb4123aed0454e88c79
MD5 c82a703358932ec8e4d1648ea8fb2d5c
BLAKE2b-256 019b8fc2ba0fe271dc5d5616056b9bc5575986566ae15fde0ce44033673374a6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page