A simple, command-line voice spelling and transcription tool.
Project description
VSpell
A simple, command-line voice spelling and transcription tool. Record a short audio clip, get it transcribed by Whisper, and have the text automatically copied to your clipboard.
VSpell listens for a few seconds, transcribes what it hears using the faster-whisper library, and copies the resulting text directly to your clipboard, streamlining voice-to-text workflows.
Features
- Fast Transcription: Quickly record and transcribe audio using the highly efficient
faster-whisperlibrary. - Clipboard Integration: Transcribed text is automatically copied to the clipboard for immediate pasting.
- Silence Detection: Avoids processing and transcribing empty audio clips, saving time and resources.
- Noise Calibration: Includes a one-time calibration step to accurately distinguish speech from ambient noise.
- Model Selection: Choose from different Whisper model sizes (
tiny,base,small,medium,large) to balance speed and accuracy. - Audio Playback: Listen to your last recording to verify what was captured.
Installation
Before installing, ensure you have the necessary system dependencies for audio recording.
For macOS:
brew install ffmpeg
For Debian/Ubuntu:
sudo apt-get install ffmpeg
To install VSpell, clone this repository and install the package using pip.
git clone https://github.com/vibe-technologies/vspell.git
cd vspell
pip install .
This will install the necessary dependencies and make the vspell command available in your terminal.
First-Time Setup: Calibration
For VSpell to work effectively, it needs to know what "silence" sounds like in your environment. Run the calibration command once before you start using it.
Find a quiet moment and run:
vspell --calibrate
Remain silent for the 2-second duration. This will measure your ambient noise level and set a threshold for silence detection. This value is saved in ~/.config/vspell/vspell_config.json. You can re-run this anytime your environment changes (e.g., you get a new microphone or move to a noisier room).
Usage
Once calibrated, using VSpell is simple.
Main Command
Just run the vspell command. It will listen for 2 seconds, transcribe what it hears, and copy the result to your clipboard.
vspell
Listening for 2 seconds...
Transcribing…
Transcribed: Hello, world.
Text copied to clipboard.
If you say nothing, it will detect the silence and stop.
vspell
Listening for 2 seconds...
No speech detected — nothing transcribed.
Command-Line Options
usage: vspell [-h] [--calibrate] [--playback [PLAYBACK]] [--duration DURATION] [--model MODEL]
VSpell - Voice spelling tool
options:
-h, --help show this help message and exit
--calibrate Calibrate ambient noise threshold
--punctuate Retain punctuation and original casing in transcribed text (default is to remove punctuation and lowercase)
--playback [PLAYBACK]
Playback recorded audio with optional volume
multiplier (default=1.0)
--duration DURATION Recording duration in seconds
--model MODEL Whisper model size [tiny, base, small, medium, large]
(default=medium)
Examples:
-
Record for 5 seconds:
vspell --duration 5
-
Use a different model for higher accuracy (e.g.,
large):vspell --model large
-
Playback the last recording at 1.5x volume:
vspell --playback 1.5
-
Transcribe text with punctuation and original casing:
vspell --punctuate
How It Works
- Record: When you run
vspell, it records audio from your default microphone for a set duration (default is 2 seconds) into a temporary.wavfile. - Analyze: It checks the audio's amplitude against the calibrated silence threshold. If it's below the threshold, the program exits.
- Transcribe: If speech is detected, the audio is passed to the
faster-whispermodel for transcription. The first time you use a model, it will be downloaded and cached locally in~/.cache/huggingface/hub. - Copy: The resulting text is copied to your system's clipboard.
Configuration
VSpell creates a configuration directory at ~/.config/vspell.
~/.config/vspell/vspell_config.json: Stores thesilence_thresholddetermined during calibration.~/.config/vspell/input.wav: The temporary audio file of your last recording.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vspell-0.1.2.tar.gz.
File metadata
- Download URL: vspell-0.1.2.tar.gz
- Upload date:
- Size: 8.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
61c536673ba860f4843d6a394ab148320a13434e06c0123840f7739f7f369a3f
|
|
| MD5 |
437ba69d6cadc4c40042ac28e79e2f7f
|
|
| BLAKE2b-256 |
9af7a527ddf6d94a3134669d2ddf98e6f7fb17d1c3e3a0208e9421e4b7f5bb8d
|
File details
Details for the file vspell-0.1.2-py3-none-any.whl.
File metadata
- Download URL: vspell-0.1.2-py3-none-any.whl
- Upload date:
- Size: 7.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d6c6e883ae9130cd4b8ab46bff067c523649211a8165f680c5404990266cf63c
|
|
| MD5 |
90d1ba4987e450185f3c1d2cf3216556
|
|
| BLAKE2b-256 |
7e1e6a86becea586195f0e43fe13ca13fd180bd8e18909a94f6bfb7550ec7594
|