WhisperBox - Record and transcribe audio with ease
Project description
WhisperBox
A powerful command-line tool for transcribing and analyzing audio recordings with AI assistance. Record meetings, lectures, or any audio directly from your terminal and get instant transcriptions with summaries, sentiment analysis, and topic detection.
Available in two versions:
- Free: Command-line interface (CLI) version - Open source and MIT licensed
- Paid: GUI version with native desktop interface - One-time $10 purchase to support development
The GUI version offers the same powerful features in a user-friendly interface, perfect for those who prefer not to use the terminal. Purchase helps support ongoing development of free and open source AI tools. Purchase WhisperBox GUI version here
Features
- Live audio recording through terminal
- Multiple transcription models via Whisper AI
- AI-powered analysis including:
- Meeting summaries and action items
- Keynote presentation generation
- Speech quality feedback
- Modify voice input for any application
- Support for multiple AI providers:
- Anthropic
- OpenAI
- Groq
- Ollama (local models)
- Export to Markdown
- Rich terminal UI with color-coded output
- Configurable audio settings and output formats
Prerequisites
- Python 3.10 or higher
- FFmpeg (required for audio processing)
- Poetry (for dependency management)
Installation
- Install FFmpeg if not already installed:
# On macOS using Homebrew
brew install ffmpeg
# On Ubuntu/Debian
sudo apt-get install ffmpeg
- Install BlackHole for system audio capture (MacOS only)
brew install blackhole-2ch
- Install portaudio for audio capture
brew install portaudio
- Install whisperbox from pip
pip install whisperbox
Usage
Setup
The first time you run the app, you will go through the setup wizard.
wb
Then select the Whisper model you want to use. The smaller models are faster and quicker to download but the larger models are more accurate. Download times will vary depending on your internet speed.
Then select the AI provider you want to use. Ollama runs locally and does not require an API key.
Then select the model you want to use.
Then you will have the option to view the config file location so you can customize additional settings. This directory also contains the whisper models you downloaded and the data directory that contains all your recordings and transcriptions.
Basic Transcription
- Start recording:
wb
- Press Enter to stop recording when finished.
Advanced Options
- Specify a profile:
wb --profile monologue_to_keynote
- Specify a Whisper model:
wb --model large
- Enable full analysis (summary, sentiment, intent, topics):
wb --analyze
- Enable verbose output:
wb --verbose
Configuration
The config.yaml file allows you to customize:
- API settings for AI providers
- Audio recording parameters
- Transcription settings
- Output formats and directories
- Display preferences
- AI prompt templates
See the example config.yaml for all available options.
Extending WhisperBox
WhisperBox can be customized to handle your recordings exactly how you want. There are two main ways to extend it:
- Profiles: Define what to do with your recordings
- Scripts: Create custom actions for your profiles
Creating Custom Profiles
A profile is a simple YAML file that tells WhisperBox:
- What to do with your recording
- How to process the transcript
- Where to send the results
To create a profile:
- Create a new
.yamlfile in theprofiles/folder (e.g.,my_profile.yaml) - Add these three main sections:
# The name that appears in WhisperBox
name: my_profile
# Instructions for processing your recording
prompt: >
Here's where you tell WhisperBox what to do with your recording.
For example: "Create a summary with these key points..."
The recording will appear here: {transcript}
# What to do with the results
actions:
- script: output_to_markdown # Save as a file
- script: copy_to_clipboard # Copy to clipboard
Built-in Actions
WhisperBox comes with several ready-to-use actions:
output_to_markdown: Save as a Markdown filecopy_to_clipboard: Copy to your clipboardoutput_to_terminal: Show in the terminalsend_post_request: Send to a webhook URL
You can use multiple actions in a single profile:
actions:
# Save the file
- script: output_to_markdown
config:
filename: meeting_notes.md
# Also copy it to clipboard
- script: copy_to_clipboard
Creating Custom Scripts
Want to do something more custom? You can create your own action scripts:
- Create a new Python file in the
scripts/folder (e.g.,my_script.py) - Add a
run_actionfunction that handles the text:
def run_action(text, config):
"""
text: The processed recording
config: Any settings from your profile
"""
print("I got the text:", text)
# Do whatever you want with the text here!
- Use it in your profile:
actions:
- script: my_script
config:
any_setting: value
Example: Meeting Summary Profile
Here's a complete example that creates meeting summaries:
name: meeting_summary
prompt: >
Create a clear summary of this meeting with:
1. Key topics and decisions
2. Action items and who's responsible
3. Important deadlines or dates
Meeting transcript: {transcript}
actions:
# Save as a file
- script: output_to_markdown
config:
filename: summary.md
# Also copy to clipboard
- script: copy_to_clipboard
Tips
- Test your profiles with short recordings first
- Check the
profiles/folder for more examples - Use multiple actions to create powerful workflows
- Keep your prompts clear and specific
Contributing
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Whisper AI for the transcription models
- Rich for the terminal UI
- All the AI providers supported by this tool
Authors
- Ty Fiero tyfierodev@gmail.com
- Mike Bird tooluseai@gmail.com
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file whisperbox-1.0.7.tar.gz.
File metadata
- Download URL: whisperbox-1.0.7.tar.gz
- Upload date:
- Size: 34.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
95b91432cbd9815ec1db10075e5e42c6263a659bc0203edc5e174ac5ffd63828
|
|
| MD5 |
b4d8293a031ada4075f2078ed9fa1141
|
|
| BLAKE2b-256 |
8b979fdf1afc7e7ad73c899b3876a9492f2021e794b615277c88ad6dcac0dbd2
|
File details
Details for the file whisperbox-1.0.7-py3-none-any.whl.
File metadata
- Download URL: whisperbox-1.0.7-py3-none-any.whl
- Upload date:
- Size: 47.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
32857af0108feba823b9404c2fdcafb120f1bf08d9712ae257783b86a090f0f2
|
|
| MD5 |
4c2099a1778f04109a9325744fdd6c02
|
|
| BLAKE2b-256 |
9b44367b3d59691721a3c4db66bbcb0a788a0e09db45aa333df00610df3053da
|