Skip to main content

WhisperBox - Record and transcribe audio with ease

Project description

WhisperBox

A powerful command-line tool for transcribing and analyzing audio recordings with AI assistance. Record meetings, lectures, or any audio directly from your terminal and get instant transcriptions with summaries, sentiment analysis, and topic detection.

Features

  • Live audio recording through terminal
  • Multiple transcription models via Whisper AI
  • AI-powered analysis including:
    • Text summarization
    • Sentiment analysis
    • Intent detection
    • Topic extraction
  • Support for multiple AI providers:
    • Anthropic Claude
    • OpenAI GPT-4
    • Groq
    • Ollama (local models)
  • Export to Markdown
  • Rich terminal UI with color-coded output
  • Configurable audio settings and output formats

Prerequisites

  • Python 3.10 or higher
  • FFmpeg (required for audio processing)
  • Poetry (for dependency management)

Installation

  1. Clone the repository:
git clone https://github.com/tooluseai/whisperbox.git
cd whisperbox
  1. Install dependencies using Poetry:
poetry install
  1. Install FFmpeg if not already installed:
# On macOS using Homebrew
brew install ffmpeg

# On Ubuntu/Debian
sudo apt-get install ffmpeg
  1. Install BlackHole (MacOS only)
brew install blackhole-2ch
  1. Configure your API keys:
    • Copy config.yaml to create your local configuration
    • Add your API keys for the services you plan to use:
      • OpenAI
      • Anthropic
      • Groq
    • Alternatively, set them as environment variables:
      • OPENAI_API_KEY
      • ANTHROPIC_API_KEY
      • GROQ_API_KEY

Usage

Setup

The first time you run the app, you will go through the setup wizard.

poetry run wb

Then select the Whisper model you want to use. The smaller models are faster and quicker to download but the larger models are more accurate. Download times will vary depending on your internet speed.

Then select the AI provider you want to use. Ollama runs locally and does not require an API key.

Then select the model you want to use.

Then you will have the option to view the config file location so you can customize additional settings. This directory also contains the whisper models you downloaded, the meeting, and the monologues.

Basic Transcription

  1. Start recording:
poetry run wb
  1. Press Enter to stop recording when finished.

Advanced Options

  • Specify a profile:
poetry run wb --profile monologue_to_keynote
  • Specify a Whisper model:
poetry run wb --model large
  • Enable full analysis (summary, sentiment, intent, topics):
poetry run wb --analyze
  • Enable verbose output:
poetry run wb --verbose

Configuration

The config.yaml file allows you to customize:

  • API settings for AI providers
  • Audio recording parameters
  • Transcription settings
  • Output formats and directories
  • Display preferences
  • AI prompt templates

See the example config.yaml for all available options.

Project Structure

whisperbox/
├── pyproject.toml       # Poetry project configuration
├── config.yaml         # Application configuration
├── main.py            # Entry point
└── src/
    ├── ai_service.py   # AI provider integrations
    ├── config.py       # Configuration management
    ├── transcribe.py   # Core transcription logic
    └── audio.py        # Audio recording utilities

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Whisper AI for the transcription models
  • Rich for the terminal UI
  • All the AI providers supported by this tool

Authors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whisperbox-1.0.0.tar.gz (34.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

whisperbox-1.0.0-py3-none-any.whl (38.1 kB view details)

Uploaded Python 3

File details

Details for the file whisperbox-1.0.0.tar.gz.

File metadata

  • Download URL: whisperbox-1.0.0.tar.gz
  • Upload date:
  • Size: 34.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for whisperbox-1.0.0.tar.gz
Algorithm Hash digest
SHA256 f737d5c737ad61ee3f4a3545733c44ecbbd94c057c50aea5594390ec4b25d81f
MD5 8a15bfbbc189ee2ee52fc87993d62e83
BLAKE2b-256 a67c2faa8090076f0d2a61801a866e502ba85674f6ae5a9d99d691ab950fea49

See more details on using hashes here.

File details

Details for the file whisperbox-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: whisperbox-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 38.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for whisperbox-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 caf543b3c0e53c2210bec1ba650760d156caedf5568ac778be8f3b218e562d1c
MD5 a2f7c4fdae2b1252de12316c58ea648d
BLAKE2b-256 876372691e53d4f74823e9171529ffb0891beddaba7684dd1123d17f570698a9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page