Real-time audio transcription for video streaming with Firefox browser integration
Project description
🎙️ LiveCaption
Real-time audio transcription for video streaming with Firefox browser integration
LiveCaption captures system audio and transcribes it in real-time using state-of-the-art Whisper models. Perfect for Japanese anime, streaming content, and multilingual videos.
✨ Features
- Real-time transcription - See subtitles as you watch
- Firefox integration - One-click recording from browser toolbar
- Multiple Whisper models:
- Kotoba Whisper v2.0 - Best for Japanese (recommended)
- Whisper Large v3 - Best for English and 99+ other languages
- Anime Whisper - Specialized for anime/games
- SRT output - Standard subtitle format for video players
- Voice Activity Detection - Accurate timestamp alignment
- Command-line & browser modes - Use from terminal or Firefox extension
📋 System Requirements
- Operating System: Linux (tested on Fedora, Ubuntu)
- Browser: Firefox 91+ (for browser extension)
- Audio System: PipeWire or PulseAudio
- Python: 3.9 or higher
- RAM: 8GB minimum, 16GB recommended
- GPU: NVIDIA GPU recommended (~4GB VRAM), CPU mode available
🚀 Installation
Method 1: pip (Recommended)
pip install livecaption
livecaption-setup
Note: After pip install, you must run livecaption-setup to register the Firefox native messaging host.
Method 2: From Source
git clone https://github.com/b-tok/LiveCaption.git
cd LiveCaption
./install.sh
First-Time Model Download
- AI models are 1-6GB each and download on first use
- First download takes 5-30 minutes depending on internet speed
- Models are cached locally for subsequent runs
📖 Usage
Browser Extension
-
Install Firefox Extension:
- Firefox Add-ons Store (recommended)
- Or download
.xpifrom GitHub Releases
-
Click the LiveCaption icon in Firefox toolbar
-
Select your settings:
- Model:
kotoba-v2.0for Japanese,large-v3for English - Audio Source: Usually auto-detected
- Output File: Where to save the SRT (default:
~/Documents/LiveCaption/recording_<timestamp>.srt)
- Model:
-
Click "Start Recording" and play your video
-
Click "Stop Recording" to save the SRT file
Command Line
# Basic usage (Japanese content)
livecaption --model kotoba-v2.0 --output subtitles.srt
# English/multilingual content
livecaption --model large-v3 --output subtitles.srt
# Anime/games (Japanese)
livecaption --model anime-whisper --output anime.srt
# List all available models
livecaption --list-models
# Get help
livecaption --help
Workflow:
- Run the command
- Start playing audio (YouTube, Netflix, local video, etc.)
- Press Ctrl+C to stop recording
- Find your subtitles in the output file
⚙️ Models
| Model | Best For | Size | Languages | Recommended Use |
|---|---|---|---|---|
kotoba-v2.0 |
Japanese | ~4GB | Japanese | Best for Japanese content |
large-v3 |
Multilingual | ~6GB | 99+ languages | Best for English/other languages |
anime-whisper |
Anime/Games | ~4GB | Japanese | Anime, visual novels, games |
kotoba-v1.0 |
Japanese | ~4GB | Japanese | Older, more stable |
medium |
Fast | ~3GB | Multilingual | Faster but less accurate |
Recommendation: Use kotoba-v2.0 for Japanese, large-v3 for everything else.
🗑️ Uninstallation
# Complete uninstall (recommended)
livecaption-uninstall
# Alternative method
python -m livecaption.uninstaller
Note: Manually remove the Firefox extension from about:addons if installed.
📝 Configuration
Settings are stored in ~/.config/livecaption/config.json:
{
"language": "ja",
"model": "kotoba-v2.0",
"device": "auto",
"output_dir": "~/Documents/LiveCaption",
"chunk_duration": 30.0
}
📄 License
MIT License - see LICENSE file for details.
🙏 Acknowledgments
- Kotoba-Whisper - Japanese-optimized Whisper
- Anime-Whisper - Anime-specialized model
- OpenAI Whisper - Base transcription model
- Silero VAD - Voice activity detection
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file livecaption-1.0.1.tar.gz.
File metadata
- Download URL: livecaption-1.0.1.tar.gz
- Upload date:
- Size: 31.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d286086d916e00e660d9757f9a8b7ed48eacad6e35da2e99bbb1636f71ed05e5
|
|
| MD5 |
1b1785451d52ef255d633ede76dd5e66
|
|
| BLAKE2b-256 |
64464fdb1ff0092a0ca5a3da5a8d857bffb932e61906ba1b1488b7d1c9dda9e3
|
File details
Details for the file livecaption-1.0.1-py3-none-any.whl.
File metadata
- Download URL: livecaption-1.0.1-py3-none-any.whl
- Upload date:
- Size: 31.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d2bc55da2c41e4ca1d07fe346d21713114031f57e33a3bd9d3c7de04a505408
|
|
| MD5 |
439760dc6982c0f00d9adf9e4a71b863
|
|
| BLAKE2b-256 |
f42a8f77ccadc624ebdbc21b7bf9170aaaa1db997d54796f33f04bd0a607cc1e
|