Skip to main content

Real-time audio transcription for video streaming with Firefox browser integration

Project description

🎙️ LiveCaption

Real-time audio transcription for video streaming with Firefox browser integration

LiveCaption captures system audio and transcribes it in real-time using state-of-the-art Whisper models. Perfect for Japanese anime, streaming content, and multilingual videos.

License Python Firefox

✨ Features

  • Real-time transcription - See subtitles as you watch
  • Firefox integration - One-click recording from browser toolbar
  • Multiple Whisper models:
    • Kotoba Whisper v2.0 - Best for Japanese (recommended)
    • Whisper Large v3 - Best for English and 99+ other languages
    • Anime Whisper - Specialized for anime/games
  • SRT output - Standard subtitle format for video players
  • Voice Activity Detection - Accurate timestamp alignment
  • Command-line & browser modes - Use from terminal or Firefox extension

📋 System Requirements

  • Operating System: Linux (tested on Fedora, Ubuntu)
  • Browser: Firefox 91+ (for browser extension)
  • Audio System: PipeWire or PulseAudio
  • Python: 3.9 or higher
  • RAM: 8GB minimum, 16GB recommended
  • GPU: NVIDIA GPU recommended (~4GB VRAM), CPU mode available

🚀 Installation

Method 1: pip (Recommended)

pip install livecaption
livecaption-setup

Note: After pip install, you must run livecaption-setup to register the Firefox native messaging host.

Method 2: From Source

git clone https://github.com/b-tok/LiveCaption.git
cd LiveCaption
./install.sh

First-Time Model Download

  • AI models are 1-6GB each and download on first use
  • First download takes 5-30 minutes depending on internet speed
  • Models are cached locally for subsequent runs

📖 Usage

Browser Extension

  1. Install Firefox Extension:

  2. Click the LiveCaption icon in Firefox toolbar

  3. Select your settings:

    • Model: kotoba-v2.0 for Japanese, large-v3 for English
    • Audio Source: Usually auto-detected
    • Output File: Where to save the SRT (default: ~/Documents/LiveCaption/recording_<timestamp>.srt)
  4. Click "Start Recording" and play your video

  5. Click "Stop Recording" to save the SRT file

Command Line

# Basic usage (Japanese content)
livecaption --model kotoba-v2.0 --output subtitles.srt

# English/multilingual content
livecaption --model large-v3 --output subtitles.srt

# Anime/games (Japanese)
livecaption --model anime-whisper --output anime.srt

# List all available models
livecaption --list-models

# Get help
livecaption --help

Workflow:

  1. Run the command
  2. Start playing audio (YouTube, Netflix, local video, etc.)
  3. Press Ctrl+C to stop recording
  4. Find your subtitles in the output file

⚙️ Models

Model Best For Size Languages Recommended Use
kotoba-v2.0 Japanese ~4GB Japanese Best for Japanese content
large-v3 Multilingual ~6GB 99+ languages Best for English/other languages
anime-whisper Anime/Games ~4GB Japanese Anime, visual novels, games
kotoba-v1.0 Japanese ~4GB Japanese Older, more stable
medium Fast ~3GB Multilingual Faster but less accurate

Recommendation: Use kotoba-v2.0 for Japanese, large-v3 for everything else.

🗑️ Uninstallation

# Complete uninstall (recommended)
livecaption-uninstall

# Alternative method
python -m livecaption.uninstaller

Note: Manually remove the Firefox extension from about:addons if installed.

📝 Configuration

Settings are stored in ~/.config/livecaption/config.json:

{
  "language": "ja",
  "model": "kotoba-v2.0",
  "device": "auto",
  "output_dir": "~/Documents/LiveCaption",
  "chunk_duration": 30.0
}

📄 License

MIT License - see LICENSE file for details.

🙏 Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

livecaption-1.0.1.tar.gz (31.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

livecaption-1.0.1-py3-none-any.whl (31.1 kB view details)

Uploaded Python 3

File details

Details for the file livecaption-1.0.1.tar.gz.

File metadata

  • Download URL: livecaption-1.0.1.tar.gz
  • Upload date:
  • Size: 31.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for livecaption-1.0.1.tar.gz
Algorithm Hash digest
SHA256 d286086d916e00e660d9757f9a8b7ed48eacad6e35da2e99bbb1636f71ed05e5
MD5 1b1785451d52ef255d633ede76dd5e66
BLAKE2b-256 64464fdb1ff0092a0ca5a3da5a8d857bffb932e61906ba1b1488b7d1c9dda9e3

See more details on using hashes here.

File details

Details for the file livecaption-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: livecaption-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 31.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for livecaption-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7d2bc55da2c41e4ca1d07fe346d21713114031f57e33a3bd9d3c7de04a505408
MD5 439760dc6982c0f00d9adf9e4a71b863
BLAKE2b-256 f42a8f77ccadc624ebdbc21b7bf9170aaaa1db997d54796f33f04bd0a607cc1e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page