Skip to main content

Real-time audio transcription for video streaming with Firefox browser integration

Project description

🎙️ LiveCaption

Real-time audio transcription for video streaming with Firefox browser integration

LiveCaption captures system audio and transcribes it in real-time using state-of-the-art Whisper models. Perfect for Japanese anime, streaming content, and multilingual videos.

License Python Firefox

✨ Features

  • Real-time transcription - See subtitles as you watch
  • Firefox integration - One-click recording from browser toolbar
  • Multiple Whisper models:
    • Kotoba Whisper v2.0 - Best for Japanese (recommended)
    • Whisper Large v3 - Best for English and 99+ other languages
    • Anime Whisper - Specialized for anime/games
  • SRT output - Standard subtitle format for video players
  • Voice Activity Detection - Accurate timestamp alignment
  • Command-line & browser modes - Use from terminal or Firefox extension

📋 System Requirements

  • Operating System: Linux (tested on Fedora, Ubuntu)
  • Browser: Firefox 91+ (for browser extension)
  • Audio System: PipeWire or PulseAudio
  • Python: 3.9 or higher
  • RAM: 8GB minimum, 16GB recommended
  • GPU: NVIDIA GPU recommended (~4GB VRAM), CPU mode available

🚀 Installation

Method 1: pip (Recommended)

pip install livecaption
livecaption-setup

Note: After pip install, you must run livecaption-setup to register the Firefox native messaging host.

Method 2: From Source

git clone https://github.com/b-tok/LiveCaption.git
cd LiveCaption
./install.sh

First-Time Model Download

  • AI models are 1-6GB each and download on first use
  • First download takes 5-30 minutes depending on internet speed
  • Models are cached locally for subsequent runs

📖 Usage

Browser Extension

  1. Install Firefox Extension:

  2. Click the LiveCaption icon in Firefox toolbar

  3. Select your settings:

    • Model: kotoba-v2.0 for Japanese, large-v3 for English
    • Audio Source: Usually auto-detected
    • Output File: Where to save the SRT (default: ~/Documents/LiveCaption/recording_<timestamp>.srt)
  4. Click "Start Recording" and play your video

  5. Click "Stop Recording" to save the SRT file

Command Line

# Basic usage (Japanese content)
livecaption --model kotoba-v2.0 --output subtitles.srt

# English/multilingual content
livecaption --model large-v3 --output subtitles.srt

# Anime/games (Japanese)
livecaption --model anime-whisper --output anime.srt

# List all available models
livecaption --list-models

# Get help
livecaption --help

Workflow:

  1. Run the command
  2. Start playing audio (YouTube, Netflix, local video, etc.)
  3. Press Ctrl+C to stop recording
  4. Find your subtitles in the output file

⚙️ Models

Model Best For Size Languages Recommended Use
kotoba-v2.0 Japanese ~4GB Japanese Best for Japanese content
large-v3 Multilingual ~6GB 99+ languages Best for English/other languages
anime-whisper Anime/Games ~4GB Japanese Anime, visual novels, games
kotoba-v1.0 Japanese ~4GB Japanese Older, more stable
medium Fast ~3GB Multilingual Faster but less accurate

Recommendation: Use kotoba-v2.0 for Japanese, large-v3 for everything else.

🗑️ Uninstallation

# Complete uninstall (recommended)
livecaption-uninstall

# Alternative method
python -m livecaption.uninstaller

Note: Manually remove the Firefox extension from about:addons if installed.

📝 Configuration

Settings are stored in ~/.config/livecaption/config.json:

{
  "language": "ja",
  "model": "kotoba-v2.0",
  "device": "auto",
  "output_dir": "~/Documents/LiveCaption",
  "chunk_duration": 30.0
}

📄 License

MIT License - see LICENSE file for details.

🙏 Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

livecaption-1.0.0.tar.gz (30.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

livecaption-1.0.0-py3-none-any.whl (29.9 kB view details)

Uploaded Python 3

File details

Details for the file livecaption-1.0.0.tar.gz.

File metadata

  • Download URL: livecaption-1.0.0.tar.gz
  • Upload date:
  • Size: 30.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for livecaption-1.0.0.tar.gz
Algorithm Hash digest
SHA256 567b30383f31ebc4eb5fe1ed443b9d816bfd4fde106c5362ba3ab1031455257b
MD5 acd6c430fdd29cdefeb82d68f1ab2122
BLAKE2b-256 cda038542872d3208808222f4d1c51519264461a7ab09ad4a650564174a6c2fd

See more details on using hashes here.

File details

Details for the file livecaption-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: livecaption-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 29.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for livecaption-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3991b1cf7e933f3c4dec8e03038fc7881c64c678e813e65369ff726b47da36e9
MD5 3fa833036b655dc46659b11513ee363d
BLAKE2b-256 7efc2346ecd6d9a597de85b281f43c59dd5c6baa29a2b814c5414299c4d9206e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page