Skip to main content

Live subtitles for your life

Project description

LiveSRT: Live Speech-to-Text & Translation

LiveSRT is a modular tool for real-time speech-to-text transcription and translation. It captures audio from your microphone (or a file), streams it to state-of-the-art AI transcription providers, and uses Large Language Models (LLMs) to correct and translate the output on the fly, displaying the results in a rich Terminal User Interface (TUI).

📺 Demo

Here's a quick demonstration of LiveSRT in action:

asciicast

✨ Features

  • Live Transcription: Real-time speech-to-text using top-tier providers.
  • Live Translation: Translate speech instantly using LLMs (Local or Remote).
  • Rich TUI: A dedicated terminal interface to view live transcripts and translations side-by-side.
  • Intelligent Post-processing: Uses LLMs to clean up stutters, fix ASR errors, and separate speakers.
  • Audio Sources: Support for microphones and audio file replay (via ffmpeg).
  • Configurable: Uses a YAML configuration file for reproducible setups.

🔌 Supported Providers

Transcription (ASR)

  • AssemblyAI (Streaming API) - Default
  • ElevenLabs (Realtime Speech-to-Text)
  • Speechmatics (Realtime API)

Translation (LLMs)

  • Local LLMs: Runs locally via llama.cpp (e.g., Ministral, Qwen).
  • Remote LLMs: Support for Groq, Mistral, Google Gemini, DeepInfra, Ollama and OpenRouter.

🚀 Quick Start

As this is a PyPI package, you can run it directly without any installation using uvx (or install via pip).

1. Initialization

First, create a default configuration file. This allows you to select your audio source, transcription backend, and translation settings.

uvx livesrt init-config
# Created config.yml

2. Authentication

Set the API key for your chosen provider (default is AssemblyAI). Keys are stored securely in your system keyring.

uvx livesrt set-token assembly_ai

3. Run

Start the application using the configuration from config.yml.

uvx livesrt run

To enable translation (if disabled in config), you can use the flag:

uvx livesrt run --translate

⚙ Configuration

LiveSRT relies on a config.yml file. You can generate a template using livesrt init-config.

Key Configuration Sections:

  • Audio: Select mic or file. If using a microphone, find your device index using livesrt list-microphones.
  • Transcription: Choose between assembly_ai, elevenlabs, or speechmatics.
  • Translation: Toggle enabled/disabled, choose local-llm or remote-llm, and set source/target languages.
  • API Keys: Manage namespaces for multiple environments.

📝 Command Reference

All commands start with livesrt. Use --help on any command for more details.

livesrt init-config

Creates a default config.yml in the current directory.

  • --output, -o: Path to the output file (default: config.yml).

livesrt run [OPTIONS]

Runs the main application using the loaded configuration.

  • --config, -c: Path to the configuration file (default: config.yml).
  • --translate / --no-translate: Override the translation setting in the config.

livesrt set-token <provider> [OPTIONS]

Sets the API token for a specific provider securely.

  • <provider> choices:
    • ASR: assembly_ai, elevenlabs, speechmatics
    • LLM: groq, mistral, google, deepinfra, openrouter, ollama
  • --api-key, -k: (Optional) Your secret API key. If omitted, you are prompted securely.

livesrt list-microphones

Lists all available input microphone devices and their IDs. Use the resulting ID to update the device_index in your config.yml.

💡 Usage Scenarios

Using a specific microphone

  1. List devices: uvx livesrt list-microphones.
  2. Edit config.yml: Set audio.device_index to the desired ID.
  3. Run: uvx livesrt run.

Debugging with a file

Simulate a live stream using an audio file (requires ffmpeg):

  1. Edit config.yml:
    audio:
        source_type: file
        file_path: "./interview.wav"
    
  2. Run: uvx livesrt run.

Live Translation with Remote LLM

To offload processing to a fast remote API (e.g., Groq):

  1. Set the key: uvx livesrt set-token groq.
  2. Edit config.yml:
    translation:
        enabled: true
        engine: remote-llm
        remote_llm:
            lang_to: Spanish
            model: groq/llama-3.3-70b-versatile
    
  3. Run: uvx livesrt run.

🛠 Development

To set up a local development environment:

uv sync

Development Commands

The Makefile contains helpers for common tasks:

  • make format: Formats the code using ruff format.
  • make lint: Lints the code using ruff check --fix.
  • make types: Performs static type checking using mypy.
  • make prettier: Formats Markdown and source files using prettier.
  • make clean: Runs all formatters, linters, and type checkers.

🏗 Code Structure

  • src/livesrt/cli.py: Entry point and CLI logic using click.
  • src/livesrt/containers.py: Dependency Injection container used to wire components based on configuration.
  • src/livesrt/tui.py: The Textual-based UI implementation.
  • src/livesrt/transcribe/: Audio capture and ASR logic.
    • transcripters/: Implementations for AssemblyAI, ElevenLabs, Speechmatics.
    • audio_sources/: Mic (pyaudio) and File (ffmpeg) sources.
  • src/livesrt/translate/: Translation logic.
    • local_llm.py: Wraps llama_cpp for local inference.
    • remote_llm.py: Wraps httpx for OpenAI-compatible APIs.
    • base.py: Handles conversation context and tool-use for accurate translations.

📜 License

This project is licensed under the WTFPL (Do What The Fuck You Want To Public License).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

livesrt-0.1.0.tar.gz (31.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

livesrt-0.1.0-py3-none-any.whl (43.3 kB view details)

Uploaded Python 3

File details

Details for the file livesrt-0.1.0.tar.gz.

File metadata

  • Download URL: livesrt-0.1.0.tar.gz
  • Upload date:
  • Size: 31.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for livesrt-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2287491a43679a2b6f8ac2d71a8235fe1e24254efb8e8ff2b68e3c6713b377f1
MD5 dc1072b59e4fb9b043f48c1403211509
BLAKE2b-256 f5696a2704cb030ad2144cb5efd7a95129a8c5e9cdfd2fda1298236a89dbe5b2

See more details on using hashes here.

File details

Details for the file livesrt-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: livesrt-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 43.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for livesrt-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b09b0290de270a6c276fdcff48efa487773f45ffe3708431e1884411fae66b0e
MD5 b17422fee43166eb7e6c9d120f9a8aa5
BLAKE2b-256 7bd3408a170cd522d46514aef52981442cd4787368b422750a873d2fa0ab0ffa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page