Skip to main content

Real-time microphone transcription with Deepgram using Python.

Project description

livetranscriber

A single-file helper with minimal external dependencies that streams microphone audio to Deepgram for real-time speech-to-text. This is available as a package on PyPI.

Features

  • Simple API - single LiveTranscriber class.
  • Configurable - every Deepgram LiveOptions parameter can be overridden via keyword arguments; sensible Nova-3 defaults are provided.
  • Mandatory callback - forces the calling code to supply a function that will be invoked for every final transcript chunk (empty / interim chunks are ignored).
  • Output capture - optional output_path writes each final transcript line to disk.
  • Pause / resume - you may call pause or resume from your callback.
  • Graceful shutdown - Ctrl-C or stop shuts everything down and releases resources.

Installation

Install the package directly from PyPI using pip:

pip install livetranscriber

Alternatively, if you are working with the source code or a specific requirements file, you can install the dependencies listed in requirements.txt:

deepgram-sdk>=4,<5
numpy>=1.24  # build-time requirement of sounddevice
sounddevice>=0.4

Install with uv (preferred) or plain pip:

uv venv .venv && source .venv/bin/activate
uv pip install -r requirements.txt

or

pip install -r requirements.txt
  1. Python Version:

    Python 3.11 is required.

Environment Setup

Export your Deepgram API key (see https://console.deepgram.com). For persistent access, add the following line to your shell profile file (e.g., ~/.zshrc, ~/.bashrc, or ~/.profile) and restart your terminal or source the file:

export DEEPGRAM_API_KEY="dg_…"

Example Usage

Here are examples demonstrating how to use the livetranscriber package.

Minimal Example

A basic example showing the essential setup:

from livetranscriber import LiveTranscriber

def simple_callback(text: str):
    print("NEW >", text)

tr = LiveTranscriber(callback=simple_callback)
tr.run()

Comprehensive Example

A more detailed example demonstrating various features like output to file and pause/resume:

import time
from livetranscriber import LiveTranscriber

def comprehensive_callback(text: str):
    print("Transcript received:", text)

    # Example: Pause transcription if a specific phrase is detected
    if "pause recording" in text.lower():
        print("Status: PAUSING...")
        transcriber.pause()
        print("Status: RECORDING PAUSED. Say 'resume recording' to continue.")

    # Example: Resume transcription if another phrase is detected
    if "resume recording" in text.lower():
        print("Status: RESUMING...")
        transcriber.resume()
        print("Status: RECORDING RESUMED.")

    # Example: Stop transcription if a stop phrase is detected
    if "stop recording" in text.lower():
        print("Status: STOPPING...")
        transcriber.stop()

# Instantiate with various options
output_file = "transcript_output.txt"
transcriber = LiveTranscriber(
    callback=comprehensive_callback,
    output_path=output_file, # Output transcript to a file
    model="nova-3-general", # Specify a model
    language="en-US",     # Specify a language
    punctuate=True,         # Enable punctuation
    smart_format=True       # Enable smart formatting (like numbers)
)

try:
    print(f"Starting transcription. Transcript will also be saved to {output_file}")
    print("Instructions: Press Ctrl+C to stop, or say 'pause recording', 'resume recording', or 'stop recording'.")
    transcriber.run() # Blocks until stop() is called or Ctrl-C is pressed
except KeyboardInterrupt:
    print("\nInterrupted by user. Stopping.")
finally:
    print("Transcription session ended.")

API

LiveTranscriber Class

High-level wrapper around Deepgram live transcription.

Parameters:

  • callback: A function that will be invoked for every final transcript. Must accept a single str argument. May be sync or async.
  • output_path (Optional): Path to a text file that will receive each final transcript line (UTF-8).
  • api_key (Optional): Your Deepgram API key. If omitted, the DEEPGRAM_API_KEY environment variable is used; failing both raises RuntimeError.
  • keepalive (Optional): If True (default) the WebSocket client sends keepalive pings.
  • **live_options_overrides (Optional): Any keyword argument that matches a LiveOptions field overrides the built-in defaults. For example, punctuate=False.

Methods:

  • run(): Run until .stop() or Ctrl-C.
  • stop(): Public request to shut down; may be called from any thread.
  • pause(): Pause writing transcripts to output_path. Note that the callback function will continue to receive transcription data while paused.
  • resume(): Resume writing transcripts to output_path.

Dependencies

  • deepgram-sdk
  • numpy
  • sounddevice

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

livetranscriber-0.2.2.tar.gz (7.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

livetranscriber-0.2.2-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file livetranscriber-0.2.2.tar.gz.

File metadata

  • Download URL: livetranscriber-0.2.2.tar.gz
  • Upload date:
  • Size: 7.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.4

File hashes

Hashes for livetranscriber-0.2.2.tar.gz
Algorithm Hash digest
SHA256 a2aab568f37987413194dee32f8b8b5d4145a5f27fd751ade821d28049300235
MD5 893c48e705048578ce25b5a4b9e6790e
BLAKE2b-256 c1b8b011cbe058e7305a0ec4022f8e0b3aeb1fd22850cfd323c8bf66b8527af9

See more details on using hashes here.

File details

Details for the file livetranscriber-0.2.2-py3-none-any.whl.

File metadata

File hashes

Hashes for livetranscriber-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9153b67df74e5d2275dce3e03134794ba14cac74d0df5376d81131e86956fa19
MD5 f2ea51d8f68b0f865fccf43b613d1115
BLAKE2b-256 7c8a80e3b0a5149aa429e768fba90bf641b0076aee96b07af555174ec5039ae6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page