Skip to main content

A simple CLI tool to transcribe large audio files using OpenAI Whisper with chunking support.

Project description

Audio Transcript CLI

PyPI version License: MIT

A robust Python tool to transcribe large audio files (MP3, WAV, M4A, etc.) using OpenAI's Whisper model. It automatically chunks long audio files to avoid memory issues, making it perfect for transcribing long meetings, podcasts, or lectures on consumer hardware or Google Colab.

Features

  • Automatic Chunking: Splits large audio files into 30-second segments (customizable) to prevent OOM errors.
  • GPU Acceleration: Automatically utilizes CUDA or MPS (Apple Silicon) if available.
  • Format Support: Supports all audio formats compatiable with ffmpeg (MP3, WAV, M4A, FLAC, etc.) at 16kHz.
  • Easy CLI: Simple command-line interface for quick usage.
  • Python API: Importable functions for integration into your own Python scripts.

Installation

You can install the package directly via pip:

pip install audio-transcript-cli

System Requirements

This package requires FFmpeg to process audio files.

  • macOS: brew install ffmpeg
  • Ubuntu/Debian: sudo apt-get install ffmpeg
  • Windows: Download FFmpeg and add to PATH.

Usage

Command Line Interface

Transcribe a file directly from your terminal:

transcribe-audio path/to/audio/meeting.mp3

Options:

Flag Description Default
--model Whisper model size (tiny, base, small, medium, large, large-v2) openai/whisper-large-v2
--device Device to run on (cuda, cpu, mps). Auto-detected. Auto
--chunk-size Chunk length in milliseconds. 30000 (30s)
--output, -o Output text filename. transcript.txt

Example:

transcribe-audio podcast.mp3 --model openai/whisper-medium -o podcast_transcript.txt
# Supports M4A files too
transcribe-audio voice_note.m4a --model openai/whisper-base

Python API

Use the transcriber in your own code:

from audio_transcript import transcribe

# Transcribe a file
result = transcribe(
    audio_path="interview.mp3",
    model_name="openai/whisper-medium",
    chunk_length_ms=30000,
    device="cuda" # or "cpu", "mps"
)

print(result)

# Save to file
with open("transcript.txt", "w") as f:
    f.write(result)

Running on Google Colab

Use the following commands in a Colab notebook cell to run the transcriber:

# 1. Install system dependencies and the package
!apt-get install -y ffmpeg
!pip install git+https://github.com/azmatsiddique/audio-transcript-cli.git

# 2. Upload your file (drag and drop to the file pane on the left)

# 3. Running transcription
!transcribe-audio "your_file.mp3" --output "transcript.txt" --device cuda

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

Azmat Siddique
Email: azmat.siddique.98@gmail.com GitHub: azmatsiddique

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audio_transcript_cli-0.1.1.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

audio_transcript_cli-0.1.1-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file audio_transcript_cli-0.1.1.tar.gz.

File metadata

  • Download URL: audio_transcript_cli-0.1.1.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.5

File hashes

Hashes for audio_transcript_cli-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f8e7ced107294afa12675548c755a1b59cf4a33a7f3abf9d8654587580e2cac8
MD5 b6fa83fcce4f98d48e12ba5a03bea894
BLAKE2b-256 b843625b016a9ed8aed2888c2031f929f937682fa85cbdda7e8d96d83f70fccb

See more details on using hashes here.

File details

Details for the file audio_transcript_cli-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for audio_transcript_cli-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5f5c2df5125b316941ff301c636a205f107e54793793f1142b8c8e9a29e6a767
MD5 6479e6ec7aaf40af3ff94d575e5607eb
BLAKE2b-256 5177d4a135dc87605a1302474fed874eda9d85c932e909e6c2cfcdcc75578ec1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page