A simple CLI tool to transcribe large audio files using OpenAI Whisper with chunking support.
Project description
Audio Transcript CLI
A robust Python tool to transcribe large audio files (MP3, WAV, M4A, etc.) using OpenAI's Whisper model. It automatically chunks long audio files to avoid memory issues, making it perfect for transcribing long meetings, podcasts, or lectures on consumer hardware or Google Colab.
Features
- Automatic Chunking: Splits large audio files into 30-second segments (customizable) to prevent OOM errors.
- GPU Acceleration: Automatically utilizes CUDA or MPS (Apple Silicon) if available.
- Format Support: Supports all audio formats compatiable with
ffmpeg(MP3, WAV, M4A, FLAC, etc.) at 16kHz. - Easy CLI: Simple command-line interface for quick usage.
- Python API: Importable functions for integration into your own Python scripts.
Installation
You can install the package directly via pip:
pip install audio-transcript-cli
System Requirements
This package requires FFmpeg to process audio files.
- macOS:
brew install ffmpeg - Ubuntu/Debian:
sudo apt-get install ffmpeg - Windows: Download FFmpeg and add to PATH.
Usage
Command Line Interface
Transcribe a file directly from your terminal:
transcribe-audio path/to/audio/meeting.mp3
Options:
| Flag | Description | Default |
|---|---|---|
--model |
Whisper model size (tiny, base, small, medium, large, large-v2) |
openai/whisper-large-v2 |
--device |
Device to run on (cuda, cpu, mps). Auto-detected. |
Auto |
--chunk-size |
Chunk length in milliseconds. | 30000 (30s) |
--output, -o |
Output text filename. | transcript.txt |
Example:
transcribe-audio podcast.mp3 --model openai/whisper-medium -o podcast_transcript.txt
# Supports M4A files too
transcribe-audio voice_note.m4a --model openai/whisper-base
Python API
Use the transcriber in your own code:
from audio_transcript import transcribe
# Transcribe a file
result = transcribe(
audio_path="interview.mp3",
model_name="openai/whisper-medium",
chunk_length_ms=30000,
device="cuda" # or "cpu", "mps"
)
print(result)
# Save to file
with open("transcript.txt", "w") as f:
f.write(result)
Running on Google Colab
Use the following commands in a Colab notebook cell to run the transcriber:
# 1. Install system dependencies and the package
!apt-get install -y ffmpeg
!pip install git+https://github.com/azmatsiddique/audio-transcript-cli.git
# 2. Upload your file (drag and drop to the file pane on the left)
# 3. Running transcription
!transcribe-audio "your_file.mp3" --output "transcript.txt" --device cuda
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contact
Azmat Siddique
Email: azmat.siddique.98@gmail.com
GitHub: azmatsiddique
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file audio_transcript_cli-0.1.1.tar.gz.
File metadata
- Download URL: audio_transcript_cli-0.1.1.tar.gz
- Upload date:
- Size: 5.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f8e7ced107294afa12675548c755a1b59cf4a33a7f3abf9d8654587580e2cac8
|
|
| MD5 |
b6fa83fcce4f98d48e12ba5a03bea894
|
|
| BLAKE2b-256 |
b843625b016a9ed8aed2888c2031f929f937682fa85cbdda7e8d96d83f70fccb
|
File details
Details for the file audio_transcript_cli-0.1.1-py3-none-any.whl.
File metadata
- Download URL: audio_transcript_cli-0.1.1-py3-none-any.whl
- Upload date:
- Size: 6.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5f5c2df5125b316941ff301c636a205f107e54793793f1142b8c8e9a29e6a767
|
|
| MD5 |
6479e6ec7aaf40af3ff94d575e5607eb
|
|
| BLAKE2b-256 |
5177d4a135dc87605a1302474fed874eda9d85c932e909e6c2cfcdcc75578ec1
|