A command-line tool for transcribing audio files using OpenAI's Whisper model
Project description
Audio Transcribe
A command-line tool for transcribing audio files using OpenAI's Whisper model.
Features
- Transcribe individual audio files with customizable options
- Batch process multiple audio files in a directory
- Support for various audio formats (MP3, WAV, M4A, OGG, FLAC)
- Multiple Whisper model sizes (tiny, base, small, medium, large)
- Progress indicators with rich terminal output
- Optional JSON output with detailed transcription results
Installation
# Clone the repository
git clone https://github.com/samurmaykrr/audio_transcribe.git
cd audio_transcribe
# Install the package
pip install -e .
Usage
Single File Transcription
# Basic usage
audio-transcribe transcribe path/to/audio.mp3
# Specify output directory
audio-transcribe transcribe path/to/audio.mp3 -o output/directory
# Use a different model size
audio-transcribe transcribe path/to/audio.mp3 -m large
# Specify language (auto-detects if not specified)
audio-transcribe transcribe path/to/audio.mp3 -l en
# Save detailed results in JSON format
audio-transcribe transcribe path/to/audio.mp3 -j
# Translate to English
audio-transcribe transcribe path/to/audio.mp3 -t translate
Batch Processing
# Process all audio files in a directory
audio-transcribe batch path/to/audio/directory
# Specify output directory
audio-transcribe batch path/to/audio/directory -o output/directory
# Use a different model size
audio-transcribe batch path/to/audio/directory -m large
# Process specific file extensions
audio-transcribe batch path/to/audio/directory -e mp3 wav
Configuration Options
Model Sizes
tiny: Fastest, lowest accuracybase: Good balance of speed and accuracysmall: Better accuracy, slower than basemedium: High accuracy, slower processinglarge: Best accuracy, slowest processing
Supported Audio Formats
- MP3
- WAV
- M4A
- OGG
- FLAC
Output Options
- Text file output (default)
- Optional JSON output with timestamps and confidence scores
- Custom output directory specification
Requirements
- Python 3.7 or higher
- Dependencies:
- openai-whisper
- typer
- rich
License
This project is licensed under the MIT License - see the LICENSE.md file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file audio_transcribe-0.1.0.tar.gz.
File metadata
- Download URL: audio_transcribe-0.1.0.tar.gz
- Upload date:
- Size: 7.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f546f674efe5fe64f8d2bf7a2f04d779ae3f1744d602a3cc6161a653758e0b68
|
|
| MD5 |
149b8591badabbe25220e118b0809158
|
|
| BLAKE2b-256 |
36d68ebbfb169790f96073d6e5bb839b4caed810f2aa96b7a62c546b9f9c044a
|
File details
Details for the file audio_transcribe-0.1.0-py3-none-any.whl.
File metadata
- Download URL: audio_transcribe-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
671e0eb26408227b5913e8dcbd2ce74b4992878c70006c9c36ecb967ce31daab
|
|
| MD5 |
ca714fec7275ab5aa700296d70b14e4e
|
|
| BLAKE2b-256 |
c9fcbf78ee041158c8040760ad403bffdbff5129a95506ab68dabd9304c7cbe5
|