A library for transcribing audio files using Whisper models
Project description
Whisper Transcriber
A Python library for transcribing audio files using Whisper models with intelligent silence detection and segmentation.
Installation
pip install whisper-transcriber
Requirements
- Python 3.7 or higher
- ffmpeg and ffprobe installed on your system
Features
- Intelligent silence detection for natural segmentation
- Adaptive audio analysis for optimal threshold detection
- High-quality transcription using Whisper models
- Support for various audio formats
- Optional SRT subtitle output
Usage
Command Line
# Basic usage
whisper-transcribe audio_file.mp3
# Advanced usage
whisper-transcribe audio_file.mp3 -m openai/whisper-small \
--min-segment 5 \
--max-segment 15 \
--silence-duration 0.2 \
--sample-rate 16000 \
--batch-size 8 \
--normalize \
--hf-token YOUR_HF_TOKEN \
--no-timestamps
Available Arguments:
input: Input audio file or directory (required)-o, --output: Output file path (optional)-m, --model: Whisper model to use (default: openai/whisper-small)--hf-token: HuggingFace API token--min-segment: Minimum segment length in seconds (default: 5)--max-segment: Maximum segment length in seconds (default: 15)--silence-duration: Minimum silence duration in seconds (default: 0.2)--sample-rate: Audio sample rate (default: 16000)--batch-size: Batch size for transcription (default: 8)--normalize: Normalize audio volume--no-text-normalize: Skip text normalization--no-timestamps: Don't print timestamps during processing
Python Library
from whisper_transcriber import WhisperTranscriber
# Initialize the transcriber
transcriber = WhisperTranscriber(model_name="openai/whisper-small", hf_token="YOUR_HF_TOKEN")
# Transcribe an audio file
results = transcriber.transcribe(
"audio_file.mp3",
min_segment=5,
max_segment=15,
silence_duration=0.2,
sample_rate=16000,
batch_size=8,
normalize=True,
normalize_text=True,
print_timestamps=True
)
# Optionally save to an SRT file
# If you want to save the transcription, provide an output path
results = transcriber.transcribe(
"audio_file.mp3",
output="transcript.srt"
)
# Access the transcription results
for i, segment in enumerate(results):
print(f"Segment {i+1}: {segment['transcript']}")
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
whisper_transcriber-0.1.4.tar.gz
(13.8 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file whisper_transcriber-0.1.4.tar.gz.
File metadata
- Download URL: whisper_transcriber-0.1.4.tar.gz
- Upload date:
- Size: 13.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b7ee80c05c871bd59b3641ea334de405653ffb325f20012e149d0c3eb8f682c8
|
|
| MD5 |
891620c33a343fd4e006b773d2de9d84
|
|
| BLAKE2b-256 |
68b0e1fae7338a26645e2e3f6b0c685ccddf3cf548fb0adc114daf22d044b37f
|
File details
Details for the file whisper_transcriber-0.1.4-py3-none-any.whl.
File metadata
- Download URL: whisper_transcriber-0.1.4-py3-none-any.whl
- Upload date:
- Size: 15.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0af7301d1a5760c13cb055fff21fb8bfdb64401ad73ef55bd606ffe4d137e4c7
|
|
| MD5 |
f7c165204ff3179c9775d79fbeb560e2
|
|
| BLAKE2b-256 |
c63fbc9bff7bc3c2a136cb8d3f2a4b31061ffbdef2342d3a16e1c3e8c9f3bc5a
|