A tool for transcribing audio files with optional speaker diarization
Project description
Audio Transcriber
Audio Transcriber is a Python tool for transcribing audio files with optional speaker diarization. It provides both a GUI and a programmable interface for easy audio transcription.
Features
- Transcribe audio files to text
- Optional speaker diarization
- User-friendly GUI
- Export results to HTML
Installation
Prerequisites
- Python 3.7 or higher
Setup
We use uv for managing virtual environments and package installation. Follow these steps to set up the project:
On macOS and Linux:
# Download the setup script
curl -O https://raw.githubusercontent.com/yourusername/audio_transcriber/main/setup.sh
# Make the script executable
chmod +x setup.sh
# Run the setup script
./setup.sh
On Windows:
# Download the setup script
Invoke-WebRequest -Uri https://raw.githubusercontent.com/yourusername/audio_transcriber/main/setup.ps1 -OutFile setup.ps1
# Set execution policy to run the script
Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
# Run the setup script
.\setup.ps1
These scripts will:
- Install
uvif it's not already installed - Create a virtual environment
- Activate the virtual environment
- Install all required packages
Usage
GUI
To run the GUI:
python examples/simple_gui.py
Programmatic Usage
from audio_transcriber import initialize_models, transcribe_audio, diarize_audio
# Initialize models
initialize_models()
# Transcribe audio
transcription = transcribe_audio("path/to/your/audio/file.mp3")
print(transcription)
# Diarize audio (if available)
diarization = diarize_audio("path/to/your/audio/file.mp3")
for turn, _, speaker in diarization.itertracks(yield_label=True):
print(f"Speaker {speaker}: {turn.start:.2f} - {turn.end:.2f}")
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Notes
While FFmpeg is not a direct requirement for this project, some underlying libraries may use it for certain audio processing tasks. If you encounter any issues with audio file handling, consider installing FFmpeg as an additional step.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file transcribify-0.1.0.tar.gz.
File metadata
- Download URL: transcribify-0.1.0.tar.gz
- Upload date:
- Size: 7.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4910232c6acb1b79c89dffe65467500d84f4e7c974fd8b18845276ef77055982
|
|
| MD5 |
9ce85e31aa9f5a733ff4b30879131e08
|
|
| BLAKE2b-256 |
191592328430ef1ec56fa150fbc35daa8affe9136f406a4680832a7d2a0fc9ff
|
File details
Details for the file transcribify-0.1.0-py3-none-any.whl.
File metadata
- Download URL: transcribify-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a075f7993f1802032eaa6fb5a3104eecebae47bc8557cab0e79758d7f035d50
|
|
| MD5 |
3f75ad79c8f0cfd7b4007622df87e7dd
|
|
| BLAKE2b-256 |
c1ffb7acf8a7b82efbd317f623c11cc822f5f8b4e1676b1fcb434fc687ac8be3
|