Privacy-first desktop transcription app using local Whisper models
Project description
Local Transcription Studio
Privacy-first audio/video transcription powered by OpenAI's Whisper—100% local, no cloud uploads, no subscriptions.
What is this?
Local Transcription Studio is a desktop application that transcribes audio and video files on your machine using OpenAI's Whisper model. Your files never leave your computer. The app handles speaker diarization, generates timestamps, and exports transcripts in multiple formats (SRT, VTT, TXT)—filling the gap between expensive cloud services and command-line tools.
Features
- 100% Local Processing – Whisper runs on your machine; no data uploaded to the cloud
- Drag-and-Drop Interface – Drop media files directly onto the app to transcribe
- Speaker Diarization – Identify and label different speakers in conversations
- Multiple Export Formats – Save transcripts as SRT, VTT, or TXT
- Timestamps – Accurate timing for every phrase in the transcript
- No Subscriptions – One-time setup, unlimited transcription
- Privacy by Design – Your audio stays private; no tracking or data collection
Quick Start
Installation
-
Clone the repository:
git clone https://github.com/yourusername/local-transcription-studio.git cd local-transcription-studio
-
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables:
cp .env.example .env
-
Start the application:
python -m local_transcription_studio.main
The app opens in your default browser at http://localhost:5000.
Usage
- Load Media – Drag and drop an audio or video file (MP3, WAV, MP4, etc.) into the interface
- Configure – (Optional) Adjust settings like language, speaker detection sensitivity
- Transcribe – Click "Transcribe" and wait for processing to complete
- Export – Download your transcript in your preferred format (SRT, VTT, or TXT)
Example Workflow
1. Drop video.mp4 onto the app
2. Select "English" language
3. Enable speaker diarization
4. Click Transcribe (~2-5 min depending on file length)
5. Export as SRT for video editing, or TXT for sharing
Tech Stack
- Backend: Python, Flask
- Transcription Engine: OpenAI Whisper
- Frontend: HTML5, JavaScript
- Audio Processing: ffmpeg
- Testing: pytest
- Packaging: setuptools
License
MIT – See LICENSE for details.
Want to learn more? Check out OVERVIEW.md for architecture details or MONETIZATION.md for business model information.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file local_transcription_studio-0.1.0.tar.gz.
File metadata
- Download URL: local_transcription_studio-0.1.0.tar.gz
- Upload date:
- Size: 15.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
93dd0e54084dfad637df0ee669c57c0887130dd5eafe217f0d23a50df3bb006e
|
|
| MD5 |
43cf873e8f2af81f092af013fce95cf4
|
|
| BLAKE2b-256 |
49e67acb431e0f38f3df90558026ecfb17d1f684965f6e2e177c19e6db786435
|
File details
Details for the file local_transcription_studio-0.1.0-py3-none-any.whl.
File metadata
- Download URL: local_transcription_studio-0.1.0-py3-none-any.whl
- Upload date:
- Size: 17.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe09edf13a43586f3fafa25a01e0c5aaed71ae81196f2b8b21c3235f59f7a681
|
|
| MD5 |
bb4e146afea0f46354cc318d31eabcf9
|
|
| BLAKE2b-256 |
99c151569e9bf6c7c49a2d6130305777cb5539c2539477c1c2bbdf3528914823
|