Skip to main content

Privacy-first desktop transcription app using local Whisper models

Project description

Local Transcription Studio

Privacy-first audio/video transcription powered by OpenAI's Whisper—100% local, no cloud uploads, no subscriptions.

What is this?

Local Transcription Studio is a desktop application that transcribes audio and video files on your machine using OpenAI's Whisper model. Your files never leave your computer. The app handles speaker diarization, generates timestamps, and exports transcripts in multiple formats (SRT, VTT, TXT)—filling the gap between expensive cloud services and command-line tools.

Features

  • 100% Local Processing – Whisper runs on your machine; no data uploaded to the cloud
  • Drag-and-Drop Interface – Drop media files directly onto the app to transcribe
  • Speaker Diarization – Identify and label different speakers in conversations
  • Multiple Export Formats – Save transcripts as SRT, VTT, or TXT
  • Timestamps – Accurate timing for every phrase in the transcript
  • No Subscriptions – One-time setup, unlimited transcription
  • Privacy by Design – Your audio stays private; no tracking or data collection

Quick Start

Installation

  1. Clone the repository:

    git clone https://github.com/yourusername/local-transcription-studio.git
    cd local-transcription-studio
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Set up environment variables:

    cp .env.example .env
    
  4. Start the application:

    python -m local_transcription_studio.main
    

The app opens in your default browser at http://localhost:5000.

Usage

  1. Load Media – Drag and drop an audio or video file (MP3, WAV, MP4, etc.) into the interface
  2. Configure – (Optional) Adjust settings like language, speaker detection sensitivity
  3. Transcribe – Click "Transcribe" and wait for processing to complete
  4. Export – Download your transcript in your preferred format (SRT, VTT, or TXT)

Example Workflow

1. Drop video.mp4 onto the app
2. Select "English" language
3. Enable speaker diarization
4. Click Transcribe (~2-5 min depending on file length)
5. Export as SRT for video editing, or TXT for sharing

Tech Stack

  • Backend: Python, Flask
  • Transcription Engine: OpenAI Whisper
  • Frontend: HTML5, JavaScript
  • Audio Processing: ffmpeg
  • Testing: pytest
  • Packaging: setuptools

License

MIT – See LICENSE for details.


Want to learn more? Check out OVERVIEW.md for architecture details or MONETIZATION.md for business model information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

local_transcription_studio-0.1.0.tar.gz (15.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

local_transcription_studio-0.1.0-py3-none-any.whl (17.1 kB view details)

Uploaded Python 3

File details

Details for the file local_transcription_studio-0.1.0.tar.gz.

File metadata

File hashes

Hashes for local_transcription_studio-0.1.0.tar.gz
Algorithm Hash digest
SHA256 93dd0e54084dfad637df0ee669c57c0887130dd5eafe217f0d23a50df3bb006e
MD5 43cf873e8f2af81f092af013fce95cf4
BLAKE2b-256 49e67acb431e0f38f3df90558026ecfb17d1f684965f6e2e177c19e6db786435

See more details on using hashes here.

File details

Details for the file local_transcription_studio-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for local_transcription_studio-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fe09edf13a43586f3fafa25a01e0c5aaed71ae81196f2b8b21c3235f59f7a681
MD5 bb4e146afea0f46354cc318d31eabcf9
BLAKE2b-256 99c151569e9bf6c7c49a2d6130305777cb5539c2539477c1c2bbdf3528914823

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page