SoberMind Offline Session Transcriber with Speaker Diarization

Project description

SoberMind Session Transcriber

An offline-first, private speech-to-text script utilizing OpenAI's Whisper models for local transcription, with optional PyAnnote.audio integration for multi-speaker diarization (speaker separation).

1. System Requirements & Setup

This script runs completely locally on your machine, ensuring absolute confidentiality for your therapy sessions.

Step A: Install FFMPEG

The transcription backend requires ffmpeg to process audio files:

Windows: Download ffmpeg via chocolatey (choco install ffmpeg) or from the official website, and add its bin directory to your system PATH.
macOS: brew install ffmpeg
Linux: sudo apt install ffmpeg

Step B: Install Python Packages

Install the required packages in your Python environment:

pip install openai-whisper torch

2. Multi-Speaker Diarization (Optional)

To separate speakers (e.g. distinguishing between Speaker 0 and Speaker 1):

Install the diarization dependencies:
```
pip install pyannote.audio
```
Go to Hugging Face and accept the user agreements for these models (requires creating a free account):
- pyannote/speaker-diarization-3.1
- pyannote/segmentation-3.0
Generate a User Access Token (Read Permission) on your Hugging Face Settings Page.

3. Usage Reference

Standard Transcription (No Speaker Separation)

Runs fully offline immediately:

python transcribe.py path/to/session.mp3

Transcribe with Multi-Speaker Diarization

Splits conversation segments by speaker automatically:

python transcribe.py path/to/session.mp3 --hf-token "YOUR_HF_TOKEN"

Options

--model: Footprint of model to load (tiny, base, small, medium, large). Defaults to base, which balances speed and accuracy on standard laptops.
--output: Specify base output name.

Outputs are generated in both:

.md: A structured Markdown dialogue format.
.txt: A timestamped plaintext dialogue transcript.

4. Web-Based GUI Dashboard

For a premium, interactive editing experience, you can launch the local GUI server:

python gui_server.py [port]

Default Port: 8080
Local Address: http://localhost:8080

GUI Features:

Drag-and-Drop Form: Easily input your audio target file, Hugging Face Token, and select Whisper model sizes dynamically.
Live Console Log: Watch the terminal status updates and model downloads inside a scrollable screen.
Dialogue Workspace:
- Edit transcribed text blocks on the fly.
- Speaker Renamer: Rename default speaker codes (e.g. SPEAKER_00 to Me, SPEAKER_01 to Dr. Jameson) and instantly replace them across the entire dialogue history.
- Export Controls: One-click copy formatted Markdown dialogue logs or download local JSON objects.

Project details

Release history Release notifications | RSS feed

This version

0.0.2

May 29, 2026

0.0.1

May 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sobertranscribe-0.0.2.tar.gz (6.2 kB view details)

Uploaded May 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sobertranscribe-0.0.2-py3-none-any.whl (6.6 kB view details)

Uploaded May 29, 2026 Python 3

File details

Details for the file sobertranscribe-0.0.2.tar.gz.

File metadata

Download URL: sobertranscribe-0.0.2.tar.gz
Upload date: May 29, 2026
Size: 6.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for sobertranscribe-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`45190b340800c7a55849e7fd743063f50d882a698d9b203029f308fb817f385f`
MD5	`9fff94809acf0bfbf8d5d8a88b27a6a8`
BLAKE2b-256	`c8961b445e527eb1f5dd782e407a854b84c9474b11c901eed0417b665fcf8fbe`

See more details on using hashes here.

File details

Details for the file sobertranscribe-0.0.2-py3-none-any.whl.

File metadata

Download URL: sobertranscribe-0.0.2-py3-none-any.whl
Upload date: May 29, 2026
Size: 6.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for sobertranscribe-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8359e0c254240ac0f42cabb6c2bfbe90b3cc2418c5314cfbfd3d89ebd1258c31`
MD5	`ae23f89664e6f3ab2e5f720becf4fad2`
BLAKE2b-256	`afacbcfc189c33ce571196fa0d8f205f365526b097bdd5da46bd7ff741658029`

See more details on using hashes here.

sobertranscribe 0.0.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

SoberMind Session Transcriber

1. System Requirements & Setup

Step A: Install FFMPEG

Step B: Install Python Packages

2. Multi-Speaker Diarization (Optional)

3. Usage Reference

Standard Transcription (No Speaker Separation)

Transcribe with Multi-Speaker Diarization

Options

4. Web-Based GUI Dashboard

GUI Features:

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes