Local-first transcription tool for Korean meetings (audio file -> txt)
Project description
mynah ๐ฆ
Local-first transcription for Korean meetings.
Record directly or drop an audio file โ get a clean.txtin minutes.
No cloud. No per-meeting cost. Runs entirely on your Mac.
Screenshots
Features
- Record directly in the TUI โ mic capture with pause/resume, auto-transcribes on stop
- Or drop an existing file โ m4a, mp3, wav, flac, webm, mp4 supported
- Korean-first โ Whisper large-v3, fixed
kolanguage, VAD always on to prevent hallucinations on silence - English code-switching โ three-layer defense (initial_prompt + hotwords + post-process replacements) keeps project names and technical terms intact
- Speaker diarization โ
SPEAKER_01:labels per segment (optional, requires HF token) - Word-level timestamps โ
[HH:MM:SS]prefix per segment (optional) - Denoising โ Demucs vocals stem strips HVAC and keyboard noise (optional)
- Glossary & replacements โ editable inside the TUI; glossary fed into Whisper as context and hotwords
- TUI for daily use, CLI for scripting โ both first-class, same pipeline
- System health check โ
mynah --doctorverifies all dependencies and shows actionable fixes - Offline, always โ no API calls after the first model download (~3 GB, one-time)
Output formats
| Flags | Format |
|---|---|
| (none) | continuous text |
--timestamps |
[HH:MM:SS] sentence... |
--diarize |
SPEAKER_NN: sentence... |
--diarize --timestamps |
[HH:MM:SS] SPEAKER_NN: sentence... |
Output filename: <input>.txt. On collision: <input> (1).txt, <input> (2).txt, etc. โ never silently overwrites.
Install
Prerequisites
| Requirement | Version | Install |
|---|---|---|
| macOS (Apple Silicon) | 14+ | โ |
| Python | 3.10 โ 3.12 | brew install python@3.12 |
| ffmpeg | any | brew install ffmpeg |
| pipx | any | brew install pipx && pipx ensurepath |
โ ๏ธ Python 3.13+ is not yet supported โ the ML stack (torch, torchaudio, ctranslate2) does not have stable wheels for 3.13+.
One-line install
pipx install mynah-stt
# Optional: with denoising (Demucs) support
# pipx install 'mynah-stt[denoise]'
# Optional: speaker diarization needs a free HuggingFace token
mynah --setup
Package vs. command name: the PyPI distribution is
mynah-stt, but the CLI you actually run ismynah(and Python imports areimport mynah). The-sttsuffix only shows up at install time โ same pattern aspip install Pillowโimport PIL.
First run: Whisper large-v3 (~3 GB) downloads automatically on the first transcription and is cached for all subsequent runs.
Quick start
mynah # open TUI โ Record โ Stop โ transcript ready
mynah meeting.m4a # CLI: transcribe existing file
mynah meeting.m4a --diarize # with speaker labels
mynah meeting.m4a --timestamps # with time codes
mynah --doctor # check system dependencies
TUI
Run mynah with no arguments to open the TUI.
Main screen โ press R or Space to start recording, F to open an existing audio file, S for settings, G for glossary, Q to quit.
Recording screen โ microphone captures at 16 kHz mono. The level meter shows live input amplitude. Space to pause/resume. S to stop and start transcription automatically.
Settings (S from anywhere) โ toggle speaker diarization, word-level timestamps, and denoising with on/off switches; switch the Whisper model and input language; set your HuggingFace token inline without leaving the TUI.
Result screen โ transcript is auto-copied to the clipboard the moment it appears. Open the file or reveal it in Finder directly from this screen.
CLI
mynah <audio_file> [flags]
| Flag | Description |
|---|---|
--diarize |
Speaker diarization (SPEAKER_NN: labels) |
--timestamps |
Word-level timestamps ([HH:MM:SS] prefixes) |
--denoise |
Denoise with Demucs (requires mynah-stt[denoise]) |
--model |
large-v3 (default) or large-v3-turbo |
--lang |
ko (default), en, auto |
--setup |
Interactive HuggingFace token wizard |
--doctor |
System dependency health check |
--edit-glossary |
Open the TUI glossary editor |
--edit-replacements |
Open the TUI replacements editor |
Glossary โ the biggest quality lever
Add the project names, people, and technical terms that appear in your meetings. mynah feeds them to Whisper as context (initial_prompt + hotwords), which keeps code-switched proper nouns in their original form.
# Edit from the TUI (G key) or directly:
~/.config/mynah/glossary.txt # one term per line
Example entries: Whisper, Slack, Q3 OKR, ํ๊ธธ๋, CTranslate2
Replacements (safety net)
Post-processing find/replace rules for known mistakes:
# ~/.config/mynah/replacements.toml
[[rule]]
from = "์ฌ๋"
to = "Slack"
[[rule]]
from = "์์คํผ"
to = "Whisper"
Set regex = true on any rule to use Python regex syntax.
Configuration files
| Path | Purpose |
|---|---|
~/.config/mynah/config.toml |
Last-used options (auto-saved) |
~/.config/mynah/glossary.txt |
Domain vocabulary (one term per line) |
~/.config/mynah/replacements.toml |
Post-processing find/replace rules |
Performance
Estimates for a 1-hour Korean meeting on MacBook Pro M5:
| Pipeline | Time |
|---|---|
Transcription only (large-v3) |
~15 min |
| + Speaker diarization | ~25 min |
| + Diarization + timestamps + denoising | ~35 min |
large-v3-turbo runs ~3โ5ร faster with a small accuracy trade-off.
The pipeline runs on CPU + int8 quantization (CTranslate2 / faster-whisper). On M5 this is fast enough that GPU/Neural Engine paths are not necessary.
Troubleshooting
Run the built-in health check first:
mynah --doctor
Common issues and fixes:
| Error | Fix |
|---|---|
ffmpeg not found |
brew install ffmpeg |
bad value(s) in fds_to_keep |
pipx runpip mynah-stt install --upgrade ctranslate2 |
list_audio_backends missing |
pipx runpip mynah-stt install --upgrade torchaudio |
| Python 3.13+ not supported | brew install python@3.12 then pipx install mynah-stt --python /opt/homebrew/bin/python3.12 --force |
| Microphone not captured | macOS System Settings โ Sound โ Input โ select the correct device |
Development
git clone https://github.com/rlawjdghksdlqslek/mynah-stt.git
cd mynah-stt
# Install with dev extras (editable mode โ code changes take effect immediately)
pipx install -e '.[dev]'
# Unit tests โ no model download required
pytest tests/
# Lint
ruff check .
The codebase is organized so that the core pipeline (mynah/core/) has no UI dependencies โ TUI and CLI are thin wrappers over pipeline.run(). Unit-testable pure modules (format, glossary, replacements, settings, model_cache, record) cover the logic that doesn't need ML model downloads.
mynah/
โโโ core/ # pipeline orchestration + ML wrappers
โโโ config/ # settings, glossary, replacements (no ML deps)
โโโ cli.py # argparse entry point
โโโ tui/ # Textual app + screens
โ โโโ screens/ # main, record, progress, result, settings, editor
โ โโโ app.css # color theme
โโโ app.py # dispatches TUI vs CLI based on argv
License
Apache-2.0. See LICENSE.
Acknowledgments
- OpenAI Whisper โ speech recognition model
- WhisperX โ alignment + diarization wrapper
- faster-whisper โ CTranslate2-accelerated Whisper inference
- pyannote-audio โ speaker diarization
- Textual โ TUI framework
- Demucs โ audio source separation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mynah_stt-0.1.0.tar.gz.
File metadata
- Download URL: mynah_stt-0.1.0.tar.gz
- Upload date:
- Size: 39.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b2f54acde3e04f392c843d57304dd14032d3a2d6a450fccdf56603084401200c
|
|
| MD5 |
b1e53ccd076253f60b0857ff14e4481f
|
|
| BLAKE2b-256 |
d78cb89fdb87f799ccb4f1ae510d2e811df287d631cfe3c46a28844ceaa98e8b
|
File details
Details for the file mynah_stt-0.1.0-py3-none-any.whl.
File metadata
- Download URL: mynah_stt-0.1.0-py3-none-any.whl
- Upload date:
- Size: 41.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f9cea0e0d813c746bcb70e360c1c402635de422198e20ffa783aef39686d560a
|
|
| MD5 |
17c593e9108b836b186e906ff47bdd5f
|
|
| BLAKE2b-256 |
f45f689923f86de15019ce67f54fa60954ec8442039b02ad33c73077215dbf6a
|