Skip to main content

Local-first transcription tool for Korean meetings (audio file -> txt)

Project description

mynah ๐Ÿฆ

Local-first transcription for Korean meetings.
Record directly or drop an audio file โ€” get a clean .txt in minutes.
No cloud. No per-meeting cost. Runs entirely on your Mac.

Python 3.10โ€“3.12 License Apache-2.0 Platform macOS Apple Silicon


Screenshots

Main screen

Recording screen Settings screen

Progress screen Result screen

Features

  • Record directly in the TUI โ€” mic capture with pause/resume, auto-transcribes on stop
  • Or drop an existing file โ€” m4a, mp3, wav, flac, webm, mp4 supported
  • Korean-first โ€” Whisper large-v3, fixed ko language, VAD always on to prevent hallucinations on silence
  • English code-switching โ€” three-layer defense (initial_prompt + hotwords + post-process replacements) keeps project names and technical terms intact
  • Speaker diarization โ€” SPEAKER_01: labels per segment (optional, requires HF token)
  • Word-level timestamps โ€” [HH:MM:SS] prefix per segment (optional)
  • Denoising โ€” Demucs vocals stem strips HVAC and keyboard noise (optional)
  • Glossary & replacements โ€” editable inside the TUI; glossary fed into Whisper as context and hotwords
  • TUI for daily use, CLI for scripting โ€” both first-class, same pipeline
  • System health check โ€” mynah --doctor verifies all dependencies and shows actionable fixes
  • Offline, always โ€” no API calls after the first model download (~3 GB, one-time)

Output formats

Flags Format
(none) continuous text
--timestamps [HH:MM:SS] sentence...
--diarize SPEAKER_NN: sentence...
--diarize --timestamps [HH:MM:SS] SPEAKER_NN: sentence...

Output filename: <input>.txt. On collision: <input> (1).txt, <input> (2).txt, etc. โ€” never silently overwrites.


Install

Prerequisites

Requirement Version Install
macOS (Apple Silicon) 14+ โ€”
Python 3.10 โ€“ 3.12 brew install python@3.12
ffmpeg any brew install ffmpeg
pipx any brew install pipx && pipx ensurepath

โš ๏ธ Python 3.13+ is not yet supported โ€” the ML stack (torch, torchaudio, ctranslate2) does not have stable wheels for 3.13+.

One-line install

pipx install mynah-stt

# Optional: with denoising (Demucs) support
# pipx install 'mynah-stt[denoise]'

# Optional: speaker diarization needs a free HuggingFace token
mynah --setup

Package vs. command name: the PyPI distribution is mynah-stt, but the CLI you actually run is mynah (and Python imports are import mynah). The -stt suffix only shows up at install time โ€” same pattern as pip install Pillow โ†’ import PIL.

First run: Whisper large-v3 (~3 GB) downloads automatically on the first transcription and is cached for all subsequent runs.


Quick start

mynah                               # open TUI โ†’ Record โ†’ Stop โ†’ transcript ready
mynah meeting.m4a                   # CLI: transcribe existing file
mynah meeting.m4a --diarize         # with speaker labels
mynah meeting.m4a --timestamps      # with time codes
mynah --doctor                      # check system dependencies

TUI

Run mynah with no arguments to open the TUI.

Main screen โ€” press R or Space to start recording, F to open an existing audio file, S for settings, G for glossary, Q to quit.

Recording screen โ€” microphone captures at 16 kHz mono. The level meter shows live input amplitude. Space to pause/resume. S to stop and start transcription automatically.

Settings (S from anywhere) โ€” toggle speaker diarization, word-level timestamps, and denoising with on/off switches; switch the Whisper model and input language; set your HuggingFace token inline without leaving the TUI.

Result screen โ€” transcript is auto-copied to the clipboard the moment it appears. Open the file or reveal it in Finder directly from this screen.


CLI

mynah <audio_file> [flags]
Flag Description
--diarize Speaker diarization (SPEAKER_NN: labels)
--timestamps Word-level timestamps ([HH:MM:SS] prefixes)
--denoise Denoise with Demucs (requires mynah-stt[denoise])
--model large-v3 (default) or large-v3-turbo
--lang ko (default), en, auto
--setup Interactive HuggingFace token wizard
--doctor System dependency health check
--edit-glossary Open the TUI glossary editor
--edit-replacements Open the TUI replacements editor

Glossary โ€” the biggest quality lever

Add the project names, people, and technical terms that appear in your meetings. mynah feeds them to Whisper as context (initial_prompt + hotwords), which keeps code-switched proper nouns in their original form.

# Edit from the TUI (G key) or directly:
~/.config/mynah/glossary.txt   # one term per line

Example entries: Whisper, Slack, Q3 OKR, ํ™๊ธธ๋™, CTranslate2

Replacements (safety net)

Post-processing find/replace rules for known mistakes:

# ~/.config/mynah/replacements.toml
[[rule]]
from = "์Šฌ๋ž™"
to = "Slack"

[[rule]]
from = "์œ„์Šคํผ"
to = "Whisper"

Set regex = true on any rule to use Python regex syntax.


Configuration files

Path Purpose
~/.config/mynah/config.toml Last-used options (auto-saved)
~/.config/mynah/glossary.txt Domain vocabulary (one term per line)
~/.config/mynah/replacements.toml Post-processing find/replace rules

Performance

Estimates for a 1-hour Korean meeting on MacBook Pro M5:

Pipeline Time
Transcription only (large-v3) ~15 min
+ Speaker diarization ~25 min
+ Diarization + timestamps + denoising ~35 min

large-v3-turbo runs ~3โ€“5ร— faster with a small accuracy trade-off.

The pipeline runs on CPU + int8 quantization (CTranslate2 / faster-whisper). On M5 this is fast enough that GPU/Neural Engine paths are not necessary.


Troubleshooting

Run the built-in health check first:

mynah --doctor

Common issues and fixes:

Error Fix
ffmpeg not found brew install ffmpeg
bad value(s) in fds_to_keep pipx runpip mynah-stt install --upgrade ctranslate2
list_audio_backends missing pipx runpip mynah-stt install --upgrade torchaudio
Python 3.13+ not supported brew install python@3.12 then pipx install mynah-stt --python /opt/homebrew/bin/python3.12 --force
Microphone not captured macOS System Settings โ†’ Sound โ†’ Input โ†’ select the correct device

Development

git clone https://github.com/rlawjdghksdlqslek/mynah-stt.git
cd mynah-stt

# Install with dev extras (editable mode โ€” code changes take effect immediately)
pipx install -e '.[dev]'

# Unit tests โ€” no model download required
pytest tests/

# Lint
ruff check .

The codebase is organized so that the core pipeline (mynah/core/) has no UI dependencies โ€” TUI and CLI are thin wrappers over pipeline.run(). Unit-testable pure modules (format, glossary, replacements, settings, model_cache, record) cover the logic that doesn't need ML model downloads.

mynah/
โ”œโ”€โ”€ core/         # pipeline orchestration + ML wrappers
โ”œโ”€โ”€ config/       # settings, glossary, replacements (no ML deps)
โ”œโ”€โ”€ cli.py        # argparse entry point
โ”œโ”€โ”€ tui/          # Textual app + screens
โ”‚   โ”œโ”€โ”€ screens/  # main, record, progress, result, settings, editor
โ”‚   โ””โ”€โ”€ app.css   # color theme
โ””โ”€โ”€ app.py        # dispatches TUI vs CLI based on argv

License

Apache-2.0. See LICENSE.

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mynah_stt-0.1.0.tar.gz (39.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mynah_stt-0.1.0-py3-none-any.whl (41.5 kB view details)

Uploaded Python 3

File details

Details for the file mynah_stt-0.1.0.tar.gz.

File metadata

  • Download URL: mynah_stt-0.1.0.tar.gz
  • Upload date:
  • Size: 39.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for mynah_stt-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b2f54acde3e04f392c843d57304dd14032d3a2d6a450fccdf56603084401200c
MD5 b1e53ccd076253f60b0857ff14e4481f
BLAKE2b-256 d78cb89fdb87f799ccb4f1ae510d2e811df287d631cfe3c46a28844ceaa98e8b

See more details on using hashes here.

File details

Details for the file mynah_stt-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mynah_stt-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 41.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for mynah_stt-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f9cea0e0d813c746bcb70e360c1c402635de422198e20ffa783aef39686d560a
MD5 17c593e9108b836b186e906ff47bdd5f
BLAKE2b-256 f45f689923f86de15019ce67f54fa60954ec8442039b02ad33c73077215dbf6a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page