TNT voice-to-text TUI with Qwen3-ASR on the Apple GPU via MLX
Project description
TNT 🧨
Terminal voice-to-text. Tap Space, speak, tap Space — your words land in the transcript and on the clipboard.
Qwen3-ASR-1.7B runs in-process on the Apple GPU via mlx-speech: the model loads once, stays resident, and transcribes a short take in about a second. Fully local — no cloud, no runtime network calls. The microphone is captured natively through AVFoundation by a small Swift helper process, so a misbehaving audio stack can never trap the mic: TNT just kills the helper and macOS releases it.
[!NOTE] Using Termux on Android? Use the preserved
legacy/android-termux-qwen0.6bbranch instead ofmaster. It is a legacy proot setup and may need device-specific fixes; validate it locally and adapt it with your own tools or agentic AI workflow.git fetch origin git switch --track origin/legacy/android-termux-qwen0.6b
Features
- In-process GPU inference — pure MLX, no PyTorch
- Resident model — loads once in the background at startup; every take is warm
- Native mic capture — AVFoundation via an isolated Swift helper process; the mic can always be reclaimed
- English, Chinese, and mixed speech — language auto-detected, or forced via env var
- Live braille oscilloscope — real audio levels while you record
- Clipboard-first — new transcriptions auto-copy; click any past entry to copy it again
- Responsive TUI — side-rail layout on wide terminals, stacked on narrow ones
Setup
[!IMPORTANT] Requires an Apple Silicon Mac (M1 or later), Python 3.13+, uv, and the Xcode command line tools (
xcode-select --install) — the mic capture helper is compiled from Swift on first launch and cached.
git clone https://github.com/appautomaton/tnt-asr.git
cd tnt-asr
uv sync
./bootstrap-mlx-asr.sh /path/to/qwen3-asr-1.7b-bf16-mlx
uv run tnt
Or install from PyPI (automaton-tnt):
uv tool install automaton-tnt
TNT_MLX_MODEL=/path/to/qwen3-asr-1.7b-bf16-mlx tnt
(Instead of exporting TNT_MLX_MODEL, you can symlink the checkpoint at
~/.local/share/tnt/qwen3-asr-mlx.)
Model checkpoint
TNT expects a converted Qwen3-ASR-1.7B MLX checkpoint (BF16). A ready-to-use one is published at appautomaton/qwen3-asr-1.7b-bf16-mlx (~4.7 GB) — download it however you prefer, then point the bootstrap script at it:
./bootstrap-mlx-asr.sh /path/to/qwen3-asr-1.7b-bf16-mlx
This symlinks the checkpoint to bin/qwen3-asr-mlx and validates that the
required files are present. Alternatively, convert the upstream
Qwen/Qwen3-ASR-1.7B weights
yourself with mlx-speech's
scripts/convert/qwen3_asr.py.
Configuration
| Environment variable | Default | Description |
|---|---|---|
TNT_MLX_MODEL |
bin/qwen3-asr-mlx, else ~/.local/share/tnt/qwen3-asr-mlx |
Path to the converted MLX checkpoint |
TNT_MLX_LANGUAGE |
auto |
Chinese, English, or auto. Use Chinese to keep mixed Chinese/English speech from being translated to English |
TNT_INPUT_DEVICE |
system default | Microphone, by index or name |
TNT_CAPTURE_BACKEND |
auto |
macOS always uses native AVFoundation (needs the Xcode command line tools: xcode-select --install); other platforms use PortAudio. portaudio is rejected on macOS |
Keybindings
| Key | Action |
|---|---|
| Space | Start / stop recording, or hold to record until release; cancels during transcription |
| c | Copy the last transcript entry |
| mouse click | Copy the clicked transcript entry |
| x | Clear the transcript |
| q | Quit |
Project structure
src/tnt/
├── app.py # Textual TUI, state machine, keybindings
├── audio.py # Recorder protocol, backend selection, PortAudio (non-macOS)
├── avf_audio.py # Native AVFoundation capture via helper process (macOS)
├── mic_helper.swift # AVFoundation helper source, compiled on demand
├── async_threads.py # Daemon-thread helpers for blocking work
├── transcriber.py # In-process MLX Qwen3-ASR transcription
└── widgets/
├── transcript.py # Scrollable transcript log
└── status.py # Braille oscilloscope + state rail
bin/
└── qwen3-asr-mlx # Symlink to converted MLX checkpoint (gitignored)
[!TIP] The inference path expects 16 kHz mono PCM WAV; the recorder produces exactly that. Cancelling a transcription abandons its result — the in-process generation cannot be killed mid-flight and quietly finishes in the background.
Related projects
- mlx-speech — our MLX-native speech runtime that powers TNT (PyPI)
- qwen3-asr-1.7b-bf16-mlx — our BF16 MLX checkpoint that TNT runs (converted from Qwen3-ASR-1.7B); more on Hugging Face
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file automaton_tnt-0.1.1.tar.gz.
File metadata
- Download URL: automaton_tnt-0.1.1.tar.gz
- Upload date:
- Size: 53.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
02e14e2f18898f757d004bcfe1c9fd78f3aff3f5b70b92aac758ffdcb98a80d0
|
|
| MD5 |
a45271b1dfa29b80e1ea2aabf2c8e061
|
|
| BLAKE2b-256 |
a9a6e50624414ba637a5c030816d801abfb845babef5383eb65050a9a3650bf4
|
Provenance
The following attestation bundles were made for automaton_tnt-0.1.1.tar.gz:
Publisher:
workflow.yaml on appautomaton/tnt-asr
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
automaton_tnt-0.1.1.tar.gz -
Subject digest:
02e14e2f18898f757d004bcfe1c9fd78f3aff3f5b70b92aac758ffdcb98a80d0 - Sigstore transparency entry: 1792508915
- Sigstore integration time:
-
Permalink:
appautomaton/tnt-asr@46e8ebb65826c4541c64532b005864931ad102c4 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/appautomaton
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yaml@46e8ebb65826c4541c64532b005864931ad102c4 -
Trigger Event:
push
-
Statement type:
File details
Details for the file automaton_tnt-0.1.1-py3-none-any.whl.
File metadata
- Download URL: automaton_tnt-0.1.1-py3-none-any.whl
- Upload date:
- Size: 28.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
05f306746f765e63384691dd1e3012e4ef42536c3449f159437ba3d9fc5f0f18
|
|
| MD5 |
4c63b0edc683bee148d804d3bbb464b0
|
|
| BLAKE2b-256 |
d8b7e59e21748384d5c88d4979f8875d254ffd316ca277339c36b4908ceb230c
|
Provenance
The following attestation bundles were made for automaton_tnt-0.1.1-py3-none-any.whl:
Publisher:
workflow.yaml on appautomaton/tnt-asr
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
automaton_tnt-0.1.1-py3-none-any.whl -
Subject digest:
05f306746f765e63384691dd1e3012e4ef42536c3449f159437ba3d9fc5f0f18 - Sigstore transparency entry: 1792509063
- Sigstore integration time:
-
Permalink:
appautomaton/tnt-asr@46e8ebb65826c4541c64532b005864931ad102c4 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/appautomaton
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yaml@46e8ebb65826c4541c64532b005864931ad102c4 -
Trigger Event:
push
-
Statement type: