100% offline, Whisper-powered voice notes from your terminal
Project description
hark 😇
100% offline, Whisper-powered voice notes from your terminal
Use Cases
- Voice-to-LLM pipelines —
hark | llmturns speech into AI prompts instantly - Meeting minutes — Transcribe calls with speaker identification (
--diarize) - System audio capture — Record what you hear, not just what you say (
--input speaker) - Private by design — No cloud, no API keys, no data leaves your machine
Features
- 🎙️ Record - Press space to start, Ctrl+C to stop
- 🔊 Multi-source - Capture microphone, system audio, or both
- ✨ Transcribe - Powered by faster-whisper
- 🗣️ Diarize - Identify who said what with WhisperX
- 🔒 Local - 100% offline, no cloud required
- 📄 Flexible - Output as plain text, markdown, or SRT subtitles
Installation
pipx install hark-cli
System Dependencies
Ubuntu/Debian:
sudo apt install portaudio19-dev
macOS:
brew install portaudio
Optional: Vulkan Acceleration
For GPU-accelerated transcription via Vulkan (AMD/Intel GPUs):
Ubuntu/Debian:
sudo apt install libvulkan1 vulkan-tools mesa-vulkan-drivers
Then set the device in your config or use --device vulkan.
Quick Start
# Record and print to stdout
hark
# Save to file
hark notes.txt
# Use larger model for better accuracy
hark --model large-v3 meeting.md
# Transcribe in German
hark --lang de notes.txt
# Output as SRT subtitles
hark --format srt captions.srt
# Capture system audio (e.g., online meetings)
hark --input speaker meeting.txt
# Capture both microphone and system audio (stereo: L=mic, R=speaker)
hark --input both conversation.txt
Configuration
Hark uses a YAML config file at ~/.config/hark/config.yaml. CLI flags override config file settings.
# ~/.config/hark/config.yaml
recording:
sample_rate: 16000
channels: 1 # Use 2 for --input both
max_duration: 600
input_source: mic # mic, speaker, or both
whisper:
model: base # tiny, base, small, medium, large, large-v2, large-v3
language: auto # auto, en, de, fr, es, ...
device: auto # auto, cpu, cuda, vulkan
preprocessing:
noise_reduction:
enabled: true
strength: 0.5 # 0.0-1.0
normalization:
enabled: true
silence_trimming:
enabled: true
output:
format: plain # plain, markdown, srt
timestamps: false
diarization:
hf_token: null # HuggingFace token (required for --diarize)
local_speaker_name: null # Your name in stereo mode, or null for SPEAKER_00
Audio Input Sources
Hark supports three input modes via --input or recording.input_source:
| Mode | Description |
|---|---|
mic |
Microphone only (default) |
speaker |
System audio only (loopback capture) |
both |
Microphone + system audio as stereo (L=mic, R=speaker) |
System Audio Capture (Linux)
System audio capture uses PulseAudio/PipeWire monitor sources. To verify your system supports it:
pactl list sources | grep -i monitor
You should see output like:
Name: alsa_output.pci-0000_00_1f.3.analog-stereo.monitor
Description: Monitor of Built-in Audio
Speaker Diarization
Identify who said what in multi-speaker recordings using WhisperX.
Setup
-
Install diarization dependencies:
pipx inject hark-cli whisperx # Or with pip: pip install hark-cli[diarization]
-
Get a HuggingFace token (required for pyannote models):
- Create account at https://huggingface.co
- Accept model licenses:
- Create token at https://huggingface.co/settings/tokens
-
Add token to config:
# ~/.config/hark/config.yaml diarization: hf_token: "hf_xxxxxxxxxxxxx"
Usage
The --diarize flag enables speaker identification. It requires --input speaker or --input both.
# Transcribe a meeting with speaker identification
hark --diarize --input speaker meeting.txt
# Specify expected number of speakers (improves accuracy)
hark --diarize --speakers 3 --input speaker meeting.md
# Skip interactive speaker naming for batch processing
hark --diarize --no-interactive --input speaker meeting.txt
# Stereo mode: separate local user from remote speakers
hark --diarize --input both conversation.md
# Combine with other options
hark --diarize --input speaker --format markdown --model large-v3 meeting.md
| Flag | Description |
|---|---|
--diarize |
Enable speaker identification |
--speakers N |
Hint for expected speaker count (improves clustering) |
--no-interactive |
Skip post-transcription speaker naming prompt |
Note: Diarization adds processing time. For a 5-minute recording, expect ~1-2 minutes on GPU or ~5-10 minutes on CPU.
Output Format
With diarization enabled, output includes speaker labels and timestamps:
Plain text:
[00:02] [SPEAKER_01] Hello everyone, let's get started.
[00:05] [SPEAKER_02] Thanks for joining. Let me share my screen.
Markdown:
# Meeting Transcript
**SPEAKER_01** (00:02)
Hello everyone, let's get started.
**SPEAKER_02** (00:05)
Thanks for joining. Let me share my screen.
---
_2 speakers detected • Duration: 5:23 • Language: en (98% confidence)_
Interactive Naming
After transcription, hark will prompt you to identify speakers:
Detected 2 speaker(s) to identify.
SPEAKER_01 said: "Hello everyone, let's get started."
Who is this? [name/skip/done]: Alice
SPEAKER_02 said: "Thanks for joining. Let me share my screen."
Who is this? [name/skip/done]: Bob
Use --no-interactive to skip this prompt.
Known Issues
Slow diarization? The pyannote models may default to CPU inference. For GPU acceleration:
pip install --force-reinstall onnxruntime-gpu
See WhisperX #499 for details.
Development
git clone https://github.com/FPurchess/hark.git
cd hark
uv sync --extra test
uv run pre-commit install
uv run pytest
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hark_cli-0.2.0.tar.gz.
File metadata
- Download URL: hark_cli-0.2.0.tar.gz
- Upload date:
- Size: 326.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: Hatch/1.16.2 cpython/3.12.3 HTTPX/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e8f61351d641fc66c99a10fa3dfd3be48aee1175f12c4512d57f01622f18056f
|
|
| MD5 |
97d6d6825e862600bfcb7c87bfd6ef23
|
|
| BLAKE2b-256 |
d53f837ba87eb58c5f2f3aa25293bb79047805a0553b4814c7b83fe0d89ef8d2
|
File details
Details for the file hark_cli-0.2.0-py3-none-any.whl.
File metadata
- Download URL: hark_cli-0.2.0-py3-none-any.whl
- Upload date:
- Size: 66.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: Hatch/1.16.2 cpython/3.12.3 HTTPX/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7f67dfaed273cc7562c155849b7cba1c330319a93ea346530c3b84a0a8f46314
|
|
| MD5 |
459071e043df76965bdde3bc15b2c6e3
|
|
| BLAKE2b-256 |
0fb31ca20a96cb00af95355d58e8517946639d28ebbbb7468322a7c5e23790de
|