micoracle

Hands-free voice input for Claude Code, Codex CLI, and any terminal — cross-platform

These details have not been verified by PyPI

Project links

Project description

Stop typing your AI prompts. Just say them.

Hands-free voice input for Claude Code, Codex CLI, and any terminal — macOS · Linux · Windows

Say "Micoracle, refactor this function" → transcribed → pasted into your terminal → Enter pressed. No push-to-talk. No cloud required.

Demo

https://github.com/user-attachments/assets/8ab4fc80-8557-4b4e-9149-d6dfad434f70

Quick Install

git clone https://github.com/thepradip/micoracle.git
cd micoracle
pip install -r requirements.txt

Then pick your platform and run:

./run_hands_free.sh        # macOS / Linux
run_hands_free.bat         # Windows

Need a specific STT backend? Jump to the full install guide below.

Why micoracle?

Without micoracle	With micoracle
Stop → think → type prompt → Enter	Say the prompt. Done.
Push-to-talk or browser extension	Always-on wake-word listener
Cloud-only transcription	100% offline on Apple Silicon & CPU
Locked to one tool	Works with any terminal app

Works with Claude Code · OpenAI Codex CLI · OpenCode · iTerm2 · Warp · VS Code terminal · Windows Terminal

Features

	Feature	Detail
🌐	Cross-platform	Auto-selects macOS (AppleScript), Linux (xdotool / wtype), or Windows (pywin32 + pyautogui)
🎙️	10 STT backends	MLX Whisper · faster-whisper · OpenAI · Azure · OpenAI Realtime · 60dB · ElevenLabs · Deepgram · AssemblyAI · Groq · Gladia
🔊	4 TTS backends	macOS `say` · pyttsx3 · OpenAI TTS · Azure Speech TTS
🔉	Continuous listening	WebRTC VAD + 300 ms preroll buffer — wake words are never clipped at onset
💬	Wake-word gate	`"Claude, …"` / `"Codex, …"` / `"Micoracle, …"` with fuzzy mishear tolerance
⏱️	Two-step follow-up	Say wake word alone → hear "listening" → speak prompt within 8 s
💰	Cost-guard	Cloud STT backends activate only after wake-word — continuous listening is always local & free
🚫	Hallucination filter	Whisper artifacts like "Thank you." / "Amen." silently dropped
🔒	Target-aware dispatch	macOS / Windows reactivate the startup target; Linux dispatches to focused window
📋	Clipboard-conscious	Original clipboard contents restored immediately after each dispatch

STT Backends

Local (free, offline)

Backend	`--stt-backend`	Best for	Install
MLX Whisper	`mlx`	Apple Silicon — fastest on-device	`pip install mlx-whisper`
faster-whisper	`faster`	Cross-platform CPU / CUDA	`pip install faster-whisper`

Cloud (post-wake-word only — never billed for continuous listening)

Backend	`--command-stt-backend`	Latency	Extra install	Key env var
OpenAI Whisper	`openai`	~1 s	`pip install openai`	`OPENAI_API_KEY`
Azure Whisper	`azure`	~1 s	`pip install openai`	`AZURE_OPENAI_KEY`
OpenAI Realtime	`realtime`	~600 ms	`pip install openai websockets`	`OPENAI_API_KEY`
60dB.ai	`60db`	~600 ms	(none — stdlib only)	`SIXTYDB_API_KEY`
ElevenLabs Scribe	`elevenlabs`	~400 ms	(none — stdlib only)	`ELEVENLABS_API_KEY`
Deepgram Nova-2	`deepgram`	~250 ms	(none — stdlib only)	`DEEPGRAM_API_KEY`
Groq Whisper	`groq`	~200 ms	(none — stdlib only)	`GROQ_API_KEY`
AssemblyAI	`assemblyai`	~3–5 s	(none — stdlib only)	`ASSEMBLYAI_API_KEY`
Gladia	`gladia`	~3–5 s	(none — stdlib only)	`GLADIA_API_KEY`

Cost-guard rule: only mlx, faster, and auto are allowed for continuous listening. Any cloud backend set as --stt-backend is automatically demoted to --command-stt-backend and a local backend handles the mic stream instead.

Wake Words

Say	Example
`Claude, …`	"Claude, explain this function"
`Codex, …`	"Codex, refactor to async"
`Micoracle, …`	"Micoracle, write a SQL query"

All three support fuzzy mishear tolerance — common STT splits like "Mic Oracle", "Mick Oracle", "meek oracle", "Lord" (for Claude) are all caught automatically.

Two-step mode: say the wake word alone → hear "listening" → speak your command within 8 s.

Platform & Backend Matrix

Platform	STT (listening)	STT (command)	TTS	Focus & paste
macOS Apple Silicon	`mlx`	your choice	`say`	AppleScript
macOS Intel	`faster`	your choice	`say`	AppleScript
Linux X11	`faster`	your choice	`pyttsx3`	`xdotool type`
Linux Wayland	`faster`	your choice	`pyttsx3`	`wtype` + `wl-copy`
Windows 10/11	`faster`	your choice	`pyttsx3`	pywin32 + pyautogui

Install

macOS install commands shown in a terminal

Step 1 — Core dependencies (all platforms)

git clone https://github.com/thepradip/micoracle.git
cd micoracle
pip install -r requirements.txt

Step 2 — System packages

macOS:

brew install portaudio

Linux (X11):

sudo apt install xdotool portaudio19-dev python3-dev

Linux (Wayland):

sudo apt install wtype wl-clipboard portaudio19-dev python3-dev

Step 3 — Pick a local STT backend (for continuous listening)

Platform	Command
macOS Apple Silicon	`pip install mlx-whisper`
macOS Intel / Linux / Windows	`pip install faster-whisper`

Step 4 — Pick a cloud STT backend (for commands, optional)

Backend	Command	Notes
OpenAI Whisper	`pip install openai`	Set `OPENAI_API_KEY`
Azure Whisper	`pip install openai`	Set Azure env vars
OpenAI Realtime	`pip install openai websockets`	Set `OPENAI_API_KEY`
60dB.ai	(none)	Set `SIXTYDB_API_KEY`
ElevenLabs Scribe	(none)	Set `ELEVENLABS_API_KEY`
Deepgram Nova	(none)	Set `DEEPGRAM_API_KEY`
Groq Whisper	(none)	Set `GROQ_API_KEY`
AssemblyAI	(none)	Set `ASSEMBLYAI_API_KEY`
Gladia	(none)	Set `GLADIA_API_KEY`

Step 5 — Pick a TTS backend (optional, for status cues)

Backend	Best for	Install
`say`	macOS (built-in)	nothing
`pyttsx3`	Linux / Windows offline	`pip install pyttsx3` + `sudo apt install espeak`
`openai`	Cloud (OpenAI TTS)	`pip install openai`
`azure`	Cloud (Azure Speech)	set Azure Speech env vars

Step 6 — Windows dispatch packages

pip install pyperclip pyautogui pywin32 psutil

Step 7 — Configure

cp .env.example .env

Recommended .env for Apple Silicon + 60dB commands:

VOICE_AGENT_STT_BACKEND=mlx
VOICE_AGENT_COMMAND_STT_BACKEND=60db
SIXTYDB_API_KEY=sk_live_...

Recommended .env for Apple Silicon + Groq commands (fastest):

VOICE_AGENT_STT_BACKEND=mlx
VOICE_AGENT_COMMAND_STT_BACKEND=groq
GROQ_API_KEY=gsk_...

Quickstart

# Focus Claude Code, Codex CLI, or any terminal — then launch:
./run_hands_free.sh          # macOS / Linux
run_hands_free.bat           # Windows

One-shot: "Micoracle, write a Python hello world." → pasted with Enter.

Two-step: "Micoracle." → hear "listening" → say prompt within 8 s → pasted.

Override backends at launch:

./run_hands_free.sh --stt-backend mlx --command-stt-backend groq

Pin to a specific app (required on Wayland):

./run_hands_free.sh --target-app gnome-terminal

CLI Reference

Flag	Default	Description
`--device <id\|name>`	system default mic	Audio input device
`--list-devices`	—	Print available input devices and exit
`--target-app <name>`	frontmost app at startup	Lock the dispatch target
`--stt-backend`	`auto`	Local STT for continuous listening: `auto` / `mlx` / `faster`
`--command-stt-backend`	same as `--stt-backend`	Cloud STT for commands after wake-word: `openai` / `azure` / `realtime` / `60db` / `elevenlabs` / `deepgram` / `groq` / `assemblyai` / `gladia`
`--tts-backend`	`auto`	`auto` / `say` / `pyttsx3` / `openai` / `azure` / `none`
`--no-speak`	—	Alias for `--tts-backend none`

Environment Variables

See .env.example for the full commented list.

Core

Variable	Purpose
`VOICE_AGENT_STT_BACKEND`	Local STT for listening (`auto` / `mlx` / `faster`)
`VOICE_AGENT_COMMAND_STT_BACKEND`	Cloud STT for commands (`60db` / `groq` / `deepgram` / `elevenlabs` / `assemblyai` / `gladia` / `openai` / `azure` / `realtime`)
`VOICE_AGENT_TTS_BACKEND`	TTS for status cues (`auto` / `say` / `pyttsx3` / `openai` / `azure` / `none`)
`VOICE_AGENT_TARGET_APP`	Default dispatch target app name
`VOICE_AGENT_INPUT_DEVICE`	Default microphone device (name fragment or numeric id)

Local STT knobs

Variable	Purpose
`VOICE_AGENT_MLX_REPO`	MLX Whisper HuggingFace repo (Apple Silicon)
`VOICE_AGENT_FASTER_MODEL`	faster-whisper model (`tiny.en` / `base.en` / `small.en` / `medium.en` / `large-v3`)
`VOICE_AGENT_FASTER_DEVICE`	faster-whisper device (`auto` / `cpu` / `cuda`)
`VOICE_AGENT_FASTER_COMPUTE`	faster-whisper compute type (`int8` / `float16` / `int8_float16`)

Cloud STT keys & options

Variable	Backend	Purpose
`OPENAI_API_KEY`	`openai` / `realtime`	OpenAI API key
`VOICE_AGENT_OPENAI_STT_MODEL`	`openai`	Model name (default: `whisper-1`)
`VOICE_AGENT_REALTIME_MODEL`	`realtime`	Realtime model (default: `gpt-4o-transcribe`)
`AZURE_OPENAI_ENDPOINT`	`azure`	Azure OpenAI endpoint URL
`AZURE_OPENAI_KEY`	`azure`	Azure OpenAI key
`AZURE_WHISPER_DEPLOYMENT`	`azure`	Deployment name (default: `whisper`)
`SIXTYDB_API_KEY`	`60db`	60dB.ai API key
`VOICE_AGENT_SIXTYDB_LANGUAGE`	`60db`	Language code (default: `en`)
`ELEVENLABS_API_KEY`	`elevenlabs`	ElevenLabs API key
`VOICE_AGENT_ELEVENLABS_MODEL`	`elevenlabs`	Model (default: `scribe_v2`)
`VOICE_AGENT_ELEVENLABS_LANGUAGE`	`elevenlabs`	Language code (default: `en`)
`DEEPGRAM_API_KEY`	`deepgram`	Deepgram API key
`VOICE_AGENT_DEEPGRAM_MODEL`	`deepgram`	Model (default: `nova-2`)
`VOICE_AGENT_DEEPGRAM_LANGUAGE`	`deepgram`	Language code (default: `en`)
`ASSEMBLYAI_API_KEY`	`assemblyai`	AssemblyAI API key
`VOICE_AGENT_ASSEMBLYAI_LANGUAGE`	`assemblyai`	Language code (default: `en`)
`GROQ_API_KEY`	`groq`	Groq API key
`VOICE_AGENT_GROQ_MODEL`	`groq`	Model (default: `whisper-large-v3-turbo`)
`VOICE_AGENT_GROQ_LANGUAGE`	`groq`	Language code (default: `en`)
`GLADIA_API_KEY`	`gladia`	Gladia API key

TTS keys & options

Variable	Purpose
`VOICE_AGENT_TTS_VOICE`	macOS `say` voice name (e.g. `Samantha`)
`VOICE_AGENT_OPENAI_TTS_VOICE`	OpenAI TTS voice (`alloy` / `echo` / `fable` / `onyx` / `nova` / `shimmer`)
`AZURE_SPEECH_KEY`	Azure Speech TTS key
`AZURE_SPEECH_REGION`	Azure Speech TTS region (e.g. `eastus`)
`VOICE_AGENT_AZURE_TTS_VOICE`	Azure TTS voice (default: `en-US-AriaNeural`)
`HF_HUB_ENABLE_HF_TRANSFER`	Set to `1` for faster HuggingFace model downloads

Architecture

micoracle architecture

How it works

You speak a command — e.g. "Micoracle, refactor this function"
micoracle listens for real speech — background noise is ignored via WebRTC VAD
Wake word is checked locally — local STT transcribes the utterance; only Claude, Codex, or Micoracle pass the gate
Command STT fires — if a cloud backend is configured, it re-transcribes for higher accuracy (paid API called only here)
Clean prompt is sent — pasted into the target app, Enter pressed
Status cue plays — e.g. "listening", "sent", or "error"

Module overview

Module	Responsibility
`hands_free_voice.py`	Main entry point — mic capture, VAD wiring, wake-word gate, dual-backend dispatch loop
`segmenter.py`	`VADSegmenter` — frame-by-frame VAD state machine, preroll ring buffer
`stt.py`	`STTBackend` ABC + 10 implementations + shared HTTP helpers + OS-aware auto factory
`tts.py`	`TTSBackend` ABC + 4 implementations + auto factory
`platform_adapter.py`	`MacAdapter` / `LinuxAdapter` / `WindowsAdapter` + factory

VAD state machine

IDLE ──(speech frames ≥ 4)──▶ CAPTURING ──(silence ≥ 840 ms OR 18 s cap)──▶ EMIT utterance ──▶ IDLE
 ▲                                 │
 └──(speech_run decays on silence)─┘

Troubleshooting

No input devices shown. Grant microphone permission to your terminal. macOS: Privacy & Security → Microphone. Linux: check PulseAudio / PipeWire. Windows: Settings → Privacy → Microphone.

Wake word never fires. Confirm the right mic with --list-devices. Say the wake word slowly — fuzzy matching covers common mishears, but very low mic gain can strip initial consonants.

Cloud backend not activating. Check that the API key env var is set in .env. Run with --command-stt-backend <name> to test explicitly.

[dispatch error] on Wayland. Wayland blocks programmatic window focus. Pass --target-app <name> and keep that window focused manually.

Windows: keystrokes go to the wrong window. Focus-stealing prevention can block SetForegroundWindow. Give the target window focus manually before speaking, or use AutoHotkey.

macOS: keystrokes ignored. Accessibility + Automation permissions missing. System Settings → Privacy & Security → Accessibility / Automation.

Privacy & Security

Local backends are fully on-device — MLX Whisper and faster-whisper make zero network calls
Cloud backends upload audio only after wake-word — continuous listening never touches cloud APIs
Clipboard temporarily overwritten per dispatch — original contents restored immediately
No telemetry. No analytics. No phone-home.
Accessibility permissions are powerful — review the source before granting

Future Scope

Stronger Linux target locking: closer to macOS / Windows target reactivation behaviour
Packaged installers: smoother setup with platform-specific dependency checks
Tray / menu bar control: pause, resume, backend selection, target status
Custom wake words: user-defined beyond the built-in three
Command history: optional local log of recent accepted prompts
Google Gemini STT: cloud transcription backend

License

Acknowledgements

MLX Whisper · faster-whisper · py-webrtcvad
60dB.ai · ElevenLabs · Deepgram · Groq · AssemblyAI · Gladia
sounddevice · soundfile
xdotool · wtype · pyautogui
pyttsx3

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.4.0

May 9, 2026

1.3.6

Apr 29, 2026

1.3.5

Apr 29, 2026

1.3.3

Apr 29, 2026

1.3.2

Apr 29, 2026

1.0.6

Apr 28, 2026

1.0.3

Apr 28, 2026

1.0.2

Apr 28, 2026

1.0.1

Apr 28, 2026

1.0.0

Apr 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

micoracle-1.4.0.tar.gz (46.8 kB view details)

Uploaded May 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

micoracle-1.4.0-py3-none-any.whl (30.4 kB view details)

Uploaded May 9, 2026 Python 3

File details

Details for the file micoracle-1.4.0.tar.gz.

File metadata

Download URL: micoracle-1.4.0.tar.gz
Upload date: May 9, 2026
Size: 46.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for micoracle-1.4.0.tar.gz
Algorithm	Hash digest
SHA256	`742ea597a05aa19c118ebade508bc49e3e2a296b7c6769d9a6a6882f62cbda27`
MD5	`6f524f0749d44d183dd4752eb5f81e15`
BLAKE2b-256	`590e3768f6a245f4fb5add1347611f90424a99b1b25efc12aa3b0ca2eb965d30`

See more details on using hashes here.

File details

Details for the file micoracle-1.4.0-py3-none-any.whl.

File metadata

Download URL: micoracle-1.4.0-py3-none-any.whl
Upload date: May 9, 2026
Size: 30.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for micoracle-1.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3ce9289725757b926856aecd7a9cd682af97b440d70855e7a8b4c33a650523d9`
MD5	`f7ad0b547569775ed2dbb8f532ffdca6`
BLAKE2b-256	`f673e26a49ad322aeab04331ba7066790b5763d3c2a4322d75b08a926adbac19`

See more details on using hashes here.

micoracle 1.4.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

Stop typing your AI prompts. Just say them.

Demo

Quick Install

Why micoracle?

Features

STT Backends

Local (free, offline)

Cloud (post-wake-word only — never billed for continuous listening)

Wake Words

Platform & Backend Matrix

Install

Step 1 — Core dependencies (all platforms)

Step 2 — System packages

Step 3 — Pick a local STT backend (for continuous listening)

Step 4 — Pick a cloud STT backend (for commands, optional)

Step 5 — Pick a TTS backend (optional, for status cues)

Step 6 — Windows dispatch packages

Step 7 — Configure

Quickstart

CLI Reference

Environment Variables

Core

Local STT knobs

Cloud STT keys & options

TTS keys & options

Architecture

How it works

Module overview

VAD state machine

Troubleshooting

Privacy & Security

Future Scope

Related searches

License

Acknowledgements

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes