Offline voice assistant for controlling YouTube Music Desktop and Spotify
Project description
REX - Offline Voice-Controlled Music Assistant
REX is a lightweight, streaming voice assistant that runs transcription locally and controls your music player (YouTube Music Desktop or Spotify). It uses native audio capture via sounddevice, Silero VAD to chunk utterances, Faster-Whisper for ASR, and a regex router to map text to actions.
Quick Start (3 steps)
# 1. Install REX
pipx install rex-voice-assistant
# 2. Run the setup wizard
rex setup
# 3. Start REX
rex
That's it! The setup wizard will guide you through configuring your music service.
Tech Stack
| Stage | Tech | What it does |
|---|---|---|
| Audio capture | sounddevice (PortAudio) |
Streams 16 kHz mono PCM from the default mic |
| Voice activity | Silero VAD (PyTorch, TorchScript) | Groups frames into utterances |
| Transcription | Faster-Whisper (CTranslate2 backend) | Speech to text on CPU or CUDA |
| Command routing | Regex matcher (rex_main/matcher.py) |
Maps recognized text to handlers |
| Media control | YTMusic Desktop Companion API / Spotipy | Sends actions to YTMD or Spotify |
| Config & secrets | ~/.rex/config.yaml + keyring |
Configuration and secure secret storage |
CLI Commands
rex # Start the voice assistant
rex setup # Interactive setup wizard
rex status # Show configuration and service connectivity
rex test ytmd # Test YouTube Music Desktop connection
rex test spotify # Test Spotify connection
rex dashboard # Run metrics dashboard standalone
rex migrate --from-env # Import settings from .env file
Options for rex command:
--model Whisper model (tiny|base|small|medium|large, default: small.en)
--device Force device (cuda|cpu, default: auto)
--beam Beam size for decoding (default: 1)
--log-file Path to log file
--debug Enable verbose logging
--dashboard Enable metrics dashboard at http://localhost:8080
--low-latency Faster response time (250ms VAD timeout, may cut speech short)
Prerequisites
Windows 10/11
-
Python 3.10+ (tested with 3.12)
winget install Python.Python.3.12
-
A microphone - Any USB or built-in microphone will work
-
Optional: NVIDIA GPU for 5-10x faster transcription
- Recent NVIDIA driver (no manual CUDA installation needed)
- The setup wizard will offer to install CUDA PyTorch automatically
Media Service Setup
YouTube Music Desktop (YTMD)
- Install YTMD: https://ytmdesktop.app
- In YTMD Settings, enable:
- "Companion server"
- "Allow browser communication"
- "Enable companion authorization"
- Run
rex setupand follow the prompts to authenticate
Spotify
- Create an app at https://developer.spotify.com/dashboard
- Set Redirect URI to
http://127.0.0.1:8888/callback - Run
rex setupand enter your Client ID and Secret
Voice Commands
| Phrase (examples) | Action |
|---|---|
| "play music", "stop music" | Play/pause |
| "next", "last/previous", "restart" | Track navigation |
| "volume up/down", "volume N" | Volume control |
| "search by " | Play first search hit |
| "switch to spotify" | Switch backend to Spotify |
| "switch to youtube music" | Switch backend to YTMD |
| "like", "dislike" | Thumbs up/down current track |
| "clip that", "save clip" | Save clip (SteelSeries GG) |
Add custom commands by editing rex_main/matcher.py and rex_main/commands.py.
SteelSeries Moments Clipping
REX integrates with SteelSeries GG Moments for voice-activated clipping:
- Install SteelSeries GG and enable Moments
- Run
rex setupto register REX with GameSense - Enable REX autoclipping in GG: Moments → Settings → Apps → REX Voice Assistant
- Say "clip that" while Moments is recording to save a clip
Configuration
REX stores configuration in ~/.rex/:
~/.rex/
config.yaml # Main configuration
secrets.yaml # Fallback secret storage (if keyring unavailable)
logs/ # Log files
models/ # Cached Whisper models
Environment Variable Overrides
| Variable | Description |
|---|---|
REX_MODEL |
Override Whisper model |
REX_DEVICE |
Force CPU/GPU (cpu/cuda) |
REX_SERVICE |
Active service (ytmd/spotify/none) |
YTMD_TOKEN |
YTMD authorization token |
YTMD_HOST |
YTMD host (default: localhost) |
YTMD_PORT |
YTMD port (default: 9863) |
SPOTIPY_CLIENT_ID |
Spotify client ID |
SPOTIPY_CLIENT_SECRET |
Spotify client secret |
SPOTIPY_REDIRECT_URI |
Spotify OAuth redirect URI |
Troubleshooting
No audio input detected:
- Check Windows sound settings for default microphone
- Run
rex statusto see detected audio device - Try running
rex setupand use the audio test
YTMD connection errors:
- Run
rex test ytmdto check connectivity - Verify Companion Server is enabled in YTMD settings
- Re-run
rex setupto get a new token
Spotify device not found:
- Open the Spotify desktop app before running REX
- Run
rex test spotifyto check connection - Re-authenticate if needed
CUDA not being used:
- Run
rex setup- it will detect your GPU and offer to install CUDA PyTorch - Or manually install:
pipx runpip rex-voice-assistant install torch --index-url https://download.pytorch.org/whl/cu124 --force-reinstall - Verify:
rexshould auto-detect and log "CUDA detected, using GPU acceleration"
Development
# Clone and install in development mode
git clone https://github.com/David-Antolick/rex_voice_assistant.git
cd rex_voice_assistant
pip install -e ".[dev]"
# Run tests
pytest
# Run directly
python -m rex_main.rex --debug
Roadmap
- Dynamic hotword ("Hey Rex") with OpenWakeWord
- Discord integration (waiting for RPC API access)
- Application controls (open/close apps)
- Performance optimizations
Contributing
PRs welcome. Please keep changes small and document new config flags in this README. For larger features, open an issue to discuss design.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rex_voice_assistant-0.2.0.tar.gz.
File metadata
- Download URL: rex_voice_assistant-0.2.0.tar.gz
- Upload date:
- Size: 57.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
10dc7d80b87902a92c14a7c850ed4c61587d668a3a78d76ffc54206dba90ac38
|
|
| MD5 |
10f99187420edc91f0819e0a90394083
|
|
| BLAKE2b-256 |
4265e7fced6f66ef6e34a288dffb1d5712ed3c0c96e9babc7adaf06ba67200a0
|
File details
Details for the file rex_voice_assistant-0.2.0-py3-none-any.whl.
File metadata
- Download URL: rex_voice_assistant-0.2.0-py3-none-any.whl
- Upload date:
- Size: 61.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f939f1844991b04531b41ae2c855742cf36b761f2f37e1d22e4fb39231bac762
|
|
| MD5 |
bd10038505baef3884e7bc6bf50a1fe4
|
|
| BLAKE2b-256 |
83e18dc07ed5e3656766c670c79739ed3fcd01198ed136da39b78336ecf33dab
|