The satellite is an open source library to work with the private assistant oecosystem built to run on edge devices. It allows the other components to interact speech based with the user and listen for user keywords to activate.
Project description
Private Assistant Communications Satellite
A high-performance voice interaction satellite optimized for edge devices like Raspberry Pi
The Private Assistant Communications Satellite is a latency-optimized edge device component that provides voice interaction capabilities for the private assistant ecosystem. Designed specifically for resource-constrained devices like Raspberry Pi Zero 2W and Raspberry Pi 4, it delivers real-time wake word detection, voice activity detection, and seamless integration with speech processing APIs.
๐ฏ Key Features
- ๐ Real-time Wake Word Detection - micro-wake-word TFLite integration with customizable models
- ๐ค Voice Activity Detection (VAD) - Silero VAD for accurate speech boundary detection
- ๐ฃ๏ธ Speech Processing Integration - Async STT/TTS API communication
- ๐ก MQTT Ecosystem Integration - Low-latency message passing with the assistant ecosystem
- โก Edge Device Optimized - Multi-threaded architecture minimizing audio processing latency
- ๐ง Flexible Configuration - YAML-based configuration with environment variable support
- ๐ก๏ธ Robust Error Handling - Graceful degradation and automatic recovery
๐๏ธ Architecture Overview
The satellite uses a simple state machine architecture for stability and performance:
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ Main Thread โ โ MQTT Thread โ
โ (State Machine) โ โ (Low Latency) โ
โโโโโโโโโโโโโโโโโโโค โโโโโโโโโโโโโโโโโโโค
โ โข LISTENING โ โ โข MQTT Client โ
โ โข RECORDING โโโโโบโ โข Message Queue โ
โ โข WAITING โ โ โข Event Loop โ
โ โข SPEAKING โ โ โข Async I/O โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
State Machine Flow:
- LISTENING: Monitors audio for wake word detection
- RECORDING: Records user speech after wake word trigger
- WAITING: Processes STT API and waits for response
- SPEAKING: Plays TTS audio response
Key Design Benefits:
- Simplified Threading: Only MQTT runs in separate thread for network I/O
- Predictable Behavior: Clear state transitions eliminate race conditions
- Resource Efficiency: Callback-based audio I/O with efficient buffering
- Stable Operation: No queue overflows or threading conflicts
๐ Quick Start
System Requirements
Minimum Hardware:
- Raspberry Pi Zero 2W (1GB RAM) or better
- USB microphone or I2S audio HAT
- Speaker or headphones for audio feedback
- MicroSD card (16GB+ recommended)
Recommended Hardware:
- Raspberry Pi 4 (2GB+ RAM) for optimal performance
- Quality USB microphone with noise cancellation
- Dedicated audio HAT for better audio quality
Installation
1. System Dependencies (Raspberry Pi OS)
# Update system packages
sudo apt update && sudo apt upgrade -y
# Install audio system dependencies
sudo apt-get install -y \
libasound2-dev \
libportaudio2 \
libportaudiocpp0 \
portaudio19-dev \
libsndfile1-dev \
python3.12-dev \
git
# Install UV package manager
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.cargo/env
2. Install Satellite
# Clone repository
git clone https://github.com/stkr22/private-assistant-comms-satellite-py.git
cd private-assistant-comms-satellite-py
# Install dependencies (development)
uv sync --group dev
# Install for deployment (all dependencies included)
uv sync
3. Quick Configuration
# Generate configuration template
uv run comms-satellite config-template
# Edit configuration for your setup
nano local_config.yaml
4. Test Installation
# Test without audio (verify dependencies)
uv run python -c "import private_assistant_comms_satellite; print('โ
Installation successful')"
# Test with audio (requires microphone)
uv run comms-satellite --help
โ๏ธ Configuration
Essential Configuration
The satellite requires several key configuration parameters. Generate a template with:
uv run comms-satellite config-template
Minimum Required Configuration:
# Wake word settings
wakework_detection_threshold: 0.97
wakeword_model_path: "./okay_nabu.tflite"
# API endpoints - REQUIRED
speech_transcription_api: "http://your-stt-server:8000/transcribe"
speech_synthesis_api: "http://your-tts-server:8080/synthesizeSpeech"
# Device identification
client_id: "living_room_satellite" # Unique name for this device
room: "livingroom"
# MQTT broker - REQUIRED
mqtt_server_host: "your-mqtt-broker.local"
mqtt_server_port: 1883
# Audio device indices (use 'python -c "import sounddevice; print(sounddevice.query_devices())"' to list devices)
input_device_index: 1 # Your microphone
output_device_index: 1 # Your speaker
Performance Tuning by Device
Raspberry Pi Zero 2W Configuration
# Optimize for minimal CPU usage
chunk_size: 1024 # Larger chunks = less CPU overhead
samplerate: 16000 # Standard rate, good quality/performance balance
vad_threshold: 0.7 # Higher threshold = less false positives
vad_trigger: 2 # Require 2 chunks before activation
max_command_input_seconds: 10 # Shorter timeout saves memory
Raspberry Pi 4 Configuration
# Optimize for lower latency
chunk_size: 512 # Smaller chunks = lower latency
samplerate: 16000 # Can handle higher if needed
vad_threshold: 0.6 # More sensitive detection
vad_trigger: 1 # Immediate activation
max_command_input_seconds: 15 # Longer timeout for complex commands
Audio Device Configuration
# List available audio devices
python -c "import sounddevice as sd; print(sd.query_devices())"
# Test microphone input
arecord -l # List recording devices
arecord -D hw:1,0 -f cd test.wav # Test recording
# Test speaker output
aplay -l # List playback devices
aplay -D hw:1,0 /usr/share/sounds/alsa/Front_Left.wav # Test playback
๐ฎ Usage
Basic Usage
# Start with default configuration
uv run comms-satellite local_config.yaml
# Start with custom configuration
uv run comms-satellite /path/to/my_config.yaml
# Use environment variable
export PRIVATE_ASSISTANT_API_CONFIG_PATH="/path/to/config.yaml"
uv run comms-satellite
Command Line Options
# Show version
uv run comms-satellite version
# Generate configuration template
uv run comms-satellite config-template
# Help
uv run comms-satellite --help
Systemd Service (Recommended for Production)
Create a systemd service for automatic startup:
# Create service file
sudo tee /etc/systemd/system/satellite.service > /dev/null <<EOF
[Unit]
Description=Private Assistant Communications Satellite
After=network.target sound.target
Wants=network.target
[Service]
Type=simple
User=pi
Group=audio
WorkingDirectory=/home/pi/private-assistant-comms-satellite-py
Environment=PRIVATE_ASSISTANT_API_CONFIG_PATH=/home/pi/satellite-config.yaml
ExecStart=/home/pi/.local/bin/uv run comms-satellite
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
EOF
# Enable and start service
sudo systemctl daemon-reload
sudo systemctl enable satellite.service
sudo systemctl start satellite.service
# Check status
sudo systemctl status satellite.service
sudo journalctl -fu satellite.service
๐ API Integration
Speech-to-Text (STT) API
The satellite expects an STT service that accepts audio data and returns transcribed text.
Expected API Contract:
POST /transcribe
Content-Type: multipart/form-data
Authorization: Bearer <optional-token>
Body:
- file: audio.raw (raw PCM audio data, 16kHz, 16-bit, mono)
Response:
{
"text": "transcribed speech text",
"message": "success"
}
Configuration:
speech_transcription_api: "http://your-stt-server:8000/transcribe"
speech_transcription_api_token: "optional-bearer-token"
Text-to-Speech (TTS) API
The satellite expects a TTS service that accepts text and returns audio data.
Expected API Contract:
POST /synthesizeSpeech
Content-Type: application/json
Authorization: Bearer <optional-token>
Body:
{
"text": "text to synthesize",
"sample_rate": 16000
}
Response: audio/wav (PCM audio data)
Configuration:
speech_synthesis_api: "http://your-tts-server:8080/synthesizeSpeech"
speech_synthesis_api_token: "optional-bearer-token"
๐ก MQTT Integration
Topic Structure
The satellite uses a hierarchical MQTT topic structure:
assistant/comms_bridge/all/{client_id}/
โโโ input # Satellite publishes recognized speech here
โโโ output # Satellite subscribes for responses to speak
assistant/comms_bridge/broadcast # System-wide announcements
Topic Configuration:
# Default topics (auto-generated)
client_id: "living_room_satellite"
# Results in: assistant/comms_bridge/all/living_room_satellite/
# Custom topic overrides
base_topic_overwrite: "custom/satellite/living_room"
input_topic_overwrite: "custom/satellite/living_room/speech_input"
output_topic_overwrite: "custom/satellite/living_room/speech_output"
Message Formats
Input Messages (Satellite โ Assistant):
{
"id": "uuid4-string",
"text": "recognized speech text",
"room": "livingroom",
"output_topic": "assistant/comms_bridge/all/living_room_satellite/output"
}
Output Messages (Assistant โ Satellite):
{
"text": "response text to speak",
"id": "uuid4-string"
}
๐๏ธ Performance Optimization
Latency Optimization
Key Performance Metrics:
- Wake word detection: ~100-200ms
- Voice activity detection: ~50-100ms per chunk
- STT API call: 1-5 seconds (network dependent)
- TTS API call: 0.5-2 seconds (network dependent)
Optimization Strategies:
- Audio Buffer Tuning:
# Lower latency (higher CPU usage)
chunk_size: 256
# Higher latency (lower CPU usage)
chunk_size: 1024
- VAD Sensitivity Tuning:
# More sensitive (faster activation, more false positives)
vad_threshold: 0.5
vad_trigger: 1
# Less sensitive (slower activation, fewer false positives)
vad_threshold: 0.8
vad_trigger: 3
- Network Optimization:
# Local APIs for best performance
speech_transcription_api: "http://localhost:8000/transcribe"
speech_synthesis_api: "http://localhost:8080/synthesizeSpeech"
Memory Usage Optimization
For Memory-Constrained Devices (Pi Zero 2W):
max_command_input_seconds: 8 # Limit recording buffer
samplerate: 16000 # Don't increase sample rate unnecessarily
chunk_size: 1024 # Larger chunks = fewer allocations
Monitor Memory Usage:
# Monitor system memory
watch -n 1 'free -h && ps aux | grep satellite | head -5'
# Monitor satellite process specifically
top -p $(pgrep -f comms-satellite)
๐ง Troubleshooting
Common Issues
Audio Device Problems
# Check audio devices
python -c "import sounddevice as sd; print(f'Devices: {len(sd.query_devices())}')"
# Test microphone
arecord -D hw:1,0 -f cd -t wav -r 16000 test.wav
# Check ALSA configuration
cat /proc/asound/cards
MQTT Connection Issues
# Test MQTT connectivity
mosquitto_pub -h your-mqtt-broker -t "test/topic" -m "test message"
mosquitto_sub -h your-mqtt-broker -t "assistant/comms_bridge/all/+/+"
# Check network connectivity
ping your-mqtt-broker
telnet your-mqtt-broker 1883
API Integration Issues
# Test STT API manually
curl -X POST -F "file=@test.wav" \
-H "user-token: your-token" \
http://your-stt-server:8000/transcribe
# Test TTS API manually
curl -X POST -H "Content-Type: application/json" \
-H "user-token: your-token" \
-d '{"text":"hello world","sample_rate":16000}' \
http://your-tts-server:8080/synthesizeSpeech > test_output.wav
Performance Issues
# Check CPU usage
htop
# Check system load
uptime
# Monitor audio dropouts
dmesg | grep -i audio
# Check for thermal throttling (Raspberry Pi)
vcgencmd measure_temp
vcgencmd get_throttled
Debug Mode
Enable detailed logging by setting environment variable:
export PYTHONPATH=src
export LOG_LEVEL=DEBUG
uv run comms-satellite config.yaml
๐จโ๐ป Development
Development Setup
# Clone repository
git clone https://github.com/stkr22/private-assistant-comms-satellite-py.git
cd private-assistant-comms-satellite-py
# Install development dependencies
uv sync --group dev
# Install pre-commit hooks
uv run pre-commit install
Code Quality Tools
# Type checking
uv run mypy src/
# Linting
uv run ruff check .
# Code formatting
uv run ruff format .
# Run tests
uv run pytest
# Run tests with coverage
uv run pytest --cov=private_assistant_comms_satellite --cov-report=html
Project Structure
private-assistant-comms-satellite-py/
โโโ src/private_assistant_comms_satellite/
โ โโโ __init__.py # Package initialization
โ โโโ cli.py # Command-line interface
โ โโโ main.py # Application entry point
โ โโโ satellite.py # Core Satellite class (main logic)
โ โโโ micro_wake_word.py # MicroWakeWord streaming detector
โ โโโ silero_vad.py # Voice Activity Detection
โ โโโ utils/
โ โโโ config.py # Configuration management
โ โโโ mqtt_utils.py # MQTT client wrapper
โ โโโ speech_recognition_tools.py # STT/TTS API integration
โโโ tests/ # Test suite
โโโ docs/ # Documentation
โโโ assets/ # Wake word models and audio files
โโโ pyproject.toml # Project configuration
Contributing
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Follow code style: Run
uv run ruff format .anduv run ruff check . - Add tests: Ensure new features have appropriate test coverage
- Commit changes: Use conventional commit format:
feat: add amazing feature [AI] - Push to branch:
git push origin feature/amazing-feature - Create Pull Request
Code Style Guidelines:
- Follow the existing code style (enforced by Ruff)
- Add type hints to all public functions
- Include docstrings for public classes and methods
- Add AIDEV-* anchor comments for significant code sections
- Keep functions focused and under 50 lines when possible
๐ Additional Resources
- Architecture Documentation - Detailed system design
- API Reference - Complete API documentation
- Performance Guide - In-depth optimization techniques
- Troubleshooting Guide - Extended troubleshooting
- Contributing Guide - Detailed contribution guidelines
๐ License
This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.
๐ Acknowledgments
- micro-wake-word - Wake word detection models (TFLite)
- Silero VAD - Voice activity detection
- Private Assistant Commons - Shared utilities and message formats
- sounddevice - Python audio I/O library with NumPy integration
- aiomqtt - Async MQTT client library
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file private_assistant_comms_satellite-1.7.0.tar.gz.
File metadata
- Download URL: private_assistant_comms_satellite-1.7.0.tar.gz
- Upload date:
- Size: 43.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a35c389f9e09fab1be1178162b86e674a75e864836b2860a0c998edcfbbf4a6a
|
|
| MD5 |
f4584f88e9c5e1eab163396d9227a177
|
|
| BLAKE2b-256 |
a80ea3ab5672e3c790b4e86b16303039c2d902518d7692c0ff7cc62cf086d00e
|
Provenance
The following attestation bundles were made for private_assistant_comms_satellite-1.7.0.tar.gz:
Publisher:
release-to-pypi.yml on stkr22/private-assistant-comms-satellite-py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
private_assistant_comms_satellite-1.7.0.tar.gz -
Subject digest:
a35c389f9e09fab1be1178162b86e674a75e864836b2860a0c998edcfbbf4a6a - Sigstore transparency entry: 926863886
- Sigstore integration time:
-
Permalink:
stkr22/private-assistant-comms-satellite-py@28b1674a763f57b4e1f983b4bbf29bf2b923ed65 -
Branch / Tag:
refs/tags/v1.7.0 - Owner: https://github.com/stkr22
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-to-pypi.yml@28b1674a763f57b4e1f983b4bbf29bf2b923ed65 -
Trigger Event:
release
-
Statement type:
File details
Details for the file private_assistant_comms_satellite-1.7.0-py3-none-any.whl.
File metadata
- Download URL: private_assistant_comms_satellite-1.7.0-py3-none-any.whl
- Upload date:
- Size: 44.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
40929ff236ac9b6822a2fb3665389581546a09fdf5e265b8d64fec07e7c8138c
|
|
| MD5 |
9f985cb85526ac9a23c9ef81581d223e
|
|
| BLAKE2b-256 |
af899b2f8eaf28b40cdba1cf54eeb2228be8ac211358bc94c77e7361f87b7ca1
|
Provenance
The following attestation bundles were made for private_assistant_comms_satellite-1.7.0-py3-none-any.whl:
Publisher:
release-to-pypi.yml on stkr22/private-assistant-comms-satellite-py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
private_assistant_comms_satellite-1.7.0-py3-none-any.whl -
Subject digest:
40929ff236ac9b6822a2fb3665389581546a09fdf5e265b8d64fec07e7c8138c - Sigstore transparency entry: 926863888
- Sigstore integration time:
-
Permalink:
stkr22/private-assistant-comms-satellite-py@28b1674a763f57b4e1f983b4bbf29bf2b923ed65 -
Branch / Tag:
refs/tags/v1.7.0 - Owner: https://github.com/stkr22
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-to-pypi.yml@28b1674a763f57b4e1f983b4bbf29bf2b923ed65 -
Trigger Event:
release
-
Statement type: