Skip to main content

The satellite is an open source library to work with the private assistant oecosystem built to run on edge devices. It allows the other components to interact speech based with the user and listen for user keywords to activate.

Project description

Private Assistant Communications Satellite

Copier Python uv Ruff Checked with mypy Code style: black pre-commit

A high-performance voice interaction satellite optimized for edge devices like Raspberry Pi

The Private Assistant Communications Satellite is a latency-optimized edge device component that provides voice interaction capabilities for the private assistant ecosystem. Designed specifically for resource-constrained devices like Raspberry Pi Zero 2W and Raspberry Pi 4, it delivers real-time wake word detection, voice activity detection, and seamless integration with speech processing APIs.

๐ŸŽฏ Key Features

  • ๐Ÿ”Š Real-time Wake Word Detection - micro-wake-word TFLite integration with customizable models
  • ๐ŸŽค Voice Activity Detection (VAD) - Silero VAD for accurate speech boundary detection
  • ๐Ÿ—ฃ๏ธ Speech Processing Integration - Async STT/TTS API communication
  • ๐Ÿ“ก MQTT Ecosystem Integration - Low-latency message passing with the assistant ecosystem
  • โšก Edge Device Optimized - Multi-threaded architecture minimizing audio processing latency
  • ๐Ÿ”ง Flexible Configuration - YAML-based configuration with environment variable support
  • ๐Ÿ›ก๏ธ Robust Error Handling - Graceful degradation and automatic recovery

๐Ÿ—๏ธ Architecture Overview

The satellite uses a simple state machine architecture for stability and performance:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Main Thread   โ”‚    โ”‚   MQTT Thread   โ”‚
โ”‚ (State Machine) โ”‚    โ”‚  (Low Latency)  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค    โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ โ€ข LISTENING     โ”‚    โ”‚ โ€ข MQTT Client   โ”‚
โ”‚ โ€ข RECORDING     โ”‚โ—„โ”€โ”€โ–บโ”‚ โ€ข Message Queue โ”‚
โ”‚ โ€ข WAITING       โ”‚    โ”‚ โ€ข Event Loop    โ”‚
โ”‚ โ€ข SPEAKING      โ”‚    โ”‚ โ€ข Async I/O     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

State Machine Flow:

  • LISTENING: Monitors audio for wake word detection
  • RECORDING: Records user speech after wake word trigger
  • WAITING: Processes STT API and waits for response
  • SPEAKING: Plays TTS audio response

Key Design Benefits:

  • Simplified Threading: Only MQTT runs in separate thread for network I/O
  • Predictable Behavior: Clear state transitions eliminate race conditions
  • Resource Efficiency: Callback-based audio I/O with efficient buffering
  • Stable Operation: No queue overflows or threading conflicts

๐Ÿš€ Quick Start

System Requirements

Minimum Hardware:

  • Raspberry Pi Zero 2W (1GB RAM) or better
  • USB microphone or I2S audio HAT
  • Speaker or headphones for audio feedback
  • MicroSD card (16GB+ recommended)

Recommended Hardware:

  • Raspberry Pi 4 (2GB+ RAM) for optimal performance
  • Quality USB microphone with noise cancellation
  • Dedicated audio HAT for better audio quality

Installation

1. System Dependencies (Raspberry Pi OS)

# Update system packages
sudo apt update && sudo apt upgrade -y

# Install audio system dependencies
sudo apt-get install -y \
    libasound2-dev \
    libportaudio2 \
    libportaudiocpp0 \
    portaudio19-dev \
    libsndfile1-dev \
    python3.12-dev \
    git

# Install UV package manager
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.cargo/env

2. Install Satellite

# Clone repository
git clone https://github.com/stkr22/private-assistant-comms-satellite-py.git
cd private-assistant-comms-satellite-py

# Install dependencies (development)
uv sync --group dev

# Install for deployment (all dependencies included)
uv sync

3. Quick Configuration

# Generate configuration template
uv run comms-satellite config-template

# Edit configuration for your setup
nano local_config.yaml

4. Test Installation

# Test without audio (verify dependencies)
uv run python -c "import private_assistant_comms_satellite; print('โœ… Installation successful')"

# Test with audio (requires microphone)
uv run comms-satellite --help

โš™๏ธ Configuration

Essential Configuration

The satellite requires several key configuration parameters. Generate a template with:

uv run comms-satellite config-template

Minimum Required Configuration:

# Wake word settings
wakework_detection_threshold: 0.97
wakeword_model_path: "./okay_nabu.tflite"

# API endpoints - REQUIRED
speech_transcription_api: "http://your-stt-server:8000/transcribe"
speech_synthesis_api: "http://your-tts-server:8080/synthesizeSpeech"

# Device identification
client_id: "living_room_satellite"  # Unique name for this device
room: "livingroom"

# MQTT broker - REQUIRED
mqtt_server_host: "your-mqtt-broker.local"
mqtt_server_port: 1883

# Audio device indices (use 'python -c "import sounddevice; print(sounddevice.query_devices())"' to list devices)
input_device_index: 1   # Your microphone
output_device_index: 1  # Your speaker

Performance Tuning by Device

Raspberry Pi Zero 2W Configuration

# Optimize for minimal CPU usage
chunk_size: 1024        # Larger chunks = less CPU overhead
samplerate: 16000       # Standard rate, good quality/performance balance
vad_threshold: 0.7      # Higher threshold = less false positives
vad_trigger: 2          # Require 2 chunks before activation
max_command_input_seconds: 10  # Shorter timeout saves memory

Raspberry Pi 4 Configuration

# Optimize for lower latency
chunk_size: 512         # Smaller chunks = lower latency
samplerate: 16000       # Can handle higher if needed
vad_threshold: 0.6      # More sensitive detection
vad_trigger: 1          # Immediate activation
max_command_input_seconds: 15  # Longer timeout for complex commands

Audio Device Configuration

# List available audio devices
python -c "import sounddevice as sd; print(sd.query_devices())"

# Test microphone input
arecord -l  # List recording devices
arecord -D hw:1,0 -f cd test.wav  # Test recording

# Test speaker output
aplay -l   # List playback devices
aplay -D hw:1,0 /usr/share/sounds/alsa/Front_Left.wav  # Test playback

๐ŸŽฎ Usage

Basic Usage

# Start with default configuration
uv run comms-satellite local_config.yaml

# Start with custom configuration
uv run comms-satellite /path/to/my_config.yaml

# Use environment variable
export PRIVATE_ASSISTANT_API_CONFIG_PATH="/path/to/config.yaml"
uv run comms-satellite

Command Line Options

# Show version
uv run comms-satellite version

# Generate configuration template
uv run comms-satellite config-template

# Help
uv run comms-satellite --help

Systemd Service (Recommended for Production)

Create a systemd service for automatic startup:

# Create service file
sudo tee /etc/systemd/system/satellite.service > /dev/null <<EOF
[Unit]
Description=Private Assistant Communications Satellite
After=network.target sound.target
Wants=network.target

[Service]
Type=simple
User=pi
Group=audio
WorkingDirectory=/home/pi/private-assistant-comms-satellite-py
Environment=PRIVATE_ASSISTANT_API_CONFIG_PATH=/home/pi/satellite-config.yaml
ExecStart=/home/pi/.local/bin/uv run comms-satellite
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal

[Install]
WantedBy=multi-user.target
EOF

# Enable and start service
sudo systemctl daemon-reload
sudo systemctl enable satellite.service
sudo systemctl start satellite.service

# Check status
sudo systemctl status satellite.service
sudo journalctl -fu satellite.service

๐Ÿ”Œ API Integration

Speech-to-Text (STT) API

The satellite expects an STT service that accepts audio data and returns transcribed text.

Expected API Contract:

POST /transcribe
Content-Type: multipart/form-data
Authorization: Bearer <optional-token>

Body:
- file: audio.raw (raw PCM audio data, 16kHz, 16-bit, mono)

Response:
{
  "text": "transcribed speech text",
  "message": "success"
}

Configuration:

speech_transcription_api: "http://your-stt-server:8000/transcribe"
speech_transcription_api_token: "optional-bearer-token"

Text-to-Speech (TTS) API

The satellite expects a TTS service that accepts text and returns audio data.

Expected API Contract:

POST /synthesizeSpeech
Content-Type: application/json
Authorization: Bearer <optional-token>

Body:
{
  "text": "text to synthesize",
  "sample_rate": 16000
}

Response: audio/wav (PCM audio data)

Configuration:

speech_synthesis_api: "http://your-tts-server:8080/synthesizeSpeech"
speech_synthesis_api_token: "optional-bearer-token"

๐Ÿ“ก MQTT Integration

Topic Structure

The satellite uses a hierarchical MQTT topic structure:

assistant/comms_bridge/all/{client_id}/
โ”œโ”€โ”€ input          # Satellite publishes recognized speech here
โ””โ”€โ”€ output         # Satellite subscribes for responses to speak

assistant/comms_bridge/broadcast  # System-wide announcements

Topic Configuration:

# Default topics (auto-generated)
client_id: "living_room_satellite"
# Results in: assistant/comms_bridge/all/living_room_satellite/

# Custom topic overrides
base_topic_overwrite: "custom/satellite/living_room"
input_topic_overwrite: "custom/satellite/living_room/speech_input"
output_topic_overwrite: "custom/satellite/living_room/speech_output"

Message Formats

Input Messages (Satellite โ†’ Assistant):

{
  "id": "uuid4-string",
  "text": "recognized speech text",
  "room": "livingroom",
  "output_topic": "assistant/comms_bridge/all/living_room_satellite/output"
}

Output Messages (Assistant โ†’ Satellite):

{
  "text": "response text to speak",
  "id": "uuid4-string"
}

๐ŸŽ›๏ธ Performance Optimization

Latency Optimization

Key Performance Metrics:

  • Wake word detection: ~100-200ms
  • Voice activity detection: ~50-100ms per chunk
  • STT API call: 1-5 seconds (network dependent)
  • TTS API call: 0.5-2 seconds (network dependent)

Optimization Strategies:

  1. Audio Buffer Tuning:
# Lower latency (higher CPU usage)
chunk_size: 256

# Higher latency (lower CPU usage)
chunk_size: 1024
  1. VAD Sensitivity Tuning:
# More sensitive (faster activation, more false positives)
vad_threshold: 0.5
vad_trigger: 1

# Less sensitive (slower activation, fewer false positives)
vad_threshold: 0.8
vad_trigger: 3
  1. Network Optimization:
# Local APIs for best performance
speech_transcription_api: "http://localhost:8000/transcribe"
speech_synthesis_api: "http://localhost:8080/synthesizeSpeech"

Memory Usage Optimization

For Memory-Constrained Devices (Pi Zero 2W):

max_command_input_seconds: 8    # Limit recording buffer
samplerate: 16000              # Don't increase sample rate unnecessarily
chunk_size: 1024               # Larger chunks = fewer allocations

Monitor Memory Usage:

# Monitor system memory
watch -n 1 'free -h && ps aux | grep satellite | head -5'

# Monitor satellite process specifically
top -p $(pgrep -f comms-satellite)

๐Ÿ”ง Troubleshooting

Common Issues

Audio Device Problems

# Check audio devices
python -c "import sounddevice as sd; print(f'Devices: {len(sd.query_devices())}')"

# Test microphone
arecord -D hw:1,0 -f cd -t wav -r 16000 test.wav

# Check ALSA configuration
cat /proc/asound/cards

MQTT Connection Issues

# Test MQTT connectivity
mosquitto_pub -h your-mqtt-broker -t "test/topic" -m "test message"
mosquitto_sub -h your-mqtt-broker -t "assistant/comms_bridge/all/+/+"

# Check network connectivity
ping your-mqtt-broker
telnet your-mqtt-broker 1883

API Integration Issues

# Test STT API manually
curl -X POST -F "file=@test.wav" \
  -H "user-token: your-token" \
  http://your-stt-server:8000/transcribe

# Test TTS API manually
curl -X POST -H "Content-Type: application/json" \
  -H "user-token: your-token" \
  -d '{"text":"hello world","sample_rate":16000}' \
  http://your-tts-server:8080/synthesizeSpeech > test_output.wav

Performance Issues

# Check CPU usage
htop

# Check system load
uptime

# Monitor audio dropouts
dmesg | grep -i audio

# Check for thermal throttling (Raspberry Pi)
vcgencmd measure_temp
vcgencmd get_throttled

Debug Mode

Enable detailed logging by setting environment variable:

export PYTHONPATH=src
export LOG_LEVEL=DEBUG
uv run comms-satellite config.yaml

๐Ÿ‘จโ€๐Ÿ’ป Development

Development Setup

# Clone repository
git clone https://github.com/stkr22/private-assistant-comms-satellite-py.git
cd private-assistant-comms-satellite-py

# Install development dependencies
uv sync --group dev

# Install pre-commit hooks
uv run pre-commit install

Code Quality Tools

# Type checking
uv run mypy src/

# Linting
uv run ruff check .

# Code formatting
uv run ruff format .

# Run tests
uv run pytest

# Run tests with coverage
uv run pytest --cov=private_assistant_comms_satellite --cov-report=html

Project Structure

private-assistant-comms-satellite-py/
โ”œโ”€โ”€ src/private_assistant_comms_satellite/
โ”‚   โ”œโ”€โ”€ __init__.py              # Package initialization
โ”‚   โ”œโ”€โ”€ cli.py                   # Command-line interface
โ”‚   โ”œโ”€โ”€ main.py                  # Application entry point
โ”‚   โ”œโ”€โ”€ satellite.py            # Core Satellite class (main logic)
โ”‚   โ”œโ”€โ”€ micro_wake_word.py     # MicroWakeWord streaming detector
โ”‚   โ”œโ”€โ”€ silero_vad.py           # Voice Activity Detection
โ”‚   โ””โ”€โ”€ utils/
โ”‚       โ”œโ”€โ”€ config.py           # Configuration management
โ”‚       โ”œโ”€โ”€ mqtt_utils.py       # MQTT client wrapper
โ”‚       โ””โ”€โ”€ speech_recognition_tools.py  # STT/TTS API integration
โ”œโ”€โ”€ tests/                       # Test suite
โ”œโ”€โ”€ docs/                        # Documentation
โ”œโ”€โ”€ assets/                      # Wake word models and audio files
โ””โ”€โ”€ pyproject.toml              # Project configuration

Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Follow code style: Run uv run ruff format . and uv run ruff check .
  4. Add tests: Ensure new features have appropriate test coverage
  5. Commit changes: Use conventional commit format: feat: add amazing feature [AI]
  6. Push to branch: git push origin feature/amazing-feature
  7. Create Pull Request

Code Style Guidelines:

  • Follow the existing code style (enforced by Ruff)
  • Add type hints to all public functions
  • Include docstrings for public classes and methods
  • Add AIDEV-* anchor comments for significant code sections
  • Keep functions focused and under 50 lines when possible

๐Ÿ“š Additional Resources

๐Ÿ“„ License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • micro-wake-word - Wake word detection models (TFLite)
  • Silero VAD - Voice activity detection
  • Private Assistant Commons - Shared utilities and message formats
  • sounddevice - Python audio I/O library with NumPy integration
  • aiomqtt - Async MQTT client library

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

private_assistant_comms_satellite-1.7.0.tar.gz (43.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file private_assistant_comms_satellite-1.7.0.tar.gz.

File metadata

File hashes

Hashes for private_assistant_comms_satellite-1.7.0.tar.gz
Algorithm Hash digest
SHA256 a35c389f9e09fab1be1178162b86e674a75e864836b2860a0c998edcfbbf4a6a
MD5 f4584f88e9c5e1eab163396d9227a177
BLAKE2b-256 a80ea3ab5672e3c790b4e86b16303039c2d902518d7692c0ff7cc62cf086d00e

See more details on using hashes here.

Provenance

The following attestation bundles were made for private_assistant_comms_satellite-1.7.0.tar.gz:

Publisher: release-to-pypi.yml on stkr22/private-assistant-comms-satellite-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file private_assistant_comms_satellite-1.7.0-py3-none-any.whl.

File metadata

File hashes

Hashes for private_assistant_comms_satellite-1.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 40929ff236ac9b6822a2fb3665389581546a09fdf5e265b8d64fec07e7c8138c
MD5 9f985cb85526ac9a23c9ef81581d223e
BLAKE2b-256 af899b2f8eaf28b40cdba1cf54eeb2228be8ac211358bc94c77e7361f87b7ca1

See more details on using hashes here.

Provenance

The following attestation bundles were made for private_assistant_comms_satellite-1.7.0-py3-none-any.whl:

Publisher: release-to-pypi.yml on stkr22/private-assistant-comms-satellite-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page