Text-to-Speech with multiple backends for scientific workflows
Project description
SciTeX Audio (scitex-audio)
Text-to-Speech with multiple backends for scientific workflows
Full Documentation · pip install scitex-audio
Problem
Scientific workflows increasingly use AI agents that run on remote servers or headless environments. These agents need to communicate results audibly — for experiment completion notifications, error alerts, or accessibility — but have no direct access to audio hardware.
Solution
SciTeX Audio provides a unified TTS interface with automatic backend fallback and smart local/remote routing. It works on local machines, remote servers (via relay), and WSL environments with automatic audio path detection.
| Backend | Quality | Cost | Internet | Offline | Default Speed |
|---|---|---|---|---|---|
| ElevenLabs | High | Paid | Required | No | 1.2x |
| LuxTTS | High | Free | First download | Yes | 2.0x |
| Google TTS | Good | Free | Required | No | 1.5x |
| System TTS | Basic | Free | No | Yes | 150 wpm |
Table 1. Supported TTS backends. The fallback order (elevenlabs → luxtts → gtts → pyttsx3) ensures the best available quality is always used.
Installation
Requires Python >= 3.10.
pip install scitex-audio
Install with specific backends:
pip install scitex-audio[gtts] # Google TTS
pip install scitex-audio[pyttsx3] # System TTS (+ apt install espeak-ng)
pip install scitex-audio[elevenlabs] # ElevenLabs
pip install scitex-audio[luxtts] # LuxTTS (voice cloning, offline)
pip install scitex-audio[all] # Everything
Quick Start
from scitex_audio import speak, available_backends
# Check what's available
print(available_backends()) # e.g., ['gtts', 'pyttsx3']
# Speak with auto-selected backend
speak("Hello from SciTeX Audio!")
# Choose a specific backend
speak("Bonjour", backend="gtts", voice="fr")
# Save without playing
speak("Save this", output_path="output.mp3", play=False)
Three Interfaces
Python API
import scitex_audio
scitex_audio.speak("Hello!") # auto backend
scitex_audio.speak("Fast", backend="gtts", speed=1.5)
scitex_audio.available_backends() # list backends
scitex_audio.check_wsl_audio() # WSL audio status
scitex_audio.generate_bytes("As bytes") # raw MP3 bytes
scitex_audio.stop_speech() # kill playback
tts = scitex_audio.get_tts("gtts") # get engine
tts.speak("With engine", voice="fr")
CLI Commands
scitex-audio --help-recursive # Show all commands
scitex-audio speak "Hello world" # Speak text
scitex-audio speak "Bonjour" -b gtts -v fr
scitex-audio backends # List backends
scitex-audio check # Audio status (WSL)
scitex-audio stop # Stop playback
scitex-audio relay --port 31293 # Start relay server
scitex-audio list-python-apis # List Python API tree
scitex-audio mcp list-tools # List MCP tools
MCP Server — for AI Agents
AI agents can speak through the MCP protocol for notifications and accessibility.
| Tool | Description |
|---|---|
audio_speak |
Convert text to speech with backend fallback |
list_backends |
List available TTS backends and status |
check_audio_status |
Check WSL audio connectivity |
announce_context |
Announce current directory and git branch |
Table 2. Four MCP tools available for AI-assisted audio. All tools accept JSON parameters and return JSON results.
scitex-audio mcp start
Remote Audio Relay
When agents run on remote servers (NAS, cloud, HPC), they have no speakers. The relay server solves this: the local machine (with speakers) runs a lightweight HTTP server, and the remote agent sends speech requests over an SSH tunnel.
Architecture
Remote Server (NAS/Cloud) Local Machine (has speakers)
┌─────────────────────┐ ┌──────────────────────┐
│ AI Agent │ │ Relay Server │
│ speak("Hello") │ │ scitex-audio relay │
│ ↓ │ │ ↓ │
│ POST /speak ────────┼── SSH ─────┼→ TTS engine │
│ (port 31293) │ tunnel │ ↓ │
│ │ │ 🔊 Speakers │
└─────────────────────┘ └──────────────────────┘
Setup
Step 1: Start relay on local machine (has speakers)
scitex-audio relay --port 31293
Step 2: SSH tunnel from local to remote
Add to your ~/.ssh/config:
Host my-server
HostName 192.168.0.69
User myuser
RemoteForward 31293 127.0.0.1:31293 # Audio relay
Then connect: ssh my-server. The tunnel maps remote port 31293 back to your local relay.
Step 3: Configure remote environment
On the remote server, set:
export SCITEX_AUDIO_MODE=remote
export SCITEX_AUDIO_RELAY_PORT=31293
# URL is auto-detected from the tunnel (localhost:31293)
Now speak("Hello") on the remote server plays audio on your local speakers.
Relay Endpoints
| Endpoint | Method | Description |
|---|---|---|
/speak |
POST | Play TTS ({"text": "...", "backend": "gtts"}) |
/health |
GET | Health check (returns {"status": "ok"}) |
/list_backends |
GET | List available backends |
Auto-Start Relay (Shell Profile)
# ~/.bashrc or ~/.bash_profile (local machine with speakers)
_start_audio_relay() {
local port="${SCITEX_AUDIO_RELAY_PORT:-31293}"
# Skip if already running
curl -sf "http://localhost:$port/health" >/dev/null 2>&1 && return
# Start in background
scitex-audio relay --port "$port" --force &>/dev/null &
disown
}
_start_audio_relay
Environment Variables
| Variable | Default | Description |
|---|---|---|
SCITEX_AUDIO_MODE |
auto |
Audio mode: local, remote, or auto |
SCITEX_AUDIO_RELAY_URL |
(auto) | Full relay URL (e.g., http://localhost:31293) |
SCITEX_AUDIO_RELAY_HOST |
(none) | Relay host (combined with port to build URL) |
SCITEX_AUDIO_RELAY_PORT |
31293 |
Relay server port |
SCITEX_AUDIO_HOST |
0.0.0.0 |
Relay server bind host |
SCITEX_AUDIO_ELEVENLABS_API_KEY |
(none) | ElevenLabs API key |
SCITEX_DIR |
~/.scitex |
Base directory for audio cache files |
SCITEX_CLOUD |
(none) | Set to true for browser relay mode (OSC escape) |
Table 3. Environment variables. Port 31293 encodes "sa-i-te-ku-su" (サイテクス) in Japanese phone keypad mapping.
Mode Resolution
auto(default): Checks local audio availability. If suspended/unavailable and relay URL detected, routes to relay. Otherwise uses local.local: Always use local TTS engine and speakers.remote: Always send speech to relay server. Fails if relay unreachable.
Part of SciTeX
SciTeX Audio is part of SciTeX. When used inside the orchestrator package scitex, audio integrates with the session system for automatic experiment notifications:
import scitex
@scitex.session
def main(CONFIG=scitex.INJECTED):
data = scitex.io.load("input.csv")
result = process(data)
scitex.io.save(result, "output.csv")
scitex.audio.speak("Experiment complete") # notify via TTS
return 0
The SciTeX ecosystem follows the Four Freedoms for researchers:
Four Freedoms for Research
- The freedom to run your research anywhere — your machine, your terms.
- The freedom to study how every step works — from raw data to final manuscript.
- The freedom to redistribute your workflows, not just your papers.
- The freedom to modify any module and share improvements with the community.
AGPL-3.0 — because research infrastructure deserves the same freedoms as the software it runs on.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scitex_audio-0.1.0.tar.gz.
File metadata
- Download URL: scitex_audio-0.1.0.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.0rc1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
52a60b71ffe1fdb869a0ba8d2942069b07ef63c7aed178f48f80c9bacd53769e
|
|
| MD5 |
5fe7077a46863475e636bbf53e1b70eb
|
|
| BLAKE2b-256 |
cce8cdb5e7faee528410970049068f863ff9bf548843f80ced3ba033eefd6d52
|
File details
Details for the file scitex_audio-0.1.0-py3-none-any.whl.
File metadata
- Download URL: scitex_audio-0.1.0-py3-none-any.whl
- Upload date:
- Size: 55.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.0rc1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
db1ee85d6753931b323b8eafebe9bc73529d09d58d8997170d70dbf32ba11f4a
|
|
| MD5 |
fcf1c5cf256bb521ac468cac2507f272
|
|
| BLAKE2b-256 |
eede5ba6b537a90cc85406341fd3e70e7411586df5090f58835f4af7fe8a7df7
|