Text-to-Speech with multiple backends for scientific workflows

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ywatanabe1989

These details have not been verified by PyPI

Project links

Documentation

Project description

SciTeX Audio (`scitex-audio`)

Text-to-Speech with multiple backends for scientific workflows

Full Documentation · pip install scitex-audio

Problem

Scientific workflows increasingly rely on AI agents that often run in parallel on remote servers or headless environments. Researchers may prefer auditory feedback over text — for experiment completion notifications, error alerts, or accessibility — but have no direct access to audio hardware.

Solution

SciTeX Audio provides a unified TTS interface with automatic backend fallback and smart local/remote routing to speakers on your desktop. It works on local machines, remote servers (via relay), and WSL environments with automatic audio path detection.

Backend	Quality	Cost	Internet	Offline	Default Speed
ElevenLabs	High	Paid	Required	No	1.2x
LuxTTS	High	Free	First download	Yes	2.0x
Google TTS	Good	Free	Required	No	1.5x
System TTS	Basic	Free	No	Yes	150 wpm

_{Table 1. Supported TTS backends. The fallback order (elevenlabs → luxtts → gtts → pyttsx3) ensures the best available quality is always used.}

Installation

Requires Python >= 3.10.

pip install scitex-audio

Install with specific backends:

pip install scitex-audio[gtts]         # Google TTS
pip install scitex-audio[pyttsx3]      # System TTS (+ apt install espeak-ng)
pip install scitex-audio[elevenlabs]   # ElevenLabs
pip install scitex-audio[luxtts]       # LuxTTS (voice cloning, offline)
pip install scitex-audio[all]          # Everything

Quick Start

from scitex_audio import speak, available_backends

# Check what's available
print(available_backends())  # e.g., ['gtts', 'pyttsx3']

# Speak with auto-selected backend
speak("Hello from SciTeX Audio!")

# Choose a specific backend
speak("Bonjour", backend="gtts", voice="fr")

# Save without playing
speak("Save this", output_path="output.mp3", play=False)

Four Interfaces

Python API

import scitex_audio

scitex_audio.speak("Hello!")                         # auto backend
scitex_audio.speak("Fast", backend="gtts", speed=1.5)
scitex_audio.available_backends()                    # list backends
scitex_audio.check_wsl_audio()                       # WSL audio status
scitex_audio.generate_bytes("As bytes")              # raw MP3 bytes
scitex_audio.stop_speech()                           # kill playback

tts = scitex_audio.get_tts("gtts")                   # get engine
tts.speak("With engine", voice="fr")

Full API reference

CLI Commands

scitex-audio --help-recursive             # Show all commands
scitex-audio speak "Hello world"          # Speak text
scitex-audio speak "Bonjour" -b gtts -v fr
scitex-audio backends                     # List backends
scitex-audio check                        # Audio status (WSL)
scitex-audio stop                         # Stop playback
scitex-audio relay --port 31293           # Start relay server
scitex-audio list-python-apis             # List Python API tree
scitex-audio mcp list-tools               # List MCP tools

Full CLI reference

MCP Server — for AI Agents

AI agents can speak through the MCP protocol for notifications and accessibility. The agent generates text, the MCP server synthesizes speech on the host machine, and audio plays through the user's local speakers — even when the agent runs on a remote server (via the relay on port 31293).

Tool	Description
`audio_speak`	Convert text to speech with backend fallback
`list_backends`	List available TTS backends and status
`check_audio_status`	Check WSL audio connectivity
`announce_context`	Announce current directory and git branch

_{Table 2. Four MCP tools available for AI-assisted audio. All tools accept JSON parameters and return JSON results.}

Claude Code Setup

Add .mcp.json to your project root. Use SCITEX_AUDIO_ENV_SRC to load all configuration from a .src file — this keeps .mcp.json static across environments:

{
  "mcpServers": {
    "scitex-audio": {
      "command": "scitex-audio",
      "args": ["mcp", "start"],
      "env": {
        "SCITEX_AUDIO_ENV_SRC": "${SCITEX_AUDIO_ENV_SRC}"
      }
    }
  }
}

Then switch environments via your shell profile:

# Local machine (has speakers)
export SCITEX_AUDIO_ENV_SRC=~/.scitex/audio/local.src

# Remote server (no speakers, uses relay)
export SCITEX_AUDIO_ENV_SRC=~/.scitex/audio/remote.src

Generate a template .src file:

scitex-audio env-template -o ~/.scitex/audio/local.src

Example local.src:

export SCITEX_AUDIO_MODE=local
export SCITEX_AUDIO_RELAY_PORT=31293
export SCITEX_AUDIO_ELEVENLABS_API_KEY="your-key-here"

Example remote.src:

export SCITEX_AUDIO_MODE=remote
export SCITEX_AUDIO_RELAY_PORT=31293
export SCITEX_AUDIO_ELEVENLABS_API_KEY="your-key-here"

Or install globally:

scitex-audio mcp install

Full MCP specification

Skills — for AI Agent Discovery

Skills provide workflow-oriented guides that AI agents query to discover capabilities and usage patterns.

scitex-audio skills list              # List available skill pages
scitex-audio skills get SKILL         # Show main skill page
scitex-dev skills export --package scitex-audio  # Export to Claude Code

Skill	Content
`quick-start`	Basic usage, first call, return values
`available-backends`	All TTS backends, capabilities, install commands
`smart-routing`	Auto/local/remote modes, relay server, SSH tunneling
`cli-commands`	Complete CLI reference
`mcp-tools-for-ai-agents`	MCP tools and installation
`common-workflows`	Notification patterns, multi-backend, save audio

Remote Audio Relay

When agents run on remote servers (NAS, cloud, HPC), they have no speakers. The relay server solves this: the local machine (with speakers) runs a lightweight HTTP server, and the remote agent sends speech requests over an SSH tunnel.

Architecture

Remote Server (NAS/Cloud)          Local Machine (has speakers)
┌─────────────────────┐            ┌──────────────────────┐
│ AI Agent            │            │ Relay Server         │
│   speak("Hello")    │            │   scitex-audio relay │
│     ↓               │            │     ↓                │
│ POST /speak ────────┼── SSH ─────┼→ TTS engine          │
│   (port 31293)      │  tunnel    │     ↓                │
│                     │            │   🔊 Speakers        │
└─────────────────────┘            └──────────────────────┘

Setup

Step 1: Start relay on local machine (has speakers)

scitex-audio relay --port 31293

Step 2: SSH tunnel from local to remote

Add to your ~/.ssh/config:

Host my-server
  HostName 192.168.0.69
  User myuser
  RemoteForward 31293 127.0.0.1:31293   # Audio relay

Then connect: ssh my-server. The tunnel maps remote port 31293 back to your local relay.

Step 3: Configure remote environment

On the remote server, set:

export SCITEX_AUDIO_MODE=remote
export SCITEX_AUDIO_RELAY_PORT=31293
# URL is auto-detected from the tunnel (localhost:31293)

Now speak("Hello") on the remote server plays audio on your local speakers.

Relay Endpoints

Endpoint	Method	Description
`/speak`	POST	Play TTS (`{"text": "...", "backend": "gtts"}`)
`/health`	GET	Health check (returns `{"status": "ok"}`)
`/list_backends`	GET	List available backends

Auto-Start Relay (Shell Profile)

Add to your shell profile (e.g., ~/.bashrc) on the local machine (the one with speakers):

export SCITEX_AUDIO_RELAY_PORT=31293
export SCITEX_AUDIO_MODE=local

_audio_health_ok() {
    local port="${SCITEX_AUDIO_RELAY_PORT:-31293}"
    curl -sf --connect-timeout 2 "http://localhost:$port/health" >/dev/null 2>&1
}

_start_audio_relay() {
    local port="${SCITEX_AUDIO_RELAY_PORT:-31293}"

    # Already healthy -> skip
    _audio_health_ok && return

    # Start relay in background
    scitex-audio relay --port "$port" --force &>/dev/null &
    disown
}

_start_audio_relay

On the remote server (NAS, cloud, HPC), add to the shell profile:

export SCITEX_AUDIO_RELAY_PORT=31293
export SCITEX_AUDIO_MODE=remote
# Relay URL is auto-detected from the SSH RemoteForward tunnel

With this setup, every SSH session automatically has audio routed back to your speakers. AI agents (Claude Code, etc.) running on the remote server call speak() and the audio plays locally.

Environment Variables

Variable	Default	Description
`SCITEX_AUDIO_MODE`	`auto`	Audio mode: `local`, `remote`, or `auto`
`SCITEX_AUDIO_RELAY_URL`	(auto)	Full relay URL (e.g., `http://localhost:31293`)
`SCITEX_AUDIO_RELAY_HOST`	(none)	Relay host (combined with port to build URL)
`SCITEX_AUDIO_RELAY_PORT`	`31293`	Relay server port
`SCITEX_AUDIO_HOST`	`0.0.0.0`	Relay server bind host
`SCITEX_AUDIO_ELEVENLABS_API_KEY`	(none)	ElevenLabs API key
`SCITEX_DIR`	`~/.scitex`	Base directory for audio cache files
`SCITEX_CLOUD`	(none)	Set to `true` for browser relay mode (OSC escape)

_{Table 3. Environment variables. Port 31293 encodes "sa-i-te-ku-su" (サイテクス) in Japanese phone keypad mapping.}

Mode Resolution

auto (default): Checks local audio availability. If suspended/unavailable and relay URL detected, routes to relay. Otherwise uses local.
local: Always use local TTS engine and speakers.
remote: Always send speech to relay server. Fails if relay unreachable.

Part of SciTeX

SciTeX Audio is part of SciTeX.

The SciTeX system follows the Four Freedoms for Research below, inspired by the Free Software Definition:

Four Freedoms for Research

The freedom to run your research anywhere — your machine, your terms.

The freedom to study how every step works — from raw data to final manuscript.

The freedom to redistribute your workflows, not just your papers.

The freedom to modify any module and share improvements with the community.

AGPL-3.0 — because we believe research infrastructure deserves the same freedoms as the software it runs on.

Acknowledgements

LuxTTS — open-source, offline TTS engine with voice cloning support

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ywatanabe1989

These details have not been verified by PyPI

Project links

Documentation

Release history Release notifications | RSS feed

This version

0.2.6

Apr 18, 2026

0.2.5

Mar 26, 2026

0.2.1

Mar 13, 2026

0.2.0

Mar 13, 2026

0.1.0

Mar 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scitex_audio-0.2.6.tar.gz (741.3 kB view details)

Uploaded Apr 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

scitex_audio-0.2.6-py3-none-any.whl (75.3 kB view details)

Uploaded Apr 18, 2026 Python 3

File details

Details for the file scitex_audio-0.2.6.tar.gz.

File metadata

Download URL: scitex_audio-0.2.6.tar.gz
Upload date: Apr 18, 2026
Size: 741.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for scitex_audio-0.2.6.tar.gz
Algorithm	Hash digest
SHA256	`7ef48753552327515fe6c74b3451911a279ce652f305e9029ffc743123a0c07b`
MD5	`ec1790fcda3d8dbafc6b3424124c8402`
BLAKE2b-256	`40b1700bca3b0aeef11d2108822788cbcd232414ca4c0253bb7263aa1004915b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for scitex_audio-0.2.6.tar.gz:

Publisher: publish-pypi.yml on ywatanabe1989/scitex-audio

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: scitex_audio-0.2.6.tar.gz
- Subject digest: 7ef48753552327515fe6c74b3451911a279ce652f305e9029ffc743123a0c07b
- Sigstore transparency entry: 1338813065
- Sigstore integration time: Apr 18, 2026
Source repository:
- Permalink: ywatanabe1989/scitex-audio@5c5221c0f86fdc587387d059d2f7a476c4ebaca4
- Branch / Tag: refs/tags/v0.2.6
- Owner: https://github.com/ywatanabe1989
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@5c5221c0f86fdc587387d059d2f7a476c4ebaca4
- Trigger Event: push

File details

Details for the file scitex_audio-0.2.6-py3-none-any.whl.

File metadata

Download URL: scitex_audio-0.2.6-py3-none-any.whl
Upload date: Apr 18, 2026
Size: 75.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for scitex_audio-0.2.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c8903bbfb7ab976ceb77118f0bf640c69871a6f517d9a698f2f4fa43fbc3bc54`
MD5	`d4974dd5eff7ee75380636e51a8ed4bc`
BLAKE2b-256	`40e8522a814e50ddcf8f3f8e94d8a27891694a44914bc947c378a3ad837adb2f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for scitex_audio-0.2.6-py3-none-any.whl:

Publisher: publish-pypi.yml on ywatanabe1989/scitex-audio

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: scitex_audio-0.2.6-py3-none-any.whl
- Subject digest: c8903bbfb7ab976ceb77118f0bf640c69871a6f517d9a698f2f4fa43fbc3bc54
- Sigstore transparency entry: 1338813068
- Sigstore integration time: Apr 18, 2026
Source repository:
- Permalink: ywatanabe1989/scitex-audio@5c5221c0f86fdc587387d059d2f7a476c4ebaca4
- Branch / Tag: refs/tags/v0.2.6
- Owner: https://github.com/ywatanabe1989
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@5c5221c0f86fdc587387d059d2f7a476c4ebaca4
- Trigger Event: push

scitex-audio 0.2.6

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SciTeX Audio (scitex-audio)

Problem

Solution

Installation

Quick Start

Four Interfaces

Claude Code Setup

Architecture

Setup

Relay Endpoints

Auto-Start Relay (Shell Profile)

Mode Resolution

Part of SciTeX

Acknowledgements

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

SciTeX Audio (`scitex-audio`)