Skip to main content

SalTTS - Real-time, controllable, adaptive neural TTS system for AI VTubers

Project description

SalTTS

Real-time, controllable, adaptive neural TTS system for AI VTubers

SalTTS is a production-ready text-to-speech system designed for real-time AI VTuber applications with <300ms latency.

Features

  • Dual Generator Architecture: Neural (Flow/Diffusion/GAN) + Parametric (WORLD vocoder)
  • Real-time Streaming: <300ms end-to-end latency with KV caching
  • Adaptive Mixing: RL-based (Soft Actor-Critic) with HNSW memory
  • Multi-modal Fusion: 3-channel NMDB integration
  • High Quality: Target MOS 4.0+, character consistency 0.85+

Architecture

NMDB (3 channels) → Multi-Modal Fusion → Prosody Generation →
Dual Generators (A+B) → RL Adaptive Mixing → Post-Processing → Audio

Installation

Using Poetry (Recommended)

poetry install

Using pip

pip install -e .

Quick Start

from saltts.inference import SalTTSEngine

# Initialize engine
engine = SalTTSEngine(config_dir="config")

# Generate speech
audio = engine.synthesize(
    text="Hello, this is SalTTS!",
    emotion="neutral",
    speaker_id=0
)

# Save audio
engine.save_audio(audio, "output.wav")

Configuration

Configuration files are located in config/:

  • model.yaml - Model architectures
  • runtime.yaml - Runtime settings
  • training.yaml - Training hyperparameters
  • nmdb_channels.yaml - NMDB configuration
  • deployment.yaml - Deployment settings

Training

# Stage 1: Train generators
python scripts/train_stage1.py

# Stage 2: Train transformer
python scripts/train_stage2.py

# Stage 3: Train integration
python scripts/train_stage3.py

# Stage 4: RL training
python scripts/train_stage4_rl.py

Performance

  • End-to-end latency: <300ms (target <200ms)
  • Real-Time Factor: <0.5
  • MOS score: 4.0+
  • Character consistency: 0.85+

License

MIT License - see LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

saltts-0.1.0.tar.gz (28.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

saltts-0.1.0-py3-none-any.whl (43.6 kB view details)

Uploaded Python 3

File details

Details for the file saltts-0.1.0.tar.gz.

File metadata

  • Download URL: saltts-0.1.0.tar.gz
  • Upload date:
  • Size: 28.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for saltts-0.1.0.tar.gz
Algorithm Hash digest
SHA256 aaab4e8e5d88228adba214fe54993dca5339d693a72ba7463217c791bbdfddfb
MD5 39975b8d2bb2e90a0b325f27bd0bc5df
BLAKE2b-256 6c4b67bc8a8f0126b1a0dcda4cb5034d1b1ee6a88683ecf2cf0a4a0414015203

See more details on using hashes here.

File details

Details for the file saltts-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: saltts-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 43.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for saltts-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e5e4ef62396e6c723198cc752ce15467ec5a0b55b897f49af65de13a46c2478d
MD5 f29aa56c2c77735612bf069be3c19628
BLAKE2b-256 76fe1084dffea75eae546c9b5b5556130f228a8205246d0fa9e456e9e6944be3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page