SalTTS - Real-time, controllable, adaptive neural TTS system for AI VTubers
Project description
SalTTS
Real-time, controllable, adaptive neural TTS system for AI VTubers
SalTTS is a production-ready text-to-speech system designed for real-time AI VTuber applications with <300ms latency.
Features
- Dual Generator Architecture: Neural (Flow/Diffusion/GAN) + Parametric (WORLD vocoder)
- Real-time Streaming: <300ms end-to-end latency with KV caching
- Adaptive Mixing: RL-based (Soft Actor-Critic) with HNSW memory
- Multi-modal Fusion: 3-channel NMDB integration
- High Quality: Target MOS 4.0+, character consistency 0.85+
Architecture
NMDB (3 channels) → Multi-Modal Fusion → Prosody Generation →
Dual Generators (A+B) → RL Adaptive Mixing → Post-Processing → Audio
Installation
Using Poetry (Recommended)
poetry install
Using pip
pip install -e .
Quick Start
from saltts.inference import SalTTSEngine
# Initialize engine
engine = SalTTSEngine(config_dir="config")
# Generate speech
audio = engine.synthesize(
text="Hello, this is SalTTS!",
emotion="neutral",
speaker_id=0
)
# Save audio
engine.save_audio(audio, "output.wav")
Configuration
Configuration files are located in config/:
model.yaml- Model architecturesruntime.yaml- Runtime settingstraining.yaml- Training hyperparametersnmdb_channels.yaml- NMDB configurationdeployment.yaml- Deployment settings
Training
# Stage 1: Train generators
python scripts/train_stage1.py
# Stage 2: Train transformer
python scripts/train_stage2.py
# Stage 3: Train integration
python scripts/train_stage3.py
# Stage 4: RL training
python scripts/train_stage4_rl.py
Performance
- End-to-end latency: <300ms (target <200ms)
- Real-Time Factor: <0.5
- MOS score: 4.0+
- Character consistency: 0.85+
License
MIT License - see LICENSE file for details
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file saltts-0.1.0.tar.gz.
File metadata
- Download URL: saltts-0.1.0.tar.gz
- Upload date:
- Size: 28.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aaab4e8e5d88228adba214fe54993dca5339d693a72ba7463217c791bbdfddfb
|
|
| MD5 |
39975b8d2bb2e90a0b325f27bd0bc5df
|
|
| BLAKE2b-256 |
6c4b67bc8a8f0126b1a0dcda4cb5034d1b1ee6a88683ecf2cf0a4a0414015203
|
File details
Details for the file saltts-0.1.0-py3-none-any.whl.
File metadata
- Download URL: saltts-0.1.0-py3-none-any.whl
- Upload date:
- Size: 43.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e5e4ef62396e6c723198cc752ce15467ec5a0b55b897f49af65de13a46c2478d
|
|
| MD5 |
f29aa56c2c77735612bf069be3c19628
|
|
| BLAKE2b-256 |
76fe1084dffea75eae546c9b5b5556130f228a8205246d0fa9e456e9e6944be3
|