outetts

OuteAI Text-to-Speech (TTS)

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

OuteTTS

🌐 Website | 🤗 Hugging Face | 💬 Discord | 𝕏 X (Twitter) | 📰 Blog

Compatibility

OuteTTS supports the following backends:

Backend	Type	Installation
Llama.cpp Python Bindings	Python	✅ Installed by default
Llama.cpp Server	Python	✅ Installed by default
Llama.cpp Server Async (Batched)	Python	✅ Installed by default
Hugging Face Transformers	Python	✅ Installed by default
ExLlamaV2 & ExLlamaV2 Async (Batched)	Python	❌ Requires manual installation
VLLM (Batched) Experimental support	Python	❌ Requires manual installation
Transformers.js	JavaScript	NPM package
Llama.cpp Directly	C++	External library

⚡ Batched RTF Benchmarks

Tested with NVIDIA L40S GPU

rtf

Installation

OuteTTS Installation Guide

OuteTTS now installs the llama.cpp Python bindings by default. Therefore, you must specify the installation based on your hardware. For more detailed instructions on building llama.cpp, refer to the following resources: llama.cpp Build and llama.cpp Python

Pip:

Transformers + llama.cpp CPU

pip install outetts --upgrade

Transformers + llama.cpp CUDA (NVIDIA GPUs)

For systems with NVIDIA GPUs and CUDA installed:

CMAKE_ARGS="-DGGML_CUDA=on" pip install outetts --upgrade

Transformers + llama.cpp ROCm/HIP (AMD GPUs)

For systems with AMD GPUs and ROCm (specify your DAMDGPU_TARGETS) installed:

CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install outetts --upgrade

Transformers + llama.cpp Vulkan (Cross-platform GPU)

For systems with Vulkan support:

CMAKE_ARGS="-DGGML_VULKAN=on" pip install outetts --upgrade

Transformers + llama.cpp Metal (Apple Silicon/Mac)

For macOS systems with Apple Silicon or compatible GPUs:

CMAKE_ARGS="-DGGML_METAL=on" pip install outetts --upgrade

Usage

📚 Documentation

For a complete usage guide, refer to the interface documentation here:

Basic Usage

[!TIP] Currently, only one default English voice is available for testing.

You can easily create your own speaker profiles in just a few lines by following this guide:

👉 Creating Custom Speaker Profiles

import outetts

# Initialize the interface
interface = outetts.Interface(
    config=outetts.ModelConfig.auto_config(
        model=outetts.Models.VERSION_1_0_SIZE_1B,
        # For llama.cpp backend
        backend=outetts.Backend.LLAMACPP,
        quantization=outetts.LlamaCppQuantization.FP16
        # For transformers backend
        # backend=outetts.Backend.HF,
    )
)

# Load the default speaker profile
speaker = interface.load_default_speaker("EN-FEMALE-1-NEUTRAL")

# Or create your own speaker profiles in seconds and reuse them instantly
# speaker = interface.create_speaker("path/to/audio.wav")
# interface.save_speaker(speaker, "speaker.json")
# speaker = interface.load_speaker("speaker.json")

# Generate speech
output = interface.generate(
    config=outetts.GenerationConfig(
        text="Hello, how are you doing?",
        speaker=speaker,
    )
)

# Save to file
output.save("output.wav")

Usage Recommendations for OuteTTS version 1.0

[!IMPORTANT] Important Sampling Considerations

When using OuteTTS version 1.0, it is crucial to use the settings specified in the Sampling Configuration section. The repetition penalty implementation is particularly important - this model requires penalization applied to a 64-token recent window, rather than across the entire context window. Penalizing the entire context will cause the model to produce broken or low-quality output.

To address this limitation, all necessary samplers and patches for all backends are set up automatically in the outetts library. If using a custom implementation, ensure you correctly implement these requirements.

Speaker Reference

The model is designed to be used with a speaker reference. Without one, it generates random vocal characteristics, often leading to lower-quality outputs. The model inherits the referenced speaker's emotion, style, and accent. Therefore, when transcribing to other languages with the same speaker, you may observe the model retaining the original accent. For example, if you use a Japanese speaker and continue speech in English, the model may tend to use a Japanese accent.

Multilingual Application

It is recommended to create a speaker profile in the language you intend to use. This helps achieve the best results in that specific language, including tone, accent, and linguistic features.

While the model supports cross-lingual speech, it still relies on the reference speaker. If the speaker has a distinct accent—such as British English—other languages may carry that accent as well.

Optimal Audio Length

Best Performance: Generate audio around 42 seconds in a single run (approximately 8,192 tokens). It is recomended not to near the limits of this windows when generating. Usually, the best results are up to 7,000 tokens.
Context Reduction with Speaker Reference: If the speaker reference is 10 seconds long, the effective context is reduced to approximately 32 seconds.

Temperature Setting Recommendations

Testing shows that a temperature of 0.4 is an ideal starting point for accuracy (with the sampling settings below). However, some voice references may benefit from higher temperatures for enhanced expressiveness or slightly lower temperatures for more precise voice replication.

Verifying Speaker Encoding

If the cloned voice quality is subpar, check the encoded speaker sample.

interface.decode_and_save_speaker(speaker=your_speaker, path="speaker.wav")

The DAC audio reconstruction model is lossy, and samples with clipping, excessive loudness, or unusual vocal features may introduce encoding issues that impact output quality.

Sampling Configuration

For optimal results with this TTS model, use the following sampling settings.

Parameter	Value
Temperature	0.4
Repetition Penalty	1.1
Repetition Range	64
Top-k	40
Top-p	0.9
Min-p	0.05

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.4.4

May 23, 2025

0.4.3

May 19, 2025

0.4.2

May 19, 2025

0.4.1

Apr 29, 2025

0.4.0

Apr 7, 2025

0.3.3

Feb 14, 2025

0.3.2

Jan 15, 2025

0.3.1

Jan 15, 2025

0.3.0

Jan 15, 2025

0.2.3

Dec 14, 2024

0.2.1

Nov 30, 2024

0.2.0

Nov 25, 2024

0.1.7

Nov 6, 2024

0.1.6

Nov 5, 2024

0.1.5

Nov 5, 2024

0.1.4

Nov 4, 2024

0.1.3

Nov 4, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

outetts-0.4.4.tar.gz (307.2 kB view details)

Uploaded May 23, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

outetts-0.4.4-py3-none-any.whl (415.7 kB view details)

Uploaded May 23, 2025 Python 3

File details

Details for the file outetts-0.4.4.tar.gz.

File metadata

Download URL: outetts-0.4.4.tar.gz
Upload date: May 23, 2025
Size: 307.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.0 CPython/3.12.7

File hashes

Hashes for outetts-0.4.4.tar.gz
Algorithm	Hash digest
SHA256	`7f9ad451967b1857079c2ea827a06585ab4d3a3f6d457368873e7af0e618b238`
MD5	`921e3e0ee75318e3da73e8ca0c70ab15`
BLAKE2b-256	`78904b5f5468a879b49c14bcc30b74e595f087c5ad374479167f336caa457154`

See more details on using hashes here.

File details

Details for the file outetts-0.4.4-py3-none-any.whl.

File metadata

Download URL: outetts-0.4.4-py3-none-any.whl
Upload date: May 23, 2025
Size: 415.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.0 CPython/3.12.7

File hashes

Hashes for outetts-0.4.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`68815ccf063c33c11ed62292dc41fa7c0e5d37e28976b9504a0e82f0972cc32c`
MD5	`202de8f0e3698370100d3330db408d91`
BLAKE2b-256	`1bea980e108960caeacc6de1cbf5d5ba61a74b86869ac56bb1321eb18706f1b0`

See more details on using hashes here.

outetts 0.4.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OuteTTS

Compatibility

⚡ Batched RTF Benchmarks

Installation

OuteTTS Installation Guide

Pip:

Usage

📚 Documentation

Basic Usage

Usage Recommendations for OuteTTS version 1.0

Speaker Reference

Multilingual Application

Optimal Audio Length

Temperature Setting Recommendations

Verifying Speaker Encoding

Sampling Configuration

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes