Text to speech using F5-TTS library

These details have not been verified by PyPI

Project links

Project description

Sinapsis F5-TTS

Templates for advanced text-to-speech synthesis with F5-TTS

🐍 Installation • 🚀 Features • 📚 Usage example • 🌐 Webapp • 📙 Documentation • 🔍 License

This Sinapsis F5-TTS package provides a template for seamlessly integrating, configuring, and running text-to-speech (TTS) functionalities powered by F5TTS.

🐍 Installation

Install using your favourite package manager. We strongly encourage the use of uv, although any other package manager should work too. If you need to install uv please see the official documentation.

Example with uv:

  uv pip install sinapsis-f5-tts --extra-index-url https://pypi.sinapsis.tech

or with raw pip:

  pip install sinapsis-f5-tts --extra-index-url https://pypi.sinapsis.tech

[!IMPORTANT] Templates in each package may require extra dependencies. For development, we recommend installing the package with all the optional dependencies:

with uv:

  uv pip install sinapsis-f5-tts[all] --extra-index-url https://pypi.sinapsis.tech

or with raw pip:

  pip install sinapsis-f5-tts[all] --extra-index-url https://pypi.sinapsis.tech

🚀 Features

Templates Supported

This module includes a template for text-to-speech synthesis using the F5TTS model:

F5TTSInference: Converts text to speech using the F5TTS model with voice cloning capabilities. The template processes text packets from the input container, generates corresponding audio using F5TTS, and adds the resulting audio packets to the container.

Attributes

model: Model name to use for inference (default: "F5TTS_v1_Base")
model_cfg: Optional path to model configuration file
ckpt_file: Optional path to model checkpoint file
vocab_file: Optional path to vocabulary file
ref_audio: Path to reference audio file for voice cloning (default: "artifacts/town.mp3")
ref_text: Reference text corresponding to the reference audio (default: empty string)
vocoder_name: Vocoder to use for waveform generation, either "vocos" or "bigvgan" (default: "vocos")
load_vocoder_from_local: Whether to load vocoder from local path (default: False)
nfe_step: Number of function evaluation steps for diffusion, higher values give better quality but slower inference (default: 32)
cfg_strength: Classifier-free guidance strength, higher values give more stable output but less expressivity (default: 2.0)
cross_fade_duration: Duration of cross-fade between audio segments in seconds (default: 0.15)
speed: Speed factor for generated speech, values > 1 make speech faster, < 1 make it slower (default: 1.0)
sway_sampling_coef: Coefficient for sway sampling (default: -1.0)
target_rms: Target RMS value for audio normalization (default: None)
fix_duration: Fixed duration for generated audio in seconds (default: None)
remove_silence: Whether to remove silence from generated audio (default: False)
save_chunk: Whether to save individual audio chunks (default: False)
device: Device to use for inference, e.g., "cuda", "cpu" (default: None, auto-detect)

[!TIP] Use CLI command sinapsis info --example-template-config TEMPLATE_NAME to produce an example Agent config for the Template specified in TEMPLATE_NAME.

For example, for F5TTSInference use sinapsis info --example-template-config RFDETRTrain to produce an example config like:

agent:
  name: my_test_agent
templates:
- template_name: InputTemplate
  class_name: InputTemplate
  attributes: {}
- template_name: F5TTSInference
  class_name: F5TTSInference
  template_input: InputTemplate
  attributes:
    model: F5TTS_v1_Base
    model_cfg: null
    ckpt_file: null
    vocab_file: null
    ref_audio: '`replace_me:<class ''str''>`'
    ref_text: ' '
    vocoder_name: vocos
    load_vocoder_from_local: false
    nfe_step: 32
    cfg_strength: 2.0
    cross_fade_duration: 0.15
    speed: 1.0
    sway_sampling_coef: -1.0
    target_rms: null
    fix_duration: null
    remove_silence: false
    save_chunk: false
    device: null

📚 Usage example

This example illustrates how to use the F5TTSInference template for text-to-speech synthesis. It converts text input into speech using F5-TTS and saves the resulting audio file locally.

Config

agent:
  name: f5tts_agent
  description: "Agent that generates speech from text using the F5TTS model."

templates:
- template_name: InputTemplate
  class_name: InputTemplate
  attributes: {}

- template_name: TextInput
  class_name: TextInput
  template_input: InputTemplate
  attributes:
    source: "user_input"
    text: "A bottle of water with a soup"

- template_name: F5TTSInference
  class_name: F5TTSInference
  template_input: TextInput
  attributes:
    model: "F5TTS_v1_Base"
    ref_audio: "artifacts/small.mp3"
    ref_text: " "
    vocoder_name: "vocos"
    nfe_step: 32
    cfg_strength: 2.0
    cross_fade_duration: 0.15
    speed: 1.0
    sway_sampling_coef: -1

- template_name: SaveGeneratedAudio
  class_name: AudioWriterSoundfile
  template_input: F5TTSInference
  attributes:
    save_dir: "f5_tts"
    root_dir: "artifacts"
    extension: "wav"

This configuration defines an agent and a sequence of templates for converting text to speech using F5-TTS.

[!IMPORTANT] The TextInput and AudioWriterSoundfile correspond to sinapsis-data-readers and sinapsis-data-writers. If you want to use the example, please make sure you install the packages.

To run the config, use the CLI:

sinapsis run name_of_config.yml

🌐 Webapp

The webapp included in this project showcases the modularity of the F5TTS template for speech generation tasks.

[!IMPORTANT] To run the app you first need to clone this repository:

git clone git@github.com:Sinapsis-ai/sinapsis-speech.git
cd sinapsis-speech

[!NOTE] If you'd like to enable external app sharing in Gradio, export GRADIO_SHARE_APP=True

[!IMPORTANT] F5TTS requires a reference audio file for voice cloning. Make sure you have a reference audio file in the artifacts directory.

🐳 Docker

IMPORTANT This docker image depends on the sinapsis-nvidia:base image. Please refer to the official sinapsis instructions to Build with Docker.

Build the sinapsis-speech image:

docker compose -f docker/compose.yaml build

Start the app container:

docker compose -f docker/compose_apps.yaml up -d sinapsis-f5tts

Check the logs

docker logs -f sinapsis-f5tts

The logs will display the URL to access the webapp, e.g.,::

Running on local URL:  http://127.0.0.1:7860

NOTE: The url may be different, check the output of logs.

To stop the app:

docker compose -f docker/compose_apps.yaml down

💻 UV

To run the webapp using the uv package manager, follow these steps:

Sync the virtual environment:

uv sync --frozen

Install the wheel:

uv pip install sinapsis-speech[all] --extra-index-url https://pypi.sinapsis.tech

Run the webapp:

uv run webapps/packet_tts_apps/f5_tts_app.py

The terminal will display the URL to access the webapp (e.g.):

Running on local URL:  http://127.0.0.1:7860

NOTE: The URL may vary; check the terminal output for the correct address.

📙 Documentation

Documentation is available on the sinapsis website

Tutorials for different projects within sinapsis are available at sinapsis tutorials page

🔍 License

This project is licensed under the AGPLv3 license, which encourages open collaboration and sharing. For more details, please refer to the LICENSE file.

For commercial use, please refer to our official Sinapsis website for information on obtaining a commercial license.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.12

Apr 22, 2026

This version

0.1.11

Dec 9, 2025

0.1.10

Aug 29, 2025

0.1.9

Aug 25, 2025

0.1.8

Aug 5, 2025

0.1.7

Aug 4, 2025

0.1.6

Jul 24, 2025

0.1.5

May 2, 2025

0.1.4

Apr 30, 2025

0.1.3

Apr 30, 2025

0.1.2

Apr 29, 2025

0.1.1

Apr 24, 2025

0.1.0

Apr 14, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinapsis_f5_tts-0.1.11.tar.gz (24.9 kB view details)

Uploaded Dec 9, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sinapsis_f5_tts-0.1.11-py3-none-any.whl (22.7 kB view details)

Uploaded Dec 9, 2025 Python 3

File details

Details for the file sinapsis_f5_tts-0.1.11.tar.gz.

File metadata

Download URL: sinapsis_f5_tts-0.1.11.tar.gz
Upload date: Dec 9, 2025
Size: 24.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.5.16

File hashes

Hashes for sinapsis_f5_tts-0.1.11.tar.gz
Algorithm	Hash digest
SHA256	`cc47f7b55cf180e8ff1883d0e32876e015f58c79cc4d4fe1adf4de664aeffe7a`
MD5	`e5a2abfce82ab5193b3ed3ef517398bc`
BLAKE2b-256	`71c2b7317d0a55adfebe27f90139b164fedd0135d0aa073612e1fed648152f03`

See more details on using hashes here.

File details

Details for the file sinapsis_f5_tts-0.1.11-py3-none-any.whl.

File metadata

Download URL: sinapsis_f5_tts-0.1.11-py3-none-any.whl
Upload date: Dec 9, 2025
Size: 22.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.5.16

File hashes

Hashes for sinapsis_f5_tts-0.1.11-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f47952f3bec8648d7ed8106188bb361cce43c9d627251b6423ef3c9ee241d799`
MD5	`b519515a42d8f0c850d0f329d4d8ba9a`
BLAKE2b-256	`01e9f7b827b674757dd2e65f01c6cc47bd328f171ed0b865e13d82235a23ec87`

See more details on using hashes here.

sinapsis-f5-tts 0.1.11

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Project description

Sinapsis F5-TTS

Templates for advanced text-to-speech synthesis with F5-TTS

🐍 Installation

🚀 Features

Templates Supported

📚 Usage example

🌐 Webapp

📙 Documentation

🔍 License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes