Text to speech using the Zonos model

These details have not been verified by PyPI

Project links

Project description

Sinapsis Zonos

Templates for advanced speech synthesis using Zonos

🐍 Installation • 🚀 Features • 📚 Usage example • 🌐 Webapp • 📙 Documentation • 🔍 License

This Sinapsis Zonos package provides a single template for integrating, configuring, and running text-to-speech (TTS) and voice cloning functionalities powered by Zonos. It supports multilingual speech, emotional modulation, and real-time audio generation.

🐍 Installation

[!IMPORTANT] Sinapsis project requires Python 3.10 or higher.

Install using your preferred package manager. We strongly recommend using uv. To install uv, refer to the official documentation.

Install with uv:

  uv pip install sinapsis-zonos --extra-index-url https://pypi.sinapsis.tech

Or with raw pip:

  pip install sinapsis-zonos --extra-index-url https://pypi.sinapsis.tech

[!IMPORTANT] Templates in each package may require additional dependencies. For development, we recommend installing the package with all the optional dependencies:

With uv:

  uv pip install sinapsis-zonos[all] --extra-index-url https://pypi.sinapsis.tech

Or with raw pip:

  pip install sinapsis-zonos[all] --extra-index-url https://pypi.sinapsis.tech

[!NOTE] Zonos depends on the eSpeak library phonemization. The installation depends on your OS. For Linux:

apt install -y espeak-ng

🚀 Features

Templates Supported

ZonosTTS: Template for converting text to speech or performing voice cloning based on the presence of an audio sample.
Attributes
- cfg_scale(Optional): Controls randomness and creativity in speech generation (default: 2.0, range: 1.0–5.0). Higher values introduce more variation in speech output.
- denoised_speaker(Optional): If True, applies denoising to the speaker embedding to reduce background noise (default: False).
- dnsmos(Optional): Denoising strength for hybrid models (default: 4.0, range: 1.0–5.0).
- emotions(Optional): Emotion configuration to fine-tune the emotional tone of the generated speech (default: {}). Accepts an Emotions object with weights for various emotions.
- fmax(Optional): Maximum frequency cutoff in Hz for audio generation (default: 22050, range: 0–24000).
- language(Optional): Language code used for synthesis (default: en-us)
- model(Optional): The Zonos model identifier to use (default: Zyphra/Zonos-v0.1-transformer). Options: Zyphra/Zonos-v0.1-transformer and Zyphra/Zonos-v0.1-hybrid.
- output_folder(Optional): The folder where generated audio files will be saved (default: SINAPSIS_CACHE_DIR/elevenlabs/ audios).
- pitch_std(Optional): Standard deviation for pitch variation, which influences pitch naturalness (default: 20.0, range: 0–300).
- prefix_audio(Optional): Path to an audio file used for prefix conditioning (e.g., whispering or prosody control) (default: None).
- randomized_seed(Optional): If True, a random seed is used for each generation (default: True).
- sampling_params(Optional): Controls sampling behavior for speech synthesis. Accepts a SamplingParams object with fields like top_p, top_k, min_p, linear, conf, and quad.
- seed(Optional): Random seed used for deterministic generation. If randomized_seed is False, this value ensures repeatable output (default: 420).
- speaker_audio(Optional): Path to a reference audio file used to extract speaker characteristics for voice cloning (default: None).
- speaking_rate(Optional): Speaking rate in syllables per second (default: 15.0, range: 5–30).
- unconditional_keys(Optional): A set of keys (e.g., {vqscore_8, dnsmos_ovrl}) that disable speaker conditioning when generating speech.
- vq_score(Optional): VQ score threshold used by hybrid models to determine decoding style (default: 0.7, range: 0.5–0.8).

[!TIP] Use CLI command sinapsis info --example-template-config TEMPLATE_NAME to produce an example Agent config for the Template specified in TEMPLATE_NAME.

For example, for ZonosTTS use sinapsis info --example-template-config ZonosTTS to produce an example config like:

Config

agent:
  name: my_test_agent
templates:
- template_name: InputTemplate
  class_name: InputTemplate
  attributes: {}
- template_name: ZonosTTS
  class_name: ZonosTTS
  template_input: InputTemplate
  attributes:
    cfg_scale: 2.0
    denoised_speaker: false
    dnsmos: 4.0
    emotions:
      happiness: 0
      sadness: 0
      disgust: 0
      fear: 0
      surprise: 0
      anger: 0
      other: 0
      neutral: 0
    fmax: 22050.0
    language: en-us
    model: Zyphra/Zonos-v0.1-transformer
    output_folder: ~/.cache/sinapsis/zonos/audios
    pitch_std: 20.0
    prefix_audio: null
    randomized_seed: true
    sampling_params:
      min_p: 0.0
      top_k: 0
      top_p: 0.0
      linear: 0.0
      conf: 0.0
      quad: 0.0
    seed: 420
    speaker_audio: null
    speaking_rate: 15.0
    unconditional_keys: !!set
      dnsmos_ovrl: null
      vqscore_8: null
    vq_score: 0.7

📚 Usage example

This example shows how to use the ZonosTTS template to convert text into speech. The generated audio is based on the input text and is saved locally as a file.

Config

agent:
  name: text_to_speech
  description: text to speech agent using Zonos

templates:

- template_name: InputTemplate
  class_name: InputTemplate
  attributes: {}

- template_name: TextInput
  class_name: TextInput
  template_input: InputTemplate
  attributes:
    text:  This is a test of Sinapsis Zonos text-to-speech template.

- template_name: ZonosTTS
  class_name: ZonosTTS
  template_input: TextInput
  attributes:
    model: Zyphra/Zonos-v0.1-transformer
    language: en-us
    emotions:
      happiness: 0.3077
      sadness: 0.0256
      disgust: 0.0256
      fear: 0.0256
      surprise: 0.0256
      anger: 0.0256
      other: 0.2564
      neutral: 0.3077
    fmax: 24000
    pitch_std: 45.0
    speaking_rate: 15.0
    cfg_scale: 2.0
    sampling_params:
      linear: 0.5
      conf: 0.4
      quad: 0
    randomized_seed: True
    denoised_speaker: False
    unconditional_keys:
      - dnsmos_ovrl
      - vqscore_8

This configuration defines an agent and a sequence of templates for speech synthesis, using Zonos.

[!IMPORTANT] The TextInput template correspond to sinapsis-data-readers. If you want to use the example, please make sure you install the package.

To run the config, use the CLI:

sinapsis run name_of_config.yml

🌐 Webapp

The webapps included in this project showcase the modularity of the templates, in this case for speech generation tasks.

[!IMPORTANT] To run the app you first need to clone this repository:

git clone git@github.com:Sinapsis-ai/sinapsis-speech.git
cd sinapsis-speech

[!NOTE] If you'd like to enable external app sharing in Gradio, export GRADIO_SHARE_APP=True

🐳 Build with Docker

IMPORTANT: This Docker image depends on the sinapsis-nvidia:base image. For detailed instructions, please refer to the Sinapsis README.

Build the Docker image:

docker compose -f docker/compose.yaml build

Start the app container:

docker compose -f docker/compose_apps.yaml up -d sinapsis-zonos

Check the logs

docker logs -f sinapsis-zonos

The logs will display the URL to access the webapp, e.g.,::

Running on local URL:  http://127.0.0.1:7860

NOTE: The url may be different, check the output of logs.

To stop the app:

docker compose -f docker/compose_apps.yaml down

💻 UV

To run the webapp using the uv package manager, follow these steps:

Sync the virtual environment:

uv sync --frozen

Install the wheel:

uv pip install sinapsis-speech[all] --extra-index-url https://pypi.sinapsis.tech

Run the webapp:

uv run webapps/generic_tts_apps/zonos_tts_app.py

The terminal will display the URL to access the webapp (e.g.):

Running on local URL:  http://127.0.0.1:7860

NOTE: The URL may vary; check the terminal output for the correct address.

📙 Documentation

Documentation is available on the sinapsis website

Tutorials for different projects within sinapsis are available at sinapsis tutorials page

🔍 License

This project is licensed under the AGPLv3 license, which encourages open collaboration and sharing. For more details, please refer to the LICENSE file.

For commercial use, please refer to our official Sinapsis website for information on obtaining a commercial license.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.11

Dec 9, 2025

0.1.10

Aug 29, 2025

0.1.9

Aug 21, 2025

0.1.8

Aug 5, 2025

0.1.7

Jul 24, 2025

0.1.6

May 2, 2025

0.1.5

Apr 30, 2025

This version

0.1.4

Apr 30, 2025

0.1.3

Apr 29, 2025

0.1.2

Apr 25, 2025

0.1.1

Apr 24, 2025

0.1.0

Apr 14, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinapsis_zonos-0.1.4.tar.gz (25.8 kB view details)

Uploaded Apr 30, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sinapsis_zonos-0.1.4-py3-none-any.whl (23.7 kB view details)

Uploaded Apr 30, 2025 Python 3

File details

Details for the file sinapsis_zonos-0.1.4.tar.gz.

File metadata

Download URL: sinapsis_zonos-0.1.4.tar.gz
Upload date: Apr 30, 2025
Size: 25.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.5.16

File hashes

Hashes for sinapsis_zonos-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`71a711a4e75ffb4b67480e556257131222063205c3d93a01c958eecb727eaa72`
MD5	`cd9d51f94a17d045dbc4639fe994b5de`
BLAKE2b-256	`de099049a4ef4a9c67d7b70a8f85f3269a9330bdf0688082dcac387ce2055393`

See more details on using hashes here.

File details

Details for the file sinapsis_zonos-0.1.4-py3-none-any.whl.

File metadata

Download URL: sinapsis_zonos-0.1.4-py3-none-any.whl
Upload date: Apr 30, 2025
Size: 23.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.5.16

File hashes

Hashes for sinapsis_zonos-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d5f55ae3d574ec58b301c41810115c288f7e67f0efccc6c9accac785a69b7feb`
MD5	`0e4009b86837db44ca9aad0e8432a0fc`
BLAKE2b-256	`963e9107f633eb68d2ec6b9f4ea7cd6579da7779cd116e2a207a5c1c164bef89`

See more details on using hashes here.

sinapsis-zonos 0.1.4

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Project description

Sinapsis Zonos

Templates for advanced speech synthesis using Zonos

🐍 Installation

🚀 Features

Templates Supported

📚 Usage example

🌐 Webapp

📙 Documentation

🔍 License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes