Skip to main content

Text to speech using Orpheus TTS model

Project description



Sinapsis Orpheus-CPP

Templates for advanced text-to-speech synthesis with Orpheus-TTS

🐍 Installation 🚀 Features 📚 Usage example🌐 Webapp📙 Documentation🔍 License

This Sinapsis Orpheus-CPP package provides a template for seamlessly integrating, configuring, and running text-to-speech (TTS) functionalities powered by Orpheus-TTS.

🐍 Installation

Install using your favourite package manager. We strongly encourage the use of uv, although any other package manager should work too. If you need to install uv please see the official documentation.

Example with uv:

  uv pip install sinapsis-orpheus-cpp --extra-index-url https://pypi.sinapsis.tech

or with raw pip:

  pip install sinapsis-orpheus-cpp --extra-index-url https://pypi.sinapsis.tech

[!IMPORTANT] Templates in each package may require extra dependencies. For development, we recommend installing the package with all the optional dependencies:

with uv:

  uv pip install sinapsis-orpheus-cpp[all] --extra-index-url https://pypi.sinapsis.tech

or with raw pip:

  pip install sinapsis-orpheus-cpp[all] --extra-index-url https://pypi.sinapsis.tech

🚀 Features

Templates Supported

This module includes a template for text-to-speech synthesis using the Orpheus TTS model:

OrpheusTTS: Advanced text-to-speech synthesis template powered by Orpheus TTS, delivering human-like speech with natural intonation, emotion, and rhythm that surpasses state-of-the-art closed-source models. The template supports expressive speech synthesis through emotive tags including <laugh>, <chuckle>, <sigh>, <cough>, <sniffle>, <groan>, <yawn>, and <gasp> for enhanced vocal expressions. Additionally, it provides multi-language support when configured with the appropriate Hugging Face model path, making it versatile for global applications.

Attributes
  • n_gpu_layers: Number of model layers to offload to GPU (-1 = use all layers, 0 = CPU only) (default: -1)
  • n_threads: Number of CPU threads to use for model inference (0 = auto-detect) (default: 0)
  • n_ctx: Context window size (maximum number of tokens, 0 = use model's maximum) (default: 8192)
  • model_id: Hugging Face model repository ID (required)
  • model_variant: Specific GGUF file to download from the repository (default: None)
  • cache_dir: Directory to store downloaded models and cache files (default: SINAPSIS_CACHE_DIR)
  • verbose: Enable verbose logging for model operations (default: False)
  • voice_id: Voice identifier for speech synthesis (required)
  • batch_size: Batch size for model inference (default: 1)
  • max_tokens: Maximum number of tokens to generate for speech (default: 2048)
  • temperature: Sampling temperature for token generation (default: 0.8)
  • top_p: Nucleus sampling probability threshold (default: 0.95)
  • top_k: Top-k sampling parameter (default: 40)
  • min_p: Minimum probability threshold for token selection (default: 0.05)
  • pre_buffer_size: Duration in seconds of audio to generate before yielding the first chunk (default: 1.5)

[!TIP] Use CLI command sinapsis info --example-template-config TEMPLATE_NAME to produce an example Agent config for the Template specified in TEMPLATE_NAME.

For example, for OrpheusTTS use sinapsis info --example-template-config OrpheusTTS to produce an example config like:

agent:
  name: my_test_agent
templates:
- template_name: InputTemplate
  class_name: InputTemplate
  attributes: {}
- template_name: OrpheusTTS
  class_name: OrpheusTTS
  template_input: InputTemplate
  attributes:
    n_gpu_layers: -1
    n_threads: 0
    n_ctx: 8192
    model_id: '`replace_me:<class ''str''>`'
    model_variant: null
    cache_dir: ~/sinapsis_cache
    verbose: false
    voice_id: '`replace_me:<class ''str''>`'
    batch_size: 1
    max_tokens: 2048
    temperature: 0.8
    top_p: 0.95
    top_k: 40
    min_p: 0.05
    pre_buffer_size: 1.5

📚 Usage example

This example illustrates how to use the OrpheusTTS template for text-to-speech synthesis. It converts text input into speech using Orpheus-TTS and saves the resulting audio file locally.

Config
agent:
  name: orpheus_tts_agent
  description: "Agent that generates speech from text using the Orpheus TTS model."

templates:
- template_name: InputTemplate
  class_name: InputTemplate
  attributes: {}

- template_name: TextInput
  class_name: TextInput
  template_input: InputTemplate
  attributes:
    source: "user_input"
    text: "Hi, I'm Tara. Welcome to Orpheus text-to-speech system! I can speak in a very natural way."

- template_name: OrpheusTTS
  class_name: OrpheusTTS
  template_input: TextInput
  attributes:
    n_gpu_layers: -1
    n_ctx: 4096
    model_id: "isaiahbjork/orpheus-3b-0.1-ft-Q4_K_M-GGUF"
    voice_id: "tara"
    temperature: 0.8
    top_p: 0.95
    top_k: 40
    min_p: 0.05
    pre_buffer_size: 1.5
    max_tokens: 2048

- template_name: SaveGeneratedAudio
  class_name: AudioWriterSoundfile
  template_input: OrpheusTTS
  attributes:
    save_dir: "orpheus_tts"
    root_dir: "artifacts"
    extension: "wav"

This configuration defines an agent and a sequence of templates for converting text to speech using Orpheus-TTS.

[!IMPORTANT] The TextInput and AudioWriterSoundfile correspond to sinapsis-data-writers. If you want to use the example, please make sure you install the packages.

To run the config, use the CLI:

sinapsis run name_of_config.yml

🌐 Webapp

The webapp included in this project showcases the modularity of the Orpheus TTS template for speech generation tasks.

[!IMPORTANT] To run the app you first need to clone this repository:

git clone git@github.com:Sinapsis-ai/sinapsis-speech.git
cd sinapsis-speech

[!NOTE] If you'd like to enable external app sharing in Gradio, export GRADIO_SHARE_APP=True

🐳 Docker

IMPORTANT This docker image depends on the sinapsis-nvidia:base image. Please refer to the official sinapsis instructions to Build with Docker.

  1. Build the sinapsis-speech image:
docker compose -f docker/compose.yaml build
  1. Start the app container:
docker compose -f docker/compose_apps.yaml up -d sinapsis-orpheus-tts
  1. Check the logs
docker logs -f sinapsis-orpheus-tts
  1. The logs will display the URL to access the webapp, e.g.,::
Running on local URL:  http://127.0.0.1:7860

NOTE: The url may be different, check the output of logs.

  1. To stop the app:
docker compose -f docker/compose_apps.yaml down
💻 UV

To run the webapp using the uv package manager, follow these steps:

  1. Export the environment variable to install the python bindings for llama-cpp:
export CMAKE_ARGS="-DGGML_CUDA=on"
export FORCE_CMAKE="1"
  1. Export CUDACXX:
export CUDACXX=$(command -v nvcc)
  1. Sync the virtual environment:
uv sync --frozen
  1. Install the wheel:
uv pip install sinapsis-speech[all] --extra-index-url https://pypi.sinapsis.tech
  1. Run the webapp:
uv run webapps/packet_tts_apps/orpheus_tts_app.py
  1. The terminal will display the URL to access the webapp (e.g.):
Running on local URL:  http://127.0.0.1:7860

NOTE: The URL may vary; check the terminal output for the correct address.

📙 Documentation

Documentation is available on the sinapsis website

Tutorials for different projects within sinapsis are available at sinapsis tutorials page

🔍 License

This project is licensed under the AGPLv3 license, which encourages open collaboration and sharing. For more details, please refer to the LICENSE file.

For commercial use, please refer to our official Sinapsis website for information on obtaining a commercial license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinapsis_orpheus_cpp-0.1.1.tar.gz (24.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sinapsis_orpheus_cpp-0.1.1-py3-none-any.whl (23.1 kB view details)

Uploaded Python 3

File details

Details for the file sinapsis_orpheus_cpp-0.1.1.tar.gz.

File metadata

File hashes

Hashes for sinapsis_orpheus_cpp-0.1.1.tar.gz
Algorithm Hash digest
SHA256 8565b6142903f0b52a8fe50c4d4033780bbc3e2722eafa2e97915badccacaf2c
MD5 f386669b09d8b3cf5e466ffea6153ffb
BLAKE2b-256 a1e6d6097019c5c5cdb39c44dd6e60651d8b56622dbe6d128759b014c10ef798

See more details on using hashes here.

File details

Details for the file sinapsis_orpheus_cpp-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for sinapsis_orpheus_cpp-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1f3f0c9909e2c3288afca9b111d549034e4406eb45135148b3b4b99cbdc4c58c
MD5 12afc7de9c7a9f6aaff7c56615aa9b44
BLAKE2b-256 3d5ed6fd54e69ba56b25667028f559a59f5cb456817b77697a78bcdd5d0aab19

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page