Skip to main content

A powerful, Transformer-based text-to-speech (TTS) tool.

Project description

str2speech

Overview

str2speech is a simple command-line tool for converting text to speech using Transformer-based text-to-speech (TTS) models. It supports multiple models and voice presets, allowing users to generate high-quality speech audio from text.

Latest

Added support for Zyphra Zonos. Just try this out:

from str2speech.speaker import Speaker

speaker = Speaker("Zyphra/Zonos-v0.1-transformer")
speaker.text_to_speech("Hello, this is a test!", "output.wav")

This expects you to have Zonos installed. If you don't have it yet, run the following:

!apt install espeak-ng
!git clone https://github.com/Zyphra/Zonos.git
!cd Zonos && pip install -e .

Features

  • Supports multiple TTS models, including suno/bark-small, suno/bark, and various facebook/mms-tts models.
  • Allows selection of voice presets.
  • Supports text input via command-line arguments or files.
  • Outputs speech in .wav format.
  • Works with both CPU and GPU.

Installation

To install str2speech, first make sure you have pip installed, then run:

pip install str2speech

Usage

Command Line

Run the script via the command line:

str2speech --text "Hello, world!" --output hello.wav

Options

  • --text (-t): The text to convert to speech.
  • --file (-f): A file containing text to convert to speech.
  • --voice (-v): The voice preset to use (optional, defaults to a predefined voice).
  • --output (-o): The output .wav file name (optional, defaults to output.wav).
  • --model (-m): The TTS model to use (optional, defaults to suno/bark-small).

Example:

str2speech --file input.txt --output speech.wav --model suno/bark

API Usage

You can also use str2speech as a Python module:

from str2speech.speaker import Speaker

speaker = Speaker()
speaker.text_to_speech("Hello, this is a test.", "test.wav")

Available Models

The following models are supported:

  • suno/bark-small (default)
  • suno/bark
  • facebook/mms-tts-eng
  • facebook/mms-tts-deu
  • facebook/mms-tts-fra
  • facebook/mms-tts-spa
  • Zyphra/Zonos-v0.1-transformer

Tested With These Dependencies

  • transformers==4.49.0
  • torch==2.5.1+cu124
  • numpy==2.2.3
  • scipy==1.13.1

License

This project is licensed under the GNU General Public License v3 (GPLv3).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

str2speech-0.1.5.tar.gz (16.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

str2speech-0.1.5-py3-none-any.whl (17.3 kB view details)

Uploaded Python 3

File details

Details for the file str2speech-0.1.5.tar.gz.

File metadata

  • Download URL: str2speech-0.1.5.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for str2speech-0.1.5.tar.gz
Algorithm Hash digest
SHA256 88af8ee668f791c9f6a579f620e5eb1d159303c2c1794a6f86de36a9837cdcbe
MD5 a835ab3ffda00ef1e3fd9db1fbba524b
BLAKE2b-256 0a79d55126e761bfb69c304656a1dbe69c7cdced46dab76765fade9b6bd59bc4

See more details on using hashes here.

File details

Details for the file str2speech-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: str2speech-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 17.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for str2speech-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 38cbb8d76bec368433b9cb0c3c8852e8d7d12177cfe330f752db0c65ed288f74
MD5 bdadabe00f475de755cdeb1215c8e18d
BLAKE2b-256 19d5d8782e554585d72b74c463147441f005ecdf194febeaaf02b3e59ae991c7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page