Minimum dependency inference library for OptiSpeech TTS models
Project description
ospeech
Minimum dependency inference library for OptiSpeech TTS model.
About OptiSpeech
OptiSpeech is ment to be an efficient, lightweight and fast text-to-speech model for on-device text-to-speech.
Install
This package can be installed using pip
:
$ pip install ospeech
If you want to run the ospeech
command from anywhere, try:
$ pipx install ospeech
Most models are trained with IPA phonemized text. To use these models, install ospeech
with the espeak
feature, which pulls-in piper-phonemize:
pip install ospeech[espeak]
If you want a gradio interface, install with the gradio
feature:
pip install ospeech[gradio]
Usage
Obtaining models
$ ospeech-models --help
usage: ospeech-models [-h] {ls,dl} ...
List and download ospeech models from HuggingFace.
positional arguments:
{ls,dl}
ls List available models
dl Download ospeech models from HuggingFace
options:
-h, --help show this help message and exit
To list available models:
$ ospeech-models ls
Lang | Speaker | ID
---------------------------------------------------------------------
en-us | lightspeech-hfc-female | en-us-lightspeech-hfc-female
en-us | convnext-tts-hfc-female | en-us-convnext-tts-hfc-female
---------------------------------------------------------------------
Using the model ID, use the following command to download a model:
$ ospeech-models dl en-us-lightspeech-hfc-female .
Downloading `en-us-lightspeech-hfc-female.onnx`
Downloading: 100%| | 38/38 [00:02<?, ?MB/s]
Command line usage
$ ospeech --help
usage: ospeech [-h] [--d-factor D_FACTOR] [--p-factor P_FACTOR] [--e-factor E_FACTOR] [--no-split] [--cuda]
onnx_path text output_dir
ONNX inference of OptiSpeech
positional arguments:
onnx_path Path to the exported OptiSpeech ONNX model
text Text to speak
output_dir Directory to write generated audio to.
options:
-h, --help show this help message and exit
--d-factor D_FACTOR Scale to control speech rate.
--p-factor P_FACTOR Scale to control pitch.
--e-factor E_FACTOR Scale to control energy.
--no-split Don't split input text into sentences.
--cuda Use GPU for inference
If you want to run with the gradio interface:
$ ospeech-gradio --help
usage: ospeech-gradio [-h] [-s] [--host HOST] [--port PORT] [--char-limit CHAR_LIMIT] onnx_file_path
positional arguments:
onnx_file_path Path to model ONNX file
options:
-h, --help show this help message and exit
-s, --share Generate gradio share link
--host HOST Host to serve the app on.
--port PORT Port to serve the app on.
--char-limit CHAR_LIMIT
Input text character limit.
Python API
import soundfile as sf
from ospeech import OptiSpeechONNXModel
model_path = "./optispeech-en-us-lightspeech.onnx"
sentence = "OptiSpeech is awesome!"
model = OptiSpeechONNXModel.from_onnx_file_path(model_path)
model_inputs= model.prepare_input(sentence)
outputs = model.synthesise(model_inputs)
for (idx, wav) in enumerate(outputs):
# Wav is a float array
sf.write(f"output-{idx}.wav", wav, model.sample_rate)
Licence
Copyright (c) Musharraf Omer. MIT Licence. See LICENSE for more details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ospeech-1.4.0.tar.gz
.
File metadata
- Download URL: ospeech-1.4.0.tar.gz
- Upload date:
- Size: 12.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c1727c08d65d84e0a1c020cf2531f307bf65bf97e770ca0dddc2b24ff48cc74e |
|
MD5 | 19271513382349d44702bc58a86a6e96 |
|
BLAKE2b-256 | d381d864a8f1d274f9bb53b577b14f6bdc81396d8120b214d1980d98838827a6 |
File details
Details for the file ospeech-1.4.0-py3-none-any.whl
.
File metadata
- Download URL: ospeech-1.4.0-py3-none-any.whl
- Upload date:
- Size: 14.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f2990f94c7466e3ad374ea00cd6173f8ea2feabed5b0785bda7b696b33e4e31e |
|
MD5 | df12e79b03656330f7041e72855fd0ae |
|
BLAKE2b-256 | 05604a560451dc8c3ce64ce1ea246c470b22587dc0ecac56f8de84dc0ed405ab |