No project description provided
Project description
speech_neuron
A text-to-speech server to convert text to speech using the Kokoro-TTS models and FastAPI.
Other Neuron Packages
Quick Start
Run:
pip install speech_neuron
Create a config.yaml file with the following content, see Configuration for more details.
Create a main.py file with the following content:
import os
import yaml
from fastapi import FastAPI
import uvicorn
from speech_neuron import SpeechNeuronServer, NodeConfig
CONFIG_PATH = os.environ.get("NODE_CONFIG_PATH", "config.yaml")
config = NodeConfig(**yaml.safe_load(open(CONFIG_PATH, "r")))
app = FastAPI()
speech_neuron = SpeechNeuronServer(config)
app.include_router(speech_neuron.router)
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=8000)
Create a client file client.py with the following content:
import requests
import io
import sounddevice as sd
import soundfile as sf
from datetime import datetime
HOST = "http://0.0.0.0:8000" # <--- Change to your server IP
url = f"{HOST}/node/speech"
start = datetime.now()
response = requests.get(
url,
params={
"text": """Anyway, it was the Saturday of the football game with Saxon Hall.
The game with Saxon Hall was supposed to be a very big deal around Pencey.
""",
"voice": "af_bella",
"speed": 1.1,
"split_pattern": r"\n+",
},
stream=True,
)
# Read the streamed response into memory
audio_buffer = io.BytesIO()
for chunk in response.iter_content(chunk_size=4096):
if chunk:
audio_buffer.write(chunk)
# Play the audio in real-time
audio_buffer.seek(0) # Reset buffer for reading
data, samplerate = sf.read(audio_buffer)
sd.play(data, samplerate)
sd.wait() # Wait for audio to finish playing
print(f"Time taken: {datetime.now() - start}")
Run:
python main.py &
And then run:
python client.py
Configuration
Create a config.yaml file with the following content:
name: "speech_node"
# "kokoro-v1.0.fp16-gpu.onnx",
# "kokoro-v1.0.fp16.onnx",
# "kokoro-v1.0.int8.onnx",
# "kokoro-v1.0.onnx"
model_name: kokoro-v1.0.int8.onnx
voices_name: voices-v1.0.bin
response:
# TODO: type: stream
sample_rate: 24000
format: wav
compression_level: 0
pipeline:
model:
device: cpu # cpu or cuda
use_transformer: true
# Model configuration
# 'a' = American English
# 'b' = British English
# 'e' = Spanish
# 'f' = French
# 'h' = Hindi
# 'i' = Italian
# 'p' = Portuguese
# 'j' = Japanese
# 'z' = Chinese
language_code: en-us
# Request defaults
speed: 1.0 # Can be set during request
voice: "af_heart" # Can be set during request
split_pattern: "\n" # Can be set during request
Dependencies
Linux
Ubuntu
sudo apt update
sudo apt install libglslang-dev
Manjaro
sudo pacman -S ffmpeg glslang
# Check for version mismatch
find /usr -name "libglslang-default-resource-limits.so*"
# If version mismatch
sudo ln -s /usr/lib/libglslang-default-resource-limits.so.15 /usr/lib/libglslang-default-resource-limits.so.14
# Check for version mismatch
find /usr -name "libSPIRV.so*"
# If version mismatch
sudo ldconfig
If NVIDIA is not working:
sudo modprobe -r nvidia_uvm
sudo modprobe nvidia_uvm
MacOS
brew install ffmpeg
brew install glslang
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file speech_neuron-0.0.5.tar.gz.
File metadata
- Download URL: speech_neuron-0.0.5.tar.gz
- Upload date:
- Size: 16.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.12.8 Darwin/24.3.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3cbc18ea75f15a5408971a80899c7b546f5affd0bb56794269fc2bc64c31dd29
|
|
| MD5 |
c117192b3d99b39ff973068ae541f12a
|
|
| BLAKE2b-256 |
9c2d4b305517da55ae70b5a703026eb1fb132cd62c490ae925d64d8ff07679f7
|
File details
Details for the file speech_neuron-0.0.5-py3-none-any.whl.
File metadata
- Download URL: speech_neuron-0.0.5-py3-none-any.whl
- Upload date:
- Size: 17.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.12.8 Darwin/24.3.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
99527827affe3ca467694bd51e7c2cf3ee424532ad8267281bdda080e840340d
|
|
| MD5 |
2b02d412e79e7137a62bd44b4b3d8df8
|
|
| BLAKE2b-256 |
0b07a95e9972a82d8facb5626dfa3675c2dc7442701be1b83b03f76996ac4ade
|