Python client for Zyphra APIs
Project description
Zyphra Python Client
A Python client library for interacting with Zyphra's text-to-speech API.
Installation
pip install zyphra
Quick Start
from zyphra import ZyphraClient
# Initialize the client
client = ZyphraClient(api_key="your-api-key")
# Generate speech and save to file
output_path = client.audio.speech.create(
text="Hello, world!",
speaking_rate=15,
model="zonos-v0.1-transformer", # Default model
output_path="output.webm"
)
# Or get audio data as bytes
audio_data = client.audio.speech.create(
text="Hello, world!",
speaking_rate=15
)
Features
- Text-to-speech generation with customizable parameters
- Support for multiple languages and audio formats
- Voice cloning capabilities
- Multiple TTS models with specialized capabilities
- Both synchronous and asynchronous operations
- Streaming support for audio responses
- Built-in type hints and validation
- Support for default and custom voice selection
Requirements
- Python 3.8+
aiohttpfor async operationspydanticfor data validationrequestsfor synchronous operations
Detailed Usage
Synchronous Client
from zyphra import ZyphraClient
with ZyphraClient(api_key="your-api-key") as client:
# Save directly to file
output_path = client.audio.speech.create(
text="Hello, world!",
speaking_rate=15,
model="zonos-v0.1-transformer",
output_path="output.webm"
)
# Get audio data as bytes
audio_data = client.audio.speech.create(
text="Hello, world!",
speaking_rate=15
)
Asynchronous Client
from zyphra import AsyncZyphraClient
async with AsyncZyphraClient(api_key="your-api-key") as client:
audio_data = await client.audio.speech.create(
text="Hello, world!",
speaking_rate=15,
model="zonos-v0.1-transformer"
)
Supported TTS Models
The API supports the following TTS models:
zonos-v0.1-transformer(Default): A standard transformer-based TTS model suitable for most applications.- Emotion and pitch_std parameters available
zonos-v0.1-hybrid: An advanced model with:- Better support for certain languages (especially Japanese)
- Supports
speaker_noiseddenoising parameter - Improved voice quality in some scenarios
Advanced Options
The text-to-speech API supports various parameters to control the output:
from typing import Optional, Literal
from pydantic import BaseModel, Field
# Define supported models
SupportedModel = Literal['zonos-v0.1-transformer', 'zonos-v0.1-hybrid']
class TTSParams:
text: str # The text to convert to speech (required)
speaker_audio: Optional[str] # Base64 audio for voice cloning
speaking_rate: Optional[float] # Speaking rate (5-35, default: 15.0)
fmax: Optional[int] # Frequency max (0-24000, default: 22050)
pitch_std: Optional[float] # Pitch standard deviation (0-500, default: 45.0) (transformer model only)
emotion: Optional[EmotionWeights] # Emotional weights (transformer model only)
language_iso_code: Optional[str] # Language code (e.g., "en-us", "fr-fr")
mime_type: Optional[str] # Output audio format (e.g., "audio/webm")
model: Optional[SupportedModel] # TTS model (default: 'zonos-v0.1-transformer')
speaker_noised: Optional[bool] # Denoises to improve stability (hybrid model only, default: True)
default_voice_name: Optional[str] # Name of a default voice to use
voice_name: Optional[str] # Name of one of the user's voices to use
class EmotionWeights:
happiness: float = 0.6 # default: 0.6
sadness: float = 0.05 # default: 0.05
disgust: float = 0.05 # default: 0.05
fear: float = 0.05 # default: 0.05
surprise: float = 0.05 # default: 0.05
anger: float = 0.05 # default: 0.05
other: float = 0.5 # default: 0.5
neutral: float = 0.6 # default: 0.6
Supported Languages
The text-to-speech API supports the following languages:
- English (US) -
en-us - French -
fr-fr - German -
de - Japanese -
ja(recommended to use withzonos-v0.1-hybridmodel) - Korean -
ko - Mandarin Chinese -
cmn
Supported Audio Formats
The API supports multiple output formats through the mime_type parameter:
- WebM (default) -
audio/webm - Ogg -
audio/ogg - WAV -
audio/wav - MP3 -
audio/mp3oraudio/mpeg - MP4/AAC -
audio/mp4oraudio/aac
Language and Format Examples
# Generate French speech in MP3 format
audio_data = client.audio.speech.create(
text="Bonjour le monde!",
language_iso_code="fr-fr",
mime_type="audio/mp3",
speaking_rate=15
)
# Generate Japanese speech in WAV format with hybrid model (recommended)
audio_data = client.audio.speech.create(
text="こんにちは世界!",
language_iso_code="ja",
mime_type="audio/wav",
speaking_rate=15,
model="zonos-v0.1-hybrid" # Better for Japanese
)
Using Default and Custom Voices
You can use pre-defined default voices or your own custom voices:
# Using a default voice
audio_data = client.audio.speech.create(
text="This uses a default voice.",
default_voice_name="american_female",
speaking_rate=15
)
Available Default Voices
The following default voices are available:
american_female- Standard American English female voiceamerican_male- Standard American English male voiceanime_girl- Stylized anime girl character voicebritish_female- British English female voicebritish_male- British English male voiceenergetic_boy- Energetic young male voiceenergetic_girl- Energetic young female voicejapanese_female- Japanese female voicejapanese_male- Japanese male voice
Using Custom Voices
You can use your own custom voices that have been created and stored in your account:
# Using a custom voice you've created and stored
audio_data = client.audio.speech.create(
text="This uses your custom voice.",
voice_name="my_custom_voice",
speaking_rate=15
)
Note: When using custom voices, the voice_name parameter should exactly match the name as it appears in your voices list on playground.zyphra.com/audio. The name is case-sensitive.
Model-Specific Parameters
For the hybrid model (zonos-v0.1-hybrid), you can utilize additional parameters:
# Using the hybrid model with its specific parameters
audio_data = client.audio.speech.create(
text="This uses the hybrid model with special parameters.",
model="zonos-v0.1-hybrid",
speaker_noised=True, # Denoises to improve stability
speaking_rate=15
)
Emotion Control
You can adjust the emotional tone of the speech:
from zyphra.models.audio import EmotionWeights
# Create custom emotion weights
emotions = EmotionWeights(
happiness=0.8, # Increase happiness
neutral=0.3, # Decrease neutrality
# Other emotions use default values
)
# Generate speech with emotional tone
audio_data = client.audio.speech.create(
text="This is a happy message!",
emotion=emotions,
speaking_rate=15,
model="zonos-v0.1-transformer"
)
Voice Cloning
You can clone voices by providing a reference audio file:
import base64
# Read and encode audio file
with open("reference_voice.wav", "rb") as f:
audio_base64 = base64.b64encode(f.read()).decode('utf-8')
# Generate speech with cloned voice
audio_data = client.audio.speech.create(
text="This will use the cloned voice",
speaker_audio=audio_base64,
speaking_rate=15,
model="zonos-v0.1-transformer"
)
Error Handling
from zyphra import ZyphraError
try:
client.audio.speech.create(
text="Hello, world!",
speaking_rate=15,
model="zonos-v0.1-transformer"
)
except ZyphraError as e:
print(f"Error: {e.status_code} - {e.response_text}")
Available Models
Speech Models
zonos-v0.1-transformer: Default transformer-based TTS modelzonos-v0.1-hybrid: Advanced hybrid TTS model with enhanced language support
License
MIT License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file zyphra-0.1.6.tar.gz.
File metadata
- Download URL: zyphra-0.1.6.tar.gz
- Upload date:
- Size: 9.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
71c5aa9593ff7fcc39da94504f4ad152b1320b1f9be94343621e37de1052f632
|
|
| MD5 |
46466056036589a1e281689dd9fd6bb5
|
|
| BLAKE2b-256 |
0747a3e21d1b2f73cfcd18aa8e8a61ddbd1619237255ab7ec824f92b677ac85f
|
File details
Details for the file zyphra-0.1.6-py3-none-any.whl.
File metadata
- Download URL: zyphra-0.1.6-py3-none-any.whl
- Upload date:
- Size: 8.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eb43807f5c7feed0300d5bc1fc82d37c183b9cfa8307537c531404cbc54487dd
|
|
| MD5 |
24f6dfab64f02c3dd1247c65b3d15855
|
|
| BLAKE2b-256 |
747013fedc47266eafef0db4b512744d8228118cfa012b5bf6e00bb78d447192
|