Python client for Zyphra APIs

Project description

Zyphra Python Client

A Python client library for interacting with Zyphra's text-to-speech API.

Installation

pip install zyphra

Quick Start

from zyphra import ZyphraClient

# Initialize the client
client = ZyphraClient(api_key="your-api-key")

# Generate speech and save to file
output_path = client.audio.speech.create(
    text="Hello, world!",
    speaking_rate=15,
    model="zonos-v0.1-transformer",  # Default model
    output_path="output.webm"
)

# Or get audio data as bytes
audio_data = client.audio.speech.create(
    text="Hello, world!",
    speaking_rate=15
)

Features

Text-to-speech generation with customizable parameters
Support for multiple languages and audio formats
Voice cloning capabilities
Multiple TTS models with specialized capabilities
Both synchronous and asynchronous operations
Streaming support for audio responses
Built-in type hints and validation
Support for default and custom voice selection

Requirements

Python 3.8+
aiohttp for async operations
pydantic for data validation
requests for synchronous operations

Detailed Usage

Synchronous Client

from zyphra import ZyphraClient

with ZyphraClient(api_key="your-api-key") as client:
    # Save directly to file
    output_path = client.audio.speech.create(
        text="Hello, world!",
        speaking_rate=15,
        model="zonos-v0.1-transformer",
        output_path="output.webm"
    )
    
    # Get audio data as bytes
    audio_data = client.audio.speech.create(
        text="Hello, world!",
        speaking_rate=15
    )

Asynchronous Client

from zyphra import AsyncZyphraClient

async with AsyncZyphraClient(api_key="your-api-key") as client:
    audio_data = await client.audio.speech.create(
        text="Hello, world!",
        speaking_rate=15,
        model="zonos-v0.1-transformer"
    )

Supported TTS Models

The API supports the following TTS models:

zonos-v0.1-transformer (Default): A standard transformer-based TTS model suitable for most applications.
- Emotion and pitch_std parameters available
zonos-v0.1-hybrid: An advanced model with:
- Better support for certain languages (especially Japanese)
- Supports speaker_noised denoising parameter
- Improved voice quality in some scenarios

Advanced Options

The text-to-speech API supports various parameters to control the output:

from typing import Optional, Literal
from pydantic import BaseModel, Field

# Define supported models
SupportedModel = Literal['zonos-v0.1-transformer', 'zonos-v0.1-hybrid']

class TTSParams:
    text: str                      # The text to convert to speech (required)
    speaker_audio: Optional[str]   # Base64 audio for voice cloning
    speaking_rate: Optional[float] # Speaking rate (5-35, default: 15.0)
    fmax: Optional[int]            # Frequency max (0-24000, default: 22050)
    pitch_std: Optional[float]     # Pitch standard deviation (0-500, default: 45.0) (transformer model only)
    emotion: Optional[EmotionWeights] # Emotional weights (transformer model only)
    language_iso_code: Optional[str] # Language code (e.g., "en-us", "fr-fr")
    mime_type: Optional[str]       # Output audio format (e.g., "audio/webm")
    model: Optional[SupportedModel] # TTS model (default: 'zonos-v0.1-transformer')
    speaker_noised: Optional[bool] # Denoises to improve stability (hybrid model only, default: True)
    default_voice_name: Optional[str] # Name of a default voice to use
    voice_name: Optional[str]      # Name of one of the user's voices to use

class EmotionWeights:
    happiness: float = 0.6   # default: 0.6
    sadness: float = 0.05    # default: 0.05
    disgust: float = 0.05    # default: 0.05
    fear: float = 0.05       # default: 0.05
    surprise: float = 0.05   # default: 0.05
    anger: float = 0.05      # default: 0.05
    other: float = 0.5       # default: 0.5
    neutral: float = 0.6     # default: 0.6

Supported Languages

The text-to-speech API supports the following languages:

English (US) - en-us
French - fr-fr
German - de
Japanese - ja (recommended to use with zonos-v0.1-hybrid model)
Korean - ko
Mandarin Chinese - cmn

Supported Audio Formats

The API supports multiple output formats through the mime_type parameter:

WebM (default) - audio/webm
Ogg - audio/ogg
WAV - audio/wav
MP3 - audio/mp3 or audio/mpeg
MP4/AAC - audio/mp4 or audio/aac

Language and Format Examples

# Generate French speech in MP3 format
audio_data = client.audio.speech.create(
    text="Bonjour le monde!",
    language_iso_code="fr-fr",
    mime_type="audio/mp3",
    speaking_rate=15
)

# Generate Japanese speech in WAV format with hybrid model (recommended)
audio_data = client.audio.speech.create(
    text="こんにちは世界！",
    language_iso_code="ja",
    mime_type="audio/wav",
    speaking_rate=15,
    model="zonos-v0.1-hybrid"  # Better for Japanese
)

Using Default and Custom Voices

You can use pre-defined default voices or your own custom voices:

# Using a default voice
audio_data = client.audio.speech.create(
    text="This uses a default voice.",
    default_voice_name="american_female",
    speaking_rate=15
)

Available Default Voices

The following default voices are available:

american_female - Standard American English female voice
american_male - Standard American English male voice
anime_girl - Stylized anime girl character voice
british_female - British English female voice
british_male - British English male voice
energetic_boy - Energetic young male voice
energetic_girl - Energetic young female voice
japanese_female - Japanese female voice
japanese_male - Japanese male voice

Using Custom Voices

You can use your own custom voices that have been created and stored in your account:

# Using a custom voice you've created and stored
audio_data = client.audio.speech.create(
    text="This uses your custom voice.",
    voice_name="my_custom_voice",
    speaking_rate=15
)

Note: When using custom voices, the voice_name parameter should exactly match the name as it appears in your voices list on playground.zyphra.com/audio. The name is case-sensitive.

Model-Specific Parameters

For the hybrid model (zonos-v0.1-hybrid), you can utilize additional parameters:

# Using the hybrid model with its specific parameters
audio_data = client.audio.speech.create(
    text="This uses the hybrid model with special parameters.",
    model="zonos-v0.1-hybrid",
    speaker_noised=True,    # Denoises to improve stability
    speaking_rate=15
)

Emotion Control

You can adjust the emotional tone of the speech:

from zyphra.models.audio import EmotionWeights

# Create custom emotion weights
emotions = EmotionWeights(
    happiness=0.8,  # Increase happiness
    neutral=0.3,    # Decrease neutrality
    # Other emotions use default values
)

# Generate speech with emotional tone
audio_data = client.audio.speech.create(
    text="This is a happy message!",
    emotion=emotions,
    speaking_rate=15,
    model="zonos-v0.1-transformer"
)

Voice Cloning

You can clone voices by providing a reference audio file:

import base64

# Read and encode audio file
with open("reference_voice.wav", "rb") as f:
    audio_base64 = base64.b64encode(f.read()).decode('utf-8')

# Generate speech with cloned voice
audio_data = client.audio.speech.create(
    text="This will use the cloned voice",
    speaker_audio=audio_base64,
    speaking_rate=15,
    model="zonos-v0.1-transformer"
)

Error Handling

from zyphra import ZyphraError

try:
    client.audio.speech.create(
        text="Hello, world!",
        speaking_rate=15,
        model="zonos-v0.1-transformer"
    )
except ZyphraError as e:
    print(f"Error: {e.status_code} - {e.response_text}")

Available Models

Speech Models

zonos-v0.1-transformer: Default transformer-based TTS model
zonos-v0.1-hybrid: Advanced hybrid TTS model with enhanced language support

License

MIT License

Project details

Release history Release notifications | RSS feed

This version

0.1.6

Apr 7, 2025

0.1.5

Apr 2, 2025

0.1.4

Mar 4, 2025

0.1.3

Mar 4, 2025

0.1.2

Feb 5, 2025

0.1.1

Feb 5, 2025

0.1.0

Feb 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zyphra-0.1.6.tar.gz (9.5 kB view details)

Uploaded Apr 7, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

zyphra-0.1.6-py3-none-any.whl (8.1 kB view details)

Uploaded Apr 7, 2025 Python 3

File details

Details for the file zyphra-0.1.6.tar.gz.

File metadata

Download URL: zyphra-0.1.6.tar.gz
Upload date: Apr 7, 2025
Size: 9.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for zyphra-0.1.6.tar.gz
Algorithm	Hash digest
SHA256	`71c5aa9593ff7fcc39da94504f4ad152b1320b1f9be94343621e37de1052f632`
MD5	`46466056036589a1e281689dd9fd6bb5`
BLAKE2b-256	`0747a3e21d1b2f73cfcd18aa8e8a61ddbd1619237255ab7ec824f92b677ac85f`

See more details on using hashes here.

File details

Details for the file zyphra-0.1.6-py3-none-any.whl.

File metadata

Download URL: zyphra-0.1.6-py3-none-any.whl
Upload date: Apr 7, 2025
Size: 8.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for zyphra-0.1.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`eb43807f5c7feed0300d5bc1fc82d37c183b9cfa8307537c531404cbc54487dd`
MD5	`24f6dfab64f02c3dd1247c65b3d15855`
BLAKE2b-256	`747013fedc47266eafef0db4b512744d8228118cfa012b5bf6e00bb78d447192`

See more details on using hashes here.

zyphra 0.1.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Zyphra Python Client

Installation

Quick Start

Features

Requirements

Detailed Usage

Synchronous Client

Asynchronous Client

Supported TTS Models

Advanced Options

Supported Languages

Supported Audio Formats

Language and Format Examples

Using Default and Custom Voices

Available Default Voices

Using Custom Voices

Model-Specific Parameters

Emotion Control

Voice Cloning

Error Handling

Available Models

Speech Models

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes