Skip to main content

No project description provided

Project description

PlayHT API SDK

pyht is a Python SDK for PlayHT's AI Text-to-Speech API. PlayHT builds conversational voice AI models for realtime use cases. With pyht, you can easily convert text into high-quality audio streams with humanlike voices.

Currently the library supports only streaming text-to-speech. For the full set of functionalities provided by the PlayHT API such as Voice Cloning, see the PlayHT docs

Features

  • Stream text-to-speech in real-time, synchronous or asynchronous.
  • Use PlayHT's pre-built voices or create custom voice clones.
  • Stream text from LLM, and generate audio stream in real-time.
  • Supports WAV, MP3, Mulaw, FLAC, and OGG audio formats as well as raw audio.
  • Supports 8KHz, 16KHz, 24KHz, 44.1KHz and 48KHz sample rates.

Requirements

  • Python 3.8+
  • aiohttp
  • filelock
  • grpc
  • requests
  • websockets

Installation

You can install the pyht SDK using pip:

pip install pyht

Usage

You can use the pyht SDK by creating a Client instance and calling its tts method. Here's a simple example:

from pyht import Client
from dotenv import load_dotenv
from pyht.client import TTSOptions
import os
load_dotenv()

client = Client(
    user_id=os.getenv("PLAY_HT_USER_ID"),
    api_key=os.getenv("PLAY_HT_API_KEY"),
)
options = TTSOptions(voice="s3://voice-cloning-zero-shot/775ae416-49bb-4fb6-bd45-740f205d20a1/jennifersaad/manifest.json")
for chunk in client.tts("Hi, I'm Jennifer from Play. How can I help you today?", options):
    # do something with the audio chunk
    print(type(chunk))

It is also possible to stream text instead of submitting it as a string all at once:

for chunk in client.stream_tts_input(some_iterable_text_stream, options):
    # do something with the audio chunk
    print(type(chunk))

An asyncio version of the client is also available:

from pyht import AsyncClient

client = AsyncClient(
    user_id=os.getenv("PLAY_HT_USER_ID"),
    api_key=os.getenv("PLAY_HT_API_KEY"),
)
options = TTSOptions(voice="s3://voice-cloning-zero-shot/775ae416-49bb-4fb6-bd45-740f205d20a1/jennifersaad/manifest.json")
async for chunk in client.tts("Hi, I'm Jennifer from Play. How can I help you today?", options):
    # do something with the audio chunk
    print(type(chunk))

The tts method takes the following arguments:

  • text: The text to be converted to speech.
    • a string or list of strings.
  • options: The options to use for the TTS request.
    • a TTSOptions object (see below).
  • voice_engine: The voice engine to use for the TTS request.
    • Play3.0-mini-http (default): Our latest multilingual model, streaming audio over HTTP. (NOTE that it is Play not PlayHT like previous voice engines)
    • Play3.0-mini-ws: Our latest multilingual model, streaming audio over WebSockets. (NOTE that it is Play not PlayHT like previous voice engines)
    • PlayHT2.0-turbo: Our legacy English-only model, streaming audio over gRPC.

TTSOptions

The TTSOptions class is used to specify the options for the TTS request. It has the following members, with these supported values:

  • voice: The voice to use for the TTS request; a string.
    • A URL pointing to a Play voice manifest file.
  • format: The format of the audio to be returned; a Format enum value.
    • FORMAT_MP3 (default)
    • FORMAT_WAV
    • FORMAT_MULAW
    • FORMAT_FLAC
    • FORMAT_OGG
    • FORMAT_RAW
  • sample_rate: The sample rate of the audio to be returned; an integer.
    • 8000
    • 16000
    • 24000
    • 44100
    • 48000
  • quality: DEPRECATED (use sample rate to adjust audio quality)
  • speed: The speed of the audio to be returned, a float (default 1.0).
  • seed: Random seed to use for audio generation, an integer (default None, will be randomly generated).
  • The following options are inference-time hyperparameters of the text-to-speech model; if unset, the model will use default values chosen by PlayHT.
    • temperature: The temperature of the model, a float.
    • top_p: The top_p of the model, a float.
    • text_guidance: The text_guidance of the model, a float.
    • voice_guidance The voice_guidance of the model, a float.
    • style_guidance (Play3.0-mini-http and Play3.0-mini-ws only): The style_guidance of the model, a float.
    • repetition_penalty: The repetition_penalty of the model, a float.
  • disable_stabilization (PlayHT2.0-turbo only): Disable the audio stabilization process, a boolean (default False).
  • language (Play3.0-mini-http and Play3.0-mini-ws only): The language of the text to be spoken, a Language enum value or None (default English).
    • AFRIKAANS
    • ALBANIAN
    • AMHARIC
    • ARABIC
    • BENGALI
    • BULGARIAN
    • CATALAN
    • CROATIAN
    • CZECH
    • DANISH
    • DUTCH
    • ENGLISH
    • FRENCH
    • GALICIAN
    • GERMAN
    • GREEK
    • HEBREW
    • HINDI
    • HUNGARIAN
    • INDONESIAN
    • ITALIAN
    • JAPANESE
    • KOREAN
    • MALAY
    • MANDARIN
    • POLISH
    • PORTUGUESE
    • RUSSIAN
    • SERBIAN
    • SPANISH
    • SWEDISH
    • TAGALOG
    • THAI
    • TURKISH
    • UKRAINIAN
    • URDU
    • XHOSA

Command-Line Demo

You can run the provided demo from the command line.

Note: This demo depends on the following packages:

pip install numpy soundfile
python demo/main.py --user $PLAY_HT_USER_ID --key $PLAY_HT_API_KEY --text "Hello from Play!"

To run with the asyncio client, use the --async flag:

python demo/main.py --user $PLAY_HT_USER_ID --key $PLAY_HT_API_KEY --text "Hello from Play!" --async

To run with the HTTP API, which uses our latest Play3.0-mini model, use the --http flag:

python demo/main.py --user $PLAY_HT_USER_ID --key $PLAY_HT_API_KEY --text "Hello from Play!" --http

To run with the WebSockets API, which also uses our latest Play3.0-mini model, use the --ws flag:

python demo/main.py --user $PLAY_HT_USER_ID --key $PLAY_HT_API_KEY --text "Hello from Play!" --ws

The HTTP and WebSockets APIs can also be used with the async client:

python demo/main.py --user $PLAY_HT_USER_ID --key $PLAY_HT_API_KEY --text "Hello from Play!" --http --async
python demo/main.py --user $PLAY_HT_USER_ID --key $PLAY_HT_API_KEY --text "Hello from Play!" --ws --async

Alternatively, you can run the demo in interactive mode:

python demo/main.py --user $PLAY_HT_USER_ID --key $PLAY_HT_API_KEY --interactive

In interactive mode, you can input text lines to generate and play audio on-the-fly. An empty line will exit the interactive session.

Get an API Key

To get started with the pyht SDK, you'll need your API Secret Key and User ID. Follow these steps to obtain them:

  1. Access the API Page: Navigate to the API Access page.

  2. Generate Your API Secret Key:

    • Click the "Generate Secret Key" button under the "Secret Key" section.
    • Your API Secret Key will be displayed. Ensure you copy it and store it securely.
  3. Locate Your User ID: Find and copy your User ID, which can be found on the same page under the "User ID" section.

Keep your API Secret Key confidential. It's crucial not to share it with anyone or include it in publicly accessible code repositories.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyht-0.1.3.tar.gz (25.4 kB view details)

Uploaded Source

Built Distribution

pyht-0.1.3-py3-none-any.whl (28.5 kB view details)

Uploaded Python 3

File details

Details for the file pyht-0.1.3.tar.gz.

File metadata

  • Download URL: pyht-0.1.3.tar.gz
  • Upload date:
  • Size: 25.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.0 Linux/6.5.0-1025-azure

File hashes

Hashes for pyht-0.1.3.tar.gz
Algorithm Hash digest
SHA256 23c130181b4a8e4178ede44a68110fbdb74b6ecd3c55bb6ebe5998aa7a421591
MD5 6aa24756b2805e8913fea90967013bb9
BLAKE2b-256 3acf8485a108e0233403e9b6210b90cfb7915b548d0ef404faf2e811e9e48a8f

See more details on using hashes here.

File details

Details for the file pyht-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: pyht-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 28.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.0 Linux/6.5.0-1025-azure

File hashes

Hashes for pyht-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 72a59a98a2119f01ad3ead6449bc5719b200549bb64a8faa857e3af84350f423
MD5 be7c0843684fac31e184413b52246cc4
BLAKE2b-256 983c7331f21a15f1b0f5c4df1089e4377d2d45bc0e9b9d9301291119242c8eda

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page