Skip to main content

Orca Text-to-Speech Engine.

Project description

Orca Binding for Python

Orca Text-to-Speech Engine

Made in Vancouver, Canada by Picovoice

Orca is an on-device text-to-speech engine producing high-quality, realistic, spoken audio with zero latency. Orca is:

  • Private; All voice processing runs locally.
  • Cross-Platform:
    • Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64)
    • Android and iOS
    • Chrome, Safari, Firefox, and Edge
    • Raspberry Pi (5, 4, 3) and NVIDIA Jetson Nano

Compatibility

  • Python 3.7+
  • Runs on Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64), Raspberry Pi (5, 4, 3), and NVIDIA Jetson Nano.

Installation

pip3 install pvorca

AccessKey

Orca requires a valid Picovoice AccessKey at initialization. AccessKey acts as your credentials when using Orca SDKs. You can get your AccessKey for free. Make sure to keep your AccessKey secret. Signup or Login to Picovoice Console to get your AccessKey.

Usage

Create an instance of the Orca engine:

import pvorca

orca = pvorca.create(access_key='${ACCESS_KEY}')

Replace the ${ACCESS_KEY} with your AccessKey obtained from Picovoice Console.

You can synthesize speech by calling one of the synthesize methods:

# Return raw PCM
pcm = orca.synthesize(text='${TEXT}')

# Save the generated audio to a WAV file directly
orca.synthesize_to_file(text='${TEXT}', path='${OUTPUT_PATH}')

Replace ${TEXT} with the text to be synthesized and ${OUTPUT_PATH} with the path to save the generated audio as a single-channel 16-bit PCM WAV file.
When done make sure to explicitly release the resources with orca.delete().

Text input

Orca accepts the 26 lowercase (a-z) and 26 uppercase (A-Z) letters of the English alphabet, as well as common punctuation marks. You can get a list of all supported characters by calling the valid_characters() method provided in the Orca SDK you are using. Pronunciations of characters or words not supported by this list can be achieved with custom pronunciations.

Custom pronunciations

Orca allows to embed custom pronunciations in the text via the syntax: {word|pronunciation}.
The pronunciation is expressed in ARPAbet phonemes, for example:

  • "This is a {custom|K AH S T AH M} pronunciation"
  • "{read|R IY D} this as {read|R EH D}, please."
  • "I {live|L IH V} in {Sevilla|S EH V IY Y AH}. We have great {live|L AY V} sports!"

Voices

Orca can synthesize speech with various voices, each of which is characterized by a model file located in lib/common. To create an instance of the engine with a specific voice, use:

orca = pvorca.create(access_key='${ACCESS_KEY}', model_path='${MODEL_PATH}')

and replace ${MODEL_PATH} with the path to the model file with the desired voice.

Speech control

Orca allows for keyword arguments to be provided to the synthesize methods to control the synthesized speech:

  • speech_rate: Controls the speed of the generated speech. Valid values are within [0.7, 1.3]. A higher (lower) value produces speech that is faster (slower). The default is 1.0.

Orca properties

To obtain the set of valid characters, call orca.valid_characters.
To retrieve the maximum number of characters allowed, call orca.max_character_limit.
The sample rate of Orca is orca.sample_rate.

Demos

pvorcademo provides command-line utilities for synthesizing audio using Orca.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pvorca-0.1.4.tar.gz (21.7 MB view hashes)

Uploaded Source

Built Distribution

pvorca-0.1.4-py3-none-any.whl (21.7 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page