Skip to main content

Orca Streaming Text-to-Speech Engine demos

Project description

Orca Streaming Text-to-Speech Engine Python Demo

Made in Vancouver, Canada by Picovoice

Orca

Orca is an on-device streaming text-to-speech engine that is designed for use with LLMs, enabling zero-latency voice assistants. Orca is:

  • Private; All voice processing runs locally.
  • Cross-Platform:
    • Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64)
    • Android and iOS
    • Chrome, Safari, Firefox, and Edge
    • Raspberry Pi (5, 4, 3) and NVIDIA Jetson Nano

Compatibility

  • Python 3.8+
  • Runs on Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64), Raspberry Pi (5, 4, 3), and NVIDIA Jetson Nano.

Installation

pip3 install pvorcademo

AccessKey

Orca requires a valid Picovoice AccessKey at initialization. AccessKey acts as your credentials when using Orca SDKs. You can get your AccessKey for free. Make sure to keep your AccessKey secret. Signup or Login to Picovoice Console to get your AccessKey.

Usage

Orca supports two modes of operation: streaming and single synthesis.

In the streaming synthesis mode, Orca processes an incoming text stream in real-time and generates audio in parallel. This is demonstrated in the Orca streaming demo.

In the single synthesis mode, the text is synthesized in a single call to the Orca engine.

Streaming synthesis demo

In this demo, we simulate a response from a language model by creating a text stream from a user-defined text. We stream that text to Orca and play the synthesized audio as soon as it gets generated.

To run it, execute the following:

orca_demo_streaming --access_key ${ACCESS_KEY} --text_to_stream ${TEXT}

Replace ${ACCESS_KEY} with your AccessKey obtained from Picovoice Console and ${TEXT} with your text to be streamed to Orca. Please note that this demo was not tested on macOS.

Single synthesis demo

To synthesize speech in a single call to Orca and without audio playback, run the following:

orca_demo --access_key ${ACCESS_KEY} --text ${TEXT} --output_path ${WAV_OUTPUT_PATH}

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console, ${TEXT} with your text to be synthesized, and ${WAV_OUTPUT_PATH} with a path to a .wav file where the generated audio will be stored as a single-channel, 16-bit PCM .wav file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pvorcademo-0.2.1.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

pvorcademo-0.2.1-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file pvorcademo-0.2.1.tar.gz.

File metadata

  • Download URL: pvorcademo-0.2.1.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for pvorcademo-0.2.1.tar.gz
Algorithm Hash digest
SHA256 b0181ac475b979bdb322f3e3cf6e8d44fb2c621229ffa7f3ef0cc7fc473ba3d9
MD5 ff53967400aba4487ea22dded12144b4
BLAKE2b-256 16d7f823f2798909d99c3fa8ec499123f6b468c53cee96a4656dc244a0bdd873

See more details on using hashes here.

File details

Details for the file pvorcademo-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: pvorcademo-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 12.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for pvorcademo-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ebedeafa6cfa9a8922f04ab80cdcf20830e9da42890e1e14b48c4ad63e2b1759
MD5 7152e6202f138d1337f325eb44b259e7
BLAKE2b-256 1c4211b107ddec352fc15f6ab960f8f961cd62f6b3eb20257d9b3388e2cafb0e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page