Rhino Speech-to-Intent engine.
Project description
Rhino Speech-to-Intent Engine
Made in Vancouver, Canada by Picovoice
Rhino is Picovoice's Speech-to-Intent engine. It directly infers intent from spoken commands within a given context of interest, in real-time. For example, given a spoken command:
Can I have a small double-shot espresso?
Rhino infers that the user would like to order a drink and emits the following inference result:
{
"isUnderstood": "true",
"intent": "orderBeverage",
"slots": {
"beverage": "espresso",
"size": "small",
"numberOfShots": "2"
}
}
Rhino is:
- using deep neural networks trained in real-world environments.
- compact and computationally-efficient, making it perfect for IoT.
- self-service. Developers and designers can train custom models using Picovoice Console.
Compatibility
- Python 3.8+
- Runs on Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64), and Raspberry Pi (Zero, 3, 4, 5).
Installation
pip3 install pvrhino
AccessKey
Rhino requires a valid Picovoice AccessKey
at initialization. AccessKey
acts as your credentials when using Rhino SDKs.
You can get your AccessKey
for free. Make sure to keep your AccessKey
secret.
Signup or Login to Picovoice Console to get your AccessKey
.
Usage
Create an instance of the engine:
import pvrhino
access_key = "${ACCESS_KEY}" # AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
handle = pvrhino.create(access_key=access_key, context_path='/absolute/path/to/context')
Where context_path
is the absolute path to Speech-to-Intent context created either using
Picovoice Console or one of the default contexts available on Rhino's GitHub repository.
The sensitivity of the engine can be tuned using the sensitivity
parameter. It is a floating-point number within
[0, 1]. A higher sensitivity value results in fewer misses at the cost of (potentially) increasing the erroneous
inference rate.
import pvrhino
access_key = "${ACCESS_KEY}" # AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
handle = pvrhino.create(access_key=access_key, context_path='/absolute/path/to/context', sensitivity=0.25)
When initialized, the valid sample rate is given by handle.sample_rate
. Expected frame length (number of audio samples
in an input array) is handle.frame_length
. The engine accepts 16-bit linearly-encoded PCM and operates on
single-channel audio.
def get_next_audio_frame():
pass
while True:
is_finalized = rhino.process(get_next_audio_frame())
if is_finalized:
inference = rhino.get_inference()
if not inference.is_understood:
# add code to handle unsupported commands
pass
else:
intent = inference.intent
slots = inference.slots
# add code to take action based on inferred intent and slot values
When done resources have to be released explicitly:
handle.delete()
Non-English Contexts
In order to run inference on non-English contexts you need to use the corresponding model file. The model files for all supported languages are available here.
Demos
pvrhinodemo provides command-line utilities for processing real-time audio (i.e. microphone) and files using Rhino.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pvrhino-3.0.3.tar.gz
.
File metadata
- Download URL: pvrhino-3.0.3.tar.gz
- Upload date:
- Size: 3.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 509c587ced9c3d165c5b91906c9ca9733b54539bd04d9b84f534f602e3e325a0 |
|
MD5 | 5a2e157014399ac8830bc24edd3019d1 |
|
BLAKE2b-256 | bb93ab3dff50bf4ebfc9a4afe1cb2a52e573ec5f2b7e4c420b742f8192228e52 |
File details
Details for the file pvrhino-3.0.3-py3-none-any.whl
.
File metadata
- Download URL: pvrhino-3.0.3-py3-none-any.whl
- Upload date:
- Size: 3.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4b0984cff08a9ae879a5971f8ab3ab4fe1915f2ba6239361d059814237d7461e |
|
MD5 | c9f8d3f0ca8949c3654e0f08a6280443 |
|
BLAKE2b-256 | be812a037cd80cab11c7f18de69908f9cb6797e8d61612e2b88bcf36ac54db85 |