Octopus Speech-to-Index engine.
Project description
Octopus
Made in Vancouver, Canada by Picovoice
Octopus is Picovoice's Speech-to-Index engine. It directly indexes speech without relying on a text representation. This acoustic-only approach boosts accuracy by removing out-of-vocabulary limitation and eliminating the problem of competing hypothesis (e.g. homophones)
Compatibility
- Python 3
- Runs on Linux (x86_64), Mac (x86_64), Windows (x86_64)
Installation
pip3 install pvoctopus
Usage
Create an instance of the engine:
import pvoctopus
access_key = "" # AccessKey provided by Picovoice Console (https://picovoice.ai/console/)
handle = pvoctopus.create(access_key=access_key)
Octopus consists of two steps: Indexing and Searching. Indexing transforms audio data into a Metadata
object that
searches can be run against.
Octopus indexing has two modes of operation: indexing PCM audio data, or indexing an audio file.
When indexing PCM audio data, the valid audio sample rate is given by handle.pcm_sample_rate
.
The engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio:
audio_data = [..]
metadata = handle.index(audio_data)
Similarly, files can be indexed by passing in the absolute file path to the audio object. Supported file formats are mp3, flac, wav and opus:
audio_file_path = "/path/to/my/audiofile.wav"
metadata = handle.index_file(audio_file_path)
Once the Metadata
object has been created, it can be used for searching:
search_term = 'picovoice'
matches = octopus.search(metadata, [search_term])
Multiple search terms can be given:
matches = octopus.search(metadata, ['picovoice', 'Octopus', 'rhino'])
The matches
object is a dictionary where the key
is the phrase
, and the value
is a list
of Match
objects.
The Match
object contains the start_sec
, end_sec
and probablity
of each match:
matches = octopus.search(metadata, ['avocado'])
avocado_matches = matches['avocado']
for match in avocado_matches:
print(f"Match for `avocado`: {match.start_sec} -> {match.end_sec} ({match.probablity})")
The Metadata
object can be cached or stored to skip the indexing step on subsequent searches.
This can be done with the to_bytes()
and from_bytes()
methods:
metadata_bytes = metadata.to_bytes()
# ... Write & load `metadata_bytes` from cache/filesystem/etc.
cached_metadata = pvoctopus.OctopusMetadata.from_bytes(metadata_bytes)
matches = self.octopus.search(cached_metadata, ['avocado'])
When done both the metadata and handle resources have to be released explicitly:
metadata.delete()
handle.delete()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pvoctopus-1.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 533e54316f963ce8022967224d69b459669a68e38f8707bb8344d04203081561 |
|
MD5 | 918f506655e1ae12559541f538e7a65e |
|
BLAKE2b-256 | c5edcdaa07aaceeebe8bfbf6282850c0342184070ddbd843018f2928ac53fb9c |