Skip to main content

Leopard Speech-to-Text Engine.

Project description

Leopard Binding for Python

Leopard Speech-to-Text Engine

Made in Vancouver, Canada by Picovoice

Leopard is an on-device speech-to-text engine. Leopard is:

  • Private; All voice processing runs locally.
  • Accurate
  • Compact and Computationally-Efficient
  • Cross-Platform:
    • Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64)
    • Android and iOS
    • Chrome, Safari, Firefox, and Edge
    • Raspberry Pi (5, 4, 3) and NVIDIA Jetson Nano

Compatibility

  • Python 3.7+
  • Runs on Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64), Raspberry Pi (4, 3), and NVIDIA Jetson Nano.

Installation

pip3 install pvleopard

AccessKey

Leopard requires a valid Picovoice AccessKey at initialization. AccessKey acts as your credentials when using Leopard SDKs. You can get your AccessKey for free. Make sure to keep your AccessKey secret. Signup or Login to Picovoice Console to get your AccessKey.

Usage

Create an instance of the engine and transcribe an audio file:

import pvleopard

leopard = pvleopard.create(access_key='${ACCESS_KEY}')

transcript, words = leopard.process_file('${AUDIO_FILE_PATH}')
print(transcript)
for word in words:
    print(
      "{word=\"%s\" start_sec=%.2f end_sec=%.2f confidence=%.2f speaker_tag=%d}"
      % (word.word, word.start_sec, word.end_sec, word.confidence, word.speaker_tag))

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console and ${AUDIO_FILE_PATH} to the path an audio file.

Finally, when done be sure to explicitly release the resources:

leopard.delete()

Language Model

The Leopard Python SDK comes preloaded with a default English language model (.pv file). Default models for other supported languages can be found in lib/common.

Create custom language models using the Picovoice Console. Here you can train language models with custom vocabulary and boost words in the existing vocabulary.

Pass in the .pv file via the model_path argument:

leopard = pvleopard.create(
    access_key='${ACCESS_KEY}',
    model_path='${MODEL_FILE_PATH}')

Word Metadata

Along with the transcript, Leopard returns metadata for each transcribed word. Available metadata items are:

  • Start Time: Indicates when the word started in the transcribed audio. Value is in seconds.
  • End Time: Indicates when the word ended in the transcribed audio. Value is in seconds.
  • Confidence: Leopard's confidence that the transcribed word is accurate. It is a number within [0, 1].
  • Speaker Tag: If speaker diarization is enabled on initialization, the speaker tag is a non-negative integer identifying unique speakers, with 0 reserved for unknown speakers. If speaker diarization is not enabled, the value will always be -1.

Demos

pvleoparddemo provides command-line utilities for processing audio using Leopard.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pvleopard-2.0.2.tar.gz (42.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pvleopard-2.0.2-py3-none-any.whl (42.2 MB view details)

Uploaded Python 3

File details

Details for the file pvleopard-2.0.2.tar.gz.

File metadata

  • Download URL: pvleopard-2.0.2.tar.gz
  • Upload date:
  • Size: 42.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.12

File hashes

Hashes for pvleopard-2.0.2.tar.gz
Algorithm Hash digest
SHA256 fccb8773a54179925e70eed7960fde2e939aed0a2c009b2f6b96eeae39fb6f80
MD5 48c34d4051617c9bd61162cb73f76a5d
BLAKE2b-256 38ba7990652a6719cf732bb146660db074be9f9a9f11f6f19c65f3fbd3a9fb30

See more details on using hashes here.

File details

Details for the file pvleopard-2.0.2-py3-none-any.whl.

File metadata

  • Download URL: pvleopard-2.0.2-py3-none-any.whl
  • Upload date:
  • Size: 42.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.12

File hashes

Hashes for pvleopard-2.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 170429cc70ed7417e04a28b4098c33388014825344cefdc8026ac766bd8f8f65
MD5 3243233efe6f1f9d1fca7416631ff86b
BLAKE2b-256 99c869097fb8922895cf6686355e87998debcd106167680f8d5398eb34d1257f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page