Leopard Speech-to-Text Engine.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 5 - Production/Stable
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Multimedia :: Sound/Audio :: Speech

Project description

Leopard Binding for Python

Leopard Speech-to-Text Engine

Made in Vancouver, Canada by Picovoice

Leopard is an on-device speech-to-text engine. Leopard is:

Private; All voice processing runs locally.
Accurate
Compact and Computationally-Efficient
Cross-Platform:
- Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64)
- Android and iOS
- Chrome, Safari, Firefox, and Edge
- Raspberry Pi (5, 4, 3) and NVIDIA Jetson Nano

Compatibility

Python 3.7+
Runs on Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64), Raspberry Pi (4, 3), and NVIDIA Jetson Nano.

Installation

pip3 install pvleopard

AccessKey

Leopard requires a valid Picovoice AccessKey at initialization. AccessKey acts as your credentials when using Leopard SDKs. You can get your AccessKey for free. Make sure to keep your AccessKey secret. Signup or Login to Picovoice Console to get your AccessKey.

Usage

Create an instance of the engine and transcribe an audio file:

import pvleopard

leopard = pvleopard.create(access_key='${ACCESS_KEY}')

transcript, words = leopard.process_file('${AUDIO_FILE_PATH}')
print(transcript)
for word in words:
    print(
      "{word=\"%s\" start_sec=%.2f end_sec=%.2f confidence=%.2f speaker_tag=%d}"
      % (word.word, word.start_sec, word.end_sec, word.confidence, word.speaker_tag))

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console and ${AUDIO_FILE_PATH} to the path an audio file.

Finally, when done be sure to explicitly release the resources:

leopard.delete()

Language Model

The Leopard Python SDK comes preloaded with a default English language model (.pv file). Default models for other supported languages can be found in lib/common.

Create custom language models using the Picovoice Console. Here you can train language models with custom vocabulary and boost words in the existing vocabulary.

Pass in the .pv file via the model_path argument:

leopard = pvleopard.create(
    access_key='${ACCESS_KEY}',
    model_path='${MODEL_FILE_PATH}')

Word Metadata

Along with the transcript, Leopard returns metadata for each transcribed word. Available metadata items are:

Start Time: Indicates when the word started in the transcribed audio. Value is in seconds.
End Time: Indicates when the word ended in the transcribed audio. Value is in seconds.
Confidence: Leopard's confidence that the transcribed word is accurate. It is a number within [0, 1].
Speaker Tag: If speaker diarization is enabled on initialization, the speaker tag is a non-negative integer identifying unique speakers, with 0 reserved for unknown speakers. If speaker diarization is not enabled, the value will always be -1.

Demos

pvleoparddemo provides command-line utilities for processing audio using Leopard.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 5 - Production/Stable
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Multimedia :: Sound/Audio :: Speech

Release history Release notifications | RSS feed

This version

2.0.2

Feb 6, 2024

2.0.1

Nov 30, 2023

2.0.0

Nov 24, 2023

1.2.2

Apr 11, 2023

1.2.1

Mar 27, 2023

1.1.5

Mar 16, 2023

1.1.4

Dec 13, 2022

1.1.3

Aug 4, 2022

1.1.2

Aug 4, 2022

1.1.1

Aug 4, 2022

1.1.0

Aug 4, 2022

1.0.7

Jul 25, 2022

1.0.6

May 12, 2022

1.0.5

Apr 11, 2022

1.0.4

Mar 11, 2022

1.0.3

Feb 28, 2022

1.0.2

Jan 17, 2022

1.0.1

Jan 11, 2022

1.0.0

Jan 10, 2022

0.9.2

Jan 9, 2022

0.9.1

Jan 9, 2022

0.9.0

Jan 6, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pvleopard-2.0.2.tar.gz (42.2 MB view hashes)

Uploaded Feb 6, 2024 Source

Built Distribution

pvleopard-2.0.2-py3-none-any.whl (42.2 MB view hashes)

Uploaded Feb 6, 2024 Python 3

Hashes for pvleopard-2.0.2.tar.gz

Hashes for pvleopard-2.0.2.tar.gz
Algorithm	Hash digest
SHA256	`fccb8773a54179925e70eed7960fde2e939aed0a2c009b2f6b96eeae39fb6f80`
MD5	`48c34d4051617c9bd61162cb73f76a5d`
BLAKE2b-256	`38ba7990652a6719cf732bb146660db074be9f9a9f11f6f19c65f3fbd3a9fb30`

Hashes for pvleopard-2.0.2-py3-none-any.whl

Hashes for pvleopard-2.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`170429cc70ed7417e04a28b4098c33388014825344cefdc8026ac766bd8f8f65`
MD5	`3243233efe6f1f9d1fca7416631ff86b`
BLAKE2b-256	`99c869097fb8922895cf6686355e87998debcd106167680f8d5398eb34d1257f`