Python wrapper for the deepspeech library
Project description
Spych
Pronounced: Speech
Python wrapper for easily accessing the DeepSpeech python package via python (without the DeepSpeech CLI)
Documentation for Spych Functions
Spych - https://connor-makowski.github.io/spych/core.html
Spych Wake - https://connor-makowski.github.io/spych/wake.html
Key Features
- Simplified access to pretrained DeepSpeech models for offline and free speech transcription
Setup
Make sure you have Python 3.6.x (or higher) and 3.8.x (or lower) installed on your system. You can download it here.
Installation
- Install SoX
- On Debian/Ubuntu
sudo apt install sox
- On Mac (via homebrew)
brew install sox
- On windows (Recommend WSL)
- Install Spych
pip install spych
- Get DeepSpeech Model and Score files:
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.pbmm
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.scorer
Examples
Transcribe Existing Audio File
from spych import spych
spych_obj=spych(model_file='deepspeech-0.9.3-models.pbmm', scorer_file='deepspeech-0.9.3-models.scorer')
# Convert the audio file to text
print('Transcription:')
print(spych_obj.stt(audio_file='test.wav'))
- Note: A
.wav
file at the same sample rate as your selected DeepSpeech models is processed the fastest
Record and Transcribe
from spych import spych
spych_obj=spych(model_file='deepspeech-0.9.3-models.pbmm', scorer_file='deepspeech-0.9.3-models.scorer')
# Record using your default microphone for 3 seconds
print('Recording...')
my_audio_buffer=spych_obj.record(duration=3)
print('Recording Finished')
# Convert the audio buffer to text
print('You said:')
print(spych_obj.stt(my_audio_buffer))
Process a Function After Hearing a Wake Word (Example Wake Word: computer
)
from spych import spych, spych_wake
model_file='deepspeech-0.9.3-models.pbmm'
scorer_file='deepspeech-0.9.3-models.scorer'
spych_object=spych(model_file=model_file, scorer_file=scorer_file)
def my_function():
print("Listening...")
audio_buffer=spych_object.record(duration=3)M
print("You said:",spych_object.stt(audio_buffer=audio_buffer))
listener=spych_wake(spych_object=spych_object, on_wake_fn=my_function, wake_word="computer")
# Alternatively you can specify a model and scorer file to initialized a wake object in the spych_wake class
# listener=spych_wake(model_file=model_file, scorer_file=scorer_file, on_wake_fn=my_function, wake_word="computer")
listener.start()
Modifying the DeepSpeech Model
- Initialized
spych
objects contain a fully functionalDeepSpeech.Model
object inside of them - You can modify this for each
spych
object any time after initialization DeepSpeech.Model
options are documented here
Example:
spych_obj=spych(model_file='deepspeech-0.9.3-models.pbmm')
spych_obj.model.enableExternalScorer('deepspeech-0.9.3-models.scorer')
spych_obj.model.addHotWord('activate',10.0)
Rasberry Pi 4 Setup
- Install system requirements
sudo apt install sox git python3-pip python3-scipy python3-numpy python3-pyaudio libatlas3-base
- Install python requirements
pip3 install spych
- Get the DeepSpeech model and score files (note Pi must use .tflite model file)
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.tflite
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.scorer
- Use the examples above substituting the original model file name for the
.tflite
one
- Depending on the memory available on you Pi, you may need to omit the scorer file.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
spych-0.0.5.tar.gz
(9.8 kB
view details)
File details
Details for the file spych-0.0.5.tar.gz
.
File metadata
- Download URL: spych-0.0.5.tar.gz
- Upload date:
- Size: 9.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 278436437fb2ad0ff78853f11d4e7d1df1b4b7406138d889a4da4aa43b6151ba |
|
MD5 | 9ca30ae34d06ae28088bf86fc00151a3 |
|
BLAKE2b-256 | b6d7449d3adfe3017bae6945c0260651046764ef0ccbd5b20fad434d6204c786 |