Python interface to CMU Sphinxbase and Pocketsphinx libraries
Project description
Pocketsphinx Python
Pocketsphinx is a part of the CMU Sphinx Open Source Toolkit For Speech Recognition.
This package provides a python interface to CMU Sphinxbase and Pocketsphinx libraries created with SWIG and Setuptools.
Supported platforms
- Windows
- Linux
- Mac OS X
Installation
# Make sure we have up-to-date versions of pip, setuptools and wheel
python -m pip install --upgrade pip setuptools wheel
pip install --upgrade pocketsphinx
More binary distributions for manual installation are available here.
Usage
LiveSpeech
It's an iterator class for continuous recognition or keyword search from a microphone.
from pocketsphinx import LiveSpeech
for phrase in LiveSpeech(): print(phrase)
An example of a keyword search:
from pocketsphinx import LiveSpeech
speech = LiveSpeech(lm=False, keyphrase='forward', kws_threshold=1e+20)
for phrase in speech:
print(phrase.segments(detailed=True))
With your model and dictionary:
import os
from pocketsphinx import LiveSpeech, get_model_path
model_path = get_model_path()
speech = LiveSpeech(
verbose=False,
sampling_rate=16000,
buffer_size=2048,
no_search=False,
full_utt=False,
hmm=os.path.join(model_path, 'en-us'),
lm=os.path.join(model_path, 'en-us.lm.bin'),
dic=os.path.join(model_path, 'cmudict-en-us.dict')
)
for phrase in speech:
print(phrase)
AudioFile
It's an iterator class for continuous recognition or keyword search from a file.
from pocketsphinx import AudioFile
for phrase in AudioFile(): print(phrase) # => "go forward ten meters"
An example of a keyword search:
from pocketsphinx import AudioFile
audio = AudioFile(lm=False, keyphrase='forward', kws_threshold=1e+20)
for phrase in audio:
print(phrase.segments(detailed=True)) # => "[('forward', -617, 63, 121)]"
With your model and dictionary:
import os
from pocketsphinx import AudioFile, get_model_path, get_data_path
model_path = get_model_path()
data_path = get_data_path()
config = {
'verbose': False,
'audio_file': os.path.join(data_path, 'goforward.raw'),
'buffer_size': 2048,
'no_search': False,
'full_utt': False,
'hmm': os.path.join(model_path, 'en-us'),
'lm': os.path.join(model_path, 'en-us.lm.bin'),
'dict': os.path.join(model_path, 'cmudict-en-us.dict')
}
audio = AudioFile(**config)
for phrase in audio:
print(phrase)
Pocketsphinx
It's a simple and flexible proxy class to pocketsphinx.Decode
.
from pocketsphinx import Pocketsphinx
print(Pocketsphinx().decode()) # => "go forward ten meters"
A more comprehensive example:
from __future__ import print_function
import os
from pocketsphinx import Pocketsphinx, get_model_path, get_data_path
model_path = get_model_path()
data_path = get_data_path()
config = {
'hmm': os.path.join(model_path, 'en-us'),
'lm': os.path.join(model_path, 'en-us.lm.bin'),
'dict': os.path.join(model_path, 'cmudict-en-us.dict')
}
ps = Pocketsphinx(**config)
ps.decode(
audio_file=os.path.join(data_path, 'goforward.raw'),
buffer_size=2048,
no_search=False,
full_utt=False
)
print(ps.segments()) # => ['<s>', '<sil>', 'go', 'forward', 'ten', 'meters', '</s>']
print('Detailed segments:', *ps.segments(detailed=True), sep='\n') # => [
# word, prob, start_frame, end_frame
# ('<s>', 0, 0, 24)
# ('<sil>', -3778, 25, 45)
# ('go', -27, 46, 63)
# ('forward', -38, 64, 116)
# ('ten', -14105, 117, 152)
# ('meters', -2152, 153, 211)
# ('</s>', 0, 212, 260)
# ]
print(ps.hypothesis()) # => go forward ten meters
print(ps.probability()) # => -32079
print(ps.score()) # => -7066
print(ps.confidence()) # => 0.04042641466841839
print(*ps.best(count=10), sep='\n') # => [
# ('go forward ten meters', -28034)
# ('go for word ten meters', -28570)
# ('go forward and majors', -28670)
# ('go forward and meters', -28681)
# ('go forward and readers', -28685)
# ('go forward ten readers', -28688)
# ('go forward ten leaders', -28695)
# ('go forward can meters', -28695)
# ('go forward and leaders', -28706)
# ('go for work ten meters', -28722)
# ]
Default config
If you don't pass any argument while creating an instance of the Pocketsphinx, AudioFile or LiveSpeech class, it will use next default values:
verbose = False
logfn = /dev/null or nul
audio_file = site-packages/pocketsphinx/data/goforward.raw
audio_device = None
sampling_rate = 16000
buffer_size = 2048
no_search = False
full_utt = False
hmm = site-packages/pocketsphinx/model/en-us
lm = site-packages/pocketsphinx/model/en-us.lm.bin
dict = site-packages/pocketsphinx/model/cmudict-en-us.dict
Any other option must be passed into the config as is, without using symbol -
.
If you want to disable default language model or dictionary, you can change the value of the corresponding options to False:
lm = False
dict = False
Verbose
Send output to stdout:
from pocketsphinx import Pocketsphinx
ps = Pocketsphinx(verbose=True)
ps.decode()
print(ps.hypothesis())
Send output to file:
from pocketsphinx import Pocketsphinx
ps = Pocketsphinx(verbose=True, logfn='pocketsphinx.log')
ps.decode()
print(ps.hypothesis())
Compatibility
Parent classes are still available:
import os
from pocketsphinx import DefaultConfig, Decoder, get_model_path, get_data_path
model_path = get_model_path()
data_path = get_data_path()
# Create a decoder with a certain model
config = DefaultConfig()
config.set_string('-hmm', os.path.join(model_path, 'en-us'))
config.set_string('-lm', os.path.join(model_path, 'en-us.lm.bin'))
config.set_string('-dict', os.path.join(model_path, 'cmudict-en-us.dict'))
decoder = Decoder(config)
# Decode streaming data
buf = bytearray(1024)
with open(os.path.join(data_path, 'goforward.raw'), 'rb') as f:
decoder.start_utt()
while f.readinto(buf):
decoder.process_raw(buf, False, False)
decoder.end_utt()
print('Best hypothesis segments:', [seg.word for seg in decoder.seg()])
Install development version
Install requirements
Windows requirements:
Ubuntu requirements:
sudo apt-get install -qq python python-dev python-pip build-essential swig git libpulse-dev
Install with pip
pip install https://github.com/bambocher/pocketsphinx-python/archive/master.zip
Install with distutils
git clone --recursive https://github.com/bambocher/pocketsphinx-python
cd pocketsphinx-python
python setup.py install
Projects using pocketsphinx-python
- SpeechRecognition - Library for performing speech recognition, with support for several engines and APIs, online and offline.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for pocketsphinx-0.1.9.win-amd64-py3.6.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | ea615099fde2603a9bcd251abedcc8adb7a261d0e78a946ff44711b70a5b59b0 |
|
MD5 | 065068be975b1c88605bfb5641967e8b |
|
BLAKE2b-256 | bceea0e957d9035149467923c18e50f4ac842eb6359a62dc2b8dec3258d20377 |
Hashes for pocketsphinx-0.1.9.win-amd64-py3.5.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8d0272cd9f97f6b53a5283d56a7b5032390b5a8afaf6f8070224e781d8c7adce |
|
MD5 | 219dcd840b4ef59dd60a2ce32ab5c186 |
|
BLAKE2b-256 | 93ec335c34b14d47e11f3e465b9bd863b8c63c5c83a11f84bb1d38cc7213bb34 |
Hashes for pocketsphinx-0.1.9.win-amd64-py2.7.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | c2efc7877afb5c4732eb45a1274d54e11defb18363c8eaae8725b0c26be9f1ff |
|
MD5 | 7d03d258ef13888686c52f69bdf20d93 |
|
BLAKE2b-256 | 3ca98ede9a2976467defee80921ecfae7d3bca4b54ed4c1fd4bfb2cdd97413d9 |
Hashes for pocketsphinx-0.1.9.win32-py3.6.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | c1019e95333edf72bd43177226882650e20a615223b259ae9fd3886ead41d10f |
|
MD5 | 1659b089c8deeb9fac5dafd7ea0c9778 |
|
BLAKE2b-256 | 6ae1739194679bc162e248647b8deff64642c59ddf257be6e9a213af5e24b2b7 |
Hashes for pocketsphinx-0.1.9.win32-py3.5.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | a4a4d09b8cbf30854e5170bf9e8db5f6e6d29029e66b3b8700ad50bfe5c89dc9 |
|
MD5 | 76b8999856cf32c1a9a53fb4dae0c67d |
|
BLAKE2b-256 | f4cf9431e06564d14d1bd8346636d9936fe7c30e6ee0fc8a416ca99457245bff |
Hashes for pocketsphinx-0.1.9.win32-py2.7.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | d931ac2493f5d90b7f7d75d880b80c4ff78de880976d59a1f5ee88fb870d6347 |
|
MD5 | ab3e9cdfd8541f399ab47b38b8d49a97 |
|
BLAKE2b-256 | 934e0a58a46e5e83ca0601ea44c747454942d3624b744093086ef061f66c2fe9 |
Hashes for pocketsphinx-0.1.9-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 957c29fc4e7aa295db5d5348e97d23c5d81f37331465ff6dd749143b91e35b6a |
|
MD5 | 73f76353db1caca7a1f22e63053a7af1 |
|
BLAKE2b-256 | 4d2b215ebfc102a5dcfbbe1046f992f52c1faa535c21ee902aa31841dc0ddbfc |
Hashes for pocketsphinx-0.1.9-cp36-cp36m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 571cd8826d0f6bd2d30ee1b940117857fa076ec130dae6ebad74bb1b2f81a59c |
|
MD5 | 640acc3c6441233791afeeb221a41cae |
|
BLAKE2b-256 | 509a68d364472c26e16b069eab6d36b55466c47c027d3b25a340e091418c28cc |
Hashes for pocketsphinx-0.1.9-cp35-cp35m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 44a207d7fd39de111fd48324aca5c393a147efe45a89b9c569be783e68b3e1e0 |
|
MD5 | 634d2f632a044f950083412a8373b417 |
|
BLAKE2b-256 | c747b2bf24500848c2315e07c954c8b5bb7db17ecc515a0923d46b67a9640e1d |
Hashes for pocketsphinx-0.1.9-cp35-cp35m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 246e541b96e92d4a3052b49a9e1c2bad2f77bdcbcf97273a2e5cc557320eadc0 |
|
MD5 | 50bf52902c3e85b0c6dbb88e86f3bbc4 |
|
BLAKE2b-256 | a624345f340be8203b511d309e5f3dafd50849e827225db6428a1589290c827e |
Hashes for pocketsphinx-0.1.9-cp27-cp27m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ace647d27922ffa0077be7c4dd6922423269456ffcd4602027dfbe9050a7c360 |
|
MD5 | 0f5cea4918c1f2729fbc2238f2c82f5a |
|
BLAKE2b-256 | 9ceb7d15e825aa968c34f73bed2d1bb216016b3e0a76fdc1c845bec55a24fa60 |
Hashes for pocketsphinx-0.1.9-cp27-cp27m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4746417f9e501e1dbf870680611ba15df6e619945e2e90d14adde85b92bd412d |
|
MD5 | 0f454faf841d817092dcc7b831b63068 |
|
BLAKE2b-256 | 9e797d61143896574b467b024c879963c35e529366f0f6a1e5f6ad425f33deee |