Python interface to CMU Sphinxbase and Pocketsphinx libraries
Project description
Pocketsphinx Python
Pocketsphinx is a part of the CMU Sphinx Open Source Toolkit For Speech Recognition.
This package provides a python interface to CMU Sphinxbase and Pocketsphinx libraries created with SWIG and Setuptools.
Supported platforms
- Windows
- Linux
- Mac OS X
Installation
# Make sure we have up-to-date versions of pip, setuptools and wheel
python -m pip install --upgrade pip setuptools wheel
pip install --upgrade pocketsphinx
More binary distributions for manual installation are available here.
Usage
LiveSpeech
It's an iterator class for continuous recognition or keyword search from a microphone.
from pocketsphinx import LiveSpeech
for phrase in LiveSpeech(): print(phrase)
An example of a keyword search:
from pocketsphinx import LiveSpeech
speech = LiveSpeech(lm=False, keyphrase='forward', kws_threshold=1e+20)
for phrase in speech:
print(phrase.segments(detailed=True))
With your model and dictionary:
import os
from pocketsphinx import LiveSpeech, get_model_path
model_path = get_model_path()
speech = LiveSpeech(
verbose=False,
sampling_rate=16000,
buffer_size=2048,
no_search=False,
full_utt=False,
hmm=os.path.join(model_path, 'en-us'),
lm=os.path.join(model_path, 'en-us.lm.bin'),
dic=os.path.join(model_path, 'cmudict-en-us.dict')
)
for phrase in speech:
print(phrase)
AudioFile
It's an iterator class for continuous recognition or keyword search from a file.
from pocketsphinx import AudioFile
for phrase in AudioFile(): print(phrase) # => "go forward ten meters"
An example of a keyword search:
from pocketsphinx import AudioFile
audio = AudioFile(lm=False, keyphrase='forward', kws_threshold=1e+20)
for phrase in audio:
print(phrase.segments(detailed=True)) # => "[('forward', -617, 63, 121)]"
With your model and dictionary:
import os
from pocketsphinx import AudioFile, get_model_path, get_data_path
model_path = get_model_path()
data_path = get_data_path()
config = {
'verbose': False,
'audio_file': os.path.join(data_path, 'goforward.raw'),
'buffer_size': 2048,
'no_search': False,
'full_utt': False,
'hmm': os.path.join(model_path, 'en-us'),
'lm': os.path.join(model_path, 'en-us.lm.bin'),
'dict': os.path.join(model_path, 'cmudict-en-us.dict')
}
audio = AudioFile(**config)
for phrase in audio:
print(phrase)
Pocketsphinx
It's a simple and flexible proxy class to pocketsphinx.Decode
.
from pocketsphinx import Pocketsphinx
print(Pocketsphinx().decode()) # => "go forward ten meters"
A more comprehensive example:
from __future__ import print_function
import os
from pocketsphinx import Pocketsphinx, get_model_path, get_data_path
model_path = get_model_path()
data_path = get_data_path()
config = {
'hmm': os.path.join(model_path, 'en-us'),
'lm': os.path.join(model_path, 'en-us.lm.bin'),
'dict': os.path.join(model_path, 'cmudict-en-us.dict')
}
ps = Pocketsphinx(**config)
ps.decode(
audio_file=os.path.join(data_path, 'goforward.raw'),
buffer_size=2048,
no_search=False,
full_utt=False
)
print(ps.segments()) # => ['<s>', '<sil>', 'go', 'forward', 'ten', 'meters', '</s>']
print('Detailed segments:', *ps.segments(detailed=True), sep='\n') # => [
# word, prob, start_frame, end_frame
# ('<s>', 0, 0, 24)
# ('<sil>', -3778, 25, 45)
# ('go', -27, 46, 63)
# ('forward', -38, 64, 116)
# ('ten', -14105, 117, 152)
# ('meters', -2152, 153, 211)
# ('</s>', 0, 212, 260)
# ]
print(ps.hypothesis()) # => go forward ten meters
print(ps.probability()) # => -32079
print(ps.score()) # => -7066
print(ps.confidence()) # => 0.04042641466841839
print(*ps.best(count=10), sep='\n') # => [
# ('go forward ten meters', -28034)
# ('go for word ten meters', -28570)
# ('go forward and majors', -28670)
# ('go forward and meters', -28681)
# ('go forward and readers', -28685)
# ('go forward ten readers', -28688)
# ('go forward ten leaders', -28695)
# ('go forward can meters', -28695)
# ('go forward and leaders', -28706)
# ('go for work ten meters', -28722)
# ]
Default config
If you don't pass any argument while creating an instance of the Pocketsphinx, AudioFile or LiveSpeech class, it will use next default values:
verbose = False
logfn = /dev/null or nul
audio_file = site-packages/pocketsphinx/data/goforward.raw
audio_device = None
sampling_rate = 16000
buffer_size = 2048
no_search = False
full_utt = False
hmm = site-packages/pocketsphinx/model/en-us
lm = site-packages/pocketsphinx/model/en-us.lm.bin
dict = site-packages/pocketsphinx/model/cmudict-en-us.dict
Any other option must be passed into the config as is, without using symbol -
.
If you want to disable default language model or dictionary, you can change the value of the corresponding options to False:
lm = False
dict = False
Verbose
Send output to stdout:
from pocketsphinx import Pocketsphinx
ps = Pocketsphinx(verbose=True)
ps.decode()
print(ps.hypothesis())
Send output to file:
from pocketsphinx import Pocketsphinx
ps = Pocketsphinx(verbose=True, logfn='pocketsphinx.log')
ps.decode()
print(ps.hypothesis())
Compatibility
Parent classes are still available:
import os
from pocketsphinx import DefaultConfig, Decoder, get_model_path, get_data_path
model_path = get_model_path()
data_path = get_data_path()
# Create a decoder with a certain model
config = DefaultConfig()
config.set_string('-hmm', os.path.join(model_path, 'en-us'))
config.set_string('-lm', os.path.join(model_path, 'en-us.lm.bin'))
config.set_string('-dict', os.path.join(model_path, 'cmudict-en-us.dict'))
decoder = Decoder(config)
# Decode streaming data
buf = bytearray(1024)
with open(os.path.join(data_path, 'goforward.raw'), 'rb') as f:
decoder.start_utt()
while f.readinto(buf):
decoder.process_raw(buf, False, False)
decoder.end_utt()
print('Best hypothesis segments:', [seg.word for seg in decoder.seg()])
Install development version
Install requirements
Windows requirements:
Ubuntu requirements:
sudo apt-get install -qq python python-dev python-pip build-essential swig git libpulse-dev
Install with pip
pip install https://github.com/bambocher/pocketsphinx-python/archive/master.zip
Install with distutils
git clone --recursive https://github.com/bambocher/pocketsphinx-python
cd pocketsphinx-python
python setup.py install
Projects using pocketsphinx-python
- SpeechRecognition - Library for performing speech recognition, with support for several engines and APIs, online and offline.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for pocketsphinx-0.1.10.win-amd64-py3.6.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | d74c352905f9410e9b7adb93be47d15879710ac2870ad5e96ffeb58935ecb7c4 |
|
MD5 | 5c9aa17515649e17fae2bb19f80cfc2a |
|
BLAKE2b-256 | dec9c6bc99d2e6fe68ce1e0a3babb65ea5e19b145137f0de2965f7e087f2f1b6 |
Hashes for pocketsphinx-0.1.10.win-amd64-py3.5.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | b16bcf09946b46de6acce8bb35d6454b2be07a6ec3107e2e44898b49185b1c4a |
|
MD5 | ab1fc2abbc89be1881844f92a2ef2abd |
|
BLAKE2b-256 | 8f8ebee80a6e93d1c78ab8fa95e034b3e93dbe3003016e268e33d82843d4e8b1 |
Hashes for pocketsphinx-0.1.10.win-amd64-py2.7.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3496ada3c4f041244f81dbb18b8b76bda6cb9c438ff881cc5c6a2e68be959627 |
|
MD5 | 2721d418e926d1987310f629e14f0f12 |
|
BLAKE2b-256 | 33714f9f39043f6c4a0b2c12d43a2630abbccddab50d7d38e303f654e77a4aae |
Hashes for pocketsphinx-0.1.10.win32-py3.6.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | 23d1ad97c7f346e2b04b8d2aa2120741ffdf381231e18e4b693cd5082a2f3f9b |
|
MD5 | 921574623a48a0f17c1748c62ad1a6f4 |
|
BLAKE2b-256 | 5cba338012d72310f1cd40183b9e01acd35e515d3209895188c21f1831858692 |
Hashes for pocketsphinx-0.1.10.win32-py3.5.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2e83eac4e4df23c62d351134f832210557d126a756b168596dbe13de2e284264 |
|
MD5 | d4e4b226ae7e44b14fb65addea396769 |
|
BLAKE2b-256 | 5436203c62c457ff6bd28fe9eb17a82102bcbecfda6d6487a36253c9c46d1ab2 |
Hashes for pocketsphinx-0.1.10.win32-py2.7.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5214d7eaec5650f00401e90f8bbb921bdfc475973af35897482fdba6c15fd484 |
|
MD5 | 94afaea6224d67c2522a9da0a6f3f3a8 |
|
BLAKE2b-256 | c34b3b0f614c628b2c599cfdbc2db9d351083ec2b8fd1018bc33674ade980b42 |
Hashes for pocketsphinx-0.1.10-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8e07e1016a3d46f4c8e88eae9b69799d50430759071e5f7e9773088aab6fb9a9 |
|
MD5 | 50af1a8635d3398f1e0a5b2daa474c4f |
|
BLAKE2b-256 | 72138748d22cb8f91e1dc970c7134e262f0a377c666d9852a4eedbdc3791a318 |
Hashes for pocketsphinx-0.1.10-cp36-cp36m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 694cf1b53658810956e14a80e53cd6de296caaafea5bd747d316a32f1f6c2c07 |
|
MD5 | b2ae6c4c87b61d2e6bf40782dbffa757 |
|
BLAKE2b-256 | e8ef477fcb4fac255f706bed84b8d59dccfd9d72efb52d8498c9e55e201c9439 |
Hashes for pocketsphinx-0.1.10-cp35-cp35m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d91f2068780a6fbf0663b857d8c9c03ea7e0598794dd04506f61af098e434542 |
|
MD5 | 2f9504e2816eacddcb53d873e643bacb |
|
BLAKE2b-256 | b7dde616cadca1a1d4424f8681b72cad2a956909d5bd4323cec350f235c69974 |
Hashes for pocketsphinx-0.1.10-cp35-cp35m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ab1bb66459400fae97927f55007d9b54996c0b92dd1984423c407733a9792c23 |
|
MD5 | 0bfdb9761e8b41c764c44e69f78e2ac4 |
|
BLAKE2b-256 | 13ebbc93ea2ad2875e6ebf022e0a03be24b8a2fad2ae64531228f1e8c3b2ae07 |
Hashes for pocketsphinx-0.1.10-cp27-cp27m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 743a3fc9df3849c4040d9e875b1752070a2cbe6bbc482ba213a2da29ba46866a |
|
MD5 | 705bf3329cd537ccafbb7fcabf3e6cb1 |
|
BLAKE2b-256 | 08234b6f112048c906c8c33e6b7a78992df504f398e243b40fb1a68bcc94a904 |
Hashes for pocketsphinx-0.1.10-cp27-cp27m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cecfe5bfd51d4694e367294c63dfbcf6be052d8b6bc450f8c9fedce21cbdc72f |
|
MD5 | 3c28af86feabed0d780febeaa72a8b7d |
|
BLAKE2b-256 | 12762786da5b29a1d918103b16dceb30de7a3216d6c5c1efc06a1369d6f44adc |