Python interface to CMU Sphinxbase and Pocketsphinx libraries
Project description
Pocketsphinx Python
Pocketsphinx is a part of the CMU Sphinx Open Source Toolkit For Speech Recognition.
This package provides a python interface to CMU Sphinxbase and Pocketsphinx libraries created with SWIG and Setuptools.
Supported platforms
- Windows
- Linux
- Mac OS X
Installation
# Make sure we have up-to-date versions of pip, setuptools and wheel
python -m pip install --upgrade pip setuptools wheel
pip install --upgrade pocketsphinx
More binary distributions for manual installation are available here.
Usage
LiveSpeech
It's an iterator class for continuous recognition or keyword search from a microphone.
from pocketsphinx import LiveSpeech
for phrase in LiveSpeech(): print(phrase)
An example of a keyword search:
from pocketsphinx import LiveSpeech
speech = LiveSpeech(lm=False, keyphrase='forward', kws_threshold=1e+20)
for phrase in speech:
print(phrase.segments(detailed=True))
With your model and dictionary:
import os
from pocketsphinx import LiveSpeech, get_model_path
model_path = get_model_path()
speech = LiveSpeech(
verbose=False,
sampling_rate=16000,
buffer_size=2048,
no_search=False,
full_utt=False,
hmm=os.path.join(model_path, 'en-us'),
lm=os.path.join(model_path, 'en-us.lm.bin'),
dic=os.path.join(model_path, 'cmudict-en-us.dict')
)
for phrase in speech:
print(phrase)
AudioFile
It's an iterator class for continuous recognition or keyword search from a file.
from pocketsphinx import AudioFile
for phrase in AudioFile(): print(phrase) # => "go forward ten meters"
An example of a keyword search:
from pocketsphinx import AudioFile
audio = AudioFile(lm=False, keyphrase='forward', kws_threshold=1e+20)
for phrase in audio:
print(phrase.segments(detailed=True)) # => "[('forward', -617, 63, 121)]"
With your model and dictionary:
import os
from pocketsphinx import AudioFile, get_model_path, get_data_path
model_path = get_model_path()
data_path = get_data_path()
config = {
'verbose': False,
'audio_file': os.path.join(data_path, 'goforward.raw'),
'buffer_size': 2048,
'no_search': False,
'full_utt': False,
'hmm': os.path.join(model_path, 'en-us'),
'lm': os.path.join(model_path, 'en-us.lm.bin'),
'dict': os.path.join(model_path, 'cmudict-en-us.dict')
}
audio = AudioFile(**config)
for phrase in audio:
print(phrase)
Pocketsphinx
It's a simple and flexible proxy class to pocketsphinx.Decode
.
from pocketsphinx import Pocketsphinx
print(Pocketsphinx().decode()) # => "go forward ten meters"
A more comprehensive example:
from __future__ import print_function
import os
from pocketsphinx import Pocketsphinx, get_model_path, get_data_path
model_path = get_model_path()
data_path = get_data_path()
config = {
'hmm': os.path.join(model_path, 'en-us'),
'lm': os.path.join(model_path, 'en-us.lm.bin'),
'dict': os.path.join(model_path, 'cmudict-en-us.dict')
}
ps = Pocketsphinx(**config)
ps.decode(
audio_file=os.path.join(data_path, 'goforward.raw'),
buffer_size=2048,
no_search=False,
full_utt=False
)
print(ps.segments()) # => ['<s>', '<sil>', 'go', 'forward', 'ten', 'meters', '</s>']
print('Detailed segments:', *ps.segments(detailed=True), sep='\n') # => [
# word, prob, start_frame, end_frame
# ('<s>', 0, 0, 24)
# ('<sil>', -3778, 25, 45)
# ('go', -27, 46, 63)
# ('forward', -38, 64, 116)
# ('ten', -14105, 117, 152)
# ('meters', -2152, 153, 211)
# ('</s>', 0, 212, 260)
# ]
print(ps.hypothesis()) # => go forward ten meters
print(ps.probability()) # => -32079
print(ps.score()) # => -7066
print(ps.confidence()) # => 0.04042641466841839
print(*ps.best(count=10), sep='\n') # => [
# ('go forward ten meters', -28034)
# ('go for word ten meters', -28570)
# ('go forward and majors', -28670)
# ('go forward and meters', -28681)
# ('go forward and readers', -28685)
# ('go forward ten readers', -28688)
# ('go forward ten leaders', -28695)
# ('go forward can meters', -28695)
# ('go forward and leaders', -28706)
# ('go for work ten meters', -28722)
# ]
Default config
If you don't pass any argument while creating an instance of the Pocketsphinx, AudioFile or LiveSpeech class, it will use next default values:
verbose = False
logfn = /dev/null or nul
audio_file = site-packages/pocketsphinx/data/goforward.raw
audio_device = None
sampling_rate = 16000
buffer_size = 2048
no_search = False
full_utt = False
hmm = site-packages/pocketsphinx/model/en-us
lm = site-packages/pocketsphinx/model/en-us.lm.bin
dict = site-packages/pocketsphinx/model/cmudict-en-us.dict
Any other option must be passed into the config as is, without using symbol -
.
If you want to disable default language model or dictionary, you can change the value of the corresponding options to False:
lm = False
dict = False
Verbose
Send output to stdout:
from pocketsphinx import Pocketsphinx
ps = Pocketsphinx(verbose=True)
ps.decode()
print(ps.hypothesis())
Send output to file:
from pocketsphinx import Pocketsphinx
ps = Pocketsphinx(verbose=True, logfn='pocketsphinx.log')
ps.decode()
print(ps.hypothesis())
Compatibility
Parent classes are still available:
import os
from pocketsphinx import DefaultConfig, Decoder, get_model_path, get_data_path
model_path = get_model_path()
data_path = get_data_path()
# Create a decoder with a certain model
config = DefaultConfig()
config.set_string('-hmm', os.path.join(model_path, 'en-us'))
config.set_string('-lm', os.path.join(model_path, 'en-us.lm.bin'))
config.set_string('-dict', os.path.join(model_path, 'cmudict-en-us.dict'))
decoder = Decoder(config)
# Decode streaming data
buf = bytearray(1024)
with open(os.path.join(data_path, 'goforward.raw'), 'rb') as f:
decoder.start_utt()
while f.readinto(buf):
decoder.process_raw(buf, False, False)
decoder.end_utt()
print('Best hypothesis segments:', [seg.word for seg in decoder.seg()])
Install development version
Install requirements
Windows requirements:
Ubuntu requirements:
sudo apt-get install -qq python python-dev python-pip build-essential swig git libpulse-dev
Install with pip
pip install https://github.com/bambocher/pocketsphinx-python/archive/master.zip
Install with distutils
git clone --recursive https://github.com/bambocher/pocketsphinx-python
cd pocketsphinx-python
python setup.py install
Projects using pocketsphinx-python
- SpeechRecognition - Library for performing speech recognition, with support for several engines and APIs, online and offline.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for pocketsphinx-0.1.11.win-amd64-py3.6.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | b1ea7874d9c34bb1f987387965cc7deb671e3658f8c541f96133c0f203dc5fff |
|
MD5 | 428c6e3c761e74605fcf0253e5b63d63 |
|
BLAKE2b-256 | a9e5e6467f34eb2971641f8067b25a526e065d27ce93fefc58adaca00dce4cfb |
Hashes for pocketsphinx-0.1.11.win-amd64-py3.5.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | c6157d83662df823ecdf3b601fb29b387c404ad0113d97e16fe1eaab5bee9c3d |
|
MD5 | d357d2d35453466205dbad9e45aeb659 |
|
BLAKE2b-256 | e10e462ea98b5abb9daa41a04ca101a1decc7449701f4aad20f2afcc84ea45d2 |
Hashes for pocketsphinx-0.1.11.win-amd64-py2.7.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | df2ea397329ff2736996b3a2e4961241779d75103fa90074963039feda4f41a3 |
|
MD5 | 71d94a2b20aef04ab8accac20288ebfa |
|
BLAKE2b-256 | db8b71b0ed53b8258f2c287292fa29e1cd1ac9e81801d94be7b75d9ca5c2b1ab |
Hashes for pocketsphinx-0.1.11.win32-py3.6.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | 21d0ea8198646274235e03eff315407de96346c78a948ec0609791f257b5cfa5 |
|
MD5 | b7c562328d8fe6b3657663a6032b422c |
|
BLAKE2b-256 | 170c4c687d4e7391961e3562297590afee9edf1a75f72df6ce8a55c250b9e58b |
Hashes for pocketsphinx-0.1.11.win32-py3.5.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4d82b014cd63eb2b5924bd4c02e2d302a37194e0ec7b4213d3d48dd810855ef7 |
|
MD5 | d749794e51bbbc4c15cfae59456a746e |
|
BLAKE2b-256 | 195c0bd8245be500cb79925a41ca167f22aea7d1b34d1f68429e43817b4bf948 |
Hashes for pocketsphinx-0.1.11.win32-py2.7.exe
Algorithm | Hash digest | |
---|---|---|
SHA256 | e83acef63174f0d46b166b989a28bd252c9c2b40a9f917952f007390a3eb3a0a |
|
MD5 | d1d634445db43d77943d0b7e0edf4f8f |
|
BLAKE2b-256 | e90cd5d631e406b3bbe3ecc387b7f4e4305219c6f7ce7e7ee09fea1230093db6 |
Hashes for pocketsphinx-0.1.11-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 43d32a749568a01bfba29175f583ab600d94b33d4fa8d15fbe82db518f28959d |
|
MD5 | 1b4c5a46ecaf300e79700274a9d16133 |
|
BLAKE2b-256 | b22e314ff2961f3cbace7b2c4fc519e0c08d53721ed524fca45fa77f8ec61fc5 |
Hashes for pocketsphinx-0.1.11-cp36-cp36m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c23ac66fc382337ea4666aa602c0d91b8d78033a1f11e411e193d7bc971f9609 |
|
MD5 | c95b74c6816575accf1da7066900cfc9 |
|
BLAKE2b-256 | 9355cbd9776559af37ecc2ea22de3201a4785ab2750e954269467c1f411a7964 |
Hashes for pocketsphinx-0.1.11-cp36-cp36m-macosx_10_12_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4ce5c646113be1aeb232900a9fc6a7e720fc51e33a4d669542799e51c069e394 |
|
MD5 | cc45afc38f67026324a5a42881335725 |
|
BLAKE2b-256 | 565774e5a4563e09d4b1e153fe56e31cfc1a81cdce8bdcd742129533a5bbf390 |
Hashes for pocketsphinx-0.1.11-cp35-cp35m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 224762f8d7ca60fba5cbad468ee4e94ab9f4e9db72f6f47a6b500a1b42812b90 |
|
MD5 | 2dcdd7e06a3d3646aff81a8c2695109f |
|
BLAKE2b-256 | 8abaccac82c1df05d3709fec2eb35e44e3a5dfb1b3b9c7c9aa9828c4367124be |
Hashes for pocketsphinx-0.1.11-cp35-cp35m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a1c2f990b18799c8217f2e8e76a7582c64a3c7e0659e50f550861452d49a51fb |
|
MD5 | 85b693a2af965f52b7484e664da75c3b |
|
BLAKE2b-256 | 0b51f3d8e00eb874d4b026b3237c6f0088e5f386774630af6dc1bddece5a2095 |
Hashes for pocketsphinx-0.1.11-cp27-cp27m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0daf4e17efd8fc790832b49dae4b61960e55fee3d26a7da894b76fdb5ba55506 |
|
MD5 | 4dbc8abd44ba1bc7d91295c406cc2e56 |
|
BLAKE2b-256 | ca129efb24419cefa28467ef31ca6ffde846dba937d4a312a7437f0ae0532455 |
Hashes for pocketsphinx-0.1.11-cp27-cp27m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1cd0c49b4b4c836a63ae73d81c4bce9479740d4f92bbd6fedc4b1f54a226ad9d |
|
MD5 | a80f29d6e95dddd57eda1fa303bb650c |
|
BLAKE2b-256 | 03211c0203f9605d8f6058aa416a69640659aa4c9f81e7df5fccd793c29edeae |
Hashes for pocketsphinx-0.1.11-cp27-cp27m-macosx_10_12_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4a64a887b4998817f76fd164b14a383420df39b9fb03cb0df368e5eff9554b81 |
|
MD5 | b67a512e56b2a60b053c2eaf288277a2 |
|
BLAKE2b-256 | 5cf7873e67a0369e82004a53ee5deb60ce1290b813c2a239b7228aee214c14e1 |