Skip to main content

Official Python bindings for PocketSphinx

Project description

PocketSphinx 5.0.0 release candidate 5

This is PocketSphinx, one of Carnegie Mellon University's open source large vocabulary, speaker-independent continuous speech recognition engines.

Although this was at one point a research system, active development has largely ceased and it has become very, very far from the state of the art. I am making a release, because people are nonetheless using it, and there are a number of historical errors in the build system and API which needed to be corrected.

The version number is strangely large because there was a "release" that people are using called 5prealpha, and we will use proper semantic versioning from now on.

Please see the LICENSE file for terms of use.

Installation

You should be able to install this with pip for recent platforms and versions of Python:

pip3 install pocketsphinx

Alternately, you can also compile it from the source tree. I highly suggest doing this in a virtual environment (replace ~/ve_pocketsphinx with the virtual environment you wish to create), from the top level directory:

python3 -m venv ~/ve_pocketsphinx
. ~/ve_pocketsphinx/bin/activate
pip3 install .

On GNU/Linux and maybe other platforms, you must have PortAudio installed for the LiveSpeech class to work (we may add a fall-back to sox in the near future). On Debian-like systems this can be achieved by installing the libportaudio2 package:

sudo apt-get install libportaudio2

Usage

See the examples directory for a number of examples of using the library from Python. You can also read the documentation for the Python API or the C API.

It also mostly supports the same APIs as the previous pocketsphinx-python module, as described below.

LiveSpeech

An iterator class for continuous recognition or keyword search from a microphone. For example, to do speech-to-text with the default (some kind of US English) model:

from pocketsphinx import LiveSpeech
for phrase in LiveSpeech(): print(phrase)

Or to do keyword search:

from pocketsphinx import LiveSpeech

speech = LiveSpeech(keyphrase='forward', kws_threshold=1e-20)
for phrase in speech:
    print(phrase.segments(detailed=True))

With your model and dictionary:

import os
from pocketsphinx import LiveSpeech, get_model_path

speech = LiveSpeech(
    sampling_rate=16000,  # optional
    hmm=get_model_path('en-us'),
    lm=get_model_path('en-us.lm.bin'),
    dic=get_model_path('cmudict-en-us.dict')
)

for phrase in speech:
    print(phrase)

AudioFile

This is an iterator class for continuous recognition or keyword search from a file. Currently it supports only raw, single-channel, 16-bit PCM data in native byte order.

from pocketsphinx import AudioFile
for phrase in AudioFile("goforward.raw"): print(phrase) # => "go forward ten meters"

An example of a keyword search:

from pocketsphinx import AudioFile

audio = AudioFile("goforward.raw", keyphrase='forward', kws_threshold=1e-20)
for phrase in audio:
    print(phrase.segments(detailed=True)) # => "[('forward', -617, 63, 121)]"

With your model and dictionary:

import os
from pocketsphinx import AudioFile, get_model_path

model_path = get_model_path()

config = {
    'verbose': False,
    'audio_file': 'goforward.raw',
    'hmm': get_model_path('en-us'),
    'lm': get_model_path('en-us.lm.bin'),
    'dict': get_model_path('cmudict-en-us.dict')
}

audio = AudioFile(**config)
for phrase in audio:
    print(phrase)

Convert frame into time coordinates:

from pocketsphinx import AudioFile

# Frames per Second
fps = 100

for phrase in AudioFile(frate=fps):  # frate (default=100)
    print('-' * 28)
    print('| %5s |  %3s  |   %4s   |' % ('start', 'end', 'word'))
    print('-' * 28)
    for s in phrase.seg():
        print('| %4ss | %4ss | %8s |' % (s.start_frame / fps, s.end_frame / fps, s.word))
    print('-' * 28)

# ----------------------------
# | start |  end  |   word   |
# ----------------------------
# |  0.0s | 0.24s | <s>      |
# | 0.25s | 0.45s | <sil>    |
# | 0.46s | 0.63s | go       |
# | 0.64s | 1.16s | forward  |
# | 1.17s | 1.52s | ten      |
# | 1.53s | 2.11s | meters   |
# | 2.12s |  2.6s | </s>     |
# ----------------------------

Authors

PocketSphinx is ultimately based on Sphinx-II which in turn was based on some older systems at Carnegie Mellon University, which were released as free software under a BSD-like license thanks to the efforts of Kevin Lenzo. Much of the decoder in particular was written by Ravishankar Mosur (look for "rkm" in the comments), but various other people contributed as well, see the AUTHORS file for more details.

David Huggins-Daines (the author of this document) is guilty^H^H^H^H^Hresponsible for creating PocketSphinx which added various speed and memory optimizations, fixed-point computation, JSGF support, portability to various platforms, and a somewhat coherent API. He then disappeared for a while.

Nickolay Shmyrev took over maintenance for quite a long time afterwards, and a lot of code was contributed by Alexander Solovets, Vyacheslav Klimkov, and others. The pocketsphinx-python module was originally written by Dmitry Prazdnichnov.

Currently this is maintained by David Huggins-Daines again.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pocketsphinx-5.0.0rc5.tar.gz (33.9 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pocketsphinx-5.0.0rc5-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29.1 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

pocketsphinx-5.0.0rc5-pp38-pypy38_pp73-macosx_10_9_x86_64.whl (29.1 MB view details)

Uploaded PyPymacOS 10.9+ x86-64

pocketsphinx-5.0.0rc5-cp310-cp310-win_amd64.whl (29.0 MB view details)

Uploaded CPython 3.10Windows x86-64

pocketsphinx-5.0.0rc5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

pocketsphinx-5.0.0rc5-cp310-cp310-macosx_10_9_x86_64.whl (29.1 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

pocketsphinx-5.0.0rc5-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29.1 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

pocketsphinx-5.0.0rc5-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29.1 MB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.17+ x86-64

File details

Details for the file pocketsphinx-5.0.0rc5.tar.gz.

File metadata

  • Download URL: pocketsphinx-5.0.0rc5.tar.gz
  • Upload date:
  • Size: 33.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for pocketsphinx-5.0.0rc5.tar.gz
Algorithm Hash digest
SHA256 05e7cc0c709e384b519e0ecd9e31876eb432f733f0cc61d668b5ef061e112591
MD5 ceaa11ccde570c17f0d7eaccaecb02ed
BLAKE2b-256 148be54c4eaf4ae0084d0e89de119576e4e49ad3d9767775fb63e4d33bd77af3

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc5-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc5-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 89a92368a5566a413300bb65f5372ff6fa474d058972e517b61a0a58dddd1adf
MD5 041d89b0c15d93603ebcd544735e5fcf
BLAKE2b-256 f6e27cd068595523eb93bca223b21878226c545d5694f88eca6b0af84880a206

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc5-pp38-pypy38_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc5-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 ff0a3ec3a1d07cb2c719dbf433e8227db67fc2f52d851bacf0e1a996881027ab
MD5 9b2231bfd959e1e452fc20c302417a76
BLAKE2b-256 86596504572cd5210302a17f28336ad6db8daf3dd374dbb3325c7bc2a090711c

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc5-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc5-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 facb29803787d92fdb94cafd60625f8c2dbb3a89d1835eba9003720920659f01
MD5 b8add238fbd52743399b29034abae8ea
BLAKE2b-256 f24fe4585f35c93edfce050dfc6433c1371b79101a0129b95529a8bbbde333ad

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 331308828f1a22856e4c5562961bf416297abfa8624e6be7b3ea9be0853b6ce6
MD5 eccb33fbf77cf217916ecfbec5e97caf
BLAKE2b-256 543fdd8f476afb03ca53241a5726babb2062c91bf6819c8d51bedcb5f34423da

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc5-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc5-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 93f6335ae187bc0c43dc1e10cd10f8c84919cd156d4e1f66cda34a238a7c7f1f
MD5 2381c0d49864b05f673baeca77dffd42
BLAKE2b-256 635297c02edaa046bb93a1044ef85973ce09f358ec04b7e6c2388ed55d674758

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc5-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc5-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7b6af66109df8c36c693272c17df0a010d115cadabae473aa168d808ae71751c
MD5 169ffee72195fc810775ee83ca2e3c04
BLAKE2b-256 d1fe1630abc582dac0085adb858ad59e348605226b97674ee87110cf841b6451

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc5-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc5-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d835924ae312c03393d1a94363c3537f03d30f10ff4dafcba26b5b7dd31da310
MD5 e1e4b9ac27e3f2df6afb67d8b5f79f7c
BLAKE2b-256 210a2070a527aeb07e24ffc85b43b1db202ce505397ec89ad68a3ff59d1e6892

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page