Skip to main content

Official Python bindings for PocketSphinx

Project description

PocketSphinx 5.0.0 release candidate 3

This is PocketSphinx, one of Carnegie Mellon University's open source large vocabulary, speaker-independent continuous speech recognition engines.

Although this was at one point a research system, active development has largely ceased and it has become very, very far from the state of the art. I am making a release, because people are nonetheless using it, and there are a number of historical errors in the build system and API which needed to be corrected.

The version number is strangely large because there was a "release" that people are using called 5prealpha, and we will use proper semantic versioning from now on.

Please see the LICENSE file for terms of use.

Installation

You should be able to install this with pip for recent platforms and versions of Python:

pip3 install pocketsphinx

Alternately, you can also compile it from the source tree. I highly suggest doing this in a virtual environment (replace ~/ve_pocketsphinx with the virtual environment you wish to create), from the top level directory:

python3 -m venv ~/ve_pocketsphinx
. ~/ve_pocketsphinx/bin/activate
pip3 install .

On GNU/Linux and maybe other platforms, you must have PortAudio installed for the LiveSpeech class to work (we may add a fall-back to sox in the near future). On Debian-like systems this can be achieved by installing the libportaudio2 package:

sudo apt-get install libportaudio2

Usage

See the examples directory for a number of examples of using the library from Python. You can also read the documentation for the Python API or the C API.

It also mostly supports the same APIs as the previous pocketsphinx-python module, as described below.

LiveSpeech

An iterator class for continuous recognition or keyword search from a microphone. For example, to do speech-to-text with the default (some kind of US English) model:

from pocketsphinx import LiveSpeech
for phrase in LiveSpeech(): print(phrase)

Or to do keyword search:

from pocketsphinx import LiveSpeech

speech = LiveSpeech(keyphrase='forward', kws_threshold=1e-20)
for phrase in speech:
    print(phrase.segments(detailed=True))

With your model and dictionary:

import os
from pocketsphinx import LiveSpeech, get_model_path

speech = LiveSpeech(
    sampling_rate=16000,  # optional
    hmm=get_model_path('en-us'),
    lm=get_model_path('en-us.lm.bin'),
    dic=get_model_path('cmudict-en-us.dict')
)

for phrase in speech:
    print(phrase)

AudioFile

This is an iterator class for continuous recognition or keyword search from a file. Currently it supports only raw, single-channel, 16-bit PCM data in native byte order.

from pocketsphinx import AudioFile
for phrase in AudioFile("goforward.raw"): print(phrase) # => "go forward ten meters"

An example of a keyword search:

from pocketsphinx import AudioFile

audio = AudioFile("goforward.raw", keyphrase='forward', kws_threshold=1e-20)
for phrase in audio:
    print(phrase.segments(detailed=True)) # => "[('forward', -617, 63, 121)]"

With your model and dictionary:

import os
from pocketsphinx import AudioFile, get_model_path

model_path = get_model_path()

config = {
    'verbose': False,
    'audio_file': 'goforward.raw',
    'hmm': get_model_path('en-us'),
    'lm': get_model_path('en-us.lm.bin'),
    'dict': get_model_path('cmudict-en-us.dict')
}

audio = AudioFile(**config)
for phrase in audio:
    print(phrase)

Convert frame into time coordinates:

from pocketsphinx import AudioFile

# Frames per Second
fps = 100

for phrase in AudioFile(frate=fps):  # frate (default=100)
    print('-' * 28)
    print('| %5s |  %3s  |   %4s   |' % ('start', 'end', 'word'))
    print('-' * 28)
    for s in phrase.seg():
        print('| %4ss | %4ss | %8s |' % (s.start_frame / fps, s.end_frame / fps, s.word))
    print('-' * 28)

# ----------------------------
# | start |  end  |   word   |
# ----------------------------
# |  0.0s | 0.24s | <s>      |
# | 0.25s | 0.45s | <sil>    |
# | 0.46s | 0.63s | go       |
# | 0.64s | 1.16s | forward  |
# | 1.17s | 1.52s | ten      |
# | 1.53s | 2.11s | meters   |
# | 2.12s |  2.6s | </s>     |
# ----------------------------

Authors

PocketSphinx is ultimately based on Sphinx-II which in turn was based on some older systems at Carnegie Mellon University, which were released as free software under a BSD-like license thanks to the efforts of Kevin Lenzo. Much of the decoder in particular was written by Ravishankar Mosur (look for "rkm" in the comments), but various other people contributed as well, see the AUTHORS file for more details.

David Huggins-Daines (the author of this document) is guilty^H^H^H^H^Hresponsible for creating PocketSphinx which added various speed and memory optimizations, fixed-point computation, JSGF support, portability to various platforms, and a somewhat coherent API. He then disappeared for a while.

Nickolay Shmyrev took over maintenance for quite a long time afterwards, and a lot of code was contributed by Alexander Solovets, Vyacheslav Klimkov, and others. The pocketsphinx-python module was originally written by Dmitry Prazdnichnov.

Currently this is maintained by David Huggins-Daines again.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pocketsphinx-5.0.0rc3.tar.gz (33.9 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pocketsphinx-5.0.0rc3-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29.1 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

pocketsphinx-5.0.0rc3-pp38-pypy38_pp73-macosx_10_9_x86_64.whl (29.0 MB view details)

Uploaded PyPymacOS 10.9+ x86-64

pocketsphinx-5.0.0rc3-cp310-cp310-win_amd64.whl (29.0 MB view details)

Uploaded CPython 3.10Windows x86-64

pocketsphinx-5.0.0rc3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

pocketsphinx-5.0.0rc3-cp310-cp310-macosx_10_9_x86_64.whl (29.1 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

pocketsphinx-5.0.0rc3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29.1 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

pocketsphinx-5.0.0rc3-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29.1 MB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.17+ x86-64

File details

Details for the file pocketsphinx-5.0.0rc3.tar.gz.

File metadata

  • Download URL: pocketsphinx-5.0.0rc3.tar.gz
  • Upload date:
  • Size: 33.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for pocketsphinx-5.0.0rc3.tar.gz
Algorithm Hash digest
SHA256 5b705d008eb77c370970589db7ec88dcfbc5ca088bd93066fc03c729d8f97486
MD5 8508d304873a3bac806ef28f4b344b10
BLAKE2b-256 4272a159e9539dfe3b42c7bd641e261cb630de81b8eb6a896925a7cd056cf80c

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc3-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc3-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 986b8b57987da404b416427acc01de8d5f4215bbdcfd565cf1e068ef2bf3bc35
MD5 2e11ff777425054dae902aae3c4b3644
BLAKE2b-256 d4772562a8c958132585e0853b2f8205026561b43c9843fc4db84edcfaa869e8

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc3-pp38-pypy38_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc3-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 66a463a13e9ead1f3e5048427598c132107341abed3094af12ab2993a52df112
MD5 f6cad8cb27c957e4d8bf85c9482184aa
BLAKE2b-256 ed5fe6e38bb3406dcb152b58425dcb75c75c32181f346079eb3a2eac412252d7

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc3-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 c71d3d5f8f36c2982fc85ef0e6bab6789c0d759d130b5df2652a1420930b59da
MD5 f0af5700735623e07c8276298bb0ab70
BLAKE2b-256 b9cd155e0f634c849c0520498cbb2514d488120a3e897eef9eaac0c9cd2b2f9b

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4a23d18d265b3aa66ae7377e5df9c3a722574c778f4060959145432a87db8b46
MD5 3a6fb53866aeb16a4c4ef6906f678314
BLAKE2b-256 667f0a47c40481604ae99b513f96d6e8949dee073467aa261bb13c833e0fe738

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc3-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc3-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 53844c57b5d7f101fcd6efd95affde7e5d9e5b75bafbe09b9fb426ee9b92f184
MD5 c87b291e18a0eaa6ccc89ebe237c672c
BLAKE2b-256 6ca4cb70df02f23fd238bd3152fba1b680140ae04efbd326e56be7531ea64e71

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 dbb88d1d2dc37513740f28a46a638809b6df54c8142655d9af4618a189d3983a
MD5 a152ae38f70adb07c63da76a9aa394c4
BLAKE2b-256 a6e0a4fee2374215f42c863117792b66ea3b2721be427ca870e381a981368f23

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc3-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc3-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9b945b074690938f974ed729879459075ce27599d2626f830e1d4f16ff398aba
MD5 e922c3a27b0b91979054b7480e424c94
BLAKE2b-256 64c95525d04ade9893f7ccf2da1e1953f22b28711ccedb2256a4e1b88d0f965d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page