Skip to main content

Official Python bindings for PocketSphinx

Project description

PocketSphinx 5.0.0 release candidate 4

This is PocketSphinx, one of Carnegie Mellon University's open source large vocabulary, speaker-independent continuous speech recognition engines.

Although this was at one point a research system, active development has largely ceased and it has become very, very far from the state of the art. I am making a release, because people are nonetheless using it, and there are a number of historical errors in the build system and API which needed to be corrected.

The version number is strangely large because there was a "release" that people are using called 5prealpha, and we will use proper semantic versioning from now on.

Please see the LICENSE file for terms of use.

Installation

You should be able to install this with pip for recent platforms and versions of Python:

pip3 install pocketsphinx

Alternately, you can also compile it from the source tree. I highly suggest doing this in a virtual environment (replace ~/ve_pocketsphinx with the virtual environment you wish to create), from the top level directory:

python3 -m venv ~/ve_pocketsphinx
. ~/ve_pocketsphinx/bin/activate
pip3 install .

On GNU/Linux and maybe other platforms, you must have PortAudio installed for the LiveSpeech class to work (we may add a fall-back to sox in the near future). On Debian-like systems this can be achieved by installing the libportaudio2 package:

sudo apt-get install libportaudio2

Usage

See the examples directory for a number of examples of using the library from Python. You can also read the documentation for the Python API or the C API.

It also mostly supports the same APIs as the previous pocketsphinx-python module, as described below.

LiveSpeech

An iterator class for continuous recognition or keyword search from a microphone. For example, to do speech-to-text with the default (some kind of US English) model:

from pocketsphinx import LiveSpeech
for phrase in LiveSpeech(): print(phrase)

Or to do keyword search:

from pocketsphinx import LiveSpeech

speech = LiveSpeech(keyphrase='forward', kws_threshold=1e-20)
for phrase in speech:
    print(phrase.segments(detailed=True))

With your model and dictionary:

import os
from pocketsphinx import LiveSpeech, get_model_path

speech = LiveSpeech(
    sampling_rate=16000,  # optional
    hmm=get_model_path('en-us'),
    lm=get_model_path('en-us.lm.bin'),
    dic=get_model_path('cmudict-en-us.dict')
)

for phrase in speech:
    print(phrase)

AudioFile

This is an iterator class for continuous recognition or keyword search from a file. Currently it supports only raw, single-channel, 16-bit PCM data in native byte order.

from pocketsphinx import AudioFile
for phrase in AudioFile("goforward.raw"): print(phrase) # => "go forward ten meters"

An example of a keyword search:

from pocketsphinx import AudioFile

audio = AudioFile("goforward.raw", keyphrase='forward', kws_threshold=1e-20)
for phrase in audio:
    print(phrase.segments(detailed=True)) # => "[('forward', -617, 63, 121)]"

With your model and dictionary:

import os
from pocketsphinx import AudioFile, get_model_path

model_path = get_model_path()

config = {
    'verbose': False,
    'audio_file': 'goforward.raw',
    'hmm': get_model_path('en-us'),
    'lm': get_model_path('en-us.lm.bin'),
    'dict': get_model_path('cmudict-en-us.dict')
}

audio = AudioFile(**config)
for phrase in audio:
    print(phrase)

Convert frame into time coordinates:

from pocketsphinx import AudioFile

# Frames per Second
fps = 100

for phrase in AudioFile(frate=fps):  # frate (default=100)
    print('-' * 28)
    print('| %5s |  %3s  |   %4s   |' % ('start', 'end', 'word'))
    print('-' * 28)
    for s in phrase.seg():
        print('| %4ss | %4ss | %8s |' % (s.start_frame / fps, s.end_frame / fps, s.word))
    print('-' * 28)

# ----------------------------
# | start |  end  |   word   |
# ----------------------------
# |  0.0s | 0.24s | <s>      |
# | 0.25s | 0.45s | <sil>    |
# | 0.46s | 0.63s | go       |
# | 0.64s | 1.16s | forward  |
# | 1.17s | 1.52s | ten      |
# | 1.53s | 2.11s | meters   |
# | 2.12s |  2.6s | </s>     |
# ----------------------------

Authors

PocketSphinx is ultimately based on Sphinx-II which in turn was based on some older systems at Carnegie Mellon University, which were released as free software under a BSD-like license thanks to the efforts of Kevin Lenzo. Much of the decoder in particular was written by Ravishankar Mosur (look for "rkm" in the comments), but various other people contributed as well, see the AUTHORS file for more details.

David Huggins-Daines (the author of this document) is guilty^H^H^H^H^Hresponsible for creating PocketSphinx which added various speed and memory optimizations, fixed-point computation, JSGF support, portability to various platforms, and a somewhat coherent API. He then disappeared for a while.

Nickolay Shmyrev took over maintenance for quite a long time afterwards, and a lot of code was contributed by Alexander Solovets, Vyacheslav Klimkov, and others. The pocketsphinx-python module was originally written by Dmitry Prazdnichnov.

Currently this is maintained by David Huggins-Daines again.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pocketsphinx-5.0.0rc4.tar.gz (33.8 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pocketsphinx-5.0.0rc4-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29.1 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

pocketsphinx-5.0.0rc4-pp38-pypy38_pp73-macosx_10_9_x86_64.whl (29.0 MB view details)

Uploaded PyPymacOS 10.9+ x86-64

pocketsphinx-5.0.0rc4-cp310-cp310-win_amd64.whl (29.0 MB view details)

Uploaded CPython 3.10Windows x86-64

pocketsphinx-5.0.0rc4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

pocketsphinx-5.0.0rc4-cp310-cp310-macosx_10_9_x86_64.whl (29.1 MB view details)

Uploaded CPython 3.10macOS 10.9+ x86-64

pocketsphinx-5.0.0rc4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29.1 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

pocketsphinx-5.0.0rc4-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29.1 MB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.17+ x86-64

File details

Details for the file pocketsphinx-5.0.0rc4.tar.gz.

File metadata

  • Download URL: pocketsphinx-5.0.0rc4.tar.gz
  • Upload date:
  • Size: 33.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for pocketsphinx-5.0.0rc4.tar.gz
Algorithm Hash digest
SHA256 aa8dc968b7c1782882a6733ce1884f0ddb0c6201c5a2f238108078edf1f2d923
MD5 3d29b5803374f54ecb5d4fc96b348d14
BLAKE2b-256 16043b5b1aebaf4fe36e9c8fe260c540b2363ef0ca3f3126c193353fe3e23a13

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc4-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc4-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9efe0294ff77e185194627b7d44cb6f603c591bed7a09e38c3f72fcc57cca9d4
MD5 464988d26ca562486b69ddb4efcb1b4d
BLAKE2b-256 45973d74aa4d266a55dc8480ad927b6d2d35a6d913d26031c6215b859ac2db07

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc4-pp38-pypy38_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc4-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 6a1d6339a68679b8c232cdb19e105ce41d7e4826872418a9c9996bcb2185bd6c
MD5 07e39dd930aa4743e007ffc8f1e8d937
BLAKE2b-256 1b7f71acad69e13730c7105e827ce13c9de1365f9552aded47bac453401df6f7

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc4-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc4-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 a9d358d33af3cc58539a8fdd7f9cfbfde91555a311aae9fe5798cfcc28418dc0
MD5 5d29a4e11cf329902c5f99b7b91b61c6
BLAKE2b-256 b459326c1e81262b85b9ba8e94eac415578a96b82fb35ac5aff5a62605fc47a9

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fd9ee0a57cc983c9f86054093d21ba17a5f92711c724d1cd8473b002d666cc31
MD5 7f5e09f1448b4c538c069ad16f094426
BLAKE2b-256 6bfb0c4f0fb9db9b72f356d37b93b0c9364ff3e40da0859d448dd37b57387f15

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc4-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc4-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 0522f3e1f0fcb9fedb3b2677313728ed893349bc86b15d7d679fd0cd8f7ada45
MD5 53d49e8c92056adf632eba468b83cb8b
BLAKE2b-256 ae60c2db80491726c10ec10d6caf3920fa9a50dd7e43e52bb678c6f90d483ed7

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 55e2349bf13b99cc9409fb3a7a9a3d06c0b76a103d393c09b4a29a893a609b68
MD5 9ce0ab8a47b5a0618ede0296f38a9c5b
BLAKE2b-256 17e4599ea4175958551e17ce3a3c11a99c57e031ecd1aa00dc85addcefe7e88b

See more details on using hashes here.

File details

Details for the file pocketsphinx-5.0.0rc4-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pocketsphinx-5.0.0rc4-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 89dc9b2a62f7818e0ae2aa791c7d2b6f2fb9c617bbb188053689afe7e6b2b201
MD5 1261e9092a225036f6cd0b72a0c974e3
BLAKE2b-256 8a966d628e6731f5dd80fc34bbdfb310e6353aa5e7d1f8cb0b00a5c70fb52f76

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page