Skip to main content

An even smaller speech recognizer

Reason this release was yanked:

horrible bug

Project description

SoundSwallower: an even smaller speech recognizer

"Time and change have a voice; eternity is silent. The human ear is always searching for one or the other."
Leena Krohn, Datura, or a delusion we all see

SoundSwallower is a refactored version of PocketSphinx intended primarily for embedding in web applications. The goal is not to provide a fast implementation of large-vocabulary continuous speech recognition, but rather to provide a small implementation of simple, useful speech technologies.

With that in mind the current version is limited to finite-state grammar recognition. In addition, the eternally problematic and badly-designed audio library as well as all other external dependencies have been removed.

Compiling SoundSwallower

Currently SoundSwallower can be built in several different ways. To build the C shared library, run CMake in the standard way:

mkdir build
cd build
cmake ..
make
make test
make install

Note that this isn't terribly useful as there is no command-line frontend. You probably want to target JavaScript or Python.

Installing the Python module and CLI

The SoundSwallower command-line is a Python module (soundswallower.cli) and can be installed using pip. It is highly recommended to do this in a virtualenv. You can simply install it from PyPI:

pip install soundswallower

Or compile from source:

pip install .

For development, you can install it in-place, but please make sure to remove any existing global installation:

pip uninstall soundswallower
pip install -e .

The command-line supports JSGF grammars and word-level force alignment for one or more input files, for example:

soundswallower --align tests/data/goforward.txt tests/data/goforward.wav
soundswallower --align-text "go forward ten meters" tests/data/goforward.wav
soundswallower --grammar tests/data/goforward.gram tests/data/goforward.wav

Note that multiple input files are not particularly useful for --align or --align-text as they will simply (try to) align the same text to each file. The output results (a list of time-aligned words) can be written to a JSON file with --output.

See also the full documentation of the Python API.

Compiling to JavaScript/WebAssembly

To build the JavaScript library, use CMake with Emscripten:

cd js
emcmake cmake ..
emmake make

This will create js/soundswallower.js and js/soundswallower.wasm in the jsbuild directory, which you can then include in your projects. Demo applications can be seen at https://github.com/dhdaines/alignment-demo and https://github.com/dhdaines/soundswallower-demo.

For more details on the JavaScript implementation and API, see js/README.js.

See also the documentation of the JavaScript API.

Creating binary distributions for Python

To build the Python extension, I suggest using build, as it will ensure that everything is done in a totally clean environment. Run this from the top-level directory

python -m build

In all cases the resulting binary wheel (found in dist) is self-contained and should not need any other components aside from the system libraries. To create wheels that are compatible with multiple Linux distributions, see the instructions in README.manylinux.md.

Compiling on Windows in Visual Studio Code

The method for building distributions noted above will also work on Windows, from within a Conda environment, provided you have Visual Studio or the Visual Studio Build Tools installed. This is somewhat magic.

If you don't have Conda, then what you will need to do is:

  • Install Visual Studio build tools. Unfortunately, a direct link does not seem to exist, but you can find them under Microsoft's downloads page. The 2019 version is probably the optimal one to use as it is compatible with all recent versions of Windows.

  • Install the version of Python you wish to use.

  • Launch the Visual Studio command-line prompt from the Start menu. Note that if your Python is 64-bit (recommended), you must be sure to launch the "x64 Native Command Line Prompt".

  • Create and activate a virtual environment using your Python binary, which may or may not be in your AppData directory:

      %USERPROFILE%\AppData\Local\Programs\Python\Python310\python -m venv py310
      py310\scripts\activate
    
  • now you can build wheels with pip, using the same method mentioned above.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soundswallower-0.3.1.tar.gz (10.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

soundswallower-0.3.1-cp310-cp310-win_amd64.whl (9.5 MB view details)

Uploaded CPython 3.10Windows x86-64

soundswallower-0.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

soundswallower-0.3.1-cp39-cp39-win_amd64.whl (9.5 MB view details)

Uploaded CPython 3.9Windows x86-64

soundswallower-0.3.1-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.6 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64manylinux: glibc 2.5+ x86-64

soundswallower-0.3.1-cp38-cp38-win_amd64.whl (9.5 MB view details)

Uploaded CPython 3.8Windows x86-64

soundswallower-0.3.1-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.6 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64manylinux: glibc 2.5+ x86-64

soundswallower-0.3.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.6 MB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.17+ x86-64manylinux: glibc 2.5+ x86-64

File details

Details for the file soundswallower-0.3.1.tar.gz.

File metadata

  • Download URL: soundswallower-0.3.1.tar.gz
  • Upload date:
  • Size: 10.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.8.10

File hashes

Hashes for soundswallower-0.3.1.tar.gz
Algorithm Hash digest
SHA256 1fb4324ee0030544c3e882f0b04fdbd8e18b13d8f6201abb6e86b18720301b49
MD5 2f7937bbbd27639e68523b5ff3ca560c
BLAKE2b-256 026e7d4add7c5875be6e72b31f9e370aa8af86f2836534f9f5cbeef742eb6de9

See more details on using hashes here.

File details

Details for the file soundswallower-0.3.1-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for soundswallower-0.3.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 37b01ae0460b1f6a3cb2a5d96370fd7b742bf050777cda1d235126ae5473a7a9
MD5 0fead7cbec566cfdd98a99cc641e66a2
BLAKE2b-256 ec29711355f4f44184889bef7822aa292cf31d1c086ac643b8d6c73aca411095

See more details on using hashes here.

File details

Details for the file soundswallower-0.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for soundswallower-0.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 674804b2de95d94d48cbffe62d7780b94f0bd871655c2188fc83f01b16c11486
MD5 5321476c79bc8a2badf41fd27d290f01
BLAKE2b-256 97af0da45b07312077bcf55c692bce743b5909991847bdc3a8f9447b42abb5c5

See more details on using hashes here.

File details

Details for the file soundswallower-0.3.1-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for soundswallower-0.3.1-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 8169eb230b4d90cef905bc5e509d747f970d7eb06b32372fcf36a755688c1336
MD5 b5c24d1ab1f1c8b3fcc270d036098ac8
BLAKE2b-256 69aa297440c314c4cb8e86a2e490caa2530ce492b8a291ec45185b065b8aed3d

See more details on using hashes here.

File details

Details for the file soundswallower-0.3.1-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for soundswallower-0.3.1-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 dff7323e26ab59cc49bd910ba98255eb02444240af6ae62e2bb374d39275bf53
MD5 155281097d5158c9a31ae3b6bdcf389a
BLAKE2b-256 9c5bd6fe01e977bc46170ef46967711dd9d277f11024c6d666636b968ff00430

See more details on using hashes here.

File details

Details for the file soundswallower-0.3.1-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for soundswallower-0.3.1-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 4ec41ac5b75e9d1a48b42be4afadb2e23cfacb8bbd430ea1ca6e9cd5205efafb
MD5 770f548f093a0005940a4aac455149c7
BLAKE2b-256 da9653d0497c529253a3bcdd2a8672b88bea71fa11a58b8ce8b58df8c1a5dc3b

See more details on using hashes here.

File details

Details for the file soundswallower-0.3.1-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for soundswallower-0.3.1-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e26a30fad3016bbb503a25b4859e77669a4a0ac4c12016bac84d35ca58fd35f2
MD5 91e592b160f5b814ae77741cf7943eb7
BLAKE2b-256 ca254c5120bae18363559d7b0cee0a8b824e2e64a38ae7a91710a0cba4383984

See more details on using hashes here.

File details

Details for the file soundswallower-0.3.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for soundswallower-0.3.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0bb3ee277883447a0e5c06b74c79055b329d907f5aebb2038a864a5f9e8ead52
MD5 62908fffade8e7c7db7d0b46830e4393
BLAKE2b-256 a46240010a9f9fab06cc1fc21542ccaa99d8eccd4c98edac1aae48127615ffd9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page