Skip to main content

Control your computer by voice!

Project description

Control your computer by voice!

Features

  • Define your own personal commands in your home directory (outside the Voca source tree).

  • Different commands are available for each app, in addition to globally available commands.

  • When editing a command file, your new commands are immediately available as soon as you save. No need to fiddle with reloading.

  • If there’s a fatal error in a command file, don’t worry – Voca simply uses a backup from the last time that file worked.

  • Your commands are executed asynchronously, so you never need to wait for one to finish before executing the next.

  • Get immediate visual feedback during an utterance – Voca’s eager mode can start acting on your commands as soon as the first word in your utterance. Switch to strict mode and Voca will wait until the end of your utterance.

  • Voca uses a modern parser, so your grammar can be arbitrarily complex.

  • Use any speech engine you like – Voca takes its input as newline-separated json on stdin.

  • Voca generates detailed structured logs you can use for debugging or analyzing your command history.

  • Voca provides adapters for current Caster and Dragonfly commands, so you can keep using commands you like – just install Voca alongside Caster. More plugins and adapters for other systems can be added.

  • Voca has a pluggable architecture. Install independent plugins for controlling your apps, without needing to fork the main repository.

  • Voca uses Python 3.7+, so all the newest Python features are available.

  • Voca is continuously tested in CI, and maintains test coverage checks.

  • Free and open source, licensed GPLv3.

Limitations

  • Nobody has used it at all, so I don’t know if it’s useful.

  • Voca does not provide a speech engine; it requires input from an existing one like Dragon, Kaldi, or DeepSpeech.

  • Multiple platforms are planned, and the basic outline is there, but tests are not currently passing on OSX or Windows. Linux is working.

  • The documentation is minimal.

  • The Caster commands for specific programming languages aren’t yet implemented.

  • There are several ways to define commands, but a single obvious way would be better.

Installation

Some of Voca’s dependencies are not yet available on PyPI, so it can’t be installed directly with pip. In bash, run these commands:

::

# Clone the repository. git clone git@github.com:python-voca/voca.git # Change working directory to inside the repository. cd voca # Create a virtual environment with Python 3.7. virtualenv -p python3.7 venv # Install the dependencies into the virtual environment. venv/bin/pip install -r dev-requirements.txt -r requirements.txt # Install Voca into the virtual environment. venv/bin/pip install –no-deps -e . # Activate the virtual environment to add its packages onto your PATH. source venv/bin/activate

Usage

  • Start the server with Docker by running ./run-kaldi-server.sh or use another, such as the online servers at http://voxhub.io/silvius. (Voca is not affiliated with Silvius, but is compatible.)

  • Detect which audio device you’re using as the microphone by running voca --mic with different --device numbers until one of them shows output.

  • Send audio to the server and receive transcripts on stdout by running voca --listen -d 2, replacing 2 with your microphone’s device number from the previous step. Try saying something and check that you get json output. Cancel this process with control-c.

  • Check that the manager is working by sending it a transcript. The -i option says which command module you want to load.

    voca manage -i voca.plugins.basic   <<EOF
    {"status": 0, "segment": 0, "result": {"hypotheses": [{"transcript": "say bravo"}], "final": true}, "id": "eec37b79-f55e-4bf8-9afe-01f278902599"}
    EOF

    It should type the letter b on your screen. Cancel this process with control-c.

  • Start the listener and manager, piping the listener’s transcripts into the manager.

    voca listen -d <device_number>  | voca manage -i voca.plugins.basic

    Speak into your microphone say charlie. It should type the letter c on your screen. Cancel this process with control-c.

  • See the location of your config directory in voca --help, and add new commands in any .py file at {config_dir}/user_modules/*.py. Run voca manage -i user_modules.my_module (replacing my_module with the name of your file, excluding the .py suffix.)

  • Try using the Caster commands.

    pip install --no-deps castervoice # Exclude Caster dependencies like wxpython.
    voca listen -d <device_number>  | VOCA_PATCH_CASTER=1 voca manage

For example, in Visual Studio Code, say new file. It should open new file in the editor by automatically pressing control-n.

  • Structured logs are stored in {config_dir}/logs/. Examine them with eliot-tree --color=always -l0 {filepath} | less -SR. They’ll show how your commands flowed through the program, and will display the full grammar that was active during each command.

Documentation

Prerequisites:

  • A speech engine, e.g. kaldi/silvius server via included docker script or on its website

  • Microphone

  • Python 3

Development

  • git clone this repo and cd inside

  • To start the kaldi server and workers in docker, plus a client listening to your mic, run ./run-kaldi-server.sh

  • ./pycli init will create a virtualenv and install the package into it

  • ./venv/bin/voca manage to start the manager process which accepts commands on stdin. The manager will start its workers.

To run the all tests run:

tox

Note, to combine the coverage data from all the tox environments run:

Windows

set PYTEST_ADDOPTS=--cov-append
tox

Other

PYTEST_ADDOPTS=--cov-append tox

Changelog

0.1.0 (2019-04-30)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voca-0.1.9.tar.gz (62.6 kB view hashes)

Uploaded Source

Built Distribution

voca-0.1.9-py2.py3-none-any.whl (52.3 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page