Skip to main content

Control your computer by voice!

Project description

Control your computer by voice!

Features

  • Define your own personal commands in your home directory (outside the Voca source tree).

  • Different commands are available for each app, in addition to globally available commands.

  • When editing a command file, your new commands are immediately available as soon as you save. No need to fiddle with reloading.

  • If there’s a fatal error in a command file, don’t worry – Voca simply uses a backup from the last time that file worked.

  • Your commands are executed asynchronously, so you never need to wait for one to finish before executing the next.

  • Get immediate visual feedback during an utterance – Voca’s eager mode can start acting on your commands as soon as the first word in your utterance. Switch to strict mode and Voca will wait until the end of your utterance.

  • Voca uses a modern parser, so your grammar can be arbitrarily complex.

  • Use any speech engine you like – Voca takes its input as newline-separated json on stdin.

  • Voca generates detailed structured logs you can use for debugging or analyzing your command history.

  • Voca provides adapters for current Caster and Dragonfly commands, so you can keep using commands you like – just install Voca alongside Caster. More plugins and adapters for other systems can be added.

  • Voca has a pluggable architecture. Install independent plugins for controlling your apps, without needing to fork the main repository.

  • Voca uses Python 3.7+, so all the newest Python features are available.

  • Voca is continuously tested in CI, and maintains test coverage checks.

  • Free and open source, licensed GPLv3.

Limitations

  • Nobody has used it at all, so I don’t know if it’s useful.

  • Voca does not provide a speech engine; it requires input from an existing one like Dragon, Kaldi, or DeepSpeech.

  • Multiple platforms are planned, and the basic outline is there, but tests are not currently passing on OSX or Windows. Linux is working.

  • The documentation is minimal.

  • The Caster commands for specific programming languages aren’t yet implemented.

  • There are several ways to define commands, but a single obvious way would be better.

Installation

Some of Voca’s dependencies are not yet available on PyPI, so it can’t be installed directly with pip. In bash, run these commands:

# Clone the repository.
git clone git@github.com:python-voca/voca.git
# Change working directory to inside the repository.
cd voca
# Create a virtual environment with Python 3.7.
virtualenv -p python3.7 venv
# Install the dependencies into the virtual environment.
venv/bin/pip install -r dev-requirements.txt -r requirements.txt
# Install Voca into the virtual environment.
venv/bin/pip install --no-deps -e .
# Activate the virtual environment to add its packages onto your PATH.
source venv/bin/activate

Usage

  • Start the server with Docker by running ./run-kaldi-server.sh or use another, such as the online servers at http://voxhub.io/silvius. (Voca is not affiliated with Silvius, but is compatible.)

  • Detect which audio device you’re using as the microphone by running voca --mic with different --device numbers until one of them shows output.

  • Send audio to the server and receive transcripts on stdout by running voca --listen -d 2, replacing 2 with your microphone’s device number from the previous step. Try saying something and check that you get json output. Cancel this process with control-c.

  • Check that the manager is working by sending it a transcript. The -i option says which command module you want to load.

    voca manage -i voca.plugins.basic   <<EOF
    {"status": 0, "segment": 0, "result": {"hypotheses": [{"transcript": "say bravo"}], "final": true}, "id": "eec37b79-f55e-4bf8-9afe-01f278902599"}
    EOF

    It should type the letter b on your screen. Cancel this process with control-c.

  • Start the listener and manager, piping the listener’s transcripts into the manager.

    voca listen -d <device_number>  | voca manage -i voca.plugins.basic

    Speak into your microphone say charlie. It should type the letter c on your screen. Cancel this process with control-c.

  • See the location of your config directory in voca --help, and add new commands in any .py file at {config_dir}/user_modules/*.py. Run voca manage -i user_modules.my_module (replacing my_module with the name of your file, excluding the .py suffix.)

  • Try using the Caster commands.

    pip install --no-deps castervoice # Exclude Caster dependencies like wxpython.
    voca listen -d <device_number>  | VOCA_PATCH_CASTER=1 voca manage

For example, in Visual Studio Code, say new file. It should open new file in the editor by automatically pressing control-n.

  • Structured logs are stored in {config_dir}/logs/. Examine them with eliot-tree --color=always -l0 {filepath} | less -SR. They’ll show how your commands flowed through the program, and will display the full grammar that was active during each command.

Documentation

Prerequisites:

  • A speech engine, e.g. kaldi/silvius server via included docker script or on its website

  • Microphone

  • Python 3

Development

  • git clone this repo and cd inside

  • To start the kaldi server and workers in docker, plus a client listening to your mic, run ./run-kaldi-server.sh

  • ./pycli init will create a virtualenv and install the package into it

  • ./venv/bin/voca manage to start the manager process which accepts commands on stdin. The manager will start its workers.

To run the all tests run:

tox

Note, to combine the coverage data from all the tox environments run:

Windows

set PYTEST_ADDOPTS=--cov-append
tox

Other

PYTEST_ADDOPTS=--cov-append tox

Changelog

0.1.0 (2019-04-30)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voca-0.1.10.tar.gz (62.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voca-0.1.10-py2.py3-none-any.whl (52.4 kB view details)

Uploaded Python 2Python 3

File details

Details for the file voca-0.1.10.tar.gz.

File metadata

  • Download URL: voca-0.1.10.tar.gz
  • Upload date:
  • Size: 62.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1

File hashes

Hashes for voca-0.1.10.tar.gz
Algorithm Hash digest
SHA256 281816450031c190e7f561b565c39e48b5d9e190e91e38d3420505a16b7f1c45
MD5 c11d89ca3b8a57b4093485ccb287e635
BLAKE2b-256 22b2d17cc4f2a5366701927ce01b04d9444bfc94ecb452f36df99df10ec6101b

See more details on using hashes here.

File details

Details for the file voca-0.1.10-py2.py3-none-any.whl.

File metadata

  • Download URL: voca-0.1.10-py2.py3-none-any.whl
  • Upload date:
  • Size: 52.4 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1

File hashes

Hashes for voca-0.1.10-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 b013a7aaa6d91fa58ca04b10e42bf4b188445b4c675456393078dc5097c20f03
MD5 e18d7b82ef4a66a8f957ca7dbef06d30
BLAKE2b-256 c0dfda07b6a4ee85c5054c3347aa3a71e5c0db729f9a876abad07b5b4d4db61e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page